Archive for the 'rebol' Category

Some conditional REBOL parse examples

Sunday, August 24th, 2008

I am still improving my webcrawler, so I thought I would post some code to show you some more of REBOL’s parse functionality. I needed to parse somthing with this pattern:

Name » Janko
City » Sevnica

<div>Name &raquo; Janko<br/> City &raquo; Sevnica<br/></div>

I assigned it to s-ok and parsed it with this code at first:

NAME: CITY: ""
parse s-ok [
	to "Name" thru "&raquo;" copy NAME to "<br/>"
	to "City" thru "&raquo;" copy CITY to "<br/>"
]
print NAME print CITY
;and got printed
 Janko
Sevnica

So it works, but sometimes the page I parse includes only City and with code above we fail to get the City out:

<div>City &raquo; Sevnica<br/></div>

OPT (optional) in code below will take care of this. But the HTML is also slopy written so I have seen these variations of it:

s-ok: "<div>Name &raquo; Janko<br/> City &raquo; Sevnica<br/></div>"
s-1: "<div>Name &raquo; Janko<br/></div>"
s-2: "<div>City &raquo; Sevnica<br/></div>"
s-br: "<div>Name &raquo; Janko<br/> City &raquo; Sevnica</div>"
s-2brr: "<div>City &raquo;&raquo; Sevnica</div>"

We modify our code to parse this all and we put it into a function :

do-parse-all: func [ s ] [
	NAME: CITY: ""
	parse s [
		OPT [ to "Name" thru "&raquo;" copy NAME to "<br/>" ]
		OPT [ to "City" SOME [ thru "&raquo;" ] copy CITY [ to "<br/>" | to "</div>" ] ]
	]
	print NAME print CITY
]

Now we go to console and we extract info out of all variants:

>> do-parse-all s-ok
 Janko
Sevnica
>> do-parse-all s-1
 Janko

>> do-parse-all s-2

Sevnica
>> do-parse-all s-br
 Janko
Sevnica
>> do-parse-all s-2brr

Sevnica
>>

BTW: The original parse would correctly parse s-ok and s-1.

Simple send-to-all TCP server in LUA

Friday, July 25th, 2008

Make a quick html data extractor with REBOL (code)

Monday, July 21st, 2008

Make a quick html data extractor with REBOL (video)

Saturday, July 19th, 2008

OMG, OMG REBOL 3 is here!?

Wednesday, January 9th, 2008

Tiny half-evil REBOL script

Wednesday, October 31st, 2007

p. languages: REBOL the internet console

Thursday, September 13th, 2007