Some conditional REBOL parse examples

August 24th, 2008

I am still improving my webcrawler, so I thought I would post some code to show you some more of REBOL’s parse functionality. I needed to parse somthing with this pattern:

Name » Janko
City » Sevnica

<div>Name &raquo; Janko<br/> City &raquo; Sevnica<br/></div>

I assigned it to s-ok and parsed it with this code at first:

NAME: CITY: ""
parse s-ok [
	to "Name" thru "&raquo;" copy NAME to "<br/>"
	to "City" thru "&raquo;" copy CITY to "<br/>"
]
print NAME print CITY
;and got printed
 Janko
Sevnica

So it works, but sometimes the page I parse includes only City and with code above we fail to get the City out:

<div>City &raquo; Sevnica<br/></div>

OPT (optional) in code below will take care of this. But the HTML is also slopy written so I have seen these variations of it:

s-ok: "<div>Name &raquo; Janko<br/> City &raquo; Sevnica<br/></div>"
s-1: "<div>Name &raquo; Janko<br/></div>"
s-2: "<div>City &raquo; Sevnica<br/></div>"
s-br: "<div>Name &raquo; Janko<br/> City &raquo; Sevnica</div>"
s-2brr: "<div>City &raquo;&raquo; Sevnica</div>"

We modify our code to parse this all and we put it into a function :

do-parse-all: func [ s ] [
	NAME: CITY: ""
	parse s [
		OPT [ to "Name" thru "&raquo;" copy NAME to "<br/>" ]
		OPT [ to "City" SOME [ thru "&raquo;" ] copy CITY [ to "<br/>" | to "</div>" ] ]
	]
	print NAME print CITY
]

Now we go to console and we extract info out of all variants:

>> do-parse-all s-ok
 Janko
Sevnica
>> do-parse-all s-1
 Janko

>> do-parse-all s-2

Sevnica
>> do-parse-all s-br
 Janko
Sevnica
>> do-parse-all s-2brr

Sevnica
>>

BTW: The original parse would correctly parse s-ok and s-1.

2 Responses to “Some conditional REBOL parse examples”

  1. BarryO Says:

    “OPT”
    Never came across OPT-ional before, guess I need to get re-reading some of the docs again. I have been using a [ “abc” | “def” | none ] format, which looks like it does the same, but OPT looks much cleaner. I`d have a guess there are a few more gems hidden in R3, if/when it is released.

  2. janko Says:

    I found out about “OPT” while looking at this: http://en.wikibooks.org/wiki/REBOL_Programming/Language_Features/Parse#Multiple_Values_in_a_Block

    I am using R2 still for this so I have to admit I have not idea what’s parse like in R3.

Leave a Reply