Common Parse Patterns - r3n/rebol-wiki GitHub Wiki
Post your favorite Rebol 3 PARSE blocks here. (Note, these will not work in Rebol 2 PARSE.)
The examples shown below are meant for and tested with Rebol 3 only. Here are a few links to related documents:
Removes all extra lines from a text file (that is, allow only one empty line between paragraphs.)
parse text [some [thru lf [lf some remove lf | none]]]Explanation:
-
some - loops on the block while its result is true
-
thru lf - advances just past the next lf
- new block - is used with | none to make an optional match
- lf - matches to a second lf (extra line) or not (none)
- some - loops on the remove while it matches lf
- remove - removes if a match to lf is made
- | none - makes the block optional (always true)
-
thru lf - advances just past the next lf
Removes extra spaces from all text.
parse text [some [thru space remove any space]]Note that space indented lines still include a single space at head of the lines.
Explanation:
-
some - loops on the block while its result is true
- thru space - advances just past the next space
- remove - if what follows is true, remove what matched
- any - loops zero or more times on matching what follows
- space - matches against space character
wspace: charset " ^-" ; space and tab char
parse text [some [thru wspace remove any wspace]]However, for most text files, it's better to first convert tabs to spaces with detab.
parse detab text [some [thru space remove any space]]Makes a sequence of adjacent lines into a single long line:
parse doc [
some [
not lf thru lf
some [s: not lf (change back s #" ") thru lf]
|
thru lf
]
](If you have a better way, feel free to modify the above.)
Explanation:
-
some - loops on the block while its result is true
- not lf - true if the next char is not an lf
- thru lf - advance just past the next lf
-
some - loops on the block while its result is true
- s: - save the current parse position
- not lf - true if the next char is not an lf
- (code) - evaluate the expression in parens
- thru lf - advance just past the next lf
- | thru lf - if above failed, advance past next lf
Also, if you put this in a function, make s a local variable.
Same as above, but leaves space-indented sections (like code examples) as-is:
parse doc [
some [
some space ; indented, leave alone
thru lf
|
not lf ; not an empty line
thru lf
some [s: not lf (change back s space) thru lf]
|
thru lf
]
]Explanation:
-
some - loops on the block while its result is true
- some space - match 1 or more spaces (indentation)
- thru lf - advance past next lf
- | - if above failed, try alternative
- not lf - true if the next char is not an lf
- thru lf - advance just past the next lf
-
some - loops on the block while its result is true
- s: - save the current parse position
- not lf - true if the next char is not an lf
- (code) - evaluate the expression in parens
- thru lf - advance just past the next lf
- | thru lf - if above failed, advance past next lf
Also, if you put this in a function, make s a local variable.
open-tag: none
close-tag: none
rules: [
copy open-tag tag! (close-tag: to-tag append "/" to-string body-open-tag)
thru close-tag
]This can be extended to match to arbitrary depth using a stack for the open/close tag pairs.