5.9 Regular Expressions - naver/lispe GitHub Wiki

Regular Expressions

LispE provides different means of regular expressions, both traditional (e.g. posix) and designed for LispE specifically.

Posix Regular Expressions

(deflib prgx (exp (str)) Creates a posix regular expression from a string)
(deflib prgx_find (exp str (pos 0)) Searches in 'str' for the sub-string that matches the posix regular expression from 'pos'.)
(deflib prgx_findall (exp str (pos 0)) Searches in 'str' all the sub-strings that matches the posix regular expression)
(deflib prgx_find_i (exp str (pos 0)) Returns the positions in 'str' of the sub-string that matches the regular expression from 'pos')
(deflib prgx_findall_i (exp str (pos 0)) Returns the positions in 'str' of all the sub-strings that matches the regular expression.)
(deflib prgx_match (exp str) Checks that 'str' matches the posix regular expression)
(deflib prgx_replace (exp str rep) Replaces 'str' by 'rep' via a posix regular expression)
(deflib prgx_split (exp str) Split 'str' via a posix regular expression)

Examples


; We create our regular expression
; It is preferable to use the low accent (backquote) to avoid having to double the \
(setq r (prgx `\w+`))
true

; We're checking to see if it's a match
(prgx_match r "ABCD")
true

; We are looking for the first matching string.
(prgx_find r "This is a test: 123A here")
This


; We're looking for all occurrences
(prgx_findall r "This is a test: 123A and 45T and 67U here")
("This" "is" "a" "test" "123A" "and" "45T" "and" "67U" "here")

LispE Regular Expressions

(deflib rgx (exp (str)) Create a regular expression from a string)
(deflib rgx_find (exp str (pos 0)) Searches in 'str' the sub-string that matches the regular expression from 'pos')
(deflib rgx_findall (exp str (pos 0)) Searches in 'str' all the sub-strings that matches the regular expression.)
(deflib rgx_find_i (exp str (pos 0)) Returns the positions in 'str' of the sub-string that matches the regular expression from 'pos')
(deflib rgx_findall_i (exp str (pos 0)) Returns the positions in 'str' of all the sub-strings that matches the regular expression.)
(deflib rgx_match (exp str) Checks that 'str' matches the regular expression)
(deflib rgx_replace (exp str rep) Replaces 'str' by 'rep' via a regular expression)
(deflib rgx_split (exp str) Split 'str' via a regular expression)

Description of regular expressions

       - %d is any number
       - %x is a hexadecimal digit (abcdef0123456789ABCDEF)
       - %p represents any punctuation
       - %c represents any lowercase letter
       - %C is any uppercase letter
       - %a represents any letter
       - %h is a Greek letter
       - %H is an Asian character (Chinese, Korean or Japanese)
       - ? represents any character
       - %? is the character "?" itself
       - %% is the "%" character itself
       - %s represents any space character, including unbreakable space.
       - %r is the carriage return
       - %n represents an unbreakable space
       - ~ negation
       - \x escape character
        - \ddd character code on 3 integers.    
       - \xFFFF character code of 4 hexas exactly
       - {...} character disjunction
       - character sequence
       - {a-z} between a and z included
       - ^ the expression must start at the beginning of the string
       - $ the expression must match to the end of the string

   Examples :
       - dog%c corresponds to dogs or dogg
       - m%d corresponds to m0, m1,...,m9
       - {%dab} corresponds to 1, a, 2, b
       - {%dab}+ corresponds to 1111a, a22a90ab

Examples


; We create our regular expression
(setq r (rgx "%d+%C"))
true

; We're checking to see if it's a match
(rgx_match r "123A")
true

; We are looking for the first matching string.
(rgx_find r "This is a test: 123A here")
123A

; We're looking for all occurrences
(rgx_findall r "This is a test: 123A and 45T and 67U here")
("123A" "45T" "67U")