Home - michellelally/regular-expression-matching GitHub Wiki
Welcome to the Graph-Theory-Project wiki!
Regular Expression Matching with Python
This project is used to match strings of text to a regular expression by building a non-deterministic finite automata for the expression. It is done by using a combination of algorithms that can; Understand both infix and postfix notation expressions and be capable of translating one to the other, Read in regular expressions and identify the special characters with each of their meanings Build a non-deterministic finite automata from a postfix notation expression Match a string to the regular expression to the equivalent NFA built
Steps:
- The user should enter either a singular or list of expressions in infix notation that they want to check if it matches a string of text. The expression should include any of the following characters:
-
'.' Concatenation of characters, the 2 characters together.
-
'|' Alternation of characters, one character or the other
-
'*' Any amount, including zero of the character
-
'?' Only zero or one of the character
-
'+' Only one or more of the character
-
'a-z, A-z, 0-9' Literal charaters, the char in the strings
-
The user then can enter the string or a list of strings they want to check the expression against. These strings can be numbers or letters. The user should refrain from using characters as it may cause issues with the special characters.
-
Once these have been entered, running the program is the next step which will output a True or False as to whether the string was matched against the expression and a list of all expression and strings.
Notes:
-
Concatenation character does not have to be included in expression, the program should know its use is being implied. The concatenation character can still be used though if the user would prefer.
-
It would be appreciated to try keep the expressions at a limited length for the moment, as described in the Issues section, the program may run into a bug with recursion and cause an infinite loop. Should this occur, Ctrl+C should stop execution and the expression that caused the bug should be altered/removed
** How it's done**