Regular Expressions (regex) - robbiehume/CS-Notes GitHub Wiki

Links

Notes

  • .: any character
  • \.: period character
  • [abc]: only a, b, and c
  • [^abc]: any character but a, b, and c
  • \d: any digit
  • \D: any non-digit character
  • [a-z]: characters a to z; can be any range (a-c)
  • [0-9]: numbers 0 to 9; can be any range (1-3)
  • \w: any alphanumeric character
  • \W: any non-alphanumeric character
  • {m}: m repetitions; a{3} will match the character exactly 3 times
    • Ex. [abc]{5}: 5 characters, each of which can be an a, b, or c
  • {m,n}: m to n repetitions; a{1,3} will match the character no more than 3 times, but no less than once
    • Ex. .{2,6}: between 2 and 6 of any character
  • *: 0 or more repetitions;
    • Ex. .*: 0 or more of any character
  • +: 1 or more repetitions
    • Ex. [abc]+: 1 or more of any a, b, or c characters
  • ?: optional character
    • Ex. ab?c will match either "abc" or "ac" because the b is considered optional
  • \s: any whitespace character (space, tab, newline, etc.)
    • Ex. .*: any amount of whitespace
  • \S: any non-whitespace character
  • \b: word boundary character; helps find words starting or ending with certain characters
    • For vim search, it uses \< and \> instead
    • Ex: \b[Aa]uth finds any word starting with "Auth" or "auth"
  • ^: starts with; is different than the ^ in [^abc]
    • Ex. ^Hello: only lines that start "Hello"
  • $: ends with
    • Ex. Bye\.$: only lines that end with "Bye."
  • Can combine ^ and $: ^mission: successful$
  • (): () can be used for match groups to extract info from a string
    • Ex: ^(IMG\d+\.png$ will only capture the image name
  • Nested groups: exgtract multiple layers of info
    • Ex: (\w+ (\d+)) turns Jan 1999 into 2 groups: Jan 1999 and 1999
  • |: can use a pipe to specify multiple different options
    • Ex: (cats|dogs)$ matches "I love cats" and "I love dogs"
  • (?i): case-insensitive mode; add it to beginning of regex
⚠️ **GitHub.com Fallback** ⚠️