Regular Expressions - CameronAuler/python-devops GitHub Wiki

Regular expressions (regex) are used for pattern matching and text processing in Python. The re module provides powerful tools for searching, extracting, and replacing text using regex patterns.

Regular Expressions (re Module)
Basic Pattern Matching
String Manipulation (re.split() & re.sub())
- re.sub() (Replacing Text)
- re.split() (Splitting Strings)
Regex Patterns

`re` Module

The re module allows working with regular expressions in Python.

Import `re`

import re

Basic Pattern Matching

The re.search(), re.match(), and re.findall() functions are used for pattern matching.

`re.search()` (Find First Match Anywhere)

re.search() Searches anywhere in the string and returns a match object if found, otherwise None. It is mainly used for extracting specific patterns like phone numbers, emails, or dates.

import re

text = "Hello, my number is 123-456-7890."
match = re.search(r"\d{3}-\d{3}-\d{4}", text)

if match:
    print("Phone number found:", match.group())

# Output:
Phone number found: 123-456-7890

`re.match()` (Match Only at the Beginning)

re.match() only matches if the pattern is at the start of the string. It is mainly used for for checking if a string starts with a specific pattern.

import re

text = "123-456-7890 is my number."
match = re.match(r"\d{3}-\d{3}-\d{4}", text)

if match:
    print("Match found:", match.group())

# Output:
Match found: 123-456-7890

`re.findall()` (Find All Matches)

re.findall() returns all occurrences of the pattern as a list. It is mainly used for extracting all occurrences of an email pattern.

import re

text = "Emails: [email protected], [email protected]"
emails = re.findall(r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b", text)

print("Emails found:", emails)

# Output:
Emails found: ['[email protected]', '[email protected]']

String Manipulation (`re.split()` & `re.sub()`)

`re.sub()` (Replacing Text)

re.sub(pattern, replacement, string) replaces matches in the string. It is mainly used for masking sensitive information like phone numbers or emails.

import re

text = "My phone is 123-456-7890."
new_text = re.sub(r"\d{3}-\d{3}-\d{4}", "XXX-XXX-XXXX", text)

print(new_text)

# Output:
My phone is XXX-XXX-XXXX.

`re.split()` (Splitting Strings)

re.split(pattern, string) splits a string based on a regex pattern. It is mainly used for tokenizing text based on multiple delimiters.

import re

text = "apple, orange; banana | grape"
words = re.split(r"[,\s;|]+", text)  # Splitting on commas, spaces, semicolons, or pipes

print(words)

# Output:
['apple', 'orange', 'banana', 'grape']

Regex Patterns

Pattern	Description	Example Match
`\d`	Matches any digit (`0-9`)	`"123"` → `1`, `2`, `3`
`\D`	Matches non-digits	`"A1B2"` → `A`, `B`
`\w`	Matches letters, digits, and `_`	`"Hello_123"` → `Hello_123`
`\W`	Matches non-word characters	`"Hello@123"` → `@`
`\s`	Matches whitespace (spaces, tabs)	`"Hello World"` → `" "`
`\S`	Matches non-whitespace	`"Hello World"` → `"Hello", "World"`
`^`	Matches start of string	`"Hello"` → `^Hello`
`$`	Matches end of string	`"world!"` → `world!$`
`.`	Matches any character except newline	`"abc"` → `a`, `b`, `c`
`*`	Matches 0 or more repetitions	`"ab*"` → `"a", "ab", "abb"`
`+`	Matches 1 or more repetitions	`"ab+"` → `"ab", "abb"`
`?`	Matches 0 or 1 occurrence	`"ab?"` → `"a", "ab"`
`{n}`	Matches exactly n times	`"\d{3}"` → `"123"`
`{n,}`	Matches at least n times	`"\d{2,}"` → `"12", "123"`
`{n,m}`	Matches between n and m times	`"\d{2,4}"` → `"12", "123", "1234"`
`\|`	OR operator	`"cat\|dog"` → `"cat"` or `"dog"`
`()`	Groups patterns	`"(ab)+"` → `"ab", "abab"`
`\b`	Matches a word boundary	`"\bword\b"` → `"word"` (but not `"wording"`)
`\B`	Matches non-word boundaries	`"\Bing"` → Matches `"wording"` but not `"ing"`
`\A`	Matches start of the string	`"\AHello"` → `"Hello world"`
`\Z`	Matches end of the string	`"world\Z"` → `"Hello world"`
`\G`	Matches position where last match ended	Used in iterative matching

Regular Expressions - CameronAuler/python-devops GitHub Wiki

Table of Contents

re Module

Import re

Basic Pattern Matching

re.search() (Find First Match Anywhere)

re.match() (Match Only at the Beginning)

re.findall() (Find All Matches)

String Manipulation (re.split() & re.sub())

re.sub() (Replacing Text)

re.split() (Splitting Strings)