9 ‐ REGREX - CloudScope/DevOpsWithCloudScope GitHub Wiki
Regex (regular expressions) in Linux is a powerful tool for searching, matching, and manipulating text. It is widely used in commands like grep
, sed
, awk
, find
, and many more. Here's an overview of the basics and some advanced concepts of regex in Linux:
Basic Regex Elements
-
Literals: Characters that match themselves, e.g.,
a
matches the character 'a'. -
Metacharacters: Special characters with specific meanings:
.
: Matches any single character except a newline.^
: Anchors the pattern to the start of the line.$
: Anchors the pattern to the end of the line.\
: Escapes a metacharacter to treat it as a literal.[]
: Bracket expression matches any single character inside the brackets, e.g.,[abc]
matches 'a', 'b', or 'c'.[^]
: Matches any single character not inside the brackets, e.g.,[^abc]
matches any character except 'a', 'b', or 'c'.
-
Quantifiers: Specify the number of occurrences.
*
: Matches 0 or more occurrences of the preceding element.+
: Matches 1 or more occurrences of the preceding element.?
: Matches 0 or 1 occurrence of the preceding element.{n}
: Matches exactly n occurrences.{n,}
: Matches n or more occurrences.{n,m}
: Matches between n and m occurrences.
-
Grouping and Alternation:
()
: Groups patterns together, e.g.,(abc)
treats 'abc' as a single unit.|
: Alternation, works like a logical OR, e.g.,a|b
matches 'a' or 'b'.
Advanced Regex Concepts
-
Character Classes:
\d
: Matches any digit (equivalent to[0-9]
).\D
: Matches any non-digit.\w
: Matches any word character (alphanumeric + underscore).\W
: Matches any non-word character.\s
: Matches any whitespace character (space, tab, newline).\S
: Matches any non-whitespace character.
-
Anchors and Boundaries:
\b
: Matches a word boundary (position between a word and a non-word character).\B
: Matches a position that is not a word boundary.
-
Lookahead and Lookbehind:
- Lookahead:
(?=...)
ensures that the following text matches the expression inside the lookahead. - Negative Lookahead:
(?!...)
ensures that the following text does not match the expression inside. - Lookbehind:
(?<=...)
ensures that the preceding text matches the expression inside the lookbehind. - Negative Lookbehind:
(?<!...)
ensures that the preceding text does not match the expression inside.
- Lookahead:
Common Commands Using Regex in Linux
-
grep:
- Basic usage:
grep 'pattern' file.txt
- Recursive search:
grep -r 'pattern' /path/to/directory
- Use extended regex:
grep -E 'pattern1|pattern2' file.txt
- Basic usage:
-
sed:
- Search and replace:
sed 's/old/new/' file.txt
- Use regex groups:
sed 's/\(pattern1\)/replacement/' file.txt
- Search and replace:
-
awk:
- Pattern matching:
awk '/pattern/ {print $0}' file.txt
- Use regex in conditions:
awk '$1 ~ /pattern/' file.txt
- Pattern matching:
-
find:
- Find files with regex:
find /path -regex '.*pattern.*'
- Find files with regex:
Tips
- Always be mindful of escaping special characters when needed.
- Use extended regex with
-E
in commands likegrep
andsed
for more complex patterns. - Test your regex with tools like
regex101.com
for validation and troubleshooting.