8 ‐ Text Processing in Linux - CloudScope/DevOpsWithCloudScope GitHub Wiki

grep

The grep command is a powerful tool for searching text in files using regular expressions. It comes with various flags that modify its behavior.

Here’s a rundown of some commonly used grep flags and options:

Common grep Flags

-i: Ignore case (case-insensitive search).

grep -i "pattern" file.txt

Example:

grep -i "hello" file.txt will match "Hello", "HELLO", etc.

-v: Invert the match (show lines that do not match the pattern).

grep -v "pattern" file.txt

Example:

grep -v "error" log.txt will show lines that do not contain "error".

-r or -R: Recursively search directories.

grep -r "pattern" /path/to/dir/

Example:

grep -r "TODO" ~/projects/ will search for "TODO" in all files within the specified directory and its subdirectories.

-l: Show only the names of files with matching lines.

grep -l "pattern" *.txt

Example:

grep -l "fix" *.txt will list all .txt files containing "fix".

-n: Show line numbers with output lines.

grep -n "pattern" file.txt

Example:

grep -n "main" source.c will show the line numbers where "main" occurs.

-c: Count the number of matching lines.

grep -c "pattern" file.txt

Example:

grep -c "error" log.txt will display the number of lines containing "error".

-w: Match whole words only.

grep -w "pattern" file.txt

Example:

grep -w "word" text.txt will match "word" but not "sword" or "words".

-x: Match the whole line.

grep -x "pattern" file.txt

Example:

grep -x "exact line" will match lines that exactly match "exact line".

-e: Specify multiple patterns.

grep -e "pattern1" -e "pattern2" file.txt

Example:

grep -e "foo" -e "bar" file.txt will match lines containing either "foo" or "bar".

-A: Show lines after a match (context lines).

grep -A 3 "pattern" file.txt

Example:

grep -A 2 "start" file.txt will show the matching line and the next 2 lines.

-B: Show lines before a match.

grep -B 3 "pattern" file.txt

Example:

grep -B 2 "error" log.txt will show the matching line and the 2 preceding lines.

-C: Show lines before and after a match (context around the match).

grep -C 3 "pattern" file.txt

Example:

grep -C 2 "function" code.c will show the matching line along with 2 lines before and 2 lines after.

--color: Highlight matching text.

grep --color=auto "pattern" file.txt

Example:

grep --color=always "keyword" document.txt will highlight "keyword" in the output.

Examples

Find all occurrences of "error" in a file, ignoring case:

grep -i "error" logfile.txt

Search recursively for "fix" in all .txt files in a directory:

grep -r "fix" /path/to/directory/*.txt

Count the number of lines containing "success" in a log file:

grep -c "success" logfile.log

Find lines with "pattern" and show 3 lines of context around each match:

grep -C 3 "pattern" file.txt

Show only the filenames that contain the string "confidential":

grep -l "confidential" *.txt

These flags provide a range of functionalities to tailor grep for various text-searching needs.

sed

Stream editor for filtering and transforming text.

sed 's/old/new/' filename.txt (replaces 'old' with 'new', without modify original file)

This will replace first occurrence of the line.

sed 's/old/new/g' filename.txt

This will change all occurrence in the file.

sed '1 s/old/new/g' filename.txt

This will change all occurrence in first line of the file.

Delete a line with sed

sed '/patten/g' filename.txt

sed -i 's/old/new/' filename.txt replaces 'old' with 'new', with modify original file)

awk

awk is a versatile command-line tool for pattern scanning and processing. It is used for extracting and manipulating text data, often from text files or command outputs.

awk '{ print $1 }' file.txt

Commonly Used Options:

-F: Specify the field separator (default is whitespace).

awk -F ',' '{ print $1 }' file.csv

Example:

awk -F ':' '{ print $1, $2 }' /etc/passwd prints the first and second fields of each line separated by a colon.

-f: Read awk commands from a file.

awk -f script.awk file

Example:

If script.awk contains print $1, awk -f script.awk file.txt prints the first field of each line.

-v: Assign values to awk variables from the command line.

awk -v var=value '{ print var }' file

Example:

awk -v max=10 '{ if ($1 > max) print $1 }' file.txt prints values greater than 10.

-W: Specify various options (e.g., compat, re-interval).

awk -W compat 'commands' file

Example:

awk -W interactive 'BEGIN { print "Hello, World!" }'

awk -F '.' {print $1 $2 "," $3}

This will separated by dot (.) and put comma (,) between 2nd and 3rd word.

Count the number of lines in a file:

awk 'END { print NR }' file.txt