Processing Text Using Filters - Paiet/Tech-Journal-for-Everything GitHub Wiki
- File-Combining Commands
- cat (Concatenate)
- Combines files together
- Files combined one after another
- Can also display the contents of a file
- Combines file with STDOUT
- Using cat:
- Combine two files together
cat first.txt second.txt > combined.txt
- Display the contents of first.txt
cat first.txt
- Display the contents of second.txt
cat second.txt
- Display the contents of combined.txt
cat combined.txt
- join
- Combines files together
- Files combined based on fields
- Useful for building tables
- Joins on first column by default
- Using join:
- Display two files together
join listing1.1.txt listing1.2.txt
- paste
- Similar to join except that it combines data by inserting a tab in between the first and second data set
- No column is used for comparison
- Using paste:
- Display two files together
paste listing1.1.txt listing1.2.txt
- File-Transforming Commands
- expand
- Converts tabs into spaces
- unexpand
- Converts spaces to tabs
- The opposite of expand
- od (Octal Dump)
- Displays a file in Octal (Base 8)
- Useful for viewing binaries
- Using od:
- Display a file in octal format
od listing1.2.txt
- sort
- Displays data reorganized to suite your needs
- Using sort
- Display the contents of listing1.1.txt sorted by first name
sort -k 3 listing1.1.txt
- split
- Divides a file based on criteria
- Useful for dividing up large files across smaller media
- Can split by:
- Output files will have two letters attached to indicate sequence + filenameaa + filenameab + ... + filenamezy + filenamezz
- Can use cat to recombine
- Using split
- Divide a file every 2 lines
split -l 2 listing1.1.txt numbers
- tr (Translate)
- Converts or removes characters from a file
- Using tr
- Replace every instance of
B
in a file to b
. In the same command, replace the characters C
and J
with the character c
tr BCJ bc < listing1.1.txt
- uniq (Unique)
- Displays data excluding duplicate entries
- Using uniq
- Display the contents of a file, excluding duplicate entries. Sort the entries alphabetically.
sort shakespeare.txt | uniq
- File-Formatting Commands
- fmt (Format)
- Applies manual word-wrapping to a file
- Defaults to 75 character width
- nl (Numbered Lines)
- Adds line numbers to each line
- Useful for readability
- Useful for troubleshooting script errors that return a line number
- Similar to cat -b but with advanced options
nl -b a filename.txt
- a option causes all lines to be numbered, including blank lines
- pr (Prepare for Printing)
- Formats a file for output to a line printer
- Assumes 80 character width and mono-space font
- Can also set headers, footers, margins, etc.
- Using pr
- Display the contents of a file double-spaced and with line numbering
cat -n /etc/profile | pr -d
- Repeat step 1, but apply word wrapping at 50 characters
cat -n /etc/profile | pr -dfl 50
- File-Viewing Commands
- head
- Displays the first 10 lines of a file
- Use -n option to set the number of lines
- tail
- Displays the last 10 lines of a file
- Use -n option to set the number of lines
- less
- Displays the contents of a file
- Allows for scrolling and searching
- Replacement for more command
- File-Summarizing Commands
- cut
- Extracts portions of a file
- Usually combined with other commands
- Using cut
- Display only the MAC address for each network interface on your system
ifconfig | grep ether | cut -d " " -f 10
- wc (Word Count)
- Displays the word count for a file
Lab
- Requirements:
- Text file named first.txt
- Contents:
Data from first file.
- Text file named second.txt
- Contents:
Data from second file.
- Text file named listing1.1.txt
- Contents:
555-2397 Beckett, Barry
555-5116 Carter, Gertrude
555-7929 Jones, Theresa
555-9871 Orwell, Samuel
- Text file named listing1.2.txt
- Contents:
555-2397 unlisted
555-5116 listed
555-7929 listed
555-9871 unlisted
- Text file named shakespeare.txt
- Contents:
to
be
or
not
to
be
that
is
the
question