Unix Tools - oils-for-unix/oils GitHub Wiki
The shell interacts with a set of Unix tools in /bin and so forth. However, in many cases, those tools have grown functionality that overlaps with shell.
Unix Tools ...
Related: Ad Hoc Protocols in Unix
That Start Processes (in parallel)
makeand other build tools.make -jfor parallel builds.xargs,-Pfor parallel execution,-I {}for substitution- Also GNU Parallel, which is mentioned in the bash manual.
find -execand-exec +
That Have Expression Languages
Expression languages must be fully recursive to count here.
With no lexer:
find---a -o ! ( )test---a -o ! ( )expr-- arithmetic, subsumed by$(())
Languages with lexers:
awkdtrace-- modelled after awk.
Honorable mention:
stracealso has a little expression language, but it's not fully recursive.
That Use Regexes
grep,grep -Esed,sed --regexp-extendedin GNU sedawk(extended only)exprfind -regexbashitself.
That Receive Code Snippets (Remote Evaluation)
tarhas a --sed option.
That Have Printf-Style Formatting
See Appendix A: How to Quickly and Correctly* Generate a Git Log in HTML
find -printf(arbitrary filenames)stat -c(arbitrary filenames)curl --write-out %{response_code}-- URLs can't have arbitrary characters?printfitself (coreutils)time(/usr/bin/time) -- mostly numbersdate-- mostly numbersbash- the
printfbuiltin - the
timebuiltin and theTIMEFORMATstring -- mostly numbers - the prompt string:
\h \W
- the
ps --format
That Have Backslash Escaping
awk -F '\t'-- same asawk -F $'\t'xargs -d '\t'- (GNU cut -d doesn't understand tabs)
find -printf
Non-standard tools:
git log --pretty=format:(arbitrary descriptions)hg log --template-- http://hgbook.red-bean.com/read/customizing-the-output-of-mercurial.html (doesn't have\0as far as I can tell.) Mercurial has its own template language likedate: {date|isodate}\n\n(no$).
NOTE: grep should have a syntax for captures, like $1 $2 name: $name age: $age. sed just has & for the matched group.
With Quoting/Escaping Algorithms
ls -q -bfor unprintable chars in filenamesprintf %qfor spaces in args${var@Q}which is different thanprintf %q!!! Seehelp-bash@thread.
Arg Substitution
These are like "$@" in shell.
xargs -I {} -- echo {}find -exec {} +
Could be replaced with $_ or @_ ("it").
With Tabular Output
find/lspsdf(has-hand-Hhuman-readable option,--output[=FIELD_LIST]but no format string)du-- has-0forNULoutput- TODO: look at netstat, iostat, lsof, etc. Brendan Gregg's pages.
With File System Path Matching
du --excludersync --include --excludefind -name,-regex,-wholename, etc.
That Format Binary Data
- od
- xxd
- hexdump -- has a % format language.
Honestly I don't understand the difference between these!
Misc Expression Languages
getoptsbuiltin spec, and/usr/bin/getopt- leading
:means to do different error handling! Instead of the arg. Gah.
- leading
The Worst Offender
find starts processes (with -exec), has a recursive boolean expression language, regexes, globs, has % and backslash escapes (in -printf), and arg substitution ({} is like "$@"). It should be part of the shell!
It also doesn't give good parse error messages. Sometimes it just says "find: invalid expression" with no location information.
Wow this is crazy too:
The regular expressions understood by find are by default Emacs Regular Expressions, but this can be changed with the -regextype option.
$ find -regextype -help
find: Unknown regular expression type ‘-help’; valid types are ‘findutils-default’, ‘awk’, ‘egrep’, ‘ed’, ‘emacs’, ‘gnu-awk’, ‘grep’, ‘posix-awk’, ‘posix-basic’, ‘posix-egrep’, ‘posix-extended’, ‘posix-minimal-basic’, ‘sed’.
I didn't know there were that many regex types! And emacs is a really bad default!
Families of Unix Tools
- CSV, JSON, HTML, XML, recfile, 1991 paper!
Misc Problems
- coreutils
timedoesn't have millisecond resolution! https://stackoverflow.com/questions/16959337/usr-bin-time-format-output-elapsed-time-in-milliseconds. It's hard-coded right in time.c --- usecs/10000. very arbitrary. - bash TIMEFORMAT has precision like
%3R, but it doesn't have the exit code! Annoying.