Unix Tools - oilshell/oil GitHub Wiki

The shell interacts with a set of Unix tools in /bin and so forth. However, in many cases, those tools have grown functionality that overlaps with shell.

Unix Tools ...

Related: Ad Hoc Protocols in Unix

That Start Processes (in parallel)

  • make and other build tools. make -j for parallel builds.
  • xargs, -P for parallel execution, -I {} for substitution
    • Also GNU Parallel, which is mentioned in the bash manual.
  • find -exec and -exec +

That Have Expression Languages

Expression languages must be fully recursive to count here.

With no lexer:

  • find -- -a -o ! ( )
  • test -- -a -o ! ( )
  • expr -- arithmetic, subsumed by $(())

Languages with lexers:

  • awk
  • dtrace -- modelled after awk.

Honorable mention:

  • strace also has a little expression language, but it's not fully recursive.

That Use Regexes

  • grep, grep -E
  • sed, sed --regexp-extended in GNU sed
  • awk (extended only)
  • expr
  • find -regex
  • bash itself.

That Receive Code Snippets (Remote Evaluation)

  • tar has a --sed option.

That Have Printf-Style Formatting

See Appendix A: How to Quickly and Correctly* Generate a Git Log in HTML

  • find -printf (arbitrary filenames)
  • stat -c (arbitrary filenames)
  • curl --write-out %{response_code} -- URLs can't have arbitrary characters?
  • printf itself (coreutils)
  • time (/usr/bin/time) -- mostly numbers
  • date -- mostly numbers
  • bash
    • the printf builtin
    • the time builtin and the TIMEFORMAT string -- mostly numbers
    • the prompt string: \h \W
  • ps --format

That Have Backslash Escaping

  • awk -F '\t' -- same as awk -F $'\t'
  • xargs -d '\t'
  • (GNU cut -d doesn't understand tabs)
  • find -printf

Non-standard tools:

NOTE: grep should have a syntax for captures, like $1 $2 name: $name age: $age. sed just has & for the matched group.

With Quoting/Escaping Algorithms

  • ls -q -b for unprintable chars in filenames
  • printf %q for spaces in args
  • ${var@Q} which is different than printf %q!!! See help-bash@ thread.

Arg Substitution

These are like "$@" in shell.

  • xargs -I {} -- echo {}
  • find -exec {} +

Could be replaced with $_ or @_ ("it").

With Tabular Output

  • find / ls
  • ps
  • df (has -h and -H human-readable option, --output[=FIELD_LIST] but no format string)
  • du -- has -0 for NUL output
  • TODO: look at netstat, iostat, lsof, etc. Brendan Gregg's pages.

With File System Path Matching

  • du --exclude
  • rsync --include --exclude
  • find -name, -regex, -wholename, etc.

That Format Binary Data

  • od
  • xxd
  • hexdump -- has a % format language.

Honestly I don't understand the difference between these!

Misc Expression Languages

  • getopts builtin spec, and /usr/bin/getopt
    • leading : means to do different error handling! Instead of the arg. Gah.

The Worst Offender

find starts processes (with -exec), has a recursive boolean expression language, regexes, globs, has % and backslash escapes (in -printf), and arg substitution ({} is like "$@"). It should be part of the shell!

It also doesn't give good parse error messages. Sometimes it just says "find: invalid expression" with no location information.

Wow this is crazy too:

The regular expressions understood by find are by default Emacs Regular Expressions, but this can be changed with the -regextype option.

$ find -regextype -help
find: Unknown regular expression type ‘-help’; valid types are ‘findutils-default’, ‘awk’, ‘egrep’, ‘ed’, ‘emacs’, ‘gnu-awk’, ‘grep’, ‘posix-awk’, ‘posix-basic’, ‘posix-egrep’, ‘posix-extended’, ‘posix-minimal-basic’, ‘sed’.

I didn't know there were that many regex types! And emacs is a really bad default!

Families of Unix Tools

Misc Problems