AWK and Other Command Line Tools - BKJackson/BKJackson_Wiki GitHub Wiki

How to (& how not to) parse 25TB of data using awk and #rstats - New long blog post on my recent journey setting up a query system for some #BigData generated for my lab. I sped up queries by 4,800 times using old-school simple techniques. Blog link.

Extract data from a file and place in different files based on one column value

Janssens books and cli tools

Data Science at the Command Line - Jeroen Janssens, 2019 (book)
Docker image - Docker image for the above book
Data Science at the Command Line Github
7 Command-Line Tools for Data Science

xml2json

jq - jq is a lightweight and flexible command-line JSON processor

json2csv

A “full-stack” data science project