csv_join - bruno-beloff/scs_analysis GitHub Wiki

docs > software repositories > scs_analysis > commands > file-based data management


DESCRIPTION

The csv_join utility performs an SQL join operation on a pair of CSV files. This is particularly useful where a comparison is to be made on timeline data from two devices.

All four join types are supported: inner, left outer, right outer and full outer. Inner joins are used when the output should only contain rows that match on their primary key.

Output is in the form of a sequence of JSON documents containing three fields: rec, left table contents (labelled for the left PREFIX) and right table contents (labelled for the right PREFIX). The left and right contents do not contain their primary key fields. For outer joins, content nodes may be null.

The --iso8601 flag is provided to indicate that the primary key should be interpreted as a ISO 8601 datetime. This is useful where data sets use alternate datetime formats such as 2019-02-22T01:00:00Z and 2019-02-22T01:00:00+00:00.

If the --verbose flag is used, a summary of the join operation is written to stderr.

Warning: if LEFT, RIGHT or FULL joins are used, the first document written to stdout may not contain all of the fields of the input CSV pair. In these cases, if the output is piped to the scs_analysis/csv_writer utility, then the csv_writer must be used in header-scan mode.

SYNOPSIS

csv_join.py [-t TYPE] -l PREFIX PK FILENAME -r PREFIX PK FILENAME [-i] [-v]

Options
--version show program's version number and exit
-h, --help show this help message and exit
-t TYPE, --type=TYPE { 'INNER' | 'LEFT' | 'RIGHT' | 'FULL' } (default 'INNER')
-l LEFT, --left=LEFT output path prefix, primary key and filename for left-hand set
-r RIGHT, --right=RIGHT output path prefix, primary key and filename for right-hand set
-i, --iso8601 interpret the primary key as an ISO 8601 datetime
-v, --verbose report narrative to stderr

EXAMPLES

csv_join.py -i -l praxis rec praxis_301/praxis_301_2018-08.csv -r ref rec ref/ref_2018-08.csv -v

DOCUMENT EXAMPLE - INPUT

left:

{"rec": "2019-02-01T02:00:00Z", "val": {"NO2": {"weV": 0.297185, "cnc": 40.8, "aeV": 0.298467, "weC": 0.002271}}}

right:

{"rec": "2019-02-01T02:00:00Z", "val": {"NO2": {"status": "P", "units": "ugm-3", "dns": 34.0}}}

DOCUMENT EXAMPLE - OUTPUT

{"rec": "2019-02-01T02:00:00Z", "praxis": {"val": {"NO2": {"weV": 0.297185, "cnc": 40.8, "aeV": 0.298467, "weC": 0.002271}}}, "ref": {"val": {"NO2": {"status": "P", "units": "ugm-3", "dns": 34.0}}}}

SEE ALSO

scs_analysis/csv_reader
scs_analysis/csv_writer

RESOURCES

https://www.w3schools.com/sql/sql_join.asp