csv_collation_summary - bruno-beloff/scs_analysis GitHub Wiki

docs > software repositories > scs_analysis > commands > filtering and aggregating data


DESCRIPTION

The csv_collation_summary utility is used alongside csv_collator to report on the effect of one independent variable on one or more dependent variables, for a range of independent variable deltas.

Input is gained from a sequence of CSV files, as created by the csv_collator utility. Input rows with missing or malformed values for any of the variables are ignored. If the specified columns for any of the named variables are missing, the utility terminates.

The output of csv_collation_summary is a sequence of JSON documents, detailing the min, median and max values for the independent variable, with the median and standard deviation of each of the dependent variables.

If the --verbose flag is used, a summary of the data processed is written to stderr.

SYNOPSIS

csv_collation_summary.py -f FILE_PREFIX -i IND_PATH [-p IND_PRECISION DEP_PRECISION] [-v] DEP_PATH_1 .. DEP_PATH_N

Options
--version show program's version number and exit
-h, --help show this help message and exit
-f FILE_PREFIX, --file-prefix=FILE_PREFIX file prefix for collated CSVs
-i IND_PATH, --ind-path=IND_PATH path to independent variable
-p PRECISIONS, --prec=PRECISIONS precision for independent and dependent variables (default 1, 3 decimals)
-v, --verbose report narrative to stderr

EXAMPLES

csv_collation_summary.py -v -f collated_5rH/joined_PM_meteo_data_2019-02_2019-07_15min_ -i th.praxis.val.hmd pm1_scaling pm2p5_scaling pm10_scaling | csv_writer.py -v collated_5rH/summary.csv

FILES

Input file names must be of the form: FILE_PREFIX_DOMAIN_LOW_DOMAIN_HIGH.csv

DOCUMENT EXAMPLE - OUTPUT

{"domain": "70.0 - 75.0", "praxis": {"climate": {"val": {"hmd": {"min": 70.0, "avg": 72.6, "max": 74.9}}}}, "error": {"pm1": {"avg": 2.648, "stdev": 2.458}, "pm2p5": {"avg": 2.992, "stdev": 2.652}, "pm10": {"avg": 2.77, "stdev": 2.456}}, "samples": 2307}

SEE ALSO

scs_analysis/csv_collator
scs_analysis/sample_error

RESOURCES

https://en.wikipedia.org/wiki/Dependent_and_independent_variables