sample_collator - bruno-beloff/scs_analysis GitHub Wiki
docs > software repositories > scs_analysis > commands > filtering and aggregating data
DESCRIPTION
The sample_collator utility is used to separate input JSON document dependent values according to the upper and lower bounds of an independent value. For each column, assignment follows the rule:
lower bound <= value < upper bound
The upper and lower bounds for the data set should be specified, along with a delta size. The number of columns required to service this domain is calculated automatically. The path identifying the leaf node in the input document for both the independent and dependent fields must be specified.
Two collators are provided in this package: sample_collator collates into separate columns (collate to columns), whereas csv_collator collates into separate CSV files (collate to rows).
SYNOPSIS
sample_collator.py -x IND_PATH [-n NAME] -y DEP_PATH [-l LOWER_BOUND] -u UPPER_BOUND -d DELTA [-v]
Options | |
---|---|
--version | show program's version number and exit |
-h, --help | show this help message and exit |
-x IND_PATH, --ind-path=IND_PATH | path to independent variable |
-n NAME, --name=NAME | name for the independent variable |
-y DEP_PATH, --dep-path=DEP_PATH | path to dependent variable |
-u UPPER, --upper=UPPER | upper bound of dataset |
-d DELTA, --delta=DELTA | width of column domain |
-l LOWER, --lower=LOWER | lower bound of dataset (default 0) |
-v, --verbose | report narrative to stderr |
EXAMPLES
csv_reader.py -v -l 1 scs-pb1-3-ref-opc-r1-error-2019-09-27T11-03-41+01-00-15min-exegesis-error.csv | \
sample_collator.py -v -x "ref.pmx.PM25 Processed Measurement (µg/m³)" -n PM25 -y error.pm2p5 -u 60 -d 10
DOCUMENT EXAMPLE - INPUT
{"rec": "2019-10-11T10:15:00Z", "opc": {"pmx": {"tag": "scs-pb1-3", "src": "R1", "val": {"per": 9.9, "pm1": 2.2, "pm2p5": 8.4, "pm10": 12.0}}}, "error": {"pm1": 1.517, "pm2p5": 3.561, "pm10": 3.429}}
DOCUMENT EXAMPLE - OUTPUT
{"rec": "2019-10-11T10:15:00Z", "opc": {"pmx": {"tag": "scs-pb1-3", "src": "R1", "val": {"per": 9.9, "pm1": 2.2, "pm2p5": 8.4, "pm10": 12.0}}}, "error": {"pm1": 1.517, "pm2p5": {"src": 3.561, "PM25_0-10": 3.561, "PM25_10-20": null, "PM25_20-30": null, "PM25_30-40": null, "PM25_40-50": null, "PM25_50-60": null, "PM25_60-70": null}, "pm10": 3.429}}