Run the pipeline - NCAR/kcor-pipeline GitHub Wiki
Configuration file
The new pipeline uses a configuration file to specify how the pipeline should be run, i.e., to specify paths, which actions to perform, etc. You always need a configuration file for running the pipeline. The configuration filename should be kcor.FLAGS.cfg
. See config/kcor.spec.cfg
for the definition of each possible option. You must change at least the paths in this file in order to run the pipeline.
Database credentials
If you want to use the database, you must have your database login credentials in a file. I recommend creating a file that is not viewable by anyone but yourself. The format of the file should be:
[pipeline]
host : {database URL}
user : {actual username here}
password : {actual password here}
port : 3306
database : MLSO
It can have multiple logins in it, the config file specifies the location of the file and which section of the file to use.
Running the pipeline
Running the pipeline requires a Python 3 installation with the psutil
package installed (can be installed via pip
).
Use the kcor
script in the bin
directory of a kcor-pipeline installation to run the pipeline. As stated above, you will need to have a valid and appropriately named configuration file.
Running the realtime and end-of-day pipeline
To run the pipeline for a day, or range of days, do something like the following:
$ kcor process -f reprocess-2018 20180702
where reprocess-2018
is the "FLAGS" portion of a configuration filename to be used for the run. To run the pipeline for multiple days, use "-" (for a range with the start date inclusive and the end date exclusive) and "," to combine dates, like:
$ kcor process -f reprocess-2018 20180601-20180701,20180704
which runs the pipeline for all of June 2018 plus 20180704 (but not 20180701).
Just producing a calibration
To run the calibration for a day, or range of days, do something like the following:
$ kcor cal -f reprocess-2018 20180702
where reprocess-2018
is the "FLAGS" portion of a configuration filename to be used for the run. You can also do:
$ kcor cal --list files.txt -f reprocess-2018 20180702
where files.txt
is a file containing a list of files to use for a calibration.