Level 3 - OzFlux/PyFluxPro GitHub Wiki

Level 3 - Post-processing of data

Overview

Level 3, or L3 for short, is the stage where PyFluxPro performs any required calculations, corrections, merging or averaging of data streams from multiple instruments. The calculations and corrections applied to the data depend on the type of data being used with PyFluxPro and the type of measurements at the site. See below for a summary of what L3 can do if required.

One of the primary functions of L3 processing is to combine data from multiple instruments into a single data stream where it is appropriate to do so. For example, most flux towers have 2 measurements of humidity at the top of the flux tower, one from a slow response instrument such as a Vaisala HMP-155 and another from a fast response IRGA such as an Li-7500RS. As long as the IRGA is calibrated, the data from these 2 instruments can be merged so that any gaps in the slow response instrument are filled with data from the IRGA. The same can be done for temperature after converting the sonic temperature to air temperature, wind speed and direction, CO2 concentration etc. In some cases, it is more appropriate to average the data from several sensors instead of merging. Many sites have several soil pits to provide spatial replicates of soil moisture, soil temperature and soil heat flux. In this case, averaging the data from several sensors provides a better estimate of the spatial average of the data.

It is very important to understand that PyFluxPro assumes that some variables will exist and have specific names so they can be used at later stages. For example, PyFluxPro will look for a variables called Ta, AH (or RH or SH) and ps so that it can calculate various meteorological quantities such as air density, saturation vapour pressure etc. The table below gives a list of the required (reqd) and recommended variables at L3.

Variable name	Quantity	Merge from examples (Loxton)
AH	Absolute humidity	AH_HMP_10m, AH_IRGA_Av
ps	Air pressure	ps
Ta	Air temperature	Ta_HMP_10m, Ta_SONIC_Av

Table 1: Required variables that must be created at L3 if they do not exist at L2.

Variable name	Quantity	Merge from examples (Loxton)
CO2	CO2 concentration	CO2_IRGA_Av
Fn	Net radiation	Fn_4cmpt
Wd	Wind direction	Wd_034B_Av, Wd_SONIC_Av
Ws	Wind speed	Ws_034B_Av, Ws_SONIC_Av

Table 2: Recommended variables that should be created at L3 by merging data from different sensors.

Note that in the above example of recommended variables, the net radiation, variable name Fn, is created by merging a single variable Fn_4cmpt. The Fn_4cmpt variable is created automatically by PyFluxPro if it finds the 4 components of radiation, Fsd, Fsu, Fld and Flu, in the L2 data.

Variable name	Quantity	Merge from examples (Loxton)
Fg	Ground heat flux	Fg_10cma,Fg_10cmb,Fg_10cmc, Fg_10cmd,Fg_10cme,Fg_10cmf
Sws	Soil moisture	Sws_005cm
Ts	Soil temperature	Ts_2cma,Ts_2cmb,Ts_2cmc, Ts_2cmd,Ts_6cma,Ts_6cmb, Ts_6cmc,Ts_6cmd

Table 3: Recommended variables that should be created at L3 by averaging data from different sensors.

PyFluxPro can either read the turbulent fluxes calculated by an external program (e.g. EddyPro, EasyFlux-PC) at L1 or it can calculate the fluxes from the covariances at L3 if these are available. The user can specify which method to use using the UseL2Fluxes option in the Options section as described below. First, however, a little history to explain why these 2 options exist.

A Little History

The predecessor to PyFluxPro was called OzFluxQC and the predecessor to OzFluxQC was just called qc. It had no GUI and was simply a collection of Python scripts written to help process the data from the 8 sites of the North Australian Tropical Transect (NATT) sites during my failed post-doc from 2007 to 2010. In fact, qc and later OzFluxQC were what I did instead of writing and publishing papers and that explains why today I'm a lowly technician and not a much vaunted academic telling technicians what they should do. So it goes. Let it be a lesson to all young post-docs that you must publish, publish, publish.

From the beginning, it was obvious that a huge amount of work would be required to find and process the 10 Hz data to get the turbulent fluxes from 8 sites. The 10 Hz data was patchy (many CF cards failed), it was stored on CDs with helpful names like ts_2763.dat, there was no documentation of what was in the binary files (TOB1 and TOB3) and so on. Even just collecting all of the CDs, copying them, converting the binary files to TOA5 and documenting what data was where would have taken several weeks. Then the data would have to be pushed through EddyPro or similar, in chunks small enough to deal with the changes in file formats, instruments etc. What to do?

The Campbell data logger systems being used in those days output all of the variances and covariances from all of the instruments sampled at 10 Hz (CSAT3 and IRGA) averaged over 30 minutes. Aha! A quick and dirty way to get the fluxes would be to do a 2D coordinate rotation on the covariances (all the sites were flat anyway), apply the frequency and density (WPL) correction to the rotated covariances, convert virtual heat flux to sensible heat flux and boom, pretty good fluxes for 5% of the effort required to find, collate and process all of the 10 Hz data.

Later on, when it became obvious that site PIs were going to continue using the quick and dirty fluxes-from-covariances method because it saved time, I got worried. After all, I'm a Physicist, a Perfectionist and a Purist in that order and that meant TURBULENT FLUXES SHOULD BE CALCULATED FROM THE RAW DATA! So, I added the ability to read fluxes, calculated by EddyPro, at L1 instead of calculating fluxes from covariances at L3. No doubt there is a small, particularly hot corner of Hell presided over by a minor demon called Papale to which I will be sent for my early sins.

L3 Processing Steps

The processing steps available at L3 are as follows:

Merge data streams from multiple instruments into a single variable. For example, merge the humidity measurements from an IRGA into the data stream from a slow response sensor such as a Vaisala HMP-45 to fill any gaps in the slow response data caused by instrument failures. Merging can be done for any variables.
Calculate a standard set of meteorological variables and add these to L3 data.
Calculate the turbulent fluxes (Fco2, Fe, Fh, Fm and ustar) from the variances and covarinces after applying the 2D coordinate rotation, correcting for low- and high-frequency losses, converting from virtual heat flux to sensible heat flux and applying the Webb Pearman Leuning (WPL) correction.
Convert CO2 concentration and flux to mkixing ratio and molar units respectively and correct the CO2 flux for storage if requested.
Calculate net radiation from the 4 components, if available.
Combine other data streams such as wind speed, wind direction, soil temperature, ground heat flux, soil water content etc. Some of these are merged, some are spatial replicates that can be averaged.
Correct the ground heat flux for storage of heat above the heat flux plates.
Calculate the available energy from Fn-Fg.

Most of these processing steps can be activated, changed or suppressed through options in the Options section. Options for the Massman frequency correction and soil parameters used to correct the measured ground geat flux for storage are contained in separate Massman and Soil sections.

The L3 Control File

The L3 control file consists of the following sections:

Files
Options
Variables
Plots

The contents of these sections and how to edit them are descxribed below.

The Files Section

Description of the Files section

The Files section allows the user to specify the path to the input and output files, the names of the input and output files and the path for plots generated by the L2 processing, see the screenshot below.

Image of the Files section in an L3 control file

The entries in the Files section are as follows:

file_path - the path to the data files
in_filename - the input file name
out_filename - the output file name
plot_path - the path for plots generated by the L2 processing

Editing the Files Section

The entries in Files section can be edited by right clicking on the entry in the Value column and using the Browse... feature or by double clicking on the entry in the Value column and manually entering the required text.

The Options Section

Description of the Options Section

The Options section allows the user to specify the options that control aspects of the L3 processing.

The example above shows the Options section for the L3 control file for the Loxton site in PFP_examples. More options can be displayed by right clicking on the Options section heading to display a context menu of available options, see the screenshot below.

The available options are described below:

zms is the measurement height of the CO2 concentration in metres.
CO2Units are the units for the CO2 concentration.
Fco2Units are the units for the CO2 flux.
ApplyFco2Storage controls whether or not the turbulent CO2 flux, Fco2, is corrected for storage or not. The storage term to be used is decided as follows;
1. Use Fco2_storage if that is available. This variable must be read at L1, it can not be calculated by PyFluxPro at present.
2. Use Fco2_profile if that is available and Fco2_storage is not available. This variable must be read at L1, it can not be calculated by PyFluxPro at present.
3. Use Fco2_single if neither Fco2_storage nor Fco2_profile are available. Fco2_single is calculated by PyFluxPro.
UseL2Fluxes controls whether or not PyFluxPro will calculate the turbulent flxes from the variances and covariances.
MassmanCorrection controls whether or not PyFluxPro will apply the Massman frequency correction to the turbulent fluxes calculated from the variances and covariances.
2DCoordRotation controls whether or not PyFluxPro will apply the 2D coordinate rotation to the variances and covariances before calculating the turbulent fluxes.
CorrectIndividualFg controls whether or not PyFluxPro will correct individual ground heat flux measurements (Fg) for heat storage in the layer above the heat flux plates. The alternative is to average the individual Fg, soil temperature (Ts) and volumetric water content (Sws) measurements first and then correct these spatial averages for heat storage.
CorrectFgForStorage controls whether or not PyFluxPro will apply the ground heat flux correction.
KeepIntermediateSeries controls whether or not PyFluxPro will delete intermediate series generated during processing. It can be useful to keep the intermediate series when debugging problems with the data.

Right clicking on most option values will display a context menu containing alternate values for this option, see below.

The Soil Section

The Soil section allows the user to specify the following soil properties:

FgDepth is the depth of the soil heat flux plates.
BulkDensity is the soil bulk density in units of kg/m^3. This is used when calculating the specific heat capacity of the soil.
OrganicContent is the organic content of the soil as a fraction (0 to 1) and is used when calculating the specific heat capacity of the soil.
SwsDefault is the default soil moisture in units of m^3/m^3, to be used when the measured soil moisture is missing.
SwsSeries is the name of the soil moisture variable used to calculate the specific heat capacity of the soil.

The Massman Section

The Massman section allows the user to specify the following parameters used in the Massman frequency correction:

zmd is the measurement height above the displacement height i.e. (z-d) in units of metres.
north_separation is the separation between the sonic anemometer and the IRGA in the North direction in units of metres.
east_separation is the separation between the sonic anemometer and the IRGA in the East direction in units of metres.
vertical_separation is the separation between the sonic anemometer and the IRGA in the vertical direction in units of metres.
z0 is the roughness length in units of metres.

If you are using fluxes calculated by EddyPro or similar and read in at L1 then you do not need to apply any frequency correction at L3 and the Massman section is not needed.

The Variables Section

Description of the Variables Section

The Variables section is where the user specifies the processing, merging and averaging options for the variables. As explained in the Overview section, the variables Ta, AH and ps are assumed to exist by PyFluxPro at L3. These variables need to be created at L3 if they are not present in the L2 data. The required variables are usually created by merging data from 1 or more instruments.

An example of the Variables subsection from the Loxton examplle control file is shown below with the AH and Ta subsections expanded.

The 2 variables shown in the example above demonstrate the use of MergeSeries to create 2 variables, AH and Ta respectively, during the L3 processing. It also shows the RangeCheck quality control test being applied to the new variable created. All of the quality control tests available at L2 are also available at L3 but, strictly speaking, it is only necessary to use them if any of the variables being merged was not cleaned up at L2.

The user can add or remove variables and add or remove merge, average or quality control checks by right clicking on a variable name. This will display a context menu populated with the available options, see the example below.

The quality control checks are described in the L2 section of this wiki. The sections below describe the operation of the MergeSeries and AverageSeries L3 options.

Description of L3 merging and averaging of variables

MergeSeries

The screenshot below shows the use of MergeSeries for the Ta variable in the Loxton L3 processing.

The example shows that the variables Ta_HMP_10m and Ta_SONIC_Av will be merged to produce a single variable called Ta. The variables listed under the source key of the MergeSeries instruction are merged in order from left to right. In this case, Ta_HMP_10m will be used first to create Ta and then wherever there are gaps, these will be filled with data from Ta_SONIC_Av.

The MergeSeries instruction can be used with multiple variables which will be merged from left to right in the order specified in source. It can also be used with only 1 variable to create a new variable identical to the original in all but name.

AverageSeries

The AverageSeries instruction works in a similar way to the MergeSeries instruction but averages the data across multiple variables instead of merging them, see the example from Loxton below.

The example shows that 6 ground heat flux measurements (Fg_10cma, Fg_10cmb, Fg_10cmc, Fg_10cmd, Fg_10cme and Fg_10cmf) will be averaged into 1 variable Fg. The order of the variables in source is not important in the AverageSeries instruction.

Editing of the Variables Section

Editing the contents of the Variables section is similar to editing other sections in the L3 control file. Items can be added to or removed from the section using a context-sensitive menu that is displayed when the user right clicks on the section or sub-section titles in the Parameter column. Entries in the Value column can be edited by double clicking on the text in the Value column and editing the text.

Removing a Variable

Variables can be removed from the Variables section by right clicking on the variable name and selecting Remove variable, see the screenshot below.

Adding a New Variable

New variables can be added to the this section by right clicking on the Variables section title in the Parameter column and selecting Add variable from the displayed context menu. The new variable is added after the last entry in the Variables section and is given the name **New variable **, see the screenshot below.

You can also add a new variable immediately above an existing variable by right clicking on the variable name and selecting New variable, see the screenshot below.

The Plots Section

The Plots section is used to specify the number and type of plots to be produced at the end of the L3 processing. Visualising data through plots is an important part of processing data using PyFluxPro. At L3, the plots are designed to show the results of the post-processing and the merged and averaged variables. As with the L2 processing, it may be necessary to repeat the L3 step several times to get the required results. The plots are designed to visualise the L3 data and to aid the user in deciding when the L3 data is fit for purpose.

The Plots section consists of an arbitrary number of sub-sections and each sub-section name is the title of a plot. Each plot sub-section has a single entry called variables which is a comma separated list of variables to be plotted, see the screenshot below.

You can edit the list of variables to be plotted by double clicking on variables entry in the Value column.

Adding and Removing Plot

Plots can be added to the Plots section by right clicking on the Plots section title and selecting the type of plot to be added, see below.

To remove a plot, right click on the plot in the Plots section and select Remove plot, see the screenshot below.

Disable and Enable Plots

It can take a long time to render all of the requested plots to the screen and this can be frustrating if the user is iterating around a particular set of results for a small number of variables.

Plots defined in the Plots section can be disabled or enabled as required. Disabled plots wont be rendered to the screen and this speeds up the process of examing a small group of variables. To disable or enable a plot, right click on the plot name and select the appropriate option, see the screenshot above. Disabled plots are labelled with (disabled), see below.

Running L3

Once the user has finished editing the L3 control file, it can be run by using the Current option of the Run entry on the main menu. The shortcut to run the current control file is Ctrl+R (press and hold down the control key and press the R key).

Output from Running L3

Files Produced During L3

The L3 processing produces 2 output files:

The L3 netCDF containing the same variables read in at L2 and with the new variables created at L3 by merging or averaging variables.
An Excel workbook containing summaries of the quality control flag values for each variable.

Plots Produced During L3

PyFluxPro produces plots of the data at L3. An example of one of these plots is given below.

The L3 plots have several components and are similar to the L2 plots. Each plot has 3 columns and can have one or more rows with each row representing data for a particular variable.

The left-most column is time series plots of the L3 and, where it exists, the L2 data. This allows the user to quickly see the affect of any quality control checks at L3 and to compare variables calculated at L3 with L2 variables. The L2 data is plotted in blue and uses the left hand Y axis which is labelled in blue. The L3 data is plotted in red and uses the right hand Y axis which is labelled in red. Where the L2 and L3 are identical (Flu, Fld, Fsu and Fsd in the example above), the L3 data (red) is plotted over the L2 data (blue) and obscures the L2 data. Where the L2 data is not present (Fn in the example above), only the L3 data (red) is plotted.

Other features of the L3 plots are the same as described for the L2 plots.