1.4 NetCDF file format and structure - wolfiex/AerVis GitHub Wiki
General structure
The genearated netCDF file has four main compoonents. These are:
- Dimensions
- Coordinates
- Attributes
- Variables (datasets)
- Groups
Dimensions
These contain information about the number of elements and their shape of any datasets (variables) contained within the file. In general these are longitude, latitude, elevation (model_level_number), pressure and pseudo_level.
Dimensions: (latitude: 144, longitude: 192, model_level_number: 85, pressure: 3, pseudo_level: 27)
Coordinates
These are constant 'attributes' which describe the data. Examples of these may be 1D arrays describing the lat,lon,elevation indexes, the timestamps for each column of data or the levels. Coordinates can be be multidimensional with their shape defined by the dimensions.
Coordinates:
* latitude (latitude) float32 -89.375 -88.125 ... 89.375
* longitude (longitude) float32 0.9375 2.8125 ... 359.0625
* pressure (pressure) float32 250.0 500.0 850.0
* pseudo_level (pseudo_level) int32 3 4 6 7 8 ... 907 908 909 910
* model_level_number (model_level_number) int32 1 2 3 4 ... 82 83 84 85
forecast_period timedelta64[ns] ...
forecast_reference_time datetime64[ns] ...
time datetime64[ns] ...
level_height (model_level_number) float32 ...
sigma (model_level_number) float32 ...
surface_altitude (latitude, longitude) float32 ...
altitude (model_level_number, latitude, longitude) float32 .
Additionally, any calculation constants are also appended as coordinates
This is for easy referencing (as this is where we store all information used to describe the data)
r_specific float64 287.1
molar_mass_air float64 0.02899
avogadro float64 6.022e+23
Attributes
This is where we place information relevant to the file, its sources, or time taken to concatenate all pp files.
Attributes:
avg_cube_delta: 0.04959738963498088
files: ['/Users/wolfiex/UKCA_postproc/data/n96_hadgem1_qrparm....
iris_cube_delta: 120.22364183700003
L0_delta: 133.26909520400002
stashname: ~Users~wolfiex~UKCA_postproc~data~AerVis~aervis~variabl...
Data Variables
These contain our datasets, of dimensions defined earlier. In our case we have defined these using the stash codes, with the fill names being easily extractable from the VariableReference class.
Data variables:
m01s00i096 (latitude, longitude) float32 ...
m01s00i509 (latitude, longitude) float32 ...
m01s01i202 (latitude, longitude) float32 ...
m01s01i270 (pseudo_level, latitude, longitude) float32 ...
m01s01i271 (pseudo_level, latitude, longitude) float32 ...
Each data variable consists of a dataset, coordinates, and attributes describing it
In [17]: d.data['m01s00i096']
Out[17]:
<xarray.DataArray 'm01s00i096' (latitude: 144, longitude: 192)>
array([[0.000000e+00, 0.000000e+00, 0.000000e+00, ..., 0.000000e+00,
0.000000e+00, 0.000000e+00],
...,
[7.713828e-07, 7.713828e-07, 7.713827e-07, ..., 7.713829e-07,
7.713829e-07, 7.713828e-07]], dtype=float32)
Coordinates:
* latitude (latitude) float32 -89.375 -88.125 ... 89.375
* longitude (longitude) float32 0.9375 2.8125 ... 359.0625
forecast_period timedelta64[ns] ...
forecast_reference_time datetime64[ns] ...
time datetime64[ns] ...
surface_altitude (latitude, longitude) float32 ...
height float64 ...
Attributes:
long_name:
source: Data from Met Office Unified Model
um_version: 11.1
STASH: [ 1 0 96]
cell_methods: time: mean (interval: 1 hour)
Groups
Generally unused in this case, but since netCDFs are actually of the HDF5 format, groups can be used to separate the data into a hierarchical structure. Groups usually contain all of the above information and an example of their usage may be when comparing several different runs containing the same variable names.