Downloading_and_Interpreting_Radar_Level_II_Data - Flight-Path-Analysis/FlightPathAnalysis GitHub Wiki

Radar Level II data is almost raw radar data, just slightly more organized for us humans. These are what are called NEXRAD weather radars, which perform routine data collection about reflectivity and doppler measurements of the area around it.

Downloading Radar Level II data.

The primary source of historical weather radar data for this project will be from the University Corporation for Atmospheric Research's THREDDS S3 NEXRAD Level II DataServer.

You can go to the link and navigate it. This resource will be used to list the available radars and datasetes for given dates.

Listing all available radars for a given date.

As you can see on the link, not all radars operate on all the dates. The data/weather/stations_database.csv file (example file found here) contains relevant information about all weather stations and radars, the NEXRAD radars should have the ID form NEXRAD:[RADAR] in the csv.

To list all available radars for a given date, we can take advantage of the default catalog.xml contained on most websites, describing some contents of the page. The code below should do just that.

# For a given date, get the list of possible radars:
import requests
from xml.etree import ElementTree
import datetime

date = datetime.datetime(2018, 1, 2) #Some example date

year = str(date.year).zfill(4)
month = str(date.month).zfill(2)
day = str(date.day).zfill(2)

# URL for the THREDDS catalog page you are interested in.
catalog_url = f'https://thredds-aws.unidata.ucar.edu/thredds/catalog/nexrad/level2/S3/{year}/{month}/{day}/catalog.xml'
radars = []
try:
    # Send a request to the server
    response = requests.get(catalog_url)
    response.raise_for_status()  # Check that request was successful

    # Parse the returned XML content
    tree = ElementTree.fromstring(response.content)

    # Define the namespace - this is used to correctly identify elements in the XML
    namespace = {'thredds': 'http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0'}

    # Find all catalogRef elements; these contain references to other catalogs or datasets
    catalog_refs = tree.findall('.//thredds:catalogRef', namespace)

    # Print dataset names (or other available metadata)
    for catalog_ref in catalog_refs:
        radars.append(catalog_ref.attrib['{http://www.w3.org/1999/xlink}title'])
        # You can also extract the URL path or other attributes if needed

except Exception as e:
    print(f"An error occurred when requesting radar data for {date.strftime('%Y-%m-%d')}:{e}")

print(f'{len(radars)} radars found!')

A neat function that does just that can be found on /src/backend/nexrad_query.py called query_radar_list.

Listing all available datasets for a given date and radar

In the same philosophy as above, we can list the available datasets for a given date and radar as shown below:

# Finding all available datasets in given date for given radar

radar = radars[0]

radar_catalog_url = f'https://thredds-aws.unidata.ucar.edu/thredds/catalog/nexrad/level2/S3/{year}/{month}/{day}/{radar}/catalog.xml'
datasets = []
try:
    # Send a request to the server
    response = requests.get(radar_catalog_url)
    response.raise_for_status()  # Check that request was successful

    # Parse the returned XML content
    tree = ElementTree.fromstring(response.content)

    # Define the namespace
    namespace = {'thredds': 'http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0'}

    # Find all dataset elements; these contain information about individual dataset files
    datasets_in_tree = tree.findall('.//thredds:dataset', namespace)

    # Collect and print information for each dataset
    for dataset in datasets_in_tree:
        if isinstance(dataset, ElementTree.Element):
            name = dataset.attrib.get('name')  # Using 'get' prevents a KeyError if 'name' doesn't exist

            # Ensure the name is a string and is not a path (does not contain '/')
            if isinstance(name, str) and '/' not in name:
                datasets.append(name)

except requests.exceptions.HTTPError as e:
    print(f"HTTP error occurred: {e}")
except Exception as e:
    print(f"An error occurred: {e}")
print(f'{len(datasets)} datasets found!')

Note that the dataset filename follows the format [RADAR][YEAR][MONTH][DAY]_[HOUR][MINUTE][SECOND], and note that the dataset sometimes contains a .gz at the end if the data is old enough. This can help narrow down specific times of day to get data.

Again, a neat function can be found on /src/backend/nexrad_query.py called query_dataset_list.

Loading dataset information

One you select the desired dataset, you'll need to build the OpenDAP link, the format is already on the function below, but you can find it by looking at the link for OpenDAP when you visit a dataset on the archive.

If you click on the OpenDAP link, on the dataset file on the website, you'll see a lot of valuable information about the station, and the data available. We need that in order to properly interpret the data. Plus, it gives us location information about the radar, which is a plus.

To do that, we'll use BeautifulSoup, which is great for interpreting html code.

import requests
from bs4 import BeautifulSoup
import datetime

def my_eval(s):
    try:
        # Try to convert the string to a number (int or float)
        return int(s)
    except ValueError:
        try:
            return float(s)
        except ValueError:
            try:
                # Try to convert the string to a datetime date
                return datetime.datetime.strptime(s, '%Y-%m-%d').date()
            except ValueError:
                # If all conversions fail, return the original string
                return s

url = "https://thredds-aws.unidata.ucar.edu/thredds/dodsC/nexrad/level2/S3/2008/02/02/KABX/KABX20080202_000314.gz.html"
response = requests.get(url)

specs_dict = {}

# Check if the request was successful
if response.status_code == 200:
    soup = BeautifulSoup(response.content, 'html.parser')
    
    # Find all text areas in the webpage
    textareas = soup.find_all('textarea')
    
    for textarea in textareas:
        key = textarea['name'].replace('_attr','')
        value = textarea.get_text()
        specs_dict[key] = value.strip()  # Using strip() to remove any leading/trailing whitespace

    for key in specs_dict.keys():
        lines = specs_dict[key].split('\n')
        specs_dict[key] = {}
        for line in lines:
            attr = line.split(':')[0]
            val = line.split(':')[1][1:]
            specs_dict[key][attr] = my_eval(val)
else:
    print(f"Failed to retrieve the webpage. Status code: {response.status_code}")

specs_dict['global']

The function my_eval just converts everything to their proper variable types (it can be made better)

Loading the dataset.

Very similar to the previous step, but now we'll send requests to that OpenDAP page via pydap. The code below takes care of that.

from pydap.client import open_url

dataset = datasets[0]

# Format:
# url = 'https://thredds-aws.unidata.ucar.edu/thredds/dodsC/nexrad/level2/S3/[YYYY]/[MM]/[DD]/[RADAR]/[DATASET_FILE]'

dataset_url = f'https://thredds-aws.unidata.ucar.edu/thredds/dodsC/nexrad/level2/S3/{year}/{month}/{day}/{radar}/{dataset}'

try:
    data = open_url(dataset_url)
except Exception as e:
    print(f"An error occurred when getting the dataset for {radar} radar {date.strftime('%Y-%m-%d')}: {e}")

data_dict = {}
for key in data.keys():
    data_dict[key] = np.array(data[key][:])
print([key for key in data_dict.keys()])

This code should display all the entries available for the data.

Interpreting radar data

Table of Radar Data Keys

Key	Description
`Reflectivity`	Measures the power returned to the radar from targets.
`RadialVelocity`	Velocity of targets relative to the radar, towards or away from it.
`SpectrumWidth`	Width of the Doppler velocity distribution, indicating the variability of velocities within the radar volume.
`DifferentialReflectivity`	Difference in reflectivity between horizontal and vertical polarizations.
`CorrelationCoefficient`	Measure of the similarity between horizontal and vertical polarizations.
`DifferentialPhase`	Difference in phase between horizontal and vertical polarizations.
`time`	Timestamp indicating when the radar sweep/data was collected.
`elevation`	Angle above the horizon at which the radar is pointing.
`azimuth`	Horizontal angle at which the radar is pointing, usually measured clockwise from the north.
`distance`	Distance from the radar to the target or particular point of interest.
`numRadials`	Number of radials (lines extending out from the center of the radar) in the data.
`numGates`	Number of gates (successive intervals along a radial) in the data.
`_HI` (suffix)	Denotes high-resolution data or data from a particular radar mode with higher resolution.

For each type of measurement, such as Reflectivity, there's also a high-resolution variant indicated by the _HI suffix. The keys with this suffix will pertain to the same type of measurement but with finer details.

The letter and suffix after time, elevation, azimuth, distance, numRadials, numGates correspond the measurement they relate to, for example, elevationR_HI is the elevation at which the measurement for Reflectivity_HI was taken.

Units and conversions

The units, offsets, scale factors, and missing_value for each measure can be found in the specs_dict we built in one of the steps above. The offset and scale_factor come in when converting the data to the proper values, that is: $$\text{actualValue} = \text{rawValue} \times \text{scaleFactor} + \text{offset}$$

And the missing_value is tells you what values should not be used for analysis.

This is it for now. oof, that's a lot, putting in more later.