02. Introduction to SpRIT - RJbalikian/SPRIT-HVSR GitHub Wiki

The SpRIT HVSR data processing package is intended to enable speedy, accurate, informative, flexible, and user-friendly processing and analysis methods for HVSR. For speedy processing of HVSR data, it is recommended to use the sprit.run() function, which calls a series of functions to process the data.

NOTE: The arguments of the any of the functions used to process the HVSR data (listed in the Processing Parameters section below) can be passed directly into the sprit.run() function. Use the help(sprit.run()) command to get a description and list of these parameters.

Inputs

To use SpRIT, the following inputs are needed: Data, Metadata, and Processing Parameters.

Data

  • Ambient seismic data acquired by a 3-component seismometer
  • Instrument response information

Processing Parameters

After calculating HVSR curves, you may also choose to convert the frequency domain to depth using the following functions: * sprit.calculate_depth() (in development): Use frequency-depth models to calculate the depth to the interface * sprit.plot_depth_curve() (in development): After calculate_depth, plot the H/V curve with depth on the Y axis (instead of frequency on the X axis) * sprit.plot_cross_section() (in development): With multiple HVSRData objects, plot depths and interpolate H/V curve values between them

What does it do?

SpRIT is intended to take ambient seismic data and generate HVSR curves for site analysis. This is intended to be used for rapid acquisition of data at several sites (20-60 minutes per site), though the underlying code is derived from a code developed by IRIS that worked with data acquired over several days. (In the case of IRIS's data, the PPSDS were already calculated, so the size of the data object may become cumbersome if attempting to use SpRIT for this length of data analysis. However, with sufficiently large PPSD length (i.e., individual time windows), it may be possible to use on longer timescales as well).

Currently, SpRIT only supports the calculation of H/V curves for a site. Future work is intended to focus on a) increasing the reliability and statistic significance of those results, b) enabling spatial analysis of the results, and c) deriving further products from the H/V curve, including shear-wave velocities.

How does it do it?

SpRIT first requires inputs to know where and how to read in data. It will then attempt to "fetch" or read seismic data, reading it as an obspy stream. Probabilistic Power Spectral Densities are then generated for each component (Z, E, and N). These are used to process H/V curves at each time-window used for the PPSD generation. Peaks are scored and checked from those H/V curves, and various reports, plots, and outputs can be generated as well.

A basic overview of this is shown in the diagram below:

    graph TB
        subgraph IP ["Input Parameters"]
            direction TB
            A0["sprit.input_params()"]
            A3
        end
        A1[(Data)] --> IP
        A2[(Metadata)] --> IP

        A3((Metadata\nParameters\nSettings)) --> A0

        IP --> FD
        subgraph FD [Fetch Data]
            direction LR
            B0["sprit.fetch_data()"]
        end
        FD -.Optional.->RSN
        FD -.Optional.-> GAD
        FD --> GP

        subgraph GAD [Generate Azimuth Data]
            direction LR
            CA0["calculate_azimuth()"]
        end
        GAD -.Optional.-> RSN
        GAD --> GP
        subgraph RSN [Remove Signal Noise]
            direction LR
            RN0["remove_noise()"]
        end
        RN1["Manual Window Selection (if desired)"]
        RN0 -.-> RN1
        RN1 -.-> RN0

        RSN --> GP

        C1((obspy.PPSD \nparameters)) -.Optional.-> C0
        subgraph GP [Generate PSDs]
            direction LR
            C0["sprit.generate_psds()"]
            C1
        end

        GP --> PHVC
        subgraph PHVC [Process HVSR Curves]
            direction LR
            D0["sprit.process_hvsr()"]
        end
        PHVC --> CP

        D0 -.Optional.-> ROC
        subgraph ROC [Remove Outlier Curves]
            direction LR
            ROC0["sprit.remove_outlier_curves()"]
        end
        ROC --> CP
        subgraph CP [Check and Score Peaks]
            direction LR
            CP0["sprit.check_peaks()"]
        end
        CP -.Optional.-> F0[/Export Data and Reports\]
        

Data objects

The SpRIT HVSR packages uses two main classes: HVSRData and HVSRBatch. Summaries of these are included at the links provided as well as in the table below:

HVSRData Objects

The HVSRData class is the basic data class of the sprit package. It contains all the processed data, input parameters, and reports. Some of the methods that work on the HVSRData object (e.g., .plot() and .get_report()) are essentially wrappers for some of the main sprit package functions (sprit.plot_hvsr() and sprit.get_report(), respectively).

These attributes and objects can be accessed using square brackets or the dot accessor. For example, to access the site name, HVSRData['site'] and HVSRData.site will both return the site name.

The following table shows the attributes belonging to HVSRData:

Name Description
acq_date The acquisition date of the data, usually derived from the data itself but may be from sprit.input_params()
batch Whether the data is part of a batch read. This is primarily used during the processing
BestPeak The information on the most highly scored peak (what a basic peak-scoring algorithm determines is the "best" peak on your H/V Curve)
cha The channels used by the instrument. See here for naming conventions
copy() METHOD Makes a copy of the data using the python copy module. See here and here for more information.
current_times_used The original window times of the starts of the ppsd windows generated using the obspy PPSD class. This is not updated as widnows are excluded throughout the processing, they just represent the original windows used.
datapath The original input filepath of the data, usually from sprit.run() or sprit.input_params().
datastream A copy of the obspy Stream of your data
depth The depth at which the seismometer was installed, usually 0
elevation The surface elevation of the seismometer at the site of interest
endtime The endtime of the data in the timezone indicated by tzone. Only needed if trimming the data or using source='raw', otherwise, this is read from the data.
export() METHOD Exports the data to a pickle object with .hvsr extension, see here for more information
get_report() METHOD Generates a report based on parameters for sprit.get_report(), see here for more information.
hvsr_band A tuple or list of the frequencies within which to carry out the HVSR processing
hvsr_curve A numpy array with the values of the final HVSR curve
hvsr_df The primary dataframe (pandas dataframe) containing data and information (including whether to use them in the final analysis) for each of the H/V curves generated for each window of analysis. Each row of hvsr_df represents a time-window as divided up using the obpsy PPSD parameters
hvsr_log_std The log standard deviation of the H/V curves for all time windows
hvsr_peak_freqs The frequencies of the peaks detected by sprit.check_peaks()
hvsr_peak_indices The index of the x_freqs attribute of the peaks detected by sprit.check_peaks()
hvsrm The values 1 standard deviation below the main HVSR curve
hvsrm2 The values 1 log standard deviation below the main HVSR curve
hvsrp The values 1 standard deviation above the main HVSR curve
hvsrp2 The values 1 log standard deviation above the main HVSR curve
ind_hvsr_curves A 2D numpy array containing the values of each of the HVSR curves for all time windows
ind_hvsr_peak_indices A list of 1D numpy arrays containing indices of the peaks of the HVSR curves of all time windows
ind_hvsr_stdDev The standard deviation values of the HVSR curves of all time windows
input_crs The input CRS of the location data, in a format readable by pyproj's CRS.from_user_input()
input_params An HVSRData object with just the input parameters from sprit.input_params()
input_stream A copy of the raw obspy Stream initially read in to sprit
instrument The type of instrument used to collect the data
inv The inventory object generated from the obspy.read_inventory() on the metapath parameter of sprit.input_params()
items() METHOD Return the "items" of the HVSRData object. For HVSRData objects, this is a dict_items object with the keys and values in tuples. Functions similar to dict.items(). See here for more information.
keys() METHOD Returns the "keys" of the HVSRData object. For HVSRData objects, these are the attributes and parameters of the object. Functions similar to dict.keys(). See here for more information.
latitude The y coordinate of the data, generated from the ycoord parameter of sprit.input_params()
loc The location of the seismometer, often used when multiple sensors at the same location. Usually 00 or a null value. See here for more information on location codes.
longitude The x coordinate of the data, generated from the xcoord parameter of sprit.input_params()
metapath The filepath of the metadata containing the relevant instrument response information needed to generate the PPSDS
method Which method was used to combine the horizontal components, see the sprit.process_hvsr() parameters for more information.
net The network this sensor is a part of, as entered into sprit.input_params(). May not be necessary except to match instrument response data to the sensor.
output_crs The CRS you would like the location information projected to for outputs, from sprit.input_params
params A dictionary containing the parameters both input and calculated used to process the data
paz A dictionary containing the "Poles and Zeros"/instrument response information used by obspy's PPSD class to generate the PPSD output for each component.
peak_freq_range The frequency range within which to search for peaks. This does not affect the data processing except to constrain the range of the final "BestPeak"
PeakReport A list of dictionaries containing the reports for all the peaks found by sprit.check_peaks() on your H/V curve
plot() METHOD Uses sprit.plot_hvsr() to plot your HVSR curve output. See here for more information.
ppsd_std Dictionary of numpy arrays containing the standard dev of the PPSDs
ppsd_std_vals_m Dictionary of numpy arrays containing the value of the curve one standard dev below the main PPSD curve for each component
ppsd_std_vals_p Dictionary of numpy arrays containing the value of the curve one standard dev above the main PPSD curve for each component
ppsds Dictionary containing an editable version of the obspy PPSD class for each component
ppsds_obspy Dictionary containing an obspy PPSD class for each component. See here for more information.
Print_Report A copy of the report generated using get_report(report_format='print'). This may include escape characters rather than being formatted correctly.
ProcessingStatus The status (True/False) of each of the main processing steps
psd_raw A dictionary containing 2D numpy arrays, with the psd values for each time window and component (this is read into the 'hvsr_df' attribute for better organization and utility, with each time window being a dataframe row
psd_values_tavg A dicitonary of 1D arrays containing the time-averaged value of the PSD of each component
report() METHOD Does the same function as the get_report() function and method. See here for more details.
site The site name of the data
sta The station code (i.e., the name of the instrument) on which the data was recorded, if applicable. See here for more details.
starttime The start time of the data in the timezone indicated by tzone. Only needed if trimming the data or using source='raw', otherwise, this is read from the data.
timezone The timezone being used for the input data. See here for more information.
tsteps_used DEPRECATED Used to calculate how many time windows are used in the final analysis. Now recorded in the "Use" column of the hvsr_df attribute/dataframe.
x_freqs A dictionary containing a 1D numpy array of the frequencies used for the generation of the HVSR curve, one entry per component. These should be the same, but since they are generated separately, they are currently stored as separate dicitonary items
x_period A dictionary containing a 1D numpy array of the periods used for the generation of the HVSR curve, one entry per component. These should be the same, but since they are generated separately, they are currently stored as separate dicitonary items
xwindows_out A list that helps keep track of which windows are excluded during noise removal

HVSRBatch Objects

HVSRBatch is the data container used for batch processing. It contains several HVSRData objects (one for each site). These can be accessed using their site name, either square brackets (HVSRBatchVariable["SiteName"]) or the dot (HVSRBatchVariable.SiteName) accessor. The dot accessor may not work if there is a space in the site name.

All of the functions in the sprit package are designed to perform the bulk of their operations iteratively on the individual HVSRData objects contained in the HVSRBatch object, and do little with the HVSRBatch object itself, besides using it determine which sites are contained within it. Therefore, the attributes of the HVSRBatch object are fewer than those of the HVSRData object (though each HVSRData object within the HVSRBatch object contains all the attributes/methods listed above and in the API).

Name Description
SITENAME1 Each of the site names (same as the "site" attribute of the HVSRData object listed above) are their own attributes
SITENAME2 Each of the site names (same as the "site" attribute of the HVSRData object listed above) are their own attributes
SITENAME... Each of the site names (same as the "site" attribute of the HVSRData object listed above) are their own attributes
SITENAME_N Each of the site names (same as the "site" attribute of the HVSRData object listed above) are their own attributes
batch Whether this object was generated during batch process (always True for HVSRBatch objects)
batch_dict Dictionary containing the names of the sites as keys and the HVSRData object for that site as the value
copy() METHOD Makes a copy of the HSVRBatch object using the python copy module. See here and here for more information.
export() METHOD Exports the data to a pickle object with .hvsr extension, see here for more information
get_report() METHOD Generates a report based on parameters for sprit.get_report(), see here for more information.
items() METHOD Return the "items" of the HVSRBatch object. For HVSRBatch objects, this is a dict_items object with the keys and values in tuples. Functions similar to dict.items(). See here for more information.
keys() METHOD Returns the "keys" of the HVSRBatch object. For HVSRBatch objects, these are the site names of the in a dict_keys format. Functions similar to dict.keys(). See here for more information. Same output as the sites attribute, except that the sites output is a list
plot() METHOD Uses sprit.plot_hvsr() to plot your HVSR curve output for each site in the HVSRBatch object. See here for more information.
report() METHOD Does the same function as the get_report() function and method. See here for more details.
sites A list of the site names of each of the HVSRData objects contained in the HVSRBatch object