02. Introduction to SpRIT - RJbalikian/SPRIT-HVSR GitHub Wiki
The SpRIT HVSR data processing package is intended to enable speedy, accurate, informative, flexible, and user-friendly processing and analysis methods for HVSR. For speedy processing of HVSR data, it is recommended to use the sprit.run() function, which calls a series of functions to process the data.
NOTE: The arguments of the any of the functions used to process the HVSR data (listed in the Processing Parameters section below) can be passed directly into the sprit.run() function. Use the
help(sprit.run())
command to get a description and list of these parameters.
Inputs
To use SpRIT, the following inputs are needed: Data, Metadata, and Processing Parameters.
Data
- Ambient seismic data acquired by a 3-component seismometer
- Data must be in format readable using the obspy.read() function (supported formats).
- See here for more details and recommendations on acquiring this data.
- Instrument response information
- This will need to be readable using obspy.read_inventory()
- IRIS maintains a "library" of select instrument response files here.
Processing Parameters
- Input parameters (or you may use the defaults, if applicable) to the fundamental functions doing the work. The sprit.run() function uses the collective parameters from all of these (see here for the API documentation for all SpRIT functions). These functions include:
After calculating HVSR curves, you may also choose to convert the frequency domain to depth using the following functions: * sprit.calculate_depth() (in development): Use frequency-depth models to calculate the depth to the interface * sprit.plot_depth_curve() (in development): After calculate_depth, plot the H/V curve with depth on the Y axis (instead of frequency on the X axis) * sprit.plot_cross_section() (in development): With multiple HVSRData objects, plot depths and interpolate H/V curve values between them
What does it do?
SpRIT is intended to take ambient seismic data and generate HVSR curves for site analysis. This is intended to be used for rapid acquisition of data at several sites (20-60 minutes per site), though the underlying code is derived from a code developed by IRIS that worked with data acquired over several days. (In the case of IRIS's data, the PPSDS were already calculated, so the size of the data object may become cumbersome if attempting to use SpRIT for this length of data analysis. However, with sufficiently large PPSD length (i.e., individual time windows), it may be possible to use on longer timescales as well).
Currently, SpRIT only supports the calculation of H/V curves for a site. Future work is intended to focus on a) increasing the reliability and statistic significance of those results, b) enabling spatial analysis of the results, and c) deriving further products from the H/V curve, including shear-wave velocities.
How does it do it?
SpRIT first requires inputs to know where and how to read in data. It will then attempt to "fetch" or read seismic data, reading it as an obspy stream. Probabilistic Power Spectral Densities are then generated for each component (Z, E, and N). These are used to process H/V curves at each time-window used for the PPSD generation. Peaks are scored and checked from those H/V curves, and various reports, plots, and outputs can be generated as well.
A basic overview of this is shown in the diagram below:
graph TB
subgraph IP ["Input Parameters"]
direction TB
A0["sprit.input_params()"]
A3
end
A1[(Data)] --> IP
A2[(Metadata)] --> IP
A3((Metadata\nParameters\nSettings)) --> A0
IP --> FD
subgraph FD [Fetch Data]
direction LR
B0["sprit.fetch_data()"]
end
FD -.Optional.->RSN
FD -.Optional.-> GAD
FD --> GP
subgraph GAD [Generate Azimuth Data]
direction LR
CA0["calculate_azimuth()"]
end
GAD -.Optional.-> RSN
GAD --> GP
subgraph RSN [Remove Signal Noise]
direction LR
RN0["remove_noise()"]
end
RN1["Manual Window Selection (if desired)"]
RN0 -.-> RN1
RN1 -.-> RN0
RSN --> GP
C1((obspy.PPSD \nparameters)) -.Optional.-> C0
subgraph GP [Generate PSDs]
direction LR
C0["sprit.generate_psds()"]
C1
end
GP --> PHVC
subgraph PHVC [Process HVSR Curves]
direction LR
D0["sprit.process_hvsr()"]
end
PHVC --> CP
D0 -.Optional.-> ROC
subgraph ROC [Remove Outlier Curves]
direction LR
ROC0["sprit.remove_outlier_curves()"]
end
ROC --> CP
subgraph CP [Check and Score Peaks]
direction LR
CP0["sprit.check_peaks()"]
end
CP -.Optional.-> F0[/Export Data and Reports\]
Data objects
The SpRIT HVSR packages uses two main classes: HVSRData and HVSRBatch. Summaries of these are included at the links provided as well as in the table below:
HVSRData Objects
The HVSRData class is the basic data class of the sprit package. It contains all the processed data, input parameters, and reports. Some of the methods that work on the HVSRData object (e.g., .plot() and .get_report()) are essentially wrappers for some of the main sprit package functions (sprit.plot_hvsr() and sprit.get_report(), respectively).
These attributes and objects can be accessed using square brackets or the dot accessor. For example, to access the site name, HVSRData['site'] and HVSRData.site will both return the site name.
The following table shows the attributes belonging to HVSRData:
Name | Description |
---|---|
acq_date | The acquisition date of the data, usually derived from the data itself but may be from sprit.input_params() |
batch | Whether the data is part of a batch read. This is primarily used during the processing |
BestPeak | The information on the most highly scored peak (what a basic peak-scoring algorithm determines is the "best" peak on your H/V Curve) |
cha | The channels used by the instrument. See here for naming conventions |
copy() | METHOD Makes a copy of the data using the python copy module. See here and here for more information. |
current_times_used | The original window times of the starts of the ppsd windows generated using the obspy PPSD class. This is not updated as widnows are excluded throughout the processing, they just represent the original windows used. |
datapath | The original input filepath of the data, usually from sprit.run() or sprit.input_params() . |
datastream | A copy of the obspy Stream of your data |
depth | The depth at which the seismometer was installed, usually 0 |
elevation | The surface elevation of the seismometer at the site of interest |
endtime | The endtime of the data in the timezone indicated by tzone. Only needed if trimming the data or using source='raw' , otherwise, this is read from the data. |
export() | METHOD Exports the data to a pickle object with .hvsr extension, see here for more information |
get_report() | METHOD Generates a report based on parameters for sprit.get_report() , see here for more information. |
hvsr_band | A tuple or list of the frequencies within which to carry out the HVSR processing |
hvsr_curve | A numpy array with the values of the final HVSR curve |
hvsr_df | The primary dataframe (pandas dataframe) containing data and information (including whether to use them in the final analysis) for each of the H/V curves generated for each window of analysis. Each row of hvsr_df represents a time-window as divided up using the obpsy PPSD parameters |
hvsr_log_std | The log standard deviation of the H/V curves for all time windows |
hvsr_peak_freqs | The frequencies of the peaks detected by sprit.check_peaks() |
hvsr_peak_indices | The index of the x_freqs attribute of the peaks detected by sprit.check_peaks() |
hvsrm | The values 1 standard deviation below the main HVSR curve |
hvsrm2 | The values 1 log standard deviation below the main HVSR curve |
hvsrp | The values 1 standard deviation above the main HVSR curve |
hvsrp2 | The values 1 log standard deviation above the main HVSR curve |
ind_hvsr_curves | A 2D numpy array containing the values of each of the HVSR curves for all time windows |
ind_hvsr_peak_indices | A list of 1D numpy arrays containing indices of the peaks of the HVSR curves of all time windows |
ind_hvsr_stdDev | The standard deviation values of the HVSR curves of all time windows |
input_crs | The input CRS of the location data, in a format readable by pyproj's CRS.from_user_input() |
input_params | An HVSRData object with just the input parameters from sprit.input_params() |
input_stream | A copy of the raw obspy Stream initially read in to sprit |
instrument | The type of instrument used to collect the data |
inv | The inventory object generated from the obspy.read_inventory() on the metapath parameter of sprit.input_params() |
items() | METHOD Return the "items" of the HVSRData object. For HVSRData objects, this is a dict_items object with the keys and values in tuples. Functions similar to dict.items(). See here for more information. |
keys() | METHOD Returns the "keys" of the HVSRData object. For HVSRData objects, these are the attributes and parameters of the object. Functions similar to dict.keys(). See here for more information. |
latitude | The y coordinate of the data, generated from the ycoord parameter of sprit.input_params() |
loc | The location of the seismometer, often used when multiple sensors at the same location. Usually 00 or a null value. See here for more information on location codes. |
longitude | The x coordinate of the data, generated from the xcoord parameter of sprit.input_params() |
metapath | The filepath of the metadata containing the relevant instrument response information needed to generate the PPSDS |
method | Which method was used to combine the horizontal components, see the sprit.process_hvsr() parameters for more information. |
net | The network this sensor is a part of, as entered into sprit.input_params() . May not be necessary except to match instrument response data to the sensor. |
output_crs | The CRS you would like the location information projected to for outputs, from sprit.input_params |
params | A dictionary containing the parameters both input and calculated used to process the data |
paz | A dictionary containing the "Poles and Zeros"/instrument response information used by obspy's PPSD class to generate the PPSD output for each component. |
peak_freq_range | The frequency range within which to search for peaks. This does not affect the data processing except to constrain the range of the final "BestPeak" |
PeakReport | A list of dictionaries containing the reports for all the peaks found by sprit.check_peaks() on your H/V curve |
plot() | METHOD Uses sprit.plot_hvsr() to plot your HVSR curve output. See here for more information. |
ppsd_std | Dictionary of numpy arrays containing the standard dev of the PPSDs |
ppsd_std_vals_m | Dictionary of numpy arrays containing the value of the curve one standard dev below the main PPSD curve for each component |
ppsd_std_vals_p | Dictionary of numpy arrays containing the value of the curve one standard dev above the main PPSD curve for each component |
ppsds | Dictionary containing an editable version of the obspy PPSD class for each component |
ppsds_obspy | Dictionary containing an obspy PPSD class for each component. See here for more information. |
Print_Report | A copy of the report generated using get_report(report_format='print') . This may include escape characters rather than being formatted correctly. |
ProcessingStatus | The status (True/False) of each of the main processing steps |
psd_raw | A dictionary containing 2D numpy arrays, with the psd values for each time window and component (this is read into the 'hvsr_df' attribute for better organization and utility, with each time window being a dataframe row |
psd_values_tavg | A dicitonary of 1D arrays containing the time-averaged value of the PSD of each component |
report() | METHOD Does the same function as the get_report() function and method. See here for more details. |
site | The site name of the data |
sta | The station code (i.e., the name of the instrument) on which the data was recorded, if applicable. See here for more details. |
starttime | The start time of the data in the timezone indicated by tzone. Only needed if trimming the data or using source='raw' , otherwise, this is read from the data. |
timezone | The timezone being used for the input data. See here for more information. |
tsteps_used | DEPRECATED Used to calculate how many time windows are used in the final analysis. Now recorded in the "Use" column of the hvsr_df attribute/dataframe. |
x_freqs | A dictionary containing a 1D numpy array of the frequencies used for the generation of the HVSR curve, one entry per component. These should be the same, but since they are generated separately, they are currently stored as separate dicitonary items |
x_period | A dictionary containing a 1D numpy array of the periods used for the generation of the HVSR curve, one entry per component. These should be the same, but since they are generated separately, they are currently stored as separate dicitonary items |
xwindows_out | A list that helps keep track of which windows are excluded during noise removal |
HVSRBatch Objects
HVSRBatch is the data container used for batch processing. It contains several HVSRData objects (one for each site). These can be accessed using their site name, either square brackets (HVSRBatchVariable["SiteName"]) or the dot (HVSRBatchVariable.SiteName) accessor. The dot accessor may not work if there is a space in the site name.
All of the functions in the sprit package are designed to perform the bulk of their operations iteratively on the individual HVSRData objects contained in the HVSRBatch object, and do little with the HVSRBatch object itself, besides using it determine which sites are contained within it. Therefore, the attributes of the HVSRBatch object are fewer than those of the HVSRData object (though each HVSRData object within the HVSRBatch object contains all the attributes/methods listed above and in the API).
Name | Description |
---|---|
SITENAME1 | Each of the site names (same as the "site" attribute of the HVSRData object listed above) are their own attributes |
SITENAME2 | Each of the site names (same as the "site" attribute of the HVSRData object listed above) are their own attributes |
SITENAME... | Each of the site names (same as the "site" attribute of the HVSRData object listed above) are their own attributes |
SITENAME_N | Each of the site names (same as the "site" attribute of the HVSRData object listed above) are their own attributes |
batch | Whether this object was generated during batch process (always True for HVSRBatch objects) |
batch_dict | Dictionary containing the names of the sites as keys and the HVSRData object for that site as the value |
copy() | METHOD Makes a copy of the HSVRBatch object using the python copy module. See here and here for more information. |
export() | METHOD Exports the data to a pickle object with .hvsr extension, see here for more information |
get_report() | METHOD Generates a report based on parameters for sprit.get_report() , see here for more information. |
items() | METHOD Return the "items" of the HVSRBatch object. For HVSRBatch objects, this is a dict_items object with the keys and values in tuples. Functions similar to dict.items(). See here for more information. |
keys() | METHOD Returns the "keys" of the HVSRBatch object. For HVSRBatch objects, these are the site names of the in a dict_keys format. Functions similar to dict.keys(). See here for more information. Same output as the sites attribute, except that the sites output is a list |
plot() | METHOD Uses sprit.plot_hvsr() to plot your HVSR curve output for each site in the HVSRBatch object. See here for more information. |
report() | METHOD Does the same function as the get_report() function and method. See here for more details. |
sites | A list of the site names of each of the HVSRData objects contained in the HVSRBatch object |