6. DM‐SED - ufarrell/sgp_phase2 GitHub Wiki

The Deep-Time Marine Sedimentary Element Database (DM-SED) is a compilation study published in Apr 2025. The project built on SGP Phase 1, with new data added from studies from broad range of ages (Phanerozoic and Proterozoic) and with global coverage.

Publication: Lai, J., Song, H., Chu, D., Dal Corso, J., Sperling, E. A., Wu, Y., Liu, X., Wei, L., Li, M., Song, H., Du, Y., Jia, E., Feng, Y., Song, H., Yu, W., Liang, Q., Li, X., & Yao, H. (2025). Deep-Time Marine Sedimentary Element Database. Earth System Science Data, 17(4), 1613–1626. https://doi.org/10.5194/essd-17-1613-2025

Dataset: Lai, J., Song, H., Chu, D., Dal Corso, J., Sperling, E. A., Wu, Y., Liu, X., Wei, L., Li, M., Song, H., Du, Y., Jia, E., Feng, Y., Song, H., Yu, W., Liang, Q., Li, X., & Yao, H. (2025). Deep-Time Marine Sedimentary Element Database [Dataset]. Zenodo. https://doi.org/10.5281/ZENODO.13131676

DM-SED samples were grouped into two projects: "New Compilation" and SGP. A decision was made to incorporate New Compilation data, from a) publications 2017 and older and b) excluding ODP/DSDP data.

DM-SED includes 8029 samples with 212506 results from 456 sites.

Geography

The samples are from 36 countries/oceans, with 25% from Canada, 21% from China and 14% from United States.

Lithology

56% of samples are simply coded as 'siliciclastic', 23% are shale, 8% are carbonates (limestone, dolomite, carbonate).

Age

33% of samples are from the Palaeozoic, 35% from the Mesozoic, 15% from the Cenozoic and 17% are Precambrian.

Data

Data was entered in batches based on publications, no methodological information was available at the time of download. Summary of batches here.

Categories below are based on those used on our search website (http://sgp-search.io/).

Completeness

Data Collection/Processing

DM-SED Version 3 data was imported into a local SQL database for processing.

Some overlap was identified between DM-SED New Compilation and SGP, based on sample names and publications - some from Phase 2 (i.e. not publicly available before this update) - this included 2662 samples and ~39 references.

427 distinct combinations of sitename, region, elevation_m, modlat, modlon were identified. 161 samples had no sitename.

DM-SED has general geographical information stored in the "Region" column, with variable level of detail, often not including country. Latitude and longitude were reverse-geocoded (using OpenRefine/OpenStreetMap Nominatim API) to find country names. State/Province was also added where possible.

Site type was coded based on the site names (e.g. assuming core if "core" was present in the site name), and by looking up the original publications. Some samples are compilation samples from both core and outcrop.

Missing site names were also coded from original publications where possible. Where no obvious site name was available/accessible the DM-SED Region was used as site name. 1824 samples were given revised site names. In some cases original sample names were also added or revised, and lithology types were updated, where more detail was easily available.

A summary of revisions here.

In keeping with CARE (Collective benefit, Authority to control, Responsibility, and Ethics) principles 48 DM-SED samples from 4 sites with 2428 results are not publicly available in Phase 2. These are from sites where the decimal latitude and longitude intersect with Native-held land (identified using public TIGER/Line shape files provided by the U.S. Census Bureau - accessed through QGIS via https://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/AIANNHA/MapServer).

DM-SED references were extracted, cleaned (some variations of the same reference removed, some DOIs updated where they didn't match the reference), and added to SGP. Samples are linked to original references and grouped into projects accordingly (e.g. Absar et al. 2009 (DM-SED)).

Interpreted Age

Interpreted ages are from DM-SED - see publication for details.

Data entry - DM-SED vs. SGP

DM SED Field name DM SED Description of the field (units) SGP table_name.column_name Notes
SampleID Unique sample identification code alternate_num.alternate_num
SampleName Author-denoted title for the sample (often non-unique) sample.original_num Cleaned up/found from papers in some cases.
SiteName Name of the drill core site or section site.section_name Cleaned up/found from papers in some cases.
Region Country or ocean of the data collection site SEE NOTES Cleaned up, combined with lat-long to find country. Verbatim in site.site_desc.
Elevation Distance between the sampling location and sea level (m) sample.elevation_m
SampleDepth Stratigraphic height or depth (m) sample.height_depth_m
ModLat Modern latitude of the collection site rounded to two decimals, with negative values indicating the Southern Hemisphere (decimal degrees) site.lat_orig, site.lat_dec
ModLon Modern longitude of the collection site rounded to two decimals, with negative values indicating the Western Hemisphere (decimal degrees) site.long_orig, site.long_dec
PalaeoLat Palaeo-latitude of the collection site rounded to two decimals, with negative values indicating the Southern Hemisphere (decimal degrees) NOT IMPORTED
PalaeoLon Palaeo-longitude of the collection site rounded to two decimals, with negative values indicating the Western Hemisphere (decimal degrees) NOT IMPORTED
Age Absolute age, in reference to GTS2020 (Ma) interpreted_age.interpreted_age
Period The geological period SEE NOTES geol_age (with below)
Stage The geological stage (i.e. geochronological age) SEE NOTES geol_age (with above)
Biozone Conodont, graptolite, and ammonite biozone sample.verbatim_biostrat
LithName Lithological name of the sample, as originally published sample.lith_id, sample.verbatim_lith Cleaned to match SGP dic_lithology, verbatim stored in verbatim_lith
LithType Lithology type of the sample (e.g. carbonate or siliciclastic) NOT INCLUDED Siliciclastic/carbonate only - used to code lithology along with lithname.
Formation Geological formation name lithostrat.verbatim_strat, lithostrat.strat_id Matched to dic_strat, verbatim stored in verbatim_strat.
Facies Depositional environment (e.g. mid-shelf or ramp) environment.env_notes Some coded to match SGP environmental bins.
Reference Data sources, including the published literature or other databases SEE NOTES Added full references to reference_work table.
Project Two parts: new compilation and SGP NOT INCLUDED