C2M2 Table Summary - nih-cfde/published-documentation GitHub Wiki

Crosscut Metadata Model (C2M2) Common Vocabulary (CV) Tables

These files can be assembled mostly automatically, please see the Submission Guide for instructions on assembling these files.

All table files listed in this summary will be bundled together, along with the C2M2 datapackage JSON Schema file which defines them, to create a valid C2M2 datapackage for submission to CFDE
TSV files for any empty (unused) tables must still be submitted, with only the (tab-separated) column-header row filled in
Table (TSV) filenames must exactly match those listed in the JSON Schema file (and in these docs)
Table column headers must exactly match those listed in the JSON Schema file (and in these docs)
Table columns must appear in the order given in the JSON Schema file (and in these docs)
Tables marked "CV term table" will be built automatically with the CFDE tools (wiki)
Table (TSV) files must not contain any empty rows or extra lines
Every TSV file must end with the final row of table data, terminated by a newline

Table (click for detailed information)	Construction	Can be empty?	Notes
analysis_type.tsv	Built by script	Y	CV term table
anatomy.tsv	Built by script	Y	CV term table
assay_type.tsv	Built by script	Y	CV term table
biofluid.tsv	Built by script	Y	CV term table
biosample.tsv	Prepared by submitter	Y	This table will have one row for each biosample
biosample_disease.tsv	Prepared by submitter	Y	For biosamples with disease metadata, this table will have one row for each disease associated with each biosample, along with a field distinguishing "exemplar of disease" from "disease specifically ruled out"
biosample_from_subject.tsv	Prepared by submitter	Y	This table will have one row for each attribution of a biosample to a subject
biosample_gene.tsv	Prepared by submitter	Y	For each biosample with a small group of associated genes (e.g. knockdown targets), this table will have one row for each association of a gene with a biosample
biosample_in_collection.tsv	Prepared by submitter	Y	This table will have one row for each assignment of a biosample as a member of a collection
biosample_protein.tsv	Prepared by submitter	Y	For each biosample with a small group of associated proteins, this table will have one row for each association of a protein with a biosample
biosample_ptm.tsv	Prepared by submitter	Y	For each biosample with a small group of associated PTMs, this table will have one row for each association of a PTM with a biosample
biosample_substance.tsv	Prepared by submitter	Y	For biosamples with substance metadata, this table will have one row for each association of a substance with a biosample
collection.tsv	Prepared by submitter	Y	This table will have one row for each collection
collection_anatomy.tsv	Prepared by submitter	Y	Each row in this table is equivalent to the statement "the contents of collection X directly relate to the study of anatomy Y", for one particular (collection X, anatomy Y) pair
collection_biofluid.tsv	Prepared by submitter	Y	Each row in this table is equivalent to the statement "the contents of collection X directly relate to the study of biofluid Y", for one particular (collection X, biofluid Y) pair
collection_compound.tsv	Prepared by submitter	Y	Each row in this table is equivalent to the statement "the contents of collection X directly relate to the study of compound Y", for one particular (collection X, compound Y) pair
collection_defined_by_project.tsv	Prepared by submitter	Y	This table will have one row for each collection that was generated directly by a project listed in the project.tsv table
collection_disease.tsv	Prepared by submitter	Y	Each row in this table is equivalent to the statement "the contents of collection X directly relate to the study of disease Y", for one particular (collection X, disease Y) pair
collection_gene.tsv	Prepared by submitter	Y	Each row in this table is equivalent to the statement "the contents of collection X directly relate to the study of gene Y", for one particular (collection X, gene Y) pair
collection_in_collection.tsv	Prepared by submitter	Y	This table will have one row for each parent->child (collection->subcollection) relationship
collection_phenotype.tsv	Prepared by submitter	Y	Each row in this table is equivalent to the statement "the contents of collection X directly relate to the study of phenotype Y", for one particular (collection X, phenotype Y) pair
collection_protein.tsv	Prepared by submitter	Y	Each row in this table is equivalent to the statement "the contents of collection X directly relate to the study of protein Y", for one particular (collection X, protein Y) pair
collection_ptm.tsv	Prepared by submitter	Y	Each row in this table is equivalent to the statement "the contents of collection X directly relate to the study of PTM Y", for one particular (collection X, PTM Y) pair
collection_substance.tsv	Prepared by submitter	Y	Each row in this table is equivalent to the statement "the contents of collection X directly relate to the study of substance Y", for one particular (collection X, substance Y) pair
collection_taxonomy.tsv	Prepared by submitter	Y	Each row in this table is equivalent to the statement "the contents of collection X directly relate to the study of taxonomy Y", for one particular (collection X, taxonomy Y) pair
compound.tsv	Built by script	Y	CV term table
data_type.tsv	Built by script	Y	CV term table
dcc.tsv (formerly `primary_dcc_contact.tsv`)	Prepared by submitter	N	This table will have exactly one row
disease.tsv	Built by script	Y	CV term table
domain_location.tsv	Prepared by submitter	Y	This table will have one row for each unique domain_location term in the ptm table
file.tsv	Prepared by submitter	Y	This table will have one row for each file
file_describes_biosample.tsv	Prepared by submitter	Y	This table will have one row for each association of a biosample with a describing file
file_describes_collection.tsv	Prepared by submitter	Y	This table will have one row for each association of a collection with a describing file
file_describes_subject.tsv	Prepared by submitter	Y	This table will have one row for each association of a subject with a describing file
file_format.tsv	Built by script	Y	CV term table
file_in_collection.tsv	Prepared by submitter	Y	This table will have one row for each assignment of a file as a member of a collection
gene.tsv	Built by script	Y	CV term table
id_namespace.tsv	Prepared by submitter	N	This table will have one row for each C2M2 identifier namespace registered with CFDE
ncbi_taxonomy.tsv	Built by script	Y	CV term table
phenotype.tsv	Built by script	Y	CV term table
phenotype_disease.tsv	Built by script	Y	Each row in this table is equivalent to the statement "phenotype X is known to be associated with disease Y", for one particular (phenotype X, disease Y) pair; contents are autoloaded from HPO by the submission prep script, which will add relevant rows for every phenotype term and every disease term used in submitter-prepared tables
phenotype_gene.tsv	Built by script	Y	Each row in this table is equivalent to the statement "phenotype X is known to be associated with gene Y", for one particular (phenotype X, gene Y) pair; contents are autoloaded from HPO by the submission prep script, which will add relevant rows for every phenotype term and every gene term used in submitter-prepared tables
project.tsv	Prepared by submitter	N	This table will have one row for each project
project_in_project.tsv	Prepared by submitter	Y^*	This table will have one row for each parent->child (project->subproject) relationship. --- ^If you have more than one project in your project.tsv table, then you must* populate this table with all of your program's top-level projects, listed as children of your program's root project.
protein.tsv	Built by script	Y	CV term table
protein_gene.tsv	Built by script	Y	Each row in this table is equivalent to the statement "protein X is known to be associated with gene Y", for one particular (protein X, gene Y) pair; contents are autoloaded from HPO by the submission prep script, which will add relevant rows for every protein term and every gene term used in submitter-prepared tables
ptm.tsv	Prepared by submitter	Y	This table will have one row for each PTM
ptm_type.tsv	Prepared by submitter	Y	This table will have one row for each unique ptm_type term in the ptm table
ptm_subtype.tsv	Prepared by submitter	Y	This table will have one row for each unique ptm_subtype term in the ptm table
sample_prep_method.tsv	Built by script	Y	CV term table
subject.tsv	Prepared by submitter	Y	This table will have one row for each subject
subject_disease.tsv	Prepared by submitter	Y	For subjects with disease metadata, this table will have one row for each disease associated with each subject, along with a field distinguishing "disease detected" from "disease specifically ruled out"
subject_in_collection.tsv	Prepared by submitter	Y	This table will have one row for each assignment of a subject as a member of a collection
subject_phenotype.tsv	Prepared by submitter	Y	For every subject with phenotype metadata, this table will have one row for each phenotype associated with each subject, along with a field distinguishing "exemplar of phenotype" from "phenotype specifically ruled out"
subject_race.tsv	Prepared by submitter	Y	This table will have one row for each subject with a race assertion
subject_role_taxonomy.tsv	Prepared by submitter	Y	This table will have one row for each taxon assigned to a subject
subject_substance.tsv	Prepared by submitter	Y	For subjects with substance metadata, this table will have one row for each substance associated with each subject
substance.tsv	Built by script	Y	CV term table

C2M2 Table Summary - nih-cfde/published-documentation GitHub Wiki

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️