ChangelogMethylation - folkehelseinstituttet/mobagen GitHub Wiki
This page contains a history and to a certain degree a road map on Methylation data from MoBa
Table of contents
- Status
- Changelog
- What to expect
Status
TSD p229methylation is complete with respect to data returned to us.
Changelog
October 2023
met011 + info on met010
05.10.23 met011 has now had it's raw data published.
met010 has had it's QC published but the changelog was not updated. See a separate entry on September 2022.
February 2023
Participant withdrawal
13.02.23 See updated documentation on withdrawal here - The documentation also tells you how to identify the sets and which samples have been deleted - (12 samples were deleted this time)
September 2022
QC if met010 available
02.09.22 met010 has now had it's QC results published.
QC documentation is now found on MethylationQC - and the contents has been restructured. The total QC will be published within a week - we first need to manually remove possible 'bad' samples.
August 2022
New set met010
met010 has now had it's raw data published, as well as parts of the QC (up to identification of bad samples).
QC documentation is now found on MethylationQC - and the contents has been restructured. The total QC will be published within a week - we first need to manually remove possible 'bad' samples.
July 2022
Participant withdrawal
27.07.22 A new round of participant deletion has been run. Details:
- met001: 12 samples (first delete)
- met002: 1 sample (first delete)
- met003: 1 sample
- met004: 3 samples (2 childs and 1 parent)
- met008: 3 samples (child)
Documentation of deleted samples have now been improved according to the withdrawal page.
May 2022
New dataset met009 published
02.05.2022 met009 has now been published, including QC results. The (common) documentation on the QC pipeline is clearer/better.
Feel free to report strangeness, a lots of new scripts have been created to automate new parts of the production/publishing process!
February 2022
Sample delete and new QC
22.2.2022 The official methylation documentation has been updated, and a brand new page withdrawal is present in the documentation.
Together these describes how the methylation directories will be cleaned up as participant ask for their data to be deleted.
Also, the first batch of deletion has started, and within a few days, a new data-set will replace the existing. A full QC has been run on the data where 14 samples have been deleted. For various reasons we lack the sentrixID for 12 more samples (belonging to met001 and met002). These will be deleted as soon as we can track down their IDs.
Furthermore, the documentations now shows you what samples have or are known to be deleted. If you have analysis that not yet have started, then you are obliged to delete these samples yourself.
QC published
5.2.2022 The totality of the QC was published to TSD, as well as some updates on the official methylation documentation.
There still can be changes to the data/structure - for example there are certain results files that only exist in .rds format. Also note the warning about the fact that soon some participants that have recently withdrawn from MoBa will have their methylation data remove from p22Methylation.
January 2022
QC plots (pilot) updated with public plots
2022.01.17 See link under plots
QC-result pilot updated
2022.01.14 plots have been added. Documented on QC results
met006 - confirmed correct idat files
2022.01.14 The previous met006 idat-files were also corrupt, but the current ones are good (a successful QC has been run on them, soon to be published.
met006 - updated raw-data
The set turned out to be corrupt, and we now have published a better candidate
met003 qc-results for discussion
2022.01.05 On TSD, under datasets/met003/QC/results you will find a very rough example of QC results. To be discussed on slack ...
Also the general section on QC results as been updated with current status/prototype info.
met004 split in child and parent
2022.01.05 For various reasons, it is an advantage to have child and parent split within a set. This is now documented under the methylation page. Note the use of soft-links ...
You can see this for the met004 and met008 data-sets. (Vigilant users might have observed this in December 2021, but it is only documented now).
December 2021
Tentative QC-directory set up
Finalizing of QC is getting closer. The QC needs certain parameters, to run. These have been tentatively described on QC. The files described are available on TSD as well.
Updated sample-sheets (met001-met002)
2021.12.08 The samplesheets of met001 and met002 have been edited.
- SentrixPostion is renamed to Sentrix_Position.
- A columns Sex has been added
November 2021
Updated sample-sheets (met003-met008)
2021.11.10
- The sets met005, met006 and met007 have had their sample-sheets updated. These now contain sample-type information on the underlying sample-types, coded as described on sampleTypes. They also contains sex information for the individual - as recorded in the NIPH biobank.
- Due to an unfortunate typo in Illumina-documentation, the column Sentrix_Postion used to be named SentrixPosition. This is fixed for all most sample-sheets. The exception are the sets met001 and met002 that still need some work because they can be used anyway.
- The sample-sheets that contained a column for sex, had the column wrongly named 'gender'. This has been fixed.
Updated sample-sheets (met003, met004, met008)
The sets met003, met004 and met008 have had their sample-sheets updated. They now contain information on the underlying sample-type, coded as described on sampleTypes. They also contains gender information for the individual.
The whole setup is experimental and might be changed later. The sample sheets are the same as before, but new columns have been added at the end.
October 2021
met008
2021.10.19 2021.10.13 Raw data/idats for met008 are published.
met004 and met005
2021.10.13 Raw data/idats for met004 are published.
met005 was allready published but has been restructured so it is organized just as the other sets:
- The samplesheet sport the chip-name.
- The idats are now directly under the idats directory - they used to be in subdirectories
- For a very limited time, the dataset is found on met005.old. Will be deleted without warning.
met003
2021.10.11 Raw data/idats for met003 are published.
met002
2021.10.08 Raw data/idats for met002 are published.
met001
2021.10.07 Raw data/idats for met001 are published.
met006 and met007
2021.10.06 idat-files for these sets have been published. See Methylation for details. met006 might be corrupted.
What to expect
Raw data sets
We hope to publish these during October 2020.
met005 is going to have its directory-structure changed.
QC
Every raw data set will have a QC'ed version of it, see QC