ChangelogMethylation - folkehelseinstituttet/mobagen GitHub Wiki

This page contains a history and to a certain degree a road map on Methylation data from MoBa

Table of contents

Status

TSD p229methylation is complete with respect to data returned to us.

Changelog

October 2023

met011 + info on met010

05.10.23 met011 has now had it's raw data published.

met010 has had it's QC published but the changelog was not updated. See a separate entry on September 2022.

February 2023

Participant withdrawal

13.02.23 See updated documentation on withdrawal here - The documentation also tells you how to identify the sets and which samples have been deleted - (12 samples were deleted this time)

September 2022

QC if met010 available

02.09.22 met010 has now had it's QC results published.

QC documentation is now found on MethylationQC - and the contents has been restructured. The total QC will be published within a week - we first need to manually remove possible 'bad' samples.

August 2022

New set met010

met010 has now had it's raw data published, as well as parts of the QC (up to identification of bad samples).

QC documentation is now found on MethylationQC - and the contents has been restructured. The total QC will be published within a week - we first need to manually remove possible 'bad' samples.

July 2022

Participant withdrawal

27.07.22 A new round of participant deletion has been run. Details:

  • met001: 12 samples (first delete)
  • met002: 1 sample (first delete)
  • met003: 1 sample
  • met004: 3 samples (2 childs and 1 parent)
  • met008: 3 samples (child)

Documentation of deleted samples have now been improved according to the withdrawal page.

May 2022

New dataset met009 published

02.05.2022 met009 has now been published, including QC results. The (common) documentation on the QC pipeline is clearer/better.

Feel free to report strangeness, a lots of new scripts have been created to automate new parts of the production/publishing process!

February 2022

Sample delete and new QC

22.2.2022 The official methylation documentation has been updated, and a brand new page withdrawal is present in the documentation.

Together these describes how the methylation directories will be cleaned up as participant ask for their data to be deleted.

Also, the first batch of deletion has started, and within a few days, a new data-set will replace the existing. A full QC has been run on the data where 14 samples have been deleted. For various reasons we lack the sentrixID for 12 more samples (belonging to met001 and met002). These will be deleted as soon as we can track down their IDs.

Furthermore, the documentations now shows you what samples have or are known to be deleted. If you have analysis that not yet have started, then you are obliged to delete these samples yourself.

QC published

5.2.2022 The totality of the QC was published to TSD, as well as some updates on the official methylation documentation.

There still can be changes to the data/structure - for example there are certain results files that only exist in .rds format. Also note the warning about the fact that soon some participants that have recently withdrawn from MoBa will have their methylation data remove from p22Methylation.

January 2022

QC plots (pilot) updated with public plots

2022.01.17 See link under plots

QC-result pilot updated

2022.01.14 plots have been added. Documented on QC results

met006 - confirmed correct idat files

2022.01.14 The previous met006 idat-files were also corrupt, but the current ones are good (a successful QC has been run on them, soon to be published.

met006 - updated raw-data

The set turned out to be corrupt, and we now have published a better candidate

met003 qc-results for discussion

2022.01.05 On TSD, under datasets/met003/QC/results you will find a very rough example of QC results. To be discussed on slack ...

Also the general section on QC results as been updated with current status/prototype info.

met004 split in child and parent

2022.01.05 For various reasons, it is an advantage to have child and parent split within a set. This is now documented under the methylation page. Note the use of soft-links ...

You can see this for the met004 and met008 data-sets. (Vigilant users might have observed this in December 2021, but it is only documented now).

December 2021

Tentative QC-directory set up

Finalizing of QC is getting closer. The QC needs certain parameters, to run. These have been tentatively described on QC. The files described are available on TSD as well.

Updated sample-sheets (met001-met002)

2021.12.08 The samplesheets of met001 and met002 have been edited.

  • SentrixPostion is renamed to Sentrix_Position.
  • A columns Sex has been added

November 2021

Updated sample-sheets (met003-met008)

2021.11.10

  • The sets met005, met006 and met007 have had their sample-sheets updated. These now contain sample-type information on the underlying sample-types, coded as described on sampleTypes. They also contains sex information for the individual - as recorded in the NIPH biobank.
  • Due to an unfortunate typo in Illumina-documentation, the column Sentrix_Postion used to be named SentrixPosition. This is fixed for all most sample-sheets. The exception are the sets met001 and met002 that still need some work because they can be used anyway.
  • The sample-sheets that contained a column for sex, had the column wrongly named 'gender'. This has been fixed.

Updated sample-sheets (met003, met004, met008)

The sets met003, met004 and met008 have had their sample-sheets updated. They now contain information on the underlying sample-type, coded as described on sampleTypes. They also contains gender information for the individual.

The whole setup is experimental and might be changed later. The sample sheets are the same as before, but new columns have been added at the end.

October 2021

met008

2021.10.19 2021.10.13 Raw data/idats for met008 are published.

met004 and met005

2021.10.13 Raw data/idats for met004 are published.

met005 was allready published but has been restructured so it is organized just as the other sets:

  • The samplesheet sport the chip-name.
  • The idats are now directly under the idats directory - they used to be in subdirectories
  • For a very limited time, the dataset is found on met005.old. Will be deleted without warning.

met003

2021.10.11 Raw data/idats for met003 are published.

met002

2021.10.08 Raw data/idats for met002 are published.

met001

2021.10.07 Raw data/idats for met001 are published.

met006 and met007

2021.10.06 idat-files for these sets have been published. See Methylation for details. met006 might be corrupted.

What to expect

Raw data sets

We hope to publish these during October 2020.

met005 is going to have its directory-structure changed.

QC

Every raw data set will have a QC'ed version of it, see QC