TiSM (This is Serious Mum) or Targeted Somatic Mutation - HealthHackAu2013/wiki GitHub Wiki

(update 25 October 2016) Robyn Lindley presents Lecture and receives R Douglas Wright Award http://melbuni.e-newsletter.com.au/link/id/zzzz57f1c2d6da225135Pzzzz57b2b0d720491222/page.html

THE HUNT FOR THE SOURCE OF MUTATIONS IN CANCER:

A story of discovery and innovation at the coalface

Presented by

Dr Robyn A. Lindley Department of Pathology, Faculty of Medicine, Dentistry & Health Sciences, The University of Melbourne and GMDx Co Pty Ltd

Dr. Robyn A. Lindley is an Honorary Senior Fellow in the Department of Pathology, Faculty of Medicine, Dentistry & Health Sciences at the University of Melbourne. She is also an inventor, and Chief Scientific Officer (CSO) and Director of GMDx Co Pty Ltd. Her first degrees are in physics and informatics. But Robyn has morphed into an internationally recognized immunogeneticist who has been publishing on the molecular mechanisms of evolution and somatic hypermutation (SHM) in the immune system for almost two decades. SMH is the cellular mechanism responsible for antibody diversity.

This journey began in 1996 when she joined with ANU Immunologists Edward J Steele and Robert V Blanden as co-authors of Lamarck’s Signature, a best-selling science book that laid out the scientific evidence for antibody diversity and the evolution of immune recognition via reverse transcription-coupled feedback loops.1 This was followed by a breakthrough paper in 2006 with ANU’s Edward J Steele and Georg F Weiller, that provided the first data-driven evidence for the reverse transcription-based mechanism in antibody producing cells.2 This was a crucial step for understanding the processes involved in the accumulation of unwanted genetic mutations in somatic cells that may ultimately give rise to cancer. In 2010, she published the highly regarded The Soma, that was a synthesis of evolutionary genetic mechanisms.3 In the same year, with Edward J Steele, it was shown that the diagnostic strand-biased mutation patterns are the same for mouse antibody genes, and for all human cancers analyzed (in part or in toto).4 The overwhelming conclusion was that all cancers displayed a similar form of dysregulated error prone somatic hypermutation that is normally tightly edited in normal hypermutating B cells in the immune system.

This research prompted some key questions: “What are the root causes of cancer in humans? and, “How can these be identified?” In mid-2012, Robyn Lindley began to answer these questions by interrogating several public mutation databases. Using her skills in recognizing patterns amongst complex series of numbers she shook off the pre-existing dogma that ‘mutations occur randomly’. She discovered that most mutations in the genes of cancer cells occur in a highly non-random fashion, and that the patterns observed are associated with proteins called deaminases that result in a high level of mutagenesis in cancer cells. Most mutations were found to occur at specific sites, codon-context motif signatures. The processes involved are referred to as Targeted Somatic Mutation (TSM). Realizing the diagnostic and prognostic potential of these findings, she began filing patent applications in late 2012.5 The first scientific paper describing TSM processes was published in Cancer Genetics in May 2013.6 Recently in 2016, she published a paper showing that some changes in TSM signatures arising in an individual can be used to predict progression or recurrence in ovarian cancer.7

To fund this research Robyn has developed several scientific and commercial partnerships. The Victorian Government Department of State Development & Business Innovation has supported the development of a new clinical grade TSM test platform. GMDx Co Pty Ltd was established, and is now involved in clinical trials in cancer and viral disease to lay the foundations for a global genetic testing company. Using next generation genome-wide sequencing, the TSM test platform can identify the likely source of unwanted new mutations arising in an individual. These may be associated with the development of pre-cancerous conditions or cancer progression, such as might occur during and after immunotherapy for late stage cancers, or while suffering from a chronic viral infection such as hepatitis B virus (HBV). This is important because by identifying changes in the source of new mutations arising in an individual, clinicians will be better able to monitor and personalize treatment.

With difficulties funding research, and scientific discoveries that break the rules of long-standing scientific dogma, the journey has not been easy. Yet, in this era of rapid and relatively cheap genome-wide sequencing, Robyn Lindley’s discoveries give straight forward biological meaning to cancer mutation signatures. She will tell this unfolding scientific story of the discovery of the underlying rules governing TSM and the path to funding clinical applications in a way all those interested will understand.

References: 1 Steele EJ, Lindley RA, Blanden RV 1998. Lamarck’s Signature: How retrogenes are changing Darwin’s Natural Selection. Ed. Paul Davies, Allen & Unwin.

2 Steele EJ, Lindley RA, Wen J, Weiller GF 2006. Computational analyses show A-to-G mutations correlate with nascent mRNA hairpins at somatic hypermutation hotspots. DNA Repair 5:1346.

3 Lindley RA 2010. The Soma: How our genes really work and why that changes everything. CYO Foundation, ISBN 1451525648.

4 Steele EJ, Lindley RA 2010. Somatic mutation patterns in non-lymphoid cancers resemble the strand biased somatic hypermutation spectra of antibody genes. DNA Repair 9:600.

5 Lindley RA 2014. Methods for determining the causes of somatic mutagenesis. PCT/AU2013/001275.

6 Lindley RA 2013. The importance of codon context for understanding the Ig-like somatic hypermutation strand-biased patterns in TP53 mutations in breast cancer Codon structure flags TP53 mutation targets in breast cancer. Cancer Genetics 206:222.

7 Lindley RA et al 2016. Targeted Somatic Mutation (TSM) profiles for serous ovarian adenocarcinoma predict the likelihood of recurrence. Cancer Medicine, In Press August 2016.

****Overview (updated 17 July 2016)

Our researcher founder and CSO/CEO Dr. Robyn A. Lyndley has a problem with automation of genomic data. Robyn has manually entered and managed sample & patient records with cancer mutation data and then create comparison frameworks for the TSM Test approach outlined in her paper published in Cancer Genetics Journal Volume 206, Issue 6 , Pages 222-226, June 2013


The importance of codon context for understanding the Ig-like somatic hypermutation strand-biased patterns in TP53 mutations in breast cancer

Dr. Robyn A. Lindley

Cancer Genetics 1 June 2013 (volume 206 issue 6 Pages 222-226 DOI: 10.1016/j.cancergen.2013.05.016)


See opportunity & vision for the future... http://venturebeat.com/2013/12/16/ibm-reveals-its-top-five-predictions-for-the-next-five-years/2/

Doctors will use your DNA to keep you well Global cancer rates are expected to jump by 75 percent by 2030. IBM wants computers to help doctors understand how a tumor affects a patient down to their DNA. They could then figure out what medications will best work against the cancer, and fulfill it with a personalized cancer treatment plan. The hope is that genomic insights will reduce the time it takes to find a treatment down from weeks to minutes.

“The ability to correlate a person’s DNA against the results of treatment with a certain protocol could be a huge breakthrough,” Meyerson said. It’ll be able to scan your DNA and find out if any magic bullet treatments exist that will address your particular ailment.

HealthHack Team (name & bio's & mobile)

Team Lead - Leon-Gerard Vandenberg at [email protected] CXO -Silicon Valley experience 03 9024 3483 -see http://bit.ly/LGVandenberg

GMDx Pty Ltd - formerly know as Melville Diagnostics P/L Executive

The Problem

TSM Test has been presented in her paper and validated by Dr. Robyn A. Lyndley The veracity of the TSM test is directly verified by several large databases of cancer patient mutations (e.g. Sanger Welcome Trust COSMIC database in the UK, and WHO’s IARC database of TP53 gene mutations in France). Robyn's research has relied on these sources to validate the test, and publish the results in Cancer Genetics Journal.

Each mutagenic protein included in the test are revealed in the priority publication. The technical veracity of the methods used is revealed in the research database used by Dr. Lyndley, and published online as supplementary material in an Excel file. All source cancer mutation data are publicly available in the WHO’s IARC database.

In the future presentations to industry and researchers it is extremely wasteful to have Robyn and others have to perform these manual time consuming data wrangling steps in order to pre-format genomic data and make ready the necessary files, gene references and codon analysis to perform a TSM Test on various patient or sample data.

The Solution

The TiSM team devised a quick partially hosted solution for Robyn and others which uses a combination of file handling & string manipulation routines to manage and pre-process, error check, then manipulate CSV Files downloaded directly from COSMIC database and add codon data from retrieved cDNA structures (per Gene) also from COSMIC - Catalog of Somantic Mutations in Cancer using the FTP service from COSMIC

Criteria for success for solution was to achieve MVP status (with minimal testing) over HealthHack Weekend with following deliverables:

  1. Open API and Source Code for dealing with COSMIC Gene and Mutation CSV Files

  2. Source Code and methods to re-structure Codon information around the Mutation point using industry standard Ensembl format

  3. TSM Test Tabulation and Statistics result page hosted by using R and RStudio - Shiny

Application/Relevance

GMDx Diagnostic's TSM Test identifies the protein source of mutations using a proprietary

Targeted Somatic Mutation (TSM) test as per Cancer Genetics Journal

TSM Test provides Early diagnosis → PREVENTION OR TREATMENT before symptoms become evident Provides people with suspect breast, colon, cervical or other tissue samples or undiagnosed illness with an indication as to whether or not rogue AID/APOBEC deaminase processes are actively causing mutations, to enable them to receive more targeted treatment and possibly avoid more invasive procedures or surgery. The test will identify which proteins are actively causing mutations to be generated.

Alternative genetic tests that rely on SNPs (single nucleotide polymorphisms) can only indicate cancer pre-disposition. Cytology tests are almost always used to diagnose cancer, with many results ‘uncertain’. These tests are only used once mutations generated result in abnormal cell growth.

Clinicians and patients will therefore prefer to request our Targeted Somatic Mutation (TSM) Test to:

  • Possibly prevent unnecessary invasive and costly procedures and treatments;
  • Provide information on active mutagenic agents that will enable more personalised treatment;
  • Provide patient reassurance and more certainly around a diagnosis that a suspect tissue is ‘benign’.

4 Patents are pending for the suite of TSM tests and these applications are owned by GMDx P/L. GMDx executive are seeking partnerships/licensing & commercialisation funding.

Datasets

The exported sheets are updated with each COSMIC release, the file can be found here : [ftp://ftp.sanger.ac.uk/pub/CGP/cosmic/data_export/CosmicCompleteExport_v[xx][release date].tsv.gz](ftp://ftp.sanger.ac.uk/pub/CGP/cosmic/data_export/CosmicCompleteExport_v[xx][release date].tsv.gz) This file contains all the samples analysed for every gene in COSMIC found with/without mutation.

see also ftp://ftp.sanger.ac.uk/pub/CGP/cosmic/ -> fasta.tgz

Restrictions/Data Sharing:

Code of Access COSMIC Data sharing policies will only be sustainable if researchers and users of this demonstration commit to:

  • conducting appropriate research
  • protecting the confidentiality of managed access data sets
  • carefully communicating research results
  • respecting rights to first publication and to acknowledgement
  • sharing, in turn, resulting data and analysis with the research community

We expect developers & researchers downloading and using TSM API & source code will respect the same data sharing policies.

Links

Tech stack

  • MS Visual Studio / .NET
  • R, RStudio Shiny

Tradeoffs/analysis

Retrieving CSV files from COSMIC went relatively well after we understood Ensembl formats and cDNA.

After team leader's presentation and a variety of team discussions some required required cross pollination, we were underway. Further, after some Q&A with Robyn the basic requirements for codon manipulations and analysis were basically understood. We are not researchers so interpretation of results were not our job we just need to valid Robyn's test subject data and seek out same expected results.

The Ensembl Perl API was a complete waste of coding and scripting time and machine setup to support Perl as the data retrieved was not quite structured as expected and some files could not be retrieved, ie their API returned NULL because Ensembl Assession ID from COSMIC were retired and did not match some Ensemble ID's so we reverted to COSMIC gene references via FTP. This was not a risk that we wanted to entertain over a weekend work effort

Using MS C# and .NET was not our first choice but available resources were comfortable with this environment (for this early TST Test in a Test Harness or Lab Environment). Options including www online delivery would have Security Requirements for trust and patient confidentiality, so local operation on a Lab PC is good fit at this time.

Future functionality

What really want to do next is finalise MVP for TSM Test process with a more flushed out User Interface and to provide an API library so that other integrations with hospital or researcher's patient data can be streamlined.

This will provide Robyn, our researchers and external researchers with support required for ongoing efficacy trials. The Patient data can then be streamlined and that TSM Test can also start to be used by more novice clinicians.