CNV - deaconjs/ThousandVariantCallersRepo GitHub Wiki

caller orig pub from study source
bic-seq2 2016 Harvard, Park lab study source
clamms 2016 Regeneron Genetics Center study source
cnvkit 2016 UCSF study source
erds-pe 2016 Harbin Institute of Technology, China study source
ioncopy 2016 Charité University Hospital, Berlin study source
popsv 2016 McGill U Canada, Bourque lab study source
srbreak 2016 University of Otego, New Zealand study source
triocnv 2016 Harbin Institute of Technology, China study source
as-genseng 2015 UNC Chapel Hill study source
codex 2015 U Penn Philadelphia study source
conserting 2015 St Jude study source
copywriter 2015 Netherlands Cancer Institute study source
falcon 2015 UC Davis study source
grom-rd 2015 Grigoriev-Lab study source
modsara 2015 Yale, Zhang lab study source
abscn-seq 2014 UCSD, Messer study source
adtex 2014 U Melbourne, Halgamuge study source
canoes 2014 Columbia U study source
climat 2014 U Hefei, China study source
cnvcapseq 2014 Imperial College London study source
cnvoffseq 2014 Imperial College London, Coin study source
cnvrd2 2014 University of Otego Dunedin, New Zealand study source
m-hmm 2014 NHGRI study source
oncocnv 2014 Curie Institute study source
patterncnv 2014 Mayo study source
pyloh 2014 UC Irvine study source
qdnaseq 2014 VU University Medical Center study source
cnvem 2013 UCLA study source
excavator 2013 U Florence study source
fishingcnv 2013 McGill U Canada, Majewski study source
matchclip 2013 U Pennsylvania, Philadelphia study source
oncosnpseq 2013 Imperial College London study source
patchwork 2013 Upsala University, Sweden study source
theta2 2013 Brown, Raphael study source
absolute 2012 Broad, Getz study source
apolloh 2012 British Colombia Cancer Agency study source
cnanorm 2012 U Leeds study source
cnvhitseq 2012 Imperial College London study source
conifer 2012 U Wash Seattle study source
contra 2012 Peter MacCallum Cancer Centre study source
cops 2012 Institute of Applied Bioinformatics, Bangalore India study source
erds 2012 Duke University study N/A
exomedepth 2012 U Cambridge, Negentsev study source
magnolya 2012 Netherlands Bioinformatics Centre study source
seqcbs 2012 Stanford University study source
xhmm 2012 Mt Sinai, Purcell study source
bic-seq 2011 Harvard study source
cnvnator 2011 Yale, Gerstein study source
exomecnv 2011 Dana-Farber Cancer Institute study source
exomecopy 2011 U Oslo study source
jointslm 2011 Careggi Hospital, Italy study source
readdepth 2011 Baylor College of Medicine study source
cnaseg 2010 Li Ka Shing Centre, UK study source
cnver 2010 U Toronto study source
copyseq 2010 EMBL study source
freec 2010 Institut Curie, Barillot study source
novelseq 2010 Simon Fraser University, Canada study source
rsw-seq 2010 Harvard Medical School, Park lab study source
cmds 2009 WashU St Louis, Province study source
cnv-seq 2009 National University of Singapore study source
rdxplorer 2009 Cold Spring Harbor study source
segseq 2009 Broad Institute study

bic-seq2

Notes: somatic calls.

clamms

Notes: exome input. exome capture data, normalizes GC content

Algorithm: HMM/mixture model

Description: Copy number estimation using Lattice-Aligned Mixture Models. Evaluate the adherence of CNV calls from CLAMMS and four other algorithms to Mendelian inheritance patterns on a pedigree

cnvkit

Notes: targeted sequencing input. somatic calls. targeted reads, uses off-target reads

Used by: biocondor

Algorithm: CBS algorithm (circular binary segmentation)

Description: CNV detection that takes advantage of both on– and off-target sequencing reads and applies a series of corrections to improve accuracy in copy number calling.

erds-pe

Algorithm: paired HMM

ioncopy

Notes: for panel/amplicon, tumor population only; no normal controls used

Algorithm: amplicon read depth population statistics calling from tumor-only cohort.

Description: estimate a null distribution of copy numbers using outlier-robust statistics and assess the significance of CNAs by comparison with this null distribution. In this way, p-values are obtained for each amplicon in each tumor that are subsequently corrected for multiple hypothesis testing. ... For all simulated situations, CN gains of 5 and more can be detected with high sensitivity and specificity. Detection of CN gains of 4 is feasible in some situations, for example when the number of genes under investigation is low.

popsv

Notes: population-based calls.

Description: Population-based detection of structural variation from High-Throughput Sequencing

srbreak

Algorithm: read-depth first for CNV-region detection, followed by split read analysis to locate breakpoints.

Compared to: Pindel, DELLY, MATCHCLIP, SoftSearch, CNVnator

Description: combines a read-depth-based approach and a split-read-based approach to identify breakpoints for different duplication/deletion events inside a large CNVR. The strength of this pipeline comes through its use of multiple samples in one CNV genotype group to identify common breakpoints for that group. It is able to use both single-end and paired-end reads from HTS data.

triocnv

Notes: trio calls.

as-genseng

Notes: combines allele-specific RC with total RC

Algorithm: HMM/model total and allele specific separate

Description: incorporates allele-specific read counts in CNV detection and estimates ASCN using either WGS or WES data

codex

Notes: exome only. population-based calls. includes terms that specifically remove biases due to GC content, exon length and capture and amplification efficiency, and latent systematic artifacts

Algorithm: Poisson latent factor/recursive segmentation

Description: relies on the availability of multiple samples processed using the same sequencing pipeline. Unlike current approaches, CODEX uses a Poisson log-linear model that is more suitable for discrete count data. The normalization model in CODEX includes terms that specifically remove biases due to GC content, exon length and capture and amplification efficiency, and latent systematic artifacts

conserting

Notes: Produces CNV/SV calls. dep on bambino, picard

Algorithm: regression tree segmentation

Description: integrate read-depth change with structural variation (SV) identification through an iterative process of segmentation by read depth, segment merging, and localized SV detection. recursive partitioning techniques to find the transition point for read depth changes.

copywriter

Notes: for targeted sequencing. reference-free

Algorithm: CBS algorithm, uses off-target sequencing reads

Description: exploiting ‘off-target’ sequence reads. CopywriteR allows for extracting uniformly distributed copy number information, can be used without reference, and can be applied to sequencing data obtained from various techniques including chromatin immunoprecipitation and target enrichment on small gene panels

falcon

Notes: somatic calls. calculates allele-specific copy numbers, deduce clonal history

Algorithm: bivariate mixed Binomial, Bayesian criterion for count estimates

Description: based on a change-point model on a bivariate mixed Binomial process, which explicitly models the copy numbers of the two chromosome haplotypes and corrects for local allele-specific coverage biases. By using the Binomial distribution rather than a normal approximation, falcon more effectively pools evidence from sites with low coverage

grom-rd

Notes: wgs, no control req

Description: excessive coverage masking, GC bias mean and variance normalization, GC weighting, dinucleotide repeat bias detection and adjustment, and a size-varying sliding window CNV search.

modsara

Algorithm: screening and ranking algo

abscn-seq

Notes: estimates purity & ploidy. exome only.

adtex

Notes: estimates purity and ploidy. exome only. testing somatic calling.

Algorithm: HMM

Description: uses two Hidden Markov Models to predict copy number and genotypes and computationally resolves polyploidy/aneuploidy, normal cell contamination and signal baseline shift. Our method makes explicit detection on chromosome arm level events, which are commonly found in tumour samples

canoes

Notes: exome only.

Algorithm: models sequence coverage using the negative binomial distribution

Climat

Notes: estimates LOH. robust to contamination and aneuploidy

Description: takes integrated analysis of read count and allele frequency derived from sequenced tumor samples, and provides extensive data processing procedures including GC-content and mappability correction of read count and quantile nor-malization of B allele frequency

cnvcapseq

Notes: targeted resequencing input.

Description: cnvCapSeq integrates evidence from both RD and read pairs (RP) to achieve high breakpoint resolution regardless of coverage uniformity

cnvoffseq

Description: normalization framework for off-target read depth that is based on local adaptive singular value decomposition (SVD). This method is designed to address the heterogeneity of the underlying data and allows for accurate and precise CNV detection and genotyping in off-target regions.

cnvrd2

Validated vs: cnvnator, cn.mops

Description: first uses observed read-count ratios to refine segmentation results in one population. Then a linear regression model is applied to adjust the results across multiple populations, in combination with a Bayesian normal mixture model to cluster segmentation scores into groups for individual CN counts.

m-hmm

Algorithm: HMM

oncocnv

Notes: amplicon input.

Description: defining a method to normalize read coverage with a small set of normal control samples and (ii) assigning statistical significance to putative CNAs resulting from the segmentation of normalized profiles

patterncnv

Notes: exome only. somatic calls. WIG format bams for speed

Algorithm: compares paired samples

Description: accounts for the read coverage variations between exons while leveraging the consistencies of this variability across different samples

pyloh

Notes: estimates LOH, purity, ploidy.

Used by: biocondor

Description: deconvolve read mixture to identify reads associated with tumor cells or a particular subclone of tumor cells. Integrate somatic copy number alterations and loss of heterozygosity in a unified probabilistic framework.

qdnaseq

Notes: shallow depth ok, robust to FFPE

Algorithm: read-depth, no paired analysis needed

cnvem

Description: use maximum likelihood to estimate locations and copy numbers of copied regions and implement an expectation-maximization (EM) algorithm

convex

Notes: exome only.

Algorithm: HMM

Description: uses ratio of tumour and matched normal average read depths at each exonic region, to predict the copy gain or loss

excavator

Notes: exome only. efficient processing

Algorithm: read-count/HMM based with 3-step normalization, segmentation

Description: combines a three-step normalization procedure with a novel heterogeneous hidden Markov model algorithm and a calling method that classifies genomic regions into five copy number state

fishingcnv

Description: compares coverage depth in a test sample against a background distribution of control samples and uses principal component analysis to remove batch effects

matchclip

Description: Our method searches for reads that potentially span the breakpoints of a CNV by screening CIGAR strings. If a long S part is at the 3′(right)-side, we can use its alignment to determine the 5′(left)-side of the breakpoint, and vice versa. Our method searches for two reads that span the same CNV with the long soft-clipped parts at the either end in order to locate both breakpoints of the CNV. To ensure the two reads indeed cover the same CNV, we require that they overlap in a certain orientation and their common string includes both of the soft-clipped parts.

oncosnpseq

Algorithm: mixed Binomial model for multiple tumor genotypes contaminated with normal cells, HMM resolves most likely set of mixtures.

patchwork

Notes: uses WGS input.

theta

Notes: estimates purity & ploidy. somatic calls. estimates purplo, efficient

Used by: biocondor

Algorithm: maximum likelihood mixture decomposition problem

Description: infers the most likely collection of genomes and their proportions in a sample, for the case where copy number aberrations distinguish subpopulations. THetA successfully estimates normal admixture and recovers clonal and subclonal copy number aberrations

absolute

Notes: estimates LOH.

Algorithm: estimates loss of heterozygosity

Description: detect subclonal heterogeneity and somatic homozygosity, and it can calculate statistical sensitivity for detection of specific aberrations

apolloh

Notes: estimates LOH. somatic calls.

Algorithm: estimates loss of heterozygosity.

Description: a hidden Markov model (HMM) for predicting somatic loss of heterozygosity and allelic imbalance in whole tumour genome sequencing data.

cnanorm

Description: identify the multi-modality of the distribution of smoothed ratios. Then we use the estimates of the mean (modes) to identify underlying ploidy and the contamination level, and finally we perform the correction.

cnanorm

Notes: somatic calls.

cnvhitseq

Description: jointly models evidence from RD, RPs and SRs at the population level. pool information across individual samples and reconcile copy number differences among data sources

conifer

Notes: exome only. population-based calls.

Description: this method can be used to reliably predict (94% overall precision) both de novo and inherited rare CNVs involving three or more consecutive exons

contra

Notes: exome only.

Algorithm: CBS algorithm

Description: calls copy number gains and losses for each target region based on normalized depth of coverage. Our key strategies include the use of base-level log-ratios to remove GC-content bias, correction for an imbalanced library size effect on log-ratios, and the estimation of log-ratio variations via binning and interpolation

erds

Notes: available?

Description: starts from read depth (RD) information, and integrates other information including paired end mapping (PEM) and soft-clip signature to call CNVS

exomedepth

Notes: exome only. de novo? mendelian

Used by: biocondor

Description: Calls copy number variants (CNVs) from targeted sequence data, typically exome sequencing experiments designed to identify the genetic basis of Mendelian disorders.

magnolya

Notes: reference-free.

Description: enables copy number variation (CNV) detections without using a reference genome. Magnolya directly compares two next-generation sequencing datasets.

seqcbs

Description: based on a simple and flexible inhomogeneous Poisson Process model for sequenced reads. We derive the score and generalized likelihood ratio statistics for this model to detect regions where the read intensity shifts in the target sample, as compared to a reference. We construct a modified Bayes information criterion (mBIC) to select the appropriate number of change points and propose Bayesian point-wise confidence intervals as a way to assess the confi- dence in the copy number estimates.

xhmm

Notes: exome only.

Algorithm: PCA/HMM

Description: uses principal component analysis (PCA) normalization and a hidden Markov model (HMM) to detect and genotype copy number variation (CNV) from normalized read-depth data from targeted sequencing experiments.

bic-seq

Notes: somatic

Description: Combines normalization of the data at the nucleotide level and Bayesian information criterion-based segmentation to detect both somatic and germline copy number variations

cmvnator

Notes: no control req

Used by: metasv

Algorithm: read coverage

Description: CNVnator is able to discover CNVs in a vast range of sizes, from a few hundred bases to megabases in length

exomecnv

Notes: exome only.

Description: a statistical method to detect CNV and LOH using depth-of-coverage and B-allele frequencies, from mapped short sequence reads

exomecopy

Notes: exome only.

Description: an HMM for predicting copy number state in exome and other targeted sequencing data using observed read counts and positional covariates

jointslm

Notes: population-based.

readdepth

Notes: no control req

cnaseg

Notes: somatic calls.

cnver

Description: supplements the depth-of-coverage with paired-end mapping information, where mate pairs mapping discordantly to the reference serve to indicate the presence of variation.

copyseq

freec

Notes: no control req

Description: The tool deals with two frequent problems in the analysis of cancer deep-sequencing data: absence of control sample and possible polyploidy of cancer cells.

novelseq

Description: discover the content and location of long novel sequence insertions

rsw-seq

Notes: somatic calls.

cmds

Description: correlation matrix diagonal segmentation (CMDS), identifies RCNAs based on a between-chromosomal-site correlation analysis.

cnv-seq

Notes: somatic calls

rdxplorer

Notes: no contol req

Description: copy number variants (CNV) detection in whole human genome sequence data using read depth (RD) coverage. CNV detection is based on the Event-Wise Testing (EWT) algorithm

segseq

Notes: somatic calls