CNV - deaconjs/ThousandVariantCallersRepo GitHub Wiki

caller	orig pub	from	study	source
bic-seq2	2016	Harvard, Park lab	study	source
clamms	2016	Regeneron Genetics Center	study	source
cnvkit	2016	UCSF	study	source
erds-pe	2016	Harbin Institute of Technology, China	study	source
ioncopy	2016	Charité University Hospital, Berlin	study	source
popsv	2016	McGill U Canada, Bourque lab	study	source
srbreak	2016	University of Otego, New Zealand	study	source
triocnv	2016	Harbin Institute of Technology, China	study	source
as-genseng	2015	UNC Chapel Hill	study	source
codex	2015	U Penn Philadelphia	study	source
conserting	2015	St Jude	study	source
copywriter	2015	Netherlands Cancer Institute	study	source
falcon	2015	UC Davis	study	source
grom-rd	2015	Grigoriev-Lab	study	source
modsara	2015	Yale, Zhang lab	study	source
abscn-seq	2014	UCSD, Messer	study	source
adtex	2014	U Melbourne, Halgamuge	study	source
canoes	2014	Columbia U	study	source
climat	2014	U Hefei, China	study	source
cnvcapseq	2014	Imperial College London	study	source
cnvoffseq	2014	Imperial College London, Coin	study	source
cnvrd2	2014	University of Otego Dunedin, New Zealand	study	source
m-hmm	2014	NHGRI	study	source
oncocnv	2014	Curie Institute	study	source
patterncnv	2014	Mayo	study	source
pyloh	2014	UC Irvine	study	source
qdnaseq	2014	VU University Medical Center	study	source
cnvem	2013	UCLA	study	source
excavator	2013	U Florence	study	source
fishingcnv	2013	McGill U Canada, Majewski	study	source
matchclip	2013	U Pennsylvania, Philadelphia	study	source
oncosnpseq	2013	Imperial College London	study	source
patchwork	2013	Upsala University, Sweden	study	source
theta2	2013	Brown, Raphael	study	source
absolute	2012	Broad, Getz	study	source
apolloh	2012	British Colombia Cancer Agency	study	source
cnanorm	2012	U Leeds	study	source
cnvhitseq	2012	Imperial College London	study	source
conifer	2012	U Wash Seattle	study	source
contra	2012	Peter MacCallum Cancer Centre	study	source
cops	2012	Institute of Applied Bioinformatics, Bangalore India	study	source
erds	2012	Duke University	study	N/A
exomedepth	2012	U Cambridge, Negentsev	study	source
magnolya	2012	Netherlands Bioinformatics Centre	study	source
seqcbs	2012	Stanford University	study	source
xhmm	2012	Mt Sinai, Purcell	study	source
bic-seq	2011	Harvard	study	source
cnvnator	2011	Yale, Gerstein	study	source
exomecnv	2011	Dana-Farber Cancer Institute	study	source
exomecopy	2011	U Oslo	study	source
jointslm	2011	Careggi Hospital, Italy	study	source
readdepth	2011	Baylor College of Medicine	study	source
cnaseg	2010	Li Ka Shing Centre, UK	study	source
cnver	2010	U Toronto	study	source
copyseq	2010	EMBL	study	source
freec	2010	Institut Curie, Barillot	study	source
novelseq	2010	Simon Fraser University, Canada	study	source
rsw-seq	2010	Harvard Medical School, Park lab	study	source
cmds	2009	WashU St Louis, Province	study	source
cnv-seq	2009	National University of Singapore	study	source
rdxplorer	2009	Cold Spring Harbor	study	source
segseq	2009	Broad Institute	study

bic-seq2

Notes: somatic calls.

clamms

Notes: exome input. exome capture data, normalizes GC content

Algorithm: HMM/mixture model

Description: Copy number estimation using Lattice-Aligned Mixture Models. Evaluate the adherence of CNV calls from CLAMMS and four other algorithms to Mendelian inheritance patterns on a pedigree

cnvkit

Notes: targeted sequencing input. somatic calls. targeted reads, uses off-target reads

Used by: biocondor

Algorithm: CBS algorithm (circular binary segmentation)

Description: CNV detection that takes advantage of both on– and off-target sequencing reads and applies a series of corrections to improve accuracy in copy number calling.

erds-pe

Algorithm: paired HMM

ioncopy

Notes: for panel/amplicon, tumor population only; no normal controls used

Algorithm: amplicon read depth population statistics calling from tumor-only cohort.

Description: estimate a null distribution of copy numbers using outlier-robust statistics and assess the significance of CNAs by comparison with this null distribution. In this way, p-values are obtained for each amplicon in each tumor that are subsequently corrected for multiple hypothesis testing. ... For all simulated situations, CN gains of 5 and more can be detected with high sensitivity and specificity. Detection of CN gains of 4 is feasible in some situations, for example when the number of genes under investigation is low.

popsv

Notes: population-based calls.

Description: Population-based detection of structural variation from High-Throughput Sequencing

srbreak

Algorithm: read-depth first for CNV-region detection, followed by split read analysis to locate breakpoints.

Compared to: Pindel, DELLY, MATCHCLIP, SoftSearch, CNVnator

Description: combines a read-depth-based approach and a split-read-based approach to identify breakpoints for different duplication/deletion events inside a large CNVR. The strength of this pipeline comes through its use of multiple samples in one CNV genotype group to identify common breakpoints for that group. It is able to use both single-end and paired-end reads from HTS data.

triocnv

Notes: trio calls.

as-genseng

Notes: combines allele-specific RC with total RC

Algorithm: HMM/model total and allele specific separate

Description: incorporates allele-specific read counts in CNV detection and estimates ASCN using either WGS or WES data

codex

Notes: exome only. population-based calls. includes terms that specifically remove biases due to GC content, exon length and capture and amplification efficiency, and latent systematic artifacts

Algorithm: Poisson latent factor/recursive segmentation

Description: relies on the availability of multiple samples processed using the same sequencing pipeline. Unlike current approaches, CODEX uses a Poisson log-linear model that is more suitable for discrete count data. The normalization model in CODEX includes terms that specifically remove biases due to GC content, exon length and capture and amplification efficiency, and latent systematic artifacts

conserting

Notes: Produces CNV/SV calls. dep on bambino, picard

Algorithm: regression tree segmentation

Description: integrate read-depth change with structural variation (SV) identification through an iterative process of segmentation by read depth, segment merging, and localized SV detection. recursive partitioning techniques to find the transition point for read depth changes.

copywriter

Notes: for targeted sequencing. reference-free

Algorithm: CBS algorithm, uses off-target sequencing reads

Description: exploiting ‘off-target’ sequence reads. CopywriteR allows for extracting uniformly distributed copy number information, can be used without reference, and can be applied to sequencing data obtained from various techniques including chromatin immunoprecipitation and target enrichment on small gene panels

falcon

Notes: somatic calls. calculates allele-specific copy numbers, deduce clonal history

Algorithm: bivariate mixed Binomial, Bayesian criterion for count estimates

Description: based on a change-point model on a bivariate mixed Binomial process, which explicitly models the copy numbers of the two chromosome haplotypes and corrects for local allele-specific coverage biases. By using the Binomial distribution rather than a normal approximation, falcon more effectively pools evidence from sites with low coverage

grom-rd

Notes: wgs, no control req

Description: excessive coverage masking, GC bias mean and variance normalization, GC weighting, dinucleotide repeat bias detection and adjustment, and a size-varying sliding window CNV search.

modsara

Algorithm: screening and ranking algo

abscn-seq

Notes: estimates purity & ploidy. exome only.

adtex

Notes: estimates purity and ploidy. exome only. testing somatic calling.

Algorithm: HMM

Description: uses two Hidden Markov Models to predict copy number and genotypes and computationally resolves polyploidy/aneuploidy, normal cell contamination and signal baseline shift. Our method makes explicit detection on chromosome arm level events, which are commonly found in tumour samples

canoes

Notes: exome only.

Algorithm: models sequence coverage using the negative binomial distribution

Climat

Notes: estimates LOH. robust to contamination and aneuploidy

Description: takes integrated analysis of read count and allele frequency derived from sequenced tumor samples, and provides extensive data processing procedures including GC-content and mappability correction of read count and quantile nor-malization of B allele frequency

cnvcapseq

Notes: targeted resequencing input.

Description: cnvCapSeq integrates evidence from both RD and read pairs (RP) to achieve high breakpoint resolution regardless of coverage uniformity

cnvoffseq

Description: normalization framework for off-target read depth that is based on local adaptive singular value decomposition (SVD). This method is designed to address the heterogeneity of the underlying data and allows for accurate and precise CNV detection and genotyping in off-target regions.

cnvrd2

Validated vs: cnvnator, cn.mops

Description: first uses observed read-count ratios to refine segmentation results in one population. Then a linear regression model is applied to adjust the results across multiple populations, in combination with a Bayesian normal mixture model to cluster segmentation scores into groups for individual CN counts.

m-hmm

Algorithm: HMM

oncocnv

Notes: amplicon input.

Description: defining a method to normalize read coverage with a small set of normal control samples and (ii) assigning statistical significance to putative CNAs resulting from the segmentation of normalized profiles

patterncnv

Notes: exome only. somatic calls. WIG format bams for speed

Algorithm: compares paired samples

Description: accounts for the read coverage variations between exons while leveraging the consistencies of this variability across different samples

pyloh

Notes: estimates LOH, purity, ploidy.

Used by: biocondor

Description: deconvolve read mixture to identify reads associated with tumor cells or a particular subclone of tumor cells. Integrate somatic copy number alterations and loss of heterozygosity in a unified probabilistic framework.

qdnaseq

Notes: shallow depth ok, robust to FFPE

Algorithm: read-depth, no paired analysis needed

cnvem

Description: use maximum likelihood to estimate locations and copy numbers of copied regions and implement an expectation-maximization (EM) algorithm

convex

Notes: exome only.

Algorithm: HMM

Description: uses ratio of tumour and matched normal average read depths at each exonic region, to predict the copy gain or loss

excavator

Notes: exome only. efficient processing

Algorithm: read-count/HMM based with 3-step normalization, segmentation

Description: combines a three-step normalization procedure with a novel heterogeneous hidden Markov model algorithm and a calling method that classifies genomic regions into five copy number state

fishingcnv

Description: compares coverage depth in a test sample against a background distribution of control samples and uses principal component analysis to remove batch effects

matchclip

Description: Our method searches for reads that potentially span the breakpoints of a CNV by screening CIGAR strings. If a long S part is at the 3′(right)-side, we can use its alignment to determine the 5′(left)-side of the breakpoint, and vice versa. Our method searches for two reads that span the same CNV with the long soft-clipped parts at the either end in order to locate both breakpoints of the CNV. To ensure the two reads indeed cover the same CNV, we require that they overlap in a certain orientation and their common string includes both of the soft-clipped parts.

oncosnpseq

Algorithm: mixed Binomial model for multiple tumor genotypes contaminated with normal cells, HMM resolves most likely set of mixtures.

patchwork

Notes: uses WGS input.

theta

Notes: estimates purity & ploidy. somatic calls. estimates purplo, efficient

Used by: biocondor

Algorithm: maximum likelihood mixture decomposition problem

Description: infers the most likely collection of genomes and their proportions in a sample, for the case where copy number aberrations distinguish subpopulations. THetA successfully estimates normal admixture and recovers clonal and subclonal copy number aberrations

absolute

Notes: estimates LOH.

Algorithm: estimates loss of heterozygosity

Description: detect subclonal heterogeneity and somatic homozygosity, and it can calculate statistical sensitivity for detection of specific aberrations

apolloh

Notes: estimates LOH. somatic calls.

Algorithm: estimates loss of heterozygosity.

Description: a hidden Markov model (HMM) for predicting somatic loss of heterozygosity and allelic imbalance in whole tumour genome sequencing data.

cnanorm

Description: identify the multi-modality of the distribution of smoothed ratios. Then we use the estimates of the mean (modes) to identify underlying ploidy and the contamination level, and finally we perform the correction.

cnanorm

Notes: somatic calls.

cnvhitseq

Description: jointly models evidence from RD, RPs and SRs at the population level. pool information across individual samples and reconcile copy number differences among data sources

conifer

Notes: exome only. population-based calls.

Description: this method can be used to reliably predict (94% overall precision) both de novo and inherited rare CNVs involving three or more consecutive exons

contra

Notes: exome only.

Algorithm: CBS algorithm

Description: calls copy number gains and losses for each target region based on normalized depth of coverage. Our key strategies include the use of base-level log-ratios to remove GC-content bias, correction for an imbalanced library size effect on log-ratios, and the estimation of log-ratio variations via binning and interpolation

erds

Notes: available?

Description: starts from read depth (RD) information, and integrates other information including paired end mapping (PEM) and soft-clip signature to call CNVS

exomedepth

Notes: exome only. de novo? mendelian

Used by: biocondor

Description: Calls copy number variants (CNVs) from targeted sequence data, typically exome sequencing experiments designed to identify the genetic basis of Mendelian disorders.

magnolya

Notes: reference-free.

Description: enables copy number variation (CNV) detections without using a reference genome. Magnolya directly compares two next-generation sequencing datasets.

seqcbs

Description: based on a simple and flexible inhomogeneous Poisson Process model for sequenced reads. We derive the score and generalized likelihood ratio statistics for this model to detect regions where the read intensity shifts in the target sample, as compared to a reference. We construct a modified Bayes information criterion (mBIC) to select the appropriate number of change points and propose Bayesian point-wise confidence intervals as a way to assess the confi- dence in the copy number estimates.

xhmm

Notes: exome only.

Algorithm: PCA/HMM

Description: uses principal component analysis (PCA) normalization and a hidden Markov model (HMM) to detect and genotype copy number variation (CNV) from normalized read-depth data from targeted sequencing experiments.

bic-seq

Notes: somatic

Description: Combines normalization of the data at the nucleotide level and Bayesian information criterion-based segmentation to detect both somatic and germline copy number variations

cmvnator

Notes: no control req

Used by: metasv

Algorithm: read coverage

Description: CNVnator is able to discover CNVs in a vast range of sizes, from a few hundred bases to megabases in length

exomecnv

Notes: exome only.

Description: a statistical method to detect CNV and LOH using depth-of-coverage and B-allele frequencies, from mapped short sequence reads

exomecopy

Notes: exome only.

Description: an HMM for predicting copy number state in exome and other targeted sequencing data using observed read counts and positional covariates

jointslm

Notes: population-based.

readdepth

Notes: no control req

cnaseg

Notes: somatic calls.

cnver

Description: supplements the depth-of-coverage with paired-end mapping information, where mate pairs mapping discordantly to the reference serve to indicate the presence of variation.

copyseq

freec

Notes: no control req

Description: The tool deals with two frequent problems in the analysis of cancer deep-sequencing data: absence of control sample and possible polyploidy of cancer cells.

novelseq

Description: discover the content and location of long novel sequence insertions

rsw-seq

Notes: somatic calls.

cmds

Description: correlation matrix diagonal segmentation (CMDS), identifies RCNAs based on a between-chromosomal-site correlation analysis.

cnv-seq

Notes: somatic calls

rdxplorer

Notes: no contol req

Description: copy number variants (CNV) detection in whole human genome sequence data using read depth (RD) coverage. CNV detection is based on the Event-Wise Testing (EWT) algorithm

segseq

Notes: somatic calls