Variomics and CNV References

From Variome

Jump to: navigation, search
Tools
1. Affymetrix platform
CNAG: http://www.genome.umin.jp/
CNAT: http://www.affymetrix.com/products/software/specific/genotyping_console_software.affx
ITALICS:

2. Illumina platform
PennCNV

Validation and extension of an empirical Bayes method for SNP calling on Affymetrix microarrays, Shin Lin email, Benilton Carvalho email, David J Cutler email, Dan E Arking email, Aravinda Chakravarti email and Rafael A Irizarry email, Genome Biology 2008, 9:R63
Multiple algorithms have been developed for the purpose of calling SNPs from Affymetrix microarrays. We extend and validate the algorithm CRLMM, which incorporates HapMap information within an empirical Bayes framework. We find CRLMM to be more accurate than the Affymetrix default programs (BRLMM and Birdseed). Also, we tie our call confidence metric to percent accuracy. We intend that our validation datasets and methods serve as standard benchmarks for future SNP calling algorithms.

Comparison results
Assessment of algorithms for high throughput detection of genomic copy number variation in oligonucleotide microarray data, Ágnes Baross,1,5 Allen D Delaney,1 H Irene Li,1 Tarun Nayar,1 Stephane Flibotte,1 Hong Qian,1 Susanna Y Chan,1 Jennifer Asano,1 Adrian Ally,1 Manqiu Cao,2 Patricia Birch,3 Mabel Brown-John,1 Nicole Fernandes,3 Anne Go,1 Giulia Kennedy,2 Sylvie Langlois,3 Patrice Eydoux,4 JM Friedman,3 and Marco A Marra, BMC Bioinformatics. 2007; 8: 368.
Background
Genomic deletions and duplications are important in the pathogenesis of diseases, such as cancer and mental retardation, and have recently been shown to occur frequently in unaffected individuals as polymorphisms. Affymetrix GeneChip whole genome sampling analysis (WGSA) combined with 100 K single nucleotide polymorphism (SNP) genotyping arrays is one of several microarray-based approaches that are now being used to detect such structural genomic changes. The popularity of this technology and its associated open source data format have resulted in the development of an increasing number of software packages for the analysis of copy number changes using these SNP arrays.
Results
We evaluated four publicly available software packages for high throughput copy number analysis using synthetic and empirical 100 K SNP array data sets, the latter obtained from 107 mental retardation (MR) patients and their unaffected parents and siblings. We evaluated the software with regards to overall suitability for high-throughput 100 K SNP array data analysis, as well as effectiveness of normalization, scaling with various reference sets and feature extraction, as well as true and false positive rates of genomic copy number variant (CNV) detection.
Conclusion
We observed considerable variation among the numbers and types of candidate CNVs detected by different analysis approaches, and found that multiple programs were needed to find all real aberrations in our test set. The frequency of false positive deletions was substantial, but could be greatly reduced by using the SNP genotype information to confirm loss of heterozygosity.
   
ITALICS: an algorithm for normalization and DNA copy number calling for Affymetrix SNP arrays. Rigaill G, Hupe' P, Almeida A, La Rosa P, Meyniel JP, Decraene C, Barillot E.Bioinformatics. 2008 Mar 15;24(6):768-74. Epub 2008 Feb 5.
MOTIVATION: Affymetrix SNP arrays can be used to determine the DNA copy number measurement of 11 000-500 000 SNPs along the genome. Their high density facilitates the precise localization of genomic alterations and makes them a powerful tool for studies of cancers and copy number polymorphism. Like other microarray technologies it is influenced by non-relevant sources of variation, requiring correction. Moreover, the amplitude of variation induced by non-relevant effects is similar or greater than the biologically relevant effect (i.e. true copy number), making it difficult to estimate non-relevant effects accurately without including the biologically relevant effect. RESULTS: We addressed this problem by developing ITALICS, a normalization method that estimates both biological and non-relevant effects in an alternate, iterative manner, accurately eliminating irrelevant effects. We compared our normalization method with other existing and available methods, and found that ITALICS outperformed these methods for several in-house datasets and one public dataset. These results were validated biologically by quantitative PCR. AVAILABILITY: The R package ITALICS (ITerative and Alternative normaLIzation and Copy number calling for affymetrix Snp arrays) has been submitted to Bioconductor.
Personal tools
Public Genotype Data