ONCOCNV is a package to detect copy number changes in Deep Sequencing data developed by OncoDNA with the collaboration with the Bioinformatics Laboratory of Institut Curie (Paris). It is now supported by the group of Valentina Boeva at Inserm U1016.

ONCOCNV automatically computes, normalizes, segments copy number profiles, then calls copy number alterations. The user can provide any positive number of control samples in order to construct the baseline. However, we recommend to use at least three control samples. The more the better.

ONCOCNV can be applied to exome-seq data. You just need to provide probe coordinates instead of amplicon coordinates, and you will get beautiful copy number profiles for your data.

Input for CNV detection: aligned single-end or paired-end data in the BAM format.
Output: Annotation of genes with copy number changes + visualization of the profile (.png).

Citation: Boeva,V. et al. (2014) Multi-factor data normalization enables the detection of copy number aberrations in amplicon sequencing data. Bioinformatics, 30(24):3443-3450. Link

SeqAnswers forum:


Read about the requirements in the README file.

Download the the latest version of ONCOCNV.

Data access

Raw or processed intput files:

  • Spreadsheet with the raw read count data for the control samples (dataset A; generated by Control.stats.txt
  • Spreadsheet with the normalized read count data for the control samples with additional information: amplicon length, GC-content, PC1, PC2, PC3, standard deviation (dataset A; generated by processControl.v5.3.R): Control.stats.Processed.txt
  • Spreadsheet with the raw read count data for the tumor samples (dataset A; generated by Test.stats.samplesA1_A3.txt and Test.stats.samplesA4_A8.txt

Result files:


The following members of the ONCOCNV working group are pleased to answer any question or address any concerns you may have with the ONCOCNV software:


Example of output .png file:

Additionally, you can visualize the output per chromosome. In this case, gene names will appear on the graph. To run the script:

   cat perChrVisualization.R | R --slave --args myTestSample.profile.txt 17
   cat perChrVisualization.R | R --slave --args myTestSample.profile.txt chr17

Blue vertical lines signify predicted breakpoints. Orange vertical lines correspond to breakpoints resulting from the segmentation (they were corrected by a t-test later). You may ignore them.
Example of output .png file: