Introduction

Control-FREEC is a tool for detection of copy-number changes and allelic imbalances (including LOH) using deep-sequencing data originally developed by the Bioinformatics Laboratory of Institut Curie (Paris). Nowdays, Control-FREEC is supported by the team of Valentina Boeva at Institut Cochin, Inserm(Paris).

Control-FREEC automatically computes, normalizes, segments copy number and beta allele frequency (BAF) profiles, then calls copy number alterations and LOH. The control (matched normal) sample is optional for whole genome sequencing data but mandatory for whole exome or targeted sequencing data. For whole genome sequencing data analysis, the program can also use mappability data (files created by GEM).

Starting from version v8.0, we provide a possibility to detect subclonal gains and losses and evaluate the likeliest average ploidy of the sample. Also, the procedure for evaluation of tumor purity has been improved.


Input for CNA detection: aligned single-end, paired-end or mate-pair data in SAM, BAM, SAMtools pileup.
Control-FREEC accepts .GZ files. Support of Eland, BED, SOAP, arachne, psl (BLAT) and Bowtie formats has been discontinued starting from version v8.0.
Input for CNA+LOH detection: There are two options: (a) provide aligned reads in SAMtools pileup format. Files can be GZipped; (b) provide BAM files together with options "makePileup" and "fastaFile" (see How to create a config file?)
Output: Regions of gain, loss and LOH, normalized copy number and BAF profiles.


Starting from Control-FREEC v5.0, the program can be used on exome-sequencing data. Starting from version v8.0, read counts are calculated by exon and not per window (set "window=0").

Starting from Control-FREEC v6.0, the user can use multiple threads to run Control-FREEC. 30x coverage WGS data with a control (i.e., two pileup.gz files) will be fully processed (CNA and LOH info) in one hour using 6 threads.

Control-FREEC publications

  • Control-FREEC: a tool for assessing copy number and allelic content using next generation sequencing data. V. Boeva, T. Popova, K. Bleakley, P. Chiche, I. Janoueix-Lerosey, O. Delattre and E. Barillot. Bioinformatics, 2012, 28(3):423-5. PMID: 22155870.

    CNA detection part of Control-FREEC (simply FREEC)

  • Control-free calling of copy number alterations in deep-sequencing data using GC-content normalization. V. Boeva, A. Zinovyev, K. Bleakley, J.-P. Vert, I. Janoueix-Lerosey, O. Delattre and E. Barillot. Bioinformatics, 2011, 27(2):268-9. PMID: 21081509.
    LOH detection part of Control-FREEC



Downloads

Starting from Control-FREEC v5.7 Windows is no longer supported. However, you can still download Control-FREEC v5.6 for Windows 32-bit (archive with a binary version (Win32)) or contact me for support.

Download the latest release of Control-FREEC for Linux from its GitHub page:


Download a test datasets:

  • Data for HCC1143 and HCC1143-BL (from Chiang et al., 2009) to test CNA predictions: test.zip (143M)
  • Dataset (cancer, unpublished) to test LOH predictions: testChr19.zip (1334 M)

Download mappability tracks if you want to include mappability information:


Do not forget to extract files from the archive! You can also generate a mappability track for other genomes using GEM.

Download files with SNPs (only if you have high coverage data and you want to detect allelic status; then, you must transform read files into pileup format)
Starting from Control-FREEC v9.3, .txt.gz, .vcf and .vcf.gz files are also accepted! For the .txt files with SNPs, please refer to FREEC FAQ Q19 to understand how these files are generated.

Links to Documentation



People who contributed to the Control-FREEC idea and code:



Contacts

I will be pleased to address any question or concern about the Control-FREEC software:


IMPORTANT: In case of a Control-FREEC error, please share your config file and the output of the program into the command line (log file).


Acknowledgements

This work was supported by grants from the Institut National de la Sante et de la Recherche Medicale, the Institut Curie, the Ligue Nationale contre le Cancer (Equipe labellisee and CIT program).