Aims and Objectives
Being driven by biological questions, we aim to understand the heterogeneity and plasticity of cancer cells, potentially leading to treatment resistance and cancer relapse. In particular, we seek to discover how cancer cells hijack mechanisms for transcriptional control and thus gain oncogenic and treatment resistance properties encoded in the human genome. The ability to correctly formalize these biological questions in computational and mathematical terms and, together with biologists and clinicians, design experiments and trials is paramount to achieving our goal.
1. Methods for modeling effects of genetic changes in cancer. Many studies, including ours, highlighted the effects of coding mutations on cancer cell phenotypes. However, the contribution of noncoding mutations constituting more than 98% of somatic mutations in cancer DNA is yet much less studied. We now developed a method to predict 3D interactions between active DNA elements based on their nucleotide sequence [1]. In this work, we take into account epigenetic context to make cell type-specific predictions relevant to studying the effects of mutations in a particular cancer type or tissue.
2. Methods for molecular signal deconvolution. "Omics" data are generally obtained from a mixed population
of malignant and non-malignant cells constituting tumors. In addition, malignant cells of
different patients (and often even within the same tumor) present a certain degree of signal variation (this
is what we refer to as heterogeneity). Signal deconvolution, allowing us to extract molecular properties
specific to cancer cells, is one of the major challenges when one works with bulk tumor data (production
of bulk data is labor- and cost-efficient compared to single-cell data and can be implemented for cancer
patients in clinics). In our group, we have successfully developed methods to detect molecular (omics) characteristics
of cancer cells from tumor samples largely infiltrated with non-malignant cells. e.g., cells
from blood vessels and immune and stromal cells found in human tumors [2, 3, 4]. Taking a step
further and analyzing the cancer-specific deconvolved signals then allowed us to get insights into certain
oncogenic mechanisms, for instance, to address causes and consequences of DNA hypermethylation in 19
human cancer types [4].
Currently, our group is developing methods for the signal deconvolution of bulk
genomic data. The first project aims to build a time- and memory-efficient solution for accurately estimating absolute gene copy numbers and genotypes from whole genome or whole exome DNA sequencing
experiments of bulk human tumors. Our second ongoing project in the direction of signal deconvolution aims at characterizing shared intratumor
transcriptional heterogeneity from bulk RNA data without any reference profiles used as additional
input (project supported by an SNF project grant).
3. Methods for data integration and survival models for prediction of clinical outcome. Due to high
data generation costs, many cancer research projects often use one or two modalities of the data (genomic,
epigenetic, transcriptomic, or imaging). Yet, the multi-layer integration of omics data has the potential
to provide a bigger picture of the molecular processes driving cancer progression. Therefore,
we develop a methodology to build interpretable survival and treatment response models based
on multi-level omics data (project partially supported by the SNF Sinergia'2022). We have been exploring multi-task learning and
group/network regularization [5], and we have proposed the knowledge distillation approach
in the context of survival analysis [6]. Importantly, another mission of our team is to provide general
standards for single and multi-omics survival analysis models [7-8].
To sum up, we lead truly interdisciplinary projects at the forefront of computational cancer
research. While choosing state-of-the-art computational algorithms and developing new methods, we pay
great attention to the underlying biological question and go into the very details of modeled molecular
processes. We tackle the question of causes of heterogeneity and plasticity of malignant cells, linking this phenomenon to genetic events, spatial composition of tumor microenvironment, and downstream consequences on patient survival and treatment response.
The long-term goal of our research group is to provide functional insights into cancer development
and progression that could be translated into predictive models that can be then used for the
treatment of patients based on their molecular profiles. Currently, we are interested in understanding the oncogenic processes related to several cancer types: esophageal adenocarcinoma, mesothelioma, lung cancer, neuroblastoma, adrenocortical carcinoma, Ewing sarcoma, and lymphoma. But we are open to collaborations with research groups studying other types of cancer.
References
- UniversalEPI: harnessing attention mechanisms to decode chromatin interactions in rare and unexplored cell types. A. Grover, L. Zhang, T. Muser, S. Haefliger, M. Wang, F.J. Theis, I.L. Ibarra, E. Krymova, V. Boeva, doi: https://doi.org/10.1101/2024.11.22.624813, BioRxiv, [Link to the paper]
- QuantumClone: Clonal assessment of functional mutations in cancer based on a genotype-aware method for clonal reconstruction. P. Deveau, L. Colmet Daage, D. Oldridge, V. Bernard, A. Bellini, M. Chicard, N. Clement, E. Lapouble, V. Combaret, A. Boland, V. Meyer, J.-F. Deleuze, I. Janoueix-Lerosey, E. Barillot, O. Delattre, J. Maris, G. Schleiermacher, and V. Boeva. Bioinformatics. 2018 Jan 12. doi: 10.1093/bioinformatics/bty016. [Epub ahead of print]. PMID: 29342233 Link to the paper
- SV-Bay: structural variant detection in cancer genomes using a Bayesian approach with correction for GC-content and read mappability. D. Iakovishina, I. Janoueix-Lerosey, E. Barillot, M. Regnier and V. Boeva. Bioinformatics. 2016. 32 (7): 984-992. PMID: 26740523 Link to the paper
- Deciphering the etiology and role in oncogenic transformation of the CpG island methylator phenotype (CIMP): a pan-cancer analysis J. Yates and V. Boeva. Briefings in Bioinformatics, 2022. 23(2):bbab610. Link to the paper
- Exploring pathway-based group lasso for cancer survival analysis: a special case of multi-task learning G. Malenova, D. Rowson and V. Boeva. Frontiers in Genetics. 2021. 12:771301. doi: 10.3389/fgene.2021.771301 [Link to the paper]
- Sparsesurv: a Python package for fitting sparse survival models via knowledge distillation. D. Wissel, N. Janakarajan, J. Schulte, D. Rowson, X. Yuan, V. Boeva. Bioinformatics, 2024, 40(9):btae521. Link to the paper
- Systematic comparison of multi-omics survival models reveals a widespread lack of noise resistance. D. Wissel, D. Rowson, V. Boeva. Cell Reports Methods, 2023. DOI:https://doi.org/10.1016/j.crmeth.2023.100461.Link to the paper
- SurvBoard: standardised benchmarking for multi-omics cancer survival models. D. Wissel, N. Janakarajan, A. Grover, E. Toniato, M. Rodriguez-Martinez and V. Boeva. BioRxiv. Link to the paper, Link to the leaderboard