My research focuses on
i) developing specific workflows for the analysis, integration and visualization of high throughput data of a specific biomedical condition and
ii) developing general tools to empower wet lab scientists in their research.
Workflows for the analysis, integration and visualization of high throughput data
With the increasing accessibility of the high throughput technologies, it is more common that different NGS applications are combined within a project. For instance, to better understand the molecular mechanisms that affect the risk of stroke in humans, the study of the relationship between extent of DNA methylation, mRNA levels and the presence of possible genetic variants could potentially identify important regulator variants/patterns (Collaboration with Prof. Christina Jern, Dept of Clinical Genetic, UGOT). In this project, different pipelines have been developed for each type of NGS applications with the long-term goal of recycling the algorithms and methods developed and use them as base line for targeting other common diseases in the identification of plausible risk factors. A tool (InVi) for generating circular visualizations has being roughly implemented, including data import, filtering and display that cover different user cases.
Another example is the development of a workflow to analyze and characterize gallbladder cancer (GBC). This project is a collaboration with Prof. Justo Bermejo at the Heidelberg University. Since this type of cancer has been largely neglected, there is not enough molecular information in public databases that help in the identification of potential therapeutic targets. Here we used RNA sequencing to characterize 23 commercial hepatobiliary cancer cell lines, and provided details mutations and gene expression data to the research community. Currently we are using the set-up workflow to analyze patient data to identify driver genes in GBC and gallbladder dysplasia, and compare their mutational profiles. We aim to identify individuals at high risk of GBC and pave the way for personalized GBC treatment.
General tools for applied bioinformaticians and wet lab researchers
One example, is the tool we developed to identify processed pseudogenes in collaboration with Anna Rohlin at the Sahlgrenska University Hospital. These pseudogenes have been linked to a new class of mutations that occurs during cancer development, thus we have implemented an automatic pipeline (Pψfinder) that uses sequencing data as input and generates list of processed pseudogenes candidates as well as a graphical representation. This workflow is freely available to the community and it is implemented at the Core Facility. We have recently published the tool and we aim to apply the tool to other different cohorts in order to improve our understanding of their impact on diagnostic testing.
An ongoing project lies within radiation research, a collaboration with Britta Langen, UT Southwestern Medical Center, USA. Erroneous dose-response estimates can lead to under-treatment of malignancies or cause severe normal tissue toxicities if safe dose levels are overestimated. Thus, molecular biomarkers are used to reflect dose-response upon exposure or during follow-up. To improve the selection process of the biomarkers, we have developed an unbiased machine learning pipeline using omics data from irradiated (normal) tissues, that can identify biomarkers panels depending on radiation type, tissue type and other parameters. The discovery pipeline was optimized for processing performance and our approach has validated two biomarkers previously proposed by (conventional) differential expression analysis, and identified 24 novel biomarker candidates undetected by conventional analysis. We are now aiming to make the tool available and portable as well as scan other cohorts to add diagnostic value for radiation therapy and risk assessment.