Finding more effective treatments for children’s cancer using advanced graph analytics

Finding more effective treatments for children’s cancer using advanced graph analytics

Scientific breakthroughs are being made daily, in particular for life-threatening diseases. The advancement of technology has aided these discoveries immeasurably. For example, the Technical University of Denmark (DTU) is pioneering the use of graph analytics with AI, Machine Learning and translational bioinformatics to better predict treatment outcomes for children with cancer.

Researchers at the Technical University of Denmark (DTU) are working in collaboration with a major EU-funded project across Denmark and Sweden called iCOPE, the Interregional Childhood Oncology Precision Medicine Exploration, to map genetic material for all children with cancer.

The project generates genomic and RNA sequence data which, when combined with patient data, helps to identify genetic factors which may affect their disease evolution and response to treatments. At DTU they are using advanced graph analytics from TigerGraph to help streamline this task and discover new relationships between genetic alterations and patient outcomes.

As part of the EU-funded iCOPE research project into childhood cancer, Jesper Vang and Adrian Otamendi – both PhD Students in the Department of Health Technology, Cancer Systems Biology at the Technical University of Denmark (DTU) – are using TigerGraph’s advanced graph analytics and Machine Learning to help improve the treatment of acute lymphoblastic leukaemia by analysing genetic factors which may affect patients’ clinical evolution and responses to treatments.

Research in cancer genomics using Whole Genome Sequencing (WGS) generates vast amounts of data which is currently stored in a MySQL relational database. However, the DTU researchers said the information is simply too much for humans to analyse in what is effectively an enormous spreadsheet, making it nearly impossible to correlate data on treatments and outcomes.

What they needed was a solution that would present the data visually and intuitively while also enabling clinicians to query the data themselves without having to become experts in SQL data queries.

Now TigerGraph’s advanced graph analytics with Machine Learning and AI has enabled the researchers to view the data in a new way. By layering the graph database on top of the MySQL database, they are able to create links between genetic data and patient data and use graph visualisation tools to graphically represent the data surrounding illnesses, diagnoses, treatments and outcomes. Using the graph query language and Machine Learning, they are also able to identify previously hidden correlations in the data.

Jesper Vang and Adrian Otamendi tell IntelligentSME.tech how TigerGraph will help them analyse these complex datasets to develop a better understanding of treatment outcomes and toxicities.

Can you tell me more about the work carried out at the Technical University of Denmark?

At iCOPE at DTU, we aim to provide the germline and somatic mutational landscape for childhood cancer through the analysis of next generation sequencing data from Danish children with cancer. Whole genome sequencing (WGS) will be performed to analyse germline mutations and RNA sequencing will be used to characterise the somatic variation in 600 Danish children with cancer. Associations between germline and tumour mutations will be analysed as well as their association to clinical biomarkers and treatment outcomes.

This study will investigate genetic variation in children with cancer who suffered drug-related side effects from an approved treatment in Denmark to find correlations between individual genetics and treatment outcomes, side effects and toxicities.

How will using TigerGraph provide earlier diagnosis and more effective treatment?

The study will focus on side effects that are well characterised in medical journals and have registered pharmacogenomic flags or associations to genetic biomarkers. We will use data mining on the current literature and select genetic and clinical biomarkers that show strong evidence and association with the treatment response and toxicity.

Genetic variation and clinical data on these biomarkers for the 600 patients will be extracted and input into TigerGraph. Connections between genetic variants and treatment outcomes and toxicities found by TigerGraph will help us categorise individuals into meaningful risk and outcome‐based pharmacogenetic (PGx) groups.

The aim is to create a tool that allows the exploration of the genetic variation in the PGx biomarkers across our patient groups and the assessment of individual risk and outcome to different drugs to finally predict risk-grouping based on individual genetic and clinical data.

Where does the data come from?

STAGING (sequencing of Tumor and Germline DNA – Implications and National Guidelines) is a research project that offers whole genome sequencing to all children and young people diagnosed with cancer under the age of 18 in Denmark. The project started on July 1 2016 and is thus the first project that does extensive sequencing of all patients in a medical specialty. As such, around 600 children diagnosed with cancer have already been sequenced.

Why is TigerGraph easier to work with than the previous system?

We try to find associations and connections across our data. For example, patients who present some specific variant or variant set respond worse to some specific treatments. In the current scenario, this relationship or connection is hidden under the entire dataset and only the specific queries will allow us to discover it.

However, TigerGraph could help point to the connections or associations in our data by integrating different data sources and enabling a visual exploration of relationships.

How does it combine AI, Machine Learning and translational bioinformatics?

We will use clinical treatment-related data along with the individual genetic variation data to create a model that, with Machine Learning and AI, will allow us to extract meaningful connections or associations between genomics and treatment outcomes and predict potential toxicities in different groups of patients.

How have you found the new system since it came online?

It’s early days using TigerGraph, but it will help us extract the relationships between datapoints in large and complex datasets. After all, we are looking for correlations between genetics and treatment outcome by analysing 600 genomes of children with cancer. These can be very small but significant features that would be missed by the general statistical analysis because they might only apply to fewer than 2% of patients.

Graph analytics will allow us to find new correlations between patients’ treatments, outcomes and genetic variations. It will enable us to answer questions that were difficult to formulate before but can be answered now thanks to the explicit relationship between datapoints inherent in the structure of the TigerGraph database and the power of graph analytics and Machine Learning.

Browse our latest issue

Intelligent SME.tech

View Magazine Archive