StabilitySort: pathogenic mutation community detection with graphs

20 Dec 2023

Supervisor: Dan Andrews (Jubilee Joint Fellow, CECC & JCSMR; Dan.Andrews@anu.edu.au)

Scholarships: Both honours and PhD scholarships are available for this project.

Description: One scientific domain to have been revolutionised by AI has been Structural Biology. First came the AlphaFold2 algorithm developed by DeepMind, which provided the world with accurate, high-resolution structures of just about all human proteins. This year, several protein language models have almost perfected predictions of pathogenic mutations in human genes. These tools provide an opportunity to apply this data to better understand the functional effects of human genetic mutation. Simply ranking all the genetic variation in a patient genome by protein stability metrics performs well (have a look at our manuscript and our webtool) – and this is only the beginning!

The source data for this project is the genomic information from thousands of individuals with autoimmune disease (Centre for Personalised Immunology@JCSMR). With genetic variation data for individuals with immune disease, we will explore ways to represent the variation within a single person as graphs. We will use a number of different sources of information to link mutations in different genes into pathways and other functionally-related units. With these graphs, we will apply existing methodology to detect communities of genetic variants that correlate with disease status across cohorts of patients with the same diagnosis.

Goals:

  • Explore methods to use genetic variation in a network/graph context, using different metrics to link co-inherited genetic variation
  • Filter and prioritise input genetic variation by differing functional metrics, obtained from protein language models and predictions of protein stability effects.
  • Identify communities of co-inherited genetic variation that correlate with patient disease.

Requirements: Python or R programming, familiarity with basic graph theory. Experience with or interest in biological datasets and biological questions. In the first instance, please make contact with Dan.Andrews@anu.edu.au

Gain:

  • Experience with real scientific data and the challenges involved in deriving meta-data and ensuring data consistency.
  • Develop an inter-disciplinary scientific skillset.
arrow-left bars search times