Machine learning for precision medicine through cytometry and genomic applications
Scholarships: Both honours and PhD scholarships are available for this project.
Description: The genetic mechanisms that cause similar, heritable diseases are often varied and difficult to determine. Many diseases with the same name can be caused by mutations in different genes, with differing and partially overlapping symptoms. In this project, we will use high-resolution diagnostic assay data to train models that can identify and classify sub-types of autoimmune diseases. With this ability to classify disease sub-type, we will turn to the genomic information from these individuals to identify patterns in this genetic data that both predict disease incidence and identify the pathogenic mechanism of this disease.
We will use a large data corpus obtained from blood samples with a Fluorescence-Activated Cell Sorter (FACS). These assays use lasers and fluorescent markers that allow individual cells in blood to be identified and counted. This compositional data of the different cell types present in blood can be used for precision diagnosis, as changes proportions of particular cell types have been shown to identify many different immune diseases. With this large, multiparametric dataset, we will train models to differentiate disease samples from healthy controls. We will expand this to also identify specific disease categories and even identify sub-clinical manifestations of disease.
- Programmatically normalise and manipulate large amounts of cytometry data to produce a uniform corpus of experimental data, to allow querying and generation of derived data.
- Pursue deep learning approaches to predict disease/healthy status
- Use graphs to identify communities of candidate genetic variation that correlate with disease sub-types
Requirements: An interest in biological data, some experience with machine learning, and competence with Python or R.