With the development in DNA sequencing, metagenomics studies make use of the genetic material directly obtained from the environment samples. One fundamental question in metagenomics is to find which microorganisms are present in the sample, which is usually referred to as taxonomic profiling for metagenomics. However, different taxonomic profiling tools (both reference-based or reference-free) might provide divergent results even with the same input and thus biologists face challenges on how to interpret divergent results.
In this project, we will explore how to design machine-learning approaches to attack the problem in general as well as the adaptable taxonomic profiling approaches within a user-specific domain (e.g., fungi, virus, plasmids, etc.).
- Algorithmic and programming skills.