The transformation of data from low to high dimensions are an essential part of Data Analytics and Machine Learning as a whole. Popular examples of these transformations involve Matrix Factorisiation (i.e. PCA, KPCA), Graph Embeddings (i.e TSNE, UMAP), and many more.
This project seeks to draw connections between transformed and untransformed data so that insights from transformed data may be used for insights into the untransformed versions. This will be particularly useful in very high dimensional data where it may be difficult to build the desired models. It would be useful if we can get similar insights into the data from a lower dimensional transformed version of the data. Alternatively insights into what the relative effects of the data transformation on the analysis outcomes would also be desired. Research will include:
- Develop a principled approach to determine the choice of dimension transformation and the effects they may have upon a model.
- Develop one or more schemes to transfer effects from one dimension (i.e. feature importance) to another based on the transformation scheme.
- Assess the quality of these insights upon several real-world datasets
- Implement codes to carry out the above tasks
Requirements
Background and experience in basic Machine Learning (i.e. COMP3670/4670/4660/4650, STAT3040/4040) is requiredalong with the fundamentals of Linear Algebra (i.e MATH1014/1115/1116/+ or equivalents). Experience with Python/R/Matlab is strongly desirable.