Large-scale deep learning models with millions and even billions of parameters are commonly used to achieve exceptional performance. However, training such models on existing electronic hardware such as GPU can take anywhere from weeks to months. Photonic computing, which encodes data on light waves for computation, has emerged as an exciting paradigm to accelerate DNN training with low-power and light-speed processing. In 2020 Lightmatter released the world-record photonic chip to accelerate the Matrix Multiply-Accumulate (MAC) operations (the dominant operations in deep learning), achieving 10x faster with 90% less energy compared with GPU. In 2023, MIT researchers unveiled an open-source photonic computing developer kit, reducing the inference time of representative DL models by >300x compared to the Nvidia A100 GPU.
Research Questions and Tasks
Most existing photonic accelerators designed for deep learning focus only on the inference stage because training demands high-precision computing and involves numerous non-linear computations. This project aims to design photonic computing solutions to accelerate the training of deep learning models, which is much more computation-, memory- and energy-intensive than deep learning inference. Specifically, this project will focus on the following tasks:
- Implement the training of a convolutional neural network based on the MIT’s Lightning open-source developer kit (https://lightning.mit.edu/)
- Conduct simulations to evaluate the developed photonic training scheme in comparison with GPU
- Develop a solution for gradient approximation using photonic computing [only for 24 credits]
Supervision
In collaboration with Visiting Associate Professor Haibo Zhang (https://comp.anu.edu.au/people/haibo-zhang/). Please contact A/Prof Zhang for more information.
References
- Zhizhen Zhong, Mingran Yang, Jay Lang, Christian Williams, Liam Kronman, Alexander Sludds, Homa Esfahanizadeh, Dirk Englund, Manya Ghobadi. Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference. Proceedings of Sigcomm, 2023. https://doi.org/10.1145/3603269.3604821
- Chengpeng Xia, Yawen Chen, Haibo Zhang, and Jigang Wu. STADIA: Photonic Stochastic Gradient Descent for Neural Network Accelerators, ACM Transactions on Embedded Computing Systems, 22(5): 1-23, 2023. https://doi.org/10.1145/3607920
Requirements
Background and experience in basic Machine Learning (i.e. COMP3670/4670/4660/4650, STAT3040/4040). Experience with Python is strongly desirable, and familiarity with Verilog would be advantageous.