Dr Shafin Rahman has been awarded the J.G. Crawford Prize for his PhD thesis on ways to empower computer vision to interpret unseen objects and new visual information without assistance from humans.
Rahman, who received his PhD in 2020, is now an Assistant Professor at North South University in Bangladesh. His award-wining thesis paves the way for “machines to develop an in-depth understanding of the real world around us, that goes far beyond the raw digital content that is acquired by the visual sensors”.
“I am so happy and honored at the same time,” Rahman said. “I want to convey my heartfelt thanks to my supervisors, colleagues, friends, family and ANU in general for recognizing my work.”
J. G. Crawford Award recognises the highest level of academic excellence across all disciplines at ANU. The first prizes were awarded in 1973 for PhD work in the Department of English and the Research School of Pacific Studies. Since then, only three people from the computer science and engineering disciplines had won the prize. Now there are four.
The promise of “zero-shot learning”
Shafin’s field of research is known in the scientific community as “zero-shot learning.” A human-like thought process for artificial intelligence utilising visual sensors will have transformative impacts on technologies like driverless cars, flying drones, surveillance, object identification, medical imaging technology, and object tracking.
“Deep learning models usually require a large amount of data during supervised learning. My thesis primarily focuses on learning —recognizing, tagging, and detecting—novel object categories with little or sometimes even no direct supervision,” Rahman said.
Dr Salman Khan, who served as Rahman’s thesis advisor, is an Honorary Lecturer at the ANU School of Computing. “Shafin’s work was among the first efforts to develop inductive and transductive approaches for zero-shot detection, where a computer algorithm can both recognize and localize novel unseen objects,” he said.
“Currently robotic systems are trained on certain concepts, but they inevitably encounter situations that are beyond their training,” Rahman said.
Using the concepts presented in Rahman’s thesis, robots will be able to make decisions about new scenarios much the way a human being does, “by relating inter-relation of prior knowledge and wisdom of the past,” he said.
“Flying drones, for instance, are trained to identify different kinds of vehicles—motorcycles, cars, trucks, buses, etc. But what happens when it encounters a tricycle, a vehicle not included in its training? Traditional systems will not be able to make any decision about this situation, whereas zero-shot learning can help the drone to recognize it as a tricycle without having any training,” Rahman said.
Pros and cons of a rapidly emerging field
Rahman said he is drawn to computer science because of its dynamism. “It’s constantly evolving,” he said. “There are always new challenges of interest that make a real impact on human life.”
Rahman’s particular research area saw rapid advancements during the early 2010s. Top innovators were moving from academia into private industry at an accelerated pace. As Rahman was embarking on his PhD journey in 2016, these dynamics led to unexpected roadblocks.
Rahman had chosen ANU in part to study with a distinguished professor who had impressed him at a conference. “His work was very relevant to my interest,” Rahman said. “But unfortunately, he moved to the USA on a sabbatical leave.”
Rahman described the ensuing changes as difficult. “It was a stress for the time being. There were a few unpublished papers, and I was struggling with how to proceed with them,” he said. But Rahman eventually found a new team of supervisors who were “even better at supporting students.”
One of those supervisors was Khan, who had received his PhD the previous year and had not yet taken on a PhD student. Rahman points to Khan’s mentorship and his academic and publishing success as a primary source of inspiration.
“One of my supervisors, Salman Khan, is the Dean’s List winner from the University of Western Australia for his PhD work,” Rahman said. “Moreover, some of his research has been accepted as oral papers in top venues” including the Conference on Computer Vision and Pattern Recognition (CVPR) and the International Conference on Computer Vision (ICCV).
Khan’s achievements spurred Rahman to try to follow in his footsteps. “After many trials, I succeeded in publishing an oral paper in ICCV 2019,” Rahman said.
Rahman said that the Crawford Prize was unlikely to be on the radar for Ph.D candidates at ANU. “At least, for me, it just came at me out of the blue,” he said.
Khan said that ANU academics are aware of the John Crawford prize, but it is not an immediate goal when writing a PhD thesis. “The pressure of putting together a well-rounded and extensive research outcome document can be quite overwhelming towards the end of the PhD, when candidates are generally looking for job opportunities too,” he said. “For Shafin’s case, we were fortunate that our initial plan towards individual chapters went quite well and each of his individual chapters was based on his already published work which was extensively reviewed by multiple peers. The review process helped us refine our chapters to a fairly polished form and Sahfin could therefore focus more on the global story in the thesis.”
Khan said it has been “very satisfying” to see Rahman progress nicely through his PhD and win such high recognition for his work. “I am also very thankful to my senior colleagues who were the chair of Shafin’s supervisory panel, Prof Nick Barnes and Prof Fatih Porikli, and his co-supervisor Dr Miaomiao Liu.”
“Shafin identified a key gap in the literature, where there were no unified solutions targeting different settings in low-shot learning i.e., learning from just one visual example, a few visual examples, or no example at all. He designed a unified approach for different flavors of low-shot learning which to my knowledge was a unique solution in this space,” Khan said.