Wei Zhou: 2D+3D Indoor Scene Understanding from a Single Monocular Image

Indoor scene understanding is a broad field, which gains great interests in recent years.  Existing methods in this field typically make use of depth sensors,  such as Kinect and LiDAR,  which are, however, not always available, or cheap to acquire.  In our works, we aim to exploit indoor scene understanding in a general case, where only a monocular color image of the scene is available.  To the end,  we developed a model that reasons local detailed depth by leveraging scene geometrical structures on multiple scales.  With depth estimations, we were able to generate 3D object box proposals with our novel integrated, differentiable framework, that estimates depth, extracts volumetric scene representation and generates 3D proposals.  In addition, we tackled scene parsing problem of its three subtasks of instance segmentation, semantic labeling, and the support relationship inference.  In summary, by exploiting the important subtasks of indoor parsing, we introduced our novel methodologies to each of these challenge problems, and in the end, provide rich and effective scene expressions from various scales and various perspectives.

Details for her presentation can be found in her publications:

[1] Indoor scene structure analysis for single image depth estimation. CVPR 2015

[2] Indoor Scene Parsing with Instance Segmentation, Semantic Labeling and Support Relationship Inference. CVPR 2017

[3] 3D Box Proposal from a Monocular Image for Indoor Scenes, AAAI 2018


Wei Zhuo is a Ph.D. student in ANU, supervised by Mathieu Salzmann, Xuming He, and Miaomiao Liu.  Her current research field is indoor scene understanding from a single monocular image. 

Date & time

3pm 27 Nov 2017


Room:R107, Design Studio Teaching



Updated:  10 August 2021/Responsible Officer:  Dean, CECS/Page Contact:  CECS Marketing