Torr Vision Group Torr Vision Group University of Oxford

SemanticPaint: Interactive Segmentation and Learning of 3D Worlds

Stuart Golodetz*, Michael Sapienza*, Julien Valentin, Vibhav Vineet, Ming-Ming Cheng, Anurag Arnab, Victor Adrian Prisacariu, Olaf Kaehler,
Carl Yuheng Ren, Stephen Hicks, David Murray, Shahram Izadi
and Philip Torr


Abstract

We present an open-source, real-time implementation of the interactive SemanticPaint system for geometric reconstruction, object-class segmentation and learning of 3D scenes described in [Valentin15]. Using our system, a user can walk into a room wearing a depth camera and a virtual reality headset, and both densely reconstruct the 3D scene [Newcombe11,Niessner13,Kaehler15] and interactively segment the environment into object classes such as 'chair', 'floor' and 'table'. The user interacts physically with the real-world scene, touching objects and using voice commands to assign them appropriate labels. These user-generated labels are leveraged by an online random forest-based machine learning algorithm, which is used to predict labels for previously unseen parts of the scene. The entire pipeline runs in real time, and the user stays 'in the loop' throughout the process, receiving immediate feedback about the progress of the labelling and interacting with the scene as necessary to refine the predicted segmentation.

Code

We are making the SemanticPaint framework open-source in the hope that others will find it a useful foundation for further research. The code can be found on our GitHub page.

Relevant Publications

  • SemanticPaint: A Framework for the Interactive Segmentation of 3D Scenes. Technical Report No. TVG-2015-1, Department of Engineering Science, University of Oxford, released as arXiv e-print 1510.03727, October 2015. PDF BibTeX
  • SemanticPaint: Interactive Segmentation and Learning of 3D Worlds. ACM SIGGRAPH 2015 Emerging Technologies, p.22.
    Los Angeles, USA, August 2015. PDF BibTeX
  • SemanticPaint: Interactive 3D Labeling and Learning at your Fingertips. ACM Transactions on Graphics, 34(5), August 2015. PDF BibTeX

Related Projects

SemanticPaint is built on top of InfiniTAM v2, a highly efficient, open-source 3D reconstruction engine developed by Oxford's Active Vision Group. Please see here for further details.







Social Media

References

  • [Kaehler15] Olaf Kaehler, Victor Adrian Prisacariu, Carl Yuheng Ren, Xin Sun, Philip Torr and David Murray. Very High Frame Rate Volumetric Integration of Depth Images on Mobile Devices. IEEE Transactions on Visualization and Computer Graphics, 21(11), November 2015.
  • [Newcombe11] Richard Newcombe, Shahram Izadi, Otmar Hilliges, David Molyneaux, David Kim, Andrew Davison, Pushmeet Kohli, Jamie Shotton, Steve Hodges and Andrew Fitzgibbon. KinectFusion: Real-Time Dense Surface Mapping and Tracking. ISMAR, 2011.
  • [Niessner13] Matthias Niessner, Michael Zollhoefer, Shahram Izadi, and Marc Stamminger. Real-time 3D Reconstruction at Scale using Voxel Hashing. ACM Transactions on Graphics, 32(6):169, 2013.
  • [Valentin15] Julien Valentin, Vibhav Vineet, Ming-Ming Cheng, David Kim, Shahram Izadi, Jamie Shotton, Pushmeet Kohli, Matthias Niessner, Antonio Criminisi, and Philip H S Torr. SemanticPaint: Interactive 3D Labeling and Learning at your Fingertips. ACM Transactions on Graphics, 34(5), August 2015.