State of the art results

Make3D Range Image Data

This dataset contains aligned image and range data:
Make3D Image and Laser Depthmap
Image and Laser and Stereo
Image and 1D Laser
Image and Depth for Objects
Video and Depth (coming soon)

Different types of examples are there---outdoor scenes (about 1000), indoor (about 50), synthetic objects (about 7000), etc.

> >

Make3D Laser+Image data

(Used in "Learning Depth from Single Monocular Images", NIPS 2005.)
Dataset Image Depths Image Depths Features Features Results
Dataset-1 (total 534) (readme). [1][3] 400 training images 400 aligned depthmaps Test 134 images Test 134 depths. Features for 134 test images [3,4] Features for 400 training images (coming soon) State of the art results**
Dataset-2 (total 445-33=425*) readme2 [1][2] 100 Images 100 Depths, 100 Depths in another format 350 Images 350 Depths 8 Images 8 Depths State of the art results**

** Email Prof. Saxena in order to submit your results on this list.

Required: Any report or publication using this data should cite its use as: [1] and [2] (dataset 2), or [1] and [3] (dataset 1):

[1] Learning Depth from Single Monocular Images, Ashutosh Saxena, Sung H. Chung, Andrew Y. Ng. NIPS 2005.
[2] 3-D Depth Reconstruction from a Single Still Image, Ashutosh Saxena, Sung H. Chung, Andrew Y. Ng. In IJCV 2007.
[3] Make3D: Learning 3D Scene Structure from a Single Still Image, Ashutosh Saxena, Min Sun, Andrew Y. Ng. IEEE Transactions of Pattern Analysis and Machine Intelligence (PAMI), vol. 30, no. 5, pp 824-840, 2009.
[4] Learning 3-D Scene Structure from a Single Still Image, Ashutosh Saxena, Min Sun, Andrew Y. Ng. In ICCV workshop on 3D Representation for Recognition (3dRR-07), 2007.

Stereo+Laser+Image Data

Image+LaserDepth+Stereo data

The depths here are raw logs from the laser scanner, in the following ascii format:
Each row represents a vertical scan. In each row, "PTLASER" tells it is from laser (true for every row). The next number, .e.g. "1130540406.855020" is the time-stamp (not needed).
Panning angle, e.g."39.874818" needed to construct 2-d map from vertical scans.
Tilt angle, e.g. "0.000000" (not used). Number of vertical scans in each row, fixed at 180,
Next 180 numbers are actual depth readings in meters for that vertical column.

Use of this data should cite:
Depth Estimation using Monocular and Stereo Cues, Ashutosh Saxena, Jamie Schulte, Andrew Y. Ng. In IJCAI 2007.
3-D Depth Reconstruction from a Single Still Image, Ashutosh Saxena, Sung H. Chung, Andrew Y. Ng. In IJCV 2007.

Car Driving 1-d depth data

1-D depth data (useful for robotic applications)

Use of this data should cite:
High Speed Obstacle Avoidance using Monocular Vision and Reinforcement Learning, Jeff Michels, Ashutosh Saxena, Andrew Y. Ng. In ICML 2005.
3-D Depth Reconstruction from a Single Still Image, Ashutosh Saxena, Sung H. Chung, Andrew Y. Ng. In IJCV 2007.

Depth+Image for synthetic objects

Available here.

Use of this data should cite:
Robotic Grasping of Novel Objects, Ashutosh Saxena, Justin Driemeyer, Justin Kearns, Andrew Ng. In NIPS 19, 2006.
Learning to Grasp Novel Objects using Vision, Ashutosh Saxena, Justin Driemeyer, Justin Kearns, Chioma Osondu, Andrew Y. Ng. 10th International Symposium of Experimental Robotics (ISER), 2006.

Depth+Image for indoors/objects

External link to:
USF Range Image Database
Middlebury data

Note: Use of this data is free to use, as long as you cite its use in any report, presentation, code, etc. Further, no permissions are obtained from people who might be present in these images; therefore, by downloading these files, you agree not to hold the authors or Stanford University or Cornell University liable for any damage, lawsuits, or other loss resulting from the possession or use of files.