Person Detection 3D Datasets

We provide five different datasets captured with several sensor types, including a stereo camera, an RGB-D camera and a time-of-flight camera. Each of the single datasets consists of two separate sequences from different locations around and within our research institute. All sequences were captured with all named depth sensors at the same time, therefore providing the same situations for each sensor. As a result, behaviour and capability of person detection algorithms can be benchmarked given the same 3D scenes, but in different quality due to the sensor data.

Research Platform and Sensors

To capture the datasets, we utilized a Summit XL from Robotnik and equipped it with a Nerian SP1 stereo camera system, an ASUS Xtion Pro Live RGB-D camera and a Fotonic E70P time-of-flight camera.

Whereas the stereo as well as the RGB-D camera provided images with VGA resolution, the Fotonic E70P is limited to QQVGA.


All 3D person datasets can be used with our framework CS::APEX which is available via GitHub.

Annotation Format

Annotation files are saved as .yaml files and contain an associative map. The format is as follows:
img_d_w: [width]
img_d_h: [height]
- ts: 0
img_id: [id]
x_d: [depth roi pos x]
y_d: [depth roi pos y]
w_d: [depth roi width]
h_d: [depth roi height]
x_rgb: [visual roi pos x]
y_rgb: [visual roi pos y]
w_rgb: [visual roi width]
h_rgb: [visual roi height]
vis: [visibility]
- ts: 0
The annotations always contain the time stamp / the id of the data quadruple. Regions of interest are provided for depth and intensity / color images. We formulate two different visibility levels or class with '1' for fully visible persons and '2' for only partially visible people. Implicitly, '0' is the code of 'non-human' or 'non-visible'. Theses are also the classification labels used by CS::APEX.

File System Structure

A dataset consists of two separate sequences, packed into separate folders within the archive. Each of these folders contains four directories:
  • depth
    the depth images
  • pointcloud
    the rgb / intensity pointclouds
  • roi
    the annotation files
  • visual
    the color / intensity images
There are always four files, one from each folder, which belong together and have the same name, but different extensions. To order all files in to a continuous sequence, the file names are based on Unix time stamps in relation to the recording sessions.


Dataset Frames Annotations
Fotonic E70P (tof) 5896 16058
Nerian SP1 (stereo) 4804 12118
Xtion Pro Live (RGB-D) 5477 14031
CS::APEX configuration files for 3D person detection on the indoor datasets can be found here.


Dataset Frames Annotations
Fotonic E70P (tof) 5983 13052
Nerian SP1 (stereo) 6157 11280
CS::APEX configuration files for 3D person detection on the outdoor datasets can be found here.

Excerpts from the different datasets

Indoor Scene Fotonic E70P Indoor Scene ASUS Xtion Pro Live
Outdoor Scene Nerian SP1 Courtyard Outdoor Scene Nerian SP1 Forest

Recommended Evaluation Metric

To achieve generally comparability of person detection approaches, we recommend using the Intersection over Union (IoU) metric for evaluation. This is a commonly used metric for benchmarking problems using 2D bounding rectangles.

The IoU measures the quality of ground truth / detection association, not only in terms of position but also in terms of shape.

Terms of Use

We provide the datasets, information about the datasets, and the associated material (altogether subsequently denoted by the "Software") because we hope that they are useful to you. We have collected all data thoroughly and described them to the best of our knowledge. Copyright (c) 2017 Chair of Cognitive Systems Permission is hereby granted, free of charge, to any person downloading the Software, to use the datasets for research and academic purposes, including the right to publish experimental results obtained by using the data. The Software, however, must not be sold or redistributed. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Richard Hanten