Vision meets robotics: The KITTI dataset
Citations
40,257 citations
30,811 citations
7,547 citations
Cites background from "Vision meets robotics: The KITTI da..."
...Out of the previously discussed datasets, only Cityscapes and KITTI provide instance-level annotations for humans and vehicles....
[...]
...As no official pixel-wise annotations exist for KITTI, several independent groups have annotated approximately 700 frames [22, 29, 32, 33, 58, 64, 77, 80]....
[...]
...Also in this area, research progress can be heavily linked to the existence of datasets such as the KITTI Vision Benchmark Suite [19], CamVid [7], Leuven [35], and Daimler Urban Segmentation [61] datasets....
[...]
...4 we observe, that in comparison to KITTI, Cityscapes covers a larger distance range....
[...]
...Regarding the first two aspects, we compare Cityscapes to other datasets with semantic pixel-wise annotations, i.e. CamVid [7], DUS [62], and KITTI [19]....
[...]
4,522 citations
4,018 citations
References
11,283 citations
"Vision meets robotics: The KITTI da..." refers background or methods in this paper
...For details about the benchmarks and evaluation metrics we refer the reader to Geiger et al. (2012a)....
[...]
...Next, we optimized an error criterion based on the Euclidean distance of 50 manually selected correspondences and a robust measure on the disparity error with respect to the three top performing stereo methods in the KITTI stereo benchmark Geiger et al. (2012a)....
[...]
...While our introductory paper (Geiger et al., 2012a) mainly focuses on the benchmarks, their creation and use for evaluating state-of-the-art computer vision methods, here we complement this information by providing technical details on the raw data itself....
[...]
...We have registered the Velodyne laser scanner with respect to the reference camera coordinate system (camera 0) by initializing the rigid body transformation using Geiger et al. (2012b)....
[...]
...For a review on related work, we refer the reader to Geiger et al. (2012a)....
[...]
488 citations
"Vision meets robotics: The KITTI da..." refers background or methods in this paper
...For details about the benchmarks and evaluation metrics we refer the reader to Geiger et al. (2012a)....
[...]
...Next, we optimized an error criterion based on the Euclidean distance of 50 manually selected correspondences and a robust measure on the disparity error with respect to the three top performing stereo methods in the KITTI stereo benchmark Geiger et al. (2012a)....
[...]
...While our introductory paper (Geiger et al., 2012a) mainly focuses on the benchmarks, their creation and use for evaluating state-of-the-art computer vision methods, here we complement this information by providing technical details on the raw data itself....
[...]
...We have registered the Velodyne laser scanner with respect to the reference camera coordinate system (camera 0) by initializing the rigid body transformation using Geiger et al. (2012b)....
[...]
...For a review on related work, we refer the reader to Geiger et al. (2012a)....
[...]
[...]
451 citations
"Vision meets robotics: The KITTI da..." refers methods in this paper
...Given two trajectories this problem corresponds to the well-known hand-eye calibration problem which can be solved using standard tools (Horaud and Dornaika, 1995)....
[...]
[...]
196 citations
181 citations
"Vision meets robotics: The KITTI da..." refers background in this paper
...The main purpose of this dataset is to push forward the development of computer vision and robotic algorithms targeted at autonomous driving (Paul and Newman, 2010; Pfeiffer and Franke, 2010; Geiger et al., 2011a,b; Wojek et al., 2012; Singh and Kosecka, 2012; Brubaker et al., 2013)....
[...]
Related Papers (5)
Frequently Asked Questions (12)
Q2. What are the future works mentioned in the paper "Vision meets robotics: the kitti dataset" ?
In the future the authors plan on expanding the set of available sequences by adding additional 3D object labels for currently unlabeled sequences and recording new sequences, N u m b er o f L ab el s 200000 150000 100000 50000 % o f O b je ct C la ss M em b er s 100 50 for example in difficult lighting situations such as at night, in tunnels, or in the presence of fog or rain. Furthermore, the authors plan on extending their benchmark suite by novel challenges.
Q3. What is the trunk of the Toyota?
The trunk of their vehicle houses a PC with two six-core Intel XEON X5650 processors and a shock-absorbed RAID 5 hard disk storage with a capacity of 4 Terabytes.
Q4. How do the authors use the timestamps of the Velodyne 3D laser scanner?
In order to synchronize the sensors, the authors use the timestamps of the Velodyne 3D laser scanner as a reference and consider each spin as a frame.
Q5. What is the data format used for the data?
4. Besides the raw recordings (’raw data’), the authors also provide post-processed data (’synced data’), i.e., rectified and synchronized video streams, on the dataset website.
Q6. What is the simplest way to store the coordinates of the Velodyne laser scanner?
For efficiency, the Velodyne scans are stored as floating point binaries that are easy to parse using the C++ or MATLAB code provided.
Q7. how do you plan to expand the set of available sequences?
In the future the authors plan on expanding the set of available sequences by adding additional 3D object labels for currently unlabeled sequences and recording new sequences,N u m b ero f L ab el s20000015000010000050000% o f O b je ct C la ssM emb er s10050for example in difficult lighting situations such as at night, in tunnels, or in the presence of fog or rain.
Q8. What is the axis of the Velodyne laser scanner?
Note that the Velodyne laser scanner rotates continuously around its vertical axis (counter-clockwise), which can be taken into account using the timestamp files.
Q9. What is the angular rate of the vehicle?
Accelera-tions and angular rates are both specified using two coordinate systems, one which is attached to the vehicle body (x, y, z) and one that is mapped to the tangent plane of the earth surface at that location (f, l, u).
Q10. What is the corresponding coordinates for the Velodyne laser scanner?
The geographic coordinates including altitude, global orientation, velocities, accelerations, angular rates, accuracies and satellite information.
Q11. what is the velocity of the camera?
The rigid body transformation from Velodyne coordinates to camera coordinates is given in calib_velo_to_cam.txt:• Rcamvelo ∈ R3×3 . . . . rotation matrix: velodyne → camera • tcamvelo ∈ R1×3 . . . translation vector: velodyne → cameraUsingTcamvelo =( Rcamvelo t cam velo0 1) (6)a 3D point x in Velodyne coordinates gets projected to a point y in the i’th camera image asy = P (i) rect R (0) rect T cam velo x (7)For registering the IMU/GPS with respect to the Velodyne laser scanner, the authors first recorded a sequence with an ’∞’-loop and registered the (untwisted) point clouds using the Pointto-Plane ICP algorithm.
Q12. What is the format of the data in which each sensor stream is stored?
The data format in which each sensor stream is stored is as follows:a) Images: Both, color and grayscale images are stored with loss-less compression using 8-bit PNG files.