This paper introduces a stereoscopic image and depth dataset created using a deep learning model. It addresses the challenge of obtaining accurate and annotated stereo image pairs with irregular boundaries for deep learning model training. Stereoscopic image and depth dataset provides a unique resource for training deep learning models to handle irregular boundary stereoscopic images, which are valuable for real-world scenarios with complex shapes or occlusions. The dataset is created using monocular depth estimation, a state-of-the-art depth estimation model, and it can be used in applications like rectifying images, estimating depth, detecting objects, and autonomous driving. Overall, this paper presents a novel dataset that demonstrates its effectiveness and potential for advancing stereo vision and developing deep learning models for computer vision applications.
B. Rogers and M. Graham, “Similarities between motion parallax and stereopsis in human depth perception,” Vis. Res ., vol. 22, no. 2, pp. 261–270, 1982.
P. Li, X. Chen, and S. Shen, “Stereo R-CNN based 3D object detection for autonomous driving,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, June 15–20, 2019, pp. 7644–7652.
W. S. Kim, D. H. Lee, Y. J. Kim, T. Kim, W. S. Lee, and C. H. Choi, “Stereo-vision-based crop height estimation for agricultural robots,” Comput. Electron. Agric., vol. 181, 2021, Art no. 105937.
Y. Zhang, Y. K. Lai, and F. L. Zhang, “Stereoscopic image stitching with rectangular boundaries,” Vis. Comput., vol. 35, pp. 823–835, 2019.
R. Ranftl, K. Lasinger, D. Hafner, K. Schindler, and V. Koltun, “Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 3, pp. 1623–1637, 2020.
L. Nie, C. Lin, K. Liao, S. Liu, and Y. Zhao, “Deep rectangling for image stitching: A learning baseline,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, June 19–20, 2022, pp. 5730–5738.
A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, June 16–21, 2012, pp. 3354–3361.
M. Z. Wong, K. Kunii, M. Baylis, W. H. Ong, P. Kroupa, and S. Koller, “Synthetic dataset generation for object-to-model deep learning in industrial applications,” PeerJ Comput. Sci., vol. 5, 2019, Art no. e222.
T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, July 21–26, 2017, pp. 2117–2125.
N. Mayer, E. Ilg, P. Häusser, P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox, “A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, June 27–30, 2016, pp. 4040–4048.
J. R. Chang and Y. S. Chen, “Pyramid stereo matching network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, June 18, 2018, pp. 5410–5418.
A. Kendall, H. Martirosyan, S. Dasgupta, P. Henry, R. Kennedy, A. Bachrach, and A. Bry, “End-to-end learning of geometry and context for deep stereo regression,” in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, October 22–29, 2017, pp. 66–75.
D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” Int. J. Comput. Vis., vol. 47, pp. 7–42, 2002.
T. Schöps, J. L. Schönberger, S. Galliani, T. Sattler, K. Schindler, M. Pollefeys, and A. Geiger, “A multi-view stereo benchmark with high-resolution images and multi-camera videos,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, July 21–26, 2017, pp. 3260–3269.
T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, October 22–29, 2017, pp. 2980–2988.
M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, June 27–30, 2016, pp. 3213–3223.
Y. You, Y. Wang, W. L. Chao, D. Garg, G. Pleiss, B. Hariharan, M. Campbell, and K. Q. Weinberger, “Pseudo-lidar++: Accurate depth for 3D object detection in autonomous driving,” in The Eleventh International Conference on Learning Representations, Kigali Rwanda, May 1–5, 2019, pages 1‒22.
R. Fan, H. Wang, P. Cai, J. Wu, M. J. Bocus, L. Qiao, and M. Liu, “Learning collision-free space detection from stereo images: Homography matrix brings better data augmentation,” IEEE/ASME Trans. Mechatronics, vol. 27, no. 1, pp. 225–233, 2021.
H. Gao, X. Liu, M. Qu, and S. Huang, “Pdanet: Self-supervised monocular depth estimation using perceptual and data augmentation consistency,” Appl. Sci., vol. 11, no. 12, 2021, Art no. 5383.
Z. A. Hasan, S. M. Hadi, and W. A. Mahmoud, “Time domain speech scrambler based on particle swarm optimization,” Pollack Period., vol. 18, no. 1, pp. 161‒166, 2023.
L. Kota and K. Jármai, “Improving optimization using adaptive algorithms,” Pollack Period., vol. 16, no. 1, pp. 14‒18, 2021.