Tomasz Lehmann
supervisor: Przemysław Rokita
Monocular depth estimation from images is applied in many applications and computer vision tasks. Nowadays there are many convolutional neural network based architectures for computing a high-resolution depth map given a single RGB image. The state of the art for one of the most popular databases dedicated to the depth estimation problem – NYUv2 - is set up by dense vision transformers and encoder-decoder structures. Inspired by solutions used in the image super resolution tasks I made a decision to implement Deep Recursive Residual Network and Enhanced Super Resolution Generative Adversarial Network to generate predictions with realistic textures during single image depth estimation. The first of mentioned architectures contain almost 150 times less parameters than auto-encoders usually used for depth estimation while the gargantuan potential of GANs seems to be an interesting alternative for the most popular solutions. Both architectures made a significant contribution in the image restoration problem. The achieved results are very preliminary and at this time it is hard to predict if the proposed methods will outperform others in any of NYUv2 metrics but there is still a long way ahead to optimize calculations by choosing suitable loss function and network modification experiments.