Tanu Sharma received her MS in Electronics and Communication Engineering (ECE). Her research work was supervised by Prof. Madhava Krishna. Here’s a summary of her research work on Plane Parameters of building facades using surface normal and depth estimation:
Planar region understanding from a single RGB image constitutes two essential components – depth map estimation and surface normal map estimation. In this work, we focus on the latter. We have proposed an approach to predict surface normals for each pixel.
While surface normal prediction problem has been extensively studied, most approaches are geared towards indoor scenes and often rely on multiple modalities (depth, multiple views) for accurate estimation of normal maps. Outdoor scenes pose a greater challenge as they exhibit significant lighting variation, often contain occluders, and structures like building facades are often ridden with numerous windows and protrusions. Conventional supervised learning schemes excel in indoor scenes, but do not exhibit competitive performance when trained and deployed in outdoor environments. Furthermore, they involve complex network architectures and require many more trainable parameters.
To tackle these challenges, we present an adversarial learning scheme that regularizes the output normal maps from a neural network to appear more realistic, by using a small number of precisely annotated examples. We target surface normal estimation in outdoor scenes { for building facades, using only RGB image input. Our method presents a lightweight, simpler, end-to-end trainable, single view architecture that utilizes skip connection, residual learning and adversarial regularization to generate fairly accurate
normal maps of buildings in the city scenes while improving performance by at least 1.5x across most metrics. We provide qualitative and quantitative comparisons of our method against the state-of-the-art methods on normal map estimation on two city scene datasets – one real, and another synthetic, and observe significant performance enhancements. These evaluations demonstrate how even without using other modalities like depth, our single-view method is able to outperform state-of-the-art methods.