[month] [year]

Praveen Krishnan – CSE

Praveen Krishnan received his doctorate in Computer Science and Engineering. His  research work was supervised by Prof. CV Jawahar. Here’s a summary of  Praveen Krishnan’s thesis, Learning Representations for Word Images as explained by him

Reading and writing documents is one among the primary skills with which we gather and communicate information. With the emergence of Artificial Intelligence (AI), researchers are in constant pursuits to build intelligent algorithms that can bring our physical and digital worlds close to each other. One  such  important  domain  is  document  image  analysis,  where  we  delve into the problem of understanding content from scanned document image collections.  Considering “words” as the basic unit in understanding a document,  in this thesis,  we address the problem of finding the best possible representation for word images.

Representation learning has been a key investigation for an AI problem. The primary goal of this thesis is to learn efficient representations for word images that encode its content. An ideal representation should be invariant  to  multiple  fonts,  handwritten  styles  and  less  sensitive  to  noise  and distortions.  In the past, representations have been handcrafted, specific to modalities (printed, handwritten), and sensitive to the complexities in hand writing in multi-writer scenarios.  In this work, we choose the paradigm of learning from data using deep neural networks. We take our inspiration from the fact that given large amounts of annotated data,  modern deep neural networks can inherently learn better representations.  In this thesis, we also relax the need for large annotated datasets by heavily capitalizing on synthetically generated images. We also introduce a novel problem of learning semantic representation for word images which encodes the semantics of the word and reduces the vocabulary gap that exists between the query and the retrieved results.