[month] [year]

Niharika

Niharika supervised by Dr. Santosh Ravi Kiran received her Master of Science  in  Electronics and Communication Engineering (ECE). Here’s a summary of her research work on High Precision Text Line Segmentation in Palm Leaf Manuscripts:

 Ancient manuscripts were among the first forms of written communication, offering key insights into our past, covering literature, medicine, culture, philosophy, religion, and more. It is imperative to save these writings to identify and extract their hidden knowledge. Document collections frequently exhibit overlapping components, irregular patterns, dense layouts, extremely high aspect ratios, physical and chemical degradation (evident in ink-based manuscripts), text misalignment, and more. Compounding these difficulties are issues that may arise during the digitization processes, such as improper document positioning, inadequate illumination, and scanner effects. In this academic thesis, our main emphasis is on identifying and segmenting text lines inside these documents for downstream OCR applications with utmost precision. Wedeviseatwo-stageapproach-SeamFormer-toidentifyspecial text baselines in palm-leaf manuscripts using a multi-task strategy via Vision Transformers (ViTs). In the first stage, we detect text strikethroughs, namely ’scribbles,’ which act as pointers to the location of text line segments within the document. The encoder-decoder architecture is used to analyze input image patches and produce two separate maps: a scribble map and a binary map. In the second stage, we post-process the prior stage outputs and generate a diacritic-aware global energy map. To generate the final precise text line polygons, we use a modified seam generation algorithm along with customized global maps. The prior methodology is further enhanced by the proposed LineTR model, a multi-task DETR (Detection Transformer) model that reconceptualizes the scribble generation as a geometric problem. This method simply generates line parameters for each text line present in the input image patch. This design decision enables the model to exhibit zero-shot behavior across diverse historical manuscripts. The state-of-the-art approach has been proven to generate precise text-line segmentation with a single ’unified’ model with minimal post-processing efforts, making it a strong candidate for image-to-OCR integration pipelines.

March 2025