[month] [year]

Bondugula Sriharshitha

Bondugula Sriharshitha supervised by Prof. Krishna Reddy Polepalli received her Master of Science – Dual Degree  in Computer Science and Engineering (CSE). Here’s a summary of her research work on Data Cube-Driven Exploration of Anomalies in Judicial Decisions: A Case Study on Indian Judgements:

In decision-making settings such as medical diagnosis, underwriting, or sentencing in a court of law, decisions are often influenced by multiple factors like the individual’s background, experience, and personal biases. Variability in decisions across individuals is thus inevitable, and so are disparities in different systems. Every system has its own way of addressing prevailing disparities—be it through noise audits, standardized guidelines, or other approaches. In the medical field, for instance, prior studies have documented significant inter-observer variations in the interpretation of clinical images such as MRIs, X-rays, etc., leading to inconsistencies in diagnoses and treatments. To address the issue, the medical domain is witnessing active development of AI tools intended to support clinical data analysis and reduce disparities in healthcare outcomes. In this thesis, we explore a data cube-based methodology to explore the issue of disparities in the legal domain. In the legal domain, decisions related to parole or bail grants, child custody, or sentence imposition are often left to the judges’ discretion. Specifically for sentence imposition, guidelines in many countries around the world allow for subjectivity. For example, in India, while sentencing guidelines prescribe minimum and maximum punishments for different offences, the weights assigned to various aggravating and mitigating circumstances are left to the judge’s discretion. This flexibility, though intended to accommodate case-specific nuances, increases variance, leading to inconsistencies in trial outcomes, sentence lengths, and penalties across courts and judges. The literature widely acknowledges that anomalies and disparities exist in sentencing and other legal decisions, often stemming from personal beliefs, biases, and contextual factors. Over the years, several efforts have also been made to analyse and assess sentencing anomalies/disparities in India and other parts of the world, particularly with respect to individual factors such as gender, race, socioeconomic background, etc., through surveys, case studies, and machine learning techniques. Notably, the Online analytical processing (OLAP) methodology has been widely employed in literature to analyse multidimensional data in different domains. The concept of a data cube was proposed to summarise and extract all sub-cubes from a table, enabling multi-dimensional analysis and derivation of insights from diverse perspectives. It is commonly adopted to extract interesting trends and anomalies from multidimensional data in domains like sales, marketing, etc. However, so far, no effort has been made to extend the data cube-based framework to explore anomalies and disparities in the legal domain. In this thesis, we leverage the OLAP framework and propose a data cube-based approach to explore potential trends and anomalies in judicial decisions, particularly sentences. A major bottleneck in this domain has been the lack of structured datasets. To address this, we employed a large language model (LLM) to curate a structured dataset from unstructured data extracted manually from Indian criminal case judgments. We designed a conceptual schema by identifying relevant attributes, hierarchies, and defining appropriate aggregate measures. We used this schema to build a data cube on the curated dataset and facilitate anomaly detection. This approach enabled us to uncover potential anomalies in court sentences, particularly in terms of quantum and monetary penalties for similar offenses across Indian states. For instance, we observed that Kerala imposed relatively higher monetary penalties in cases of rape and murder compared to other states. Several additional trends were also identified. Our experiments demonstrate that the proposed framework has the potential to identify the anomalies that could help in further understanding the causal factors, such as bias, that contribute to such anomalies and disparities. We also provide a structured dataset annotated by domain experts, which treasures sentencing-related information along with the verdict rationale from judgements of around 10,000 criminal cases adjudicated in the Trial court, High Court, and Supreme Court of India during 2000-2010. We make this dataset public to encourage further research.

December 2025