Harshita Sharma, supervised by Prof. Dipti M Sharma received her Master of Science – Dual Degree in Computational Linguistics (CL). Here’s a summary of her research work on Hindi Word Problem Solving:
Word problem Solving is a popular NLP task that deals with solving mathematical problems described in natural language. Mathematical Word Problems cover problems over a large mathematical domain with various complexities ranging from Arithmetic and Algebraic to Geometry and Calculus. While most word problems are entirely textual, some word problems, like geometric word problems, may also have a visual component. Much research has been carried out to solve different genres of word problems with various complexity levels in recent years. However, most publicly available datasets and work are in English. Recently there has been a surge in word problem-solving in Chinese with the creation of large benchmark datasets. Labelled benchmark datasets for low-resource languages are very scarce. The first requirement for solving word problems, like any other problem, is data. To the best of our knowledge, no datasets are available for any Indian Languages for Word Problem Solving. Such limitations on data availability not only encouraged us to create a new dataset for an Indian Language but also made us explore techniques by which data for other Indian Languages can be created with ease. In this work, we present a diverse dataset containing 2336 Arithmetic word problems in Hindi built by manually crafting word problems and using word problems augmented from benchmark datasets of other languages. For augmentation, we used translation of word problems (of other languages in which Word Problem Solving data was developed) as a tool to generate diverse word problems. In this process, we studied the translated word problems, gathered the patterns of issues seen in these translations and defined the steps to eliminate the errors and improve the quality of the translated output for it to be suited to be a set of Hindi word problems that look as natural as word problems that are studied across India in Hindi medium schools. We also developed baseline systems for solving these word problems – a rule-based solver that uses verbs to identify operations for generating the answers to word problems and an end-to-end deep learning-based solver that generates equations for word problems. We also propose a new evaluation technique for word problem solvers taking equation equivalence into account. This will form the basis for future work for Word Problem Solving in Indian Languages, especially in Hindi.
October 2023