[month] [year]

Athavale Vinayak Sanjay – Dual Degree CL

Athavale Vinayak Sanjay received his MS  in Computer Linguistics (CL). His  research work was supervised by Dr. Manish Shrivastava. Here’s a summary of Athavale Vinayak Sanjay’s  MS  thesis, Classifying Mathematical Problems in Natural Language   as explained by him: 

In recent years there has been a growing interest in understanding natural language for the purpose of answering questions related to science and maths. Automatically generating computer programs to solve problems posed in natural language is one of the holy grail problems of AI. A step by step approach to this would first involve understanding the problem from the text and then categorising this problem into certain categories based on algorithms and many other criteria and then finally generating the code. In this thesis we tackle the problem of automatically categorising mathematics and programming word problems.

Recent advances in Deep Learning, Machine Learning based classifiers as well as the advent of word embeddings has shown impressive results in the field of Text Classification. In light of the advances in text classification we model the task of operator prediction in math word problems and algorithm prediction in programming word problems as a text classification problem and solve it using the state of the art approaches for text classification.

In the first part of this thesis we deal with arithmetic word problems. Arithmetic word problems can be solved with the help of the numbers mentioned in the text and their relationships through basic mathematical operations (addition, subtraction, division, multiplication). We start by creating a model which detects the mathematical operation (+, -, *, /) that would be used to solve an arithmetic word problem. We then use this model to build an arithmetic word problem solver which takes as an input an arithmetic word problem and outputs the answer. We use various machine learning and deep learning based algorithms to build the  model which detects the mathematical operation. Our solver is able to solve 81 percent of all the problems correctly in our dataset.

We then broaden our focus to deal with programming word problems. These are problems in natural language which can be solved by a computer program. These problems are significantly harder than math problems for the following reasons. The solver must first decode the intent of the problem, or understand what the problem is. Then the solver needs to apply their knowledge of algorithms to write a solution program. Another reason is that the solution programs must be efficient with respect to the given time and memory constraints. An outgrowth of this is that the algorithm required to solve a particular problem not only depends on the problem statement, but also the constraints. These Problems are routinely used by major tech companies to hire software engineers. Hence, these problems represent a significant problem for AI which has largely been untouched in current literature. 

As a first step towards solving these problems we introduce the task of algorithm prediction for programming word problems. We formulate the problem as a single and multi-label classification taskand use state of the art Machine Learning and Deep Learning Based classification algorithms on this task. We create the first publicly available dataset for the task of algorithm prediction in a programming problem. From this dataset, we create 4 classification tasks; 2 single label classification tasks with 5, 10 classes respectively and 2 multi-label classification tasks with 10, 20 classes respectively.  Our best performing classifier gets an accuracy of 62.7 percent on the 5 class single label classification whereas the best multilabel classification model has an accuracy only 9 percent lower than a skilled human. To the best of our knowledge, these are the first reported results on such a task. We make our code and datasets publicly available.