[month] [year]

Bajaj Vaibhav Ganesh – TEASER

Bajaj Vaibhav Ganesh received his MS Dual Degree in  Computer Science and Engineering (CSE). His research work was supervised by Dr. Radhika Mamidi. Here’s a summary of his research work on TEASER: Towards Efficient Aspect-based SEntiment analysis and Recognition:

Recent advances in networking sites, the Internet revolution have changed everything, right from businesses to healthcare to education to the ways of communicating with our friends. The Internet has opened doors for people to express themselves, write their thoughts about a particular topic, share an experience online for other people to read without much hassle.

Even before going out, order something, or watching a show/movie, people tend to check the online reviews first. And it is feasible and practical also because anyone wouldn’t want to spend their time, money, and other resources on something that isn’t worth it. With so many reviews, tweets, and content available online, it is also important to process them in such a way that they can be used by everyone productively.

One use-case could be as the word limit for online reviews on sites like IMDb, Zomato, and Amazon is pretty significant (10; 000 characters for IMDb, Amazon has a limit of 5; 000 words  23; 500 characters) some reviews tend to be longer. While the reviewer elaborates on their experience, from a reader point of view, what really important is, what aspects of the given target entities the reviewer liked/disliked. This is where the need for Aspect-based Sentiment Analysis arises. Aspect-Based Sentiment Analysis (ABSA) aims to extract the aspects of the given target entities and their respective sentiments.

The main issue is, the amount of data is so huge that manually processing the data on such a vast scale is impossible. E.g., Twitter alone sees an average of 6000 tweets per second, roughly 500 million tweets per day. Hence, we look for some fast, automated methods that can do the processing almost in real-time.

In this thesis, we build a deep-learning enabled model TEASER based on an extract-the-classify framework for extracting the aspects and detecting the respective sentiment attached. We also conduct extensive experiments to show that TEASER performs better than the existing models. In chapter 4, we present two novel datasets for this task, Movie20, and moviesLarge. Movie20 is a supervised dataset manually annotated by two annotators, whereas moviesLarge is a pseudo-labeled dataset. And then, with the help of Semi-supervised learning, we benchmark TEASER on the Movie20 dataset.