November 2022
Samyak Agrawal received his Master of Science – Dual Degree in Computer Science and Engineering (CSE). His research work was supervised by Dr. Radhika Mamidi. Here’s a summary of his research work on Towards using Deep Learning for Text Classification and Sentiment Prediction and its application in Multimodal Settings:
Text Classification is a technique for assigning a set of categories to give text. A significant type of text classification is Sentiment Analysis. Sentiment Analysis is a technique in natural language processing to identify and quantify subjective states and information systematically. With the exponential increase in traffic over the internet, we have a lot of textual content available in different settings. This data provides an opportunity for efficient processing and application in solving problems across multiple domains. Few such critical applications include analysing public behaviour online and how they interact. Other is to analyse news content by different media houses that can cause public divide and polarisation due to its subjective nature. Firstly, we aim to create a novel dataset for bias detection in Hindi. We also explore classification tasks in a multi-modal setting to analyse and understand how information in multiple modalities affects classification outcomes. We further experiment with deep learning models to classify text where the language is more subtle and often unrealised for humans making its classification a tricky task. Recently, we have seen a rise in content from highly subjective news media outlets created to polarise and create a divide between people. This reporting can have far-reaching consequences, from orchestrating riots and public agitations to influencing the results of an election. This biased reporting makes its analysis and detection crucial. We aim to do it for the Hindi Language. As no data were available, we decided to develop our dataset. Our dataset consists of 1388 Hindi news articles and their headlines from various Hindi news media outlets, which were annotated for being biased towards, against, or neutral towards the BJP (Bhartiya Janata Party). Further, we explore a multi-modal classification task of identifying misogynous memes posted online. In this work, we also explore how the visual modality affects the classification task. We build an ensemble of deep learning texttransformers and vision-language transformers like UNITER and OSCAR and compare them to better understand the role of vision in multi-modal classification tasks. We also experiment with task-specific pretraining of our visionlanguage(VL) transformers. Lastly, we experiment with employing transformers trained on similar tasks on other similar datasets to explore transfer learning. Lastly, We employ deep-learning transformers to detect patronising and condescending language targeting vulnerable communities (e.g. refugees, homeless people, and low-income families). We experiment with techniques like focal-loss and weighted random sampling to deal with class imbalances and to improve our results. We also take up the task of classifying more fine-grained categories of condescending language.