Arjit Srivastava received his MS Dual Degree in Exact Humanities. His research work was supervised by Dr. Manish Srivastava. Here’s a summary of Arjit Srivastava’s thesis Nuances of Aggression in Social Media Text as explained by him:
The advent of social media has immensely increased the number of opinions and arguments voiced on the internet. Social media platforms comprise a significant part of an individual’s social interaction. These interactions also generate many opinions on issues where there is a significant division—these virtual interactions, which often result in debates, manifest cases of aggression. Various online platforms like forums, blogs, and so on help users post comments and reply to other users’ comments. Some of these comments can be aggressive, hate speech, lovable, offensive languages etc. With the growing population on social media, interactions over the web have increased and have become aggressive, and related activities like cyberbullying, trolling, hate speech, etc. have also increased manifold across the globe. Thus, aggressive online behaviour incidents have become a significant source of social conflict, potentially resulting in an activity of a criminal nature. Thus, a fundamental challenge for identifying aggression on social media is to classify it from offensive or vitriolic languages. For the task of Aggression Detection, we used a Hindi-English code-mixed dataset provided for the shared task in the 1st Workshop on Trolling, Aggression and Cyberbullying (TRAC-1). Keeping these ideas in mind, we developed a system to discriminate between Overtly Aggressive, Covertly Aggressive and Non-aggressive content in texts. While research has been focused mostly on analyzing aggression, stance, and other dimensions of speech in isolation from each other, this work also attempts to gain an extensive and fine-grained understanding of aggression and figurative language use patterns when voicing an opinion. However, this task is daunting since natural language is fraught with ambiguities, and language in social media is boisterous. So, specialized techniques are required to handle issues related to these data streams’ unstructured and dynamic nature — it can be further used in various contexts to analyze and gain insights from social behaviours. Since the users on these social media platforms tend to write in an informal tone in real-time, it is relatively natural to mix languages as they ease communication. This factor could be attributed to these users being informal, being multilingual, or non-native language speakers. However, it adds another layer of complexity on top of the dynamic layer of social media data. This thesis explores and develops techniques that can further help us to gain in-depth insights from such data. We also present a code-mixed dataset in English-Hindi, of opinion on a politico-social issue. We annotate it across multiple dimensions: aggression, hate speech, emotion arousal, and figurative language usage (such as sarcasm/irony, metaphors/similes, puns/word-play) across varied modalities. Like the one presented, such in-depth datasets are required to analyze the not so apparent forms of verbal aggression displayed on social media and analyze the social dynamics of opinion. The thesis also hopes to understand linguistic patterns better when voicing an opinion and showing aggression. Furthermore, such datasets also facilitate classification models that leverage corpora annotated for auxiliary tasks through transfer learning, joint modelling, and semi-supervised label propagation methods.