September 2022
Dowlagar Suman working under the supervision of Dr. Radhika Mamidi published a paper on A code-mixed task-oriented dialog dataset for medical domain in one of the top ranked journals – Computer Speech & Language. The journal is ranked as “A” according to the journal core ranking portal. Research work as explained by Dr. Radhika Mamidi and Dowlagar Suman:
In the healthcare domain, medical and patient interactions form a crucial part of the diagnosis. Initially, the AI models developed for healthcare centered only on monolingual data. However, such models do not cater to the multilingual regions, where most conversations are Code-Mixed. We present the Code-Mixed Medical Task-Oriented Dialog Dataset to facilitate the research and development of Code-Mixed medical dialog systems. We analyzed the dataset using medical, conversational, and linguistic theories. The dataset contains 3005 Telugu–English Code-Mixed dialogs between patients and doctors with 29 k utterances covering ten specializations with an average code-mixing index (CMI) of 33.3%. We manually annotated the conversational dataset with intents and slot labels. We also present baselines to establish benchmarks on the dataset using existing state-of-the-art Natural Language Understanding (NLU) models. We improved the existing baselines using contextual ground truth intent labels and processing the slots as chunks. The data is made publically available.
Read full paper: https://www.sciencedirect.com/science/article/abs/pii/S0885230822000729
Journal core ranking portal: http://portal.core.edu.au/jnl-ranks/?search=Computer+speech+and+language&by=all&source=CORE2020&sort=atitle&page=1