Dr. Vinoo Alluri and her students Kunal Vaswani and Yudhik Agrawal presented a paper on Multimodal Fusion Based Attentive Networks for Sequential Music Recommendation at the IEEE International Conference on Multimedia (BigMM 2021) in Taichung, Taiwan from 15 – 17 November. The conference was held in a hybrid model. Dr. Vinoo Alluri, Kunal Vaswani and Yudhik Agrawal explain their research work:
Music has the power to evoke intense emotional experiences and regulate the mood of an individual. With the advent of online streaming services, research in music recommendation services has seen tremendous progress. Modern methods leveraging the listening histories of users for session-based song recommendations have overlooked the significance of features extracted from lyrics and acoustic content. We address the task of song prediction through multiple modalities, including tags, lyrics, and acoustic content. In this paper, we propose a novel deep learning approach by refining Attentive Neural Networks using representations derived via a Transformer model for lyrics and Variational Autoencoder for acoustic features. Our model achieves significant improvement in performance over existing state-of-the-art models using lyrical and acoustic features alone. Furthermore, we conduct a study to investigate the impact of users’ psychological health on our model’s performance.
The IEEE International Conference on Multimedia Big Data (BigMM) was inaugurated in 2015 in Beijing, China. Subsequent conferences have been held in Taipei, Taiwan (2016), in Laguna Hills, California, USA (2017), Xi’an, China (2018), Singapore, Singapore (2019), and in Delhi India (virtual 2020). IEEE BigMM has fully established itself and is on a path to further broaden its audience. Jointly sponsored by the IEEE-TCMC (Technical Committee on Multimedia Computing), IEEE-TCSEM (Technical Committee on Semantic Computing), BigMM aims to establish a community of researchers from academia and industry focusing on the synergetic interactions between multimedia content and big data analytics. It is a world premier forum of leading researchers in the highly active multimedia big data research, development and applications.
Multimedia, as one of the most important and valuable sources for insights and information, is increasingly being considered as big data.” Multimedia big data includes but is not limited to text, image, graphics, audio, video, social, and sensor data that is highly valuable in decision making. It covers from everyone’s experiences to everything happening in the world. As such, multimedia big data is spurring on a tremendous amount of research and development of related technologies and applications.
The focus of the conference series has been established with the belief that it is important to have a full, top-tier conference that focuses specifically on the combination of multimedia and big data research. While research about specific aspects of big data systems is regularly published in the various proceedings and transactions of the information retrieval, operating system, real-time system, and database communities, BigMM aims to cut across these domains in the context of multimedia data types. This provides a unique opportunity to view the intersections and interplay of the various approaches and solutions developed across these domains to deal with multimedia big data types. Furthermore, BigMM provides an avenue for communicating research that addresses both multimedia and big data holistically. The mission of the conference is to share research results and solutions and to identify new issues and directions for future research and development work.