ICME-2023 -

Sai Shashank Kalakonda, a dual-degree student working with Dr. Ravi Kiran Sarvadevabhatla gave an oral presentation on Action-GPT: Leveraging Large-scale Language Models for Improved and Generalized Action Generation at IEEE International Conference on Multimedia and Expo (ICME) held in Brisbane, Australia from 10 – 14 July.

Research work as explained by the authors – Sai Shashank Kalakonda, Shubh Maheshwari, TCS Innovation Labs and Ravi Kiran Sarvadevabhatla:

We introduce Action-GPT,

A plug and play framework for incorporating Large Language Models (LLMs) into text-based action generation models
By carefully crafting prompts for LLMs, we generate richer and fine-grained descriptions of the action.
We show that utilizing these detailed descriptions instead of the original action phrases leads to better alignment of text and motion spaces.
Our experiments show qualitative and quantitative improvement in the quality of synthesized motions produced by recent text-to-motion models.
Code, pretrained models and sample videos will be made available.

Full paper: https://actiongpt.github.io/

July 2023