WACV 2025 -

Prajneya Kumar, Dual Degree student and Eshika Khandelwal, B.Tech (Hons.) working with Dr. Vishnu Sreekumar and Dr. Makarand Tapaswi presented a paper on Seeing Eye to AI: Comparing Human Gaze and Model Attention in Video Memorability at Winter Conference on Applications of Computer Vision (WACV-2025) held at Tuczon, Arizona from 28 February to 4 March. Prajneya Kumar was a co-first author on this paper along with Eshika Khandelwal from CVIT. Dr. Makarand Tapaswi co-guided this project along with Dr. Vishnu Sreekumar.

Here is the summary of the paper as explained by the authors: Understanding what makes a video memorable has important applications in advertising and education technology. Towards this goal we investigate spatio-temporal attention mechanisms underlying video memorability. Different from previous works that fuse multiple features, we adopt a simple CNN+Transformer architecture that enables analysis of spatio-temporal attention while matching state-of-the-art (SoTA) performance on video memorability prediction. We compare model attention against human gaze fixations collected through a small-scale eye-tracking study where humans perform the video memory task. We uncover the following insights: (i) Quantitative saliency metrics show that our model trained only to predict a memorability score exhibits similar spatial attention patterns to human gaze especially for more memorable videos. (ii) The model assigns greater importance to initial frames in a video mimicking human attention patterns. (iii) Panoptic segmentation reveals that both (model and humans) assign a greater share of attention to things and less attention to stuff as compared to their occurrence probability.

Citation: Kumar, P.*, Khandelwal, E.*, Tapaswi, M.+, & Sreekumar, V.+ (2025). Eye vs. AI: Human Gaze and Model Attention in Video Memorability. Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 2082-2091. *+ equal contribution

View Full paper: Seeing Eye to AI: Comparing Human Gaze and Model Attention in Video Memorability

Conference page: WACV2025

March 2025