Dr. Ravi Kiran Sarvadevabhatla and his students presented the following papers at IEEE International Conference on Multimedia (ACMMM-2021). The hybrid conference (onsite and virtual) was held in Chengdu, China from 20 – 24 October.
– Wisdom of (Binned) Crowds: A Bayesian Stratification Paradigm for Crowd Counting – Sravya Vardhani Shivapuja, Mansi Pradeep Khamkar, Divij Bajaj, Ganesh Ramakrishnan, Department of CSE, IIT Bombay and Dr. Ravi Kiran Sarvadevabhatla.
Research work as explained by the authors:
Datasets for training crowd counting deep networks are typically heavy-tailed in count distribution and exhibit discontinuities across the count range. As a result, the de facto statistical measures (MSE, MAE) exhibit large variance and tend to be unreliable indicators of performance across the count range. To address these concerns in a holistic manner, we revise processes at various stages of the standard crowd counting pipeline. To enable principled and balanced minibatch sampling, we propose a novel smoothed Bayesian sample stratification approach. We propose a novel cost function which can be readily incorporated into existing crowd counting deep networks to encourage strata-aware optimization. We analyze the performance of representative crowd counting approaches across standard datasets at per strata level and in aggregate. We analyze the performance of crowd counting approaches across standard datasets and demonstrate that our proposed modifications noticeably reduce error standard deviation. Our contributions represent a nuanced, statistically balanced and fine-grained characterization of performance for crowd counting approaches. Code, pretrained models and interactive visualizations can be viewed at our project page deepcount.iiit.ac.in.
Paper pdf : https://arxiv.org/pdf/2108.08784
Project page : https://deepcount.iiit.ac.in/
- MeronymNet: A Hierarchical Model for Unified and Controllable Multi-Category Object Generation [ORAL] – Rishabh Baghel, Abhishek Trivedi, Tejas Ravichandran and Dr. Ravi Kiran Sarvadevabhatla
Research work as explained by the authors:
We introduce MeronymNet, a novel hierarchical approach for controllable, part-based generation of multi-category objects using a single unified model. We adopt a guided coarse-to-fine strategy involving semantically conditioned generation of bounding box layouts, pixel-level part layouts and ultimately, the object depictions themselves. We use Graph Convolutional Networks, Deep Recurrent Networks along with custom-designed Conditional Variational Autoencoders to enable flexible, diverse and category-aware generation of 2-D objects in a controlled manner. The performance scores for generated objects reflect MeronymNet’s superior performance compared to multiple strong baselines and ablative variants. We also showcase MeronymNet’s suitability for controllable object generation and interactive object editing at various levels of structural and semantic granularity.
Paper pdf : https://arxiv.org/pdf/2110.08818.pdf
Project page : http://meronymnet.github.io/
Since the founding of ACM SIGMM in 1993, ACM Multimedia has been the worldwide premier conference and a key world event to display scientific achievements and innovative industrial products in the multimedia field. For the first time in its history, ACM Multimedia 2021 was held in Chengdu, the capital city of the Sichuan Province in China. At ACM Multimedia 2021had an extensive programs consisting of technical sessions covering all aspects of the multimedia field via oral, video and poster presentations, tutorials, panels, exhibits, demonstrations, workshops, doctoral symposium, multimedia grand challenge, brave new ideas on shaping the research landscape, open source software competition, and also an interactive arts program stimulating artists and computer scientists to meet and discover together the frontiers of artistic communication. The conference also added the industrial track to recognize those research works with significant industrial values.
Link to the conference page: (https://2021.acmmm.org/)
MeronymNet: A Hierarchical Model for Unified and Controllable Multi-Category Object Generation was also presented at ICCV 2021 workshop on Structural and Compositional Learning on 3D Data and ICCV 2021 workshop on Learning 3D Representations for Shape and Appearance held as part of International Conference on Computer Vision (ICCV) 2021.
ICCV is the premier international computer vision event comprising the main conference and several co-located workshops and tutorials. With its high quality and low cost, it provides an exceptional value for students, academics and industry researchers.
Links to the two workshops: https://geometry.stanford.edu/struco3d/papers.html and