Dr. Praneeth Netrapalli, Research Scientist at Google Research India, Bangalore gave a talk on Mitigating Simplicity Bias in Deep Learning on 29 August. Here is the summary of Dr. Praneeth Netrapalli in his own words:
While deep neural networks have achieved large gains in performance on benchmark datasets, their performance often degrades drastically with changes in data distribution encountered during real-world deployment. In this work, through systematic experiments and theoretical analysis, we attempt to understand the key reasons behind such brittleness of neural networks in real-world settings and propose algorithms that can train models that are more robust to distribution shifts.
We first hypothesise, and through empirical+theoretical studies demonstrate, that (i) neural network training exhibits “simplicity bias” (SB), where the models learn only the simplest discriminative features and (ii) SB is one of the key reasons behind non-robustness of neural networks.
We then delve deeper into the nature of SB and find that while the network’s backbone learns both simple and complex features, it is the final classifier layer which does not use these features in the eventual prediction. We posit two reasons for this:
- Dominance of non-robust features, and
- Replication of simple features, leading to over-dependence of the final layer linear classifier on these
and empirically validate these hypotheses on semi-synthetic and real-world datasets. We then propose two methods to deal with both of these phenomena, and show gains of upto 1.5% over the state-of-the-art on DomainBed – a standard and large-scale benchmark for domain generalisation.
We will end with some thoughts on what SB and robustness mean in the new world of large language models (LLMs).
The talk was based on several joint works with Anshul Nasery, Sravanti Addepalli, Depen Morwani, Harshay Shah, Jatin Batra, Kaustav Tamuly, Aditi Raghunathan, R Venkatesh Babu and Prateek Jain.
Dr. Praneeth Netrapalli, is currently a research scientist at Google Research India, Bengaluru. He is also an adjunct professor at TIFR, Mumbai and a faculty associate of ICTS, Bengaluru. Prior to this, he was a researcher at Microsoft Research India for 4.5 years and did his postdoc at Microsoft Research New England in Cambridge, Massachusetts. Dr. Netrapalli obtained his M.S and Ph.D in ECE from UT Austin and his B.Tech in Electrical Engineering from IIT Bombay.
Dr. Praneeth Netrapalli has received many honors and awards like the INSA Medal for Young Scientists, 2021; IEEE Signal Processing Society Best Paper Award 2019 and Associate of Indian Academy of Sciences (IASc) 2019-2022
Dr. Netrapalli research interests are broadly in designing reliable and robust machine learning (ML) algorithms, representation learning, black-box optimization, time series modeling, quantum optimization and minimax/game theoretic optimization. I am also extremely interested in applying ML techniques to help solve problems from Sciences as well as enable positive social outcomes.
August 2023