Researchers introduce a Bayesian approach to understand the success of overparameterized deep neural networks (DNNs) by considering network architecture, training algorithms, and data structure.
They show that DNNs exhibit an Occam's razor-like inductive bias towards simple functions, which helps counteract the growth of complex functions, leading to their remarkable performance.
By analyzing Boolean function classification and utilizing a prior over functions determined by the network, researchers accurately predict the posterior for DNNs trained with stochastic gradient descent.
This study demonstrates that structured data and the intrinsic Occam's razor principle play a significant role in the success of deep neural networks.