Practitioners often treat gradients of neural networks as inputs to task-specific algorithms for optimization, editing, and analysis.
A new paper introduces GradMetaNet, an architecture designed specifically for processing gradients by following principles like equivariant design and efficient gradient representation.
GradMetaNet is demonstrated to outperform previous approaches in approximating natural gradient-based functions for tasks like learned optimization, INR editing, and loss landscape curvature estimation.
The architecture, based on simple equivariant blocks, is proven to be universal and effective on a variety of gradient-based tasks involving MLPs and transformers.