Oversmoothing is a major limitation of Graph Neural Networks (GNNs) where input node features converge to a non-informative representation with bounded weights.
Backward oversmoothing is analyzed in this paper, indicating that backpropagated errors leading to gradient computations are also subjected to oversmoothing from output to input.
The interaction between forward and backward smoothing, especially with nonlinear activation functions, plays a key role in this phenomenon, leading GNNs to have many spurious stationary points.
This paper sheds light on the optimization landscape specific to deep GNNs and highlights the distinctive nature of this issue compared to other network architectures like Multi-Layer Perceptrons.