<ul><li>This work proposes a mathematically founded mixed precision accumulation strategy for the inference of neural networks.</li><li>The strategy is based on a componentwise forward error analysis to explain the error propagation in the forward pass of neural networks.</li><li>The analysis shows that the error in each component of the output of a layer is proportional to the product of the condition numbers of the weights and the input, and the condition number of the activation function.</li><li>The proposed algorithm utilizes this analysis to determine the precision inversely proportional to these condition numbers, leading to improved cost-accuracy tradeoff compared to uniform precision accumulation baselines.</li></ul>

Mixed precision accumulation for neural network inference guided by componentwise forward error analysis

Discover more