menu
techminis

A naukri.com initiative

google-web-stories
Home

>

Programming News

>

Eli Bender...
source image

PlanetPython

1M

read

294

img
dot

Eli Bendersky: Reverse mode Automatic Differentiation

  • Automatic Differentiation (AD) is an important algorithm to calculate derivatives of computer programs. Reverse mode AD works for any number of function outputs and is commonly used in machine learning.
  • AD treats a computation as a nested sequence of function compositions, and then calculates the derivative of the outputs w.r.t. the inputs using repeated applications of the chain rule. There are two methods of AD: forward mode and reverse mode.
  • Linear chain graphs can be easily calculated through a sequence of function compositions with a single input and single output. Reverse mode AD is a generalization of the Backpropagation technique used in training neural networks.
  • General DAGs are more complex with non-linear patterns of interconnected nodes but can still be calculated through the multivariate chain rule. Expensive full Jacobians are not required in reverse mode AD because we only need a function that takes a row vector and outputs the VJP.
  • Reverse mode AD and VJPs are highly regarded because large and sparse jacobians can lead to optimal computational efficiency.
  • The Var class uses operator overloading and costumed functions to construct the graph in the background. The grad method then runs reverse mode AD.
  • Reverse mode AD is commonly used in machine learning because it's more efficient and handles scalar loss output functions.
  • Reverse mode AD and VJPs are crucial in implementing machine learning algorithms and neural networks.
  • Professional AD implementations like Autograd and JAX have better ergonomics and performance for large-scale projects.

Read Full Article

like

17 Likes

For uninterrupted reading, download the app