Re-posted from: https://www.stochasticlifestyle.com/the-numerical-analysis-of-differentiable-simulation-automatic-differentiation-can-be-incorrect/
The Numerical Analysis of Differentiable Simulation: How Automatic Differentiation of Physics Can Give Incorrect Derivatives
Scientific machine learning (SciML) relies heavily on automatic differentiation (AD), the process of constructing gradients which include machine learning integrated into mechanistic models for the purpose of gradient-based optimization. While these differentiable programming approaches pitch an idea of “simply put the simulator into a loss function and use AD”, it turns out there are a lot more subtle details to consider in practice. In this talk we will dive into the numerical analysis of differentiable simulation and ask the question: how numerically stable and robust is AD? We will use examples from the Python-based Jax (diffrax) and PyTorch (torchdiffeq) libraries in order to demonstrate how canonical formulations of AD and adjoint methods can give inaccurate gradients in the context of ODEs and PDEs. We demonstrate cases where the methodologies are “mathematically correct”, but due to the intricacies of numerical error propagation, their approaches can give 60% and greater error even in simple cases like linear ODEs. We’ll then describe some of the non-standard modifications to AD which are done in the Julia SciML libraries to overcome these numerical instabilities and achieve accurate results, crucially also describing the engineering trade-offs which are required to be made in the process. The audience should leave with a greater appreciation of the greater numerical challenges which still need to be addressed in the field of AD for SciML.
The post The Numerical Analysis of Differentiable Simulation: Automatic Differentiation Can Be Incorrect appeared first on Stochastic Lifestyle.