Exact Linear-System Solver Enables Linear-Time Hessian Inversion for Deep Networks
Global: Exact Linear-System Solver Enables Linear-Time Hessian Inversion for Deep Networks
Algorithm Overview
Researchers have unveiled an exact algorithm for solving linear systems involving the Hessian of deep neural networks, announced in a paper submitted to arXiv on Jan 1, 2026. The method computes Hessian‑inverse‑vector products without storing the full Hessian or its inverse, achieving linear time and storage scaling with the number of layers.
Scalability Advantages
The algorithm builds on Pearlmutter’s technique for Hessian‑vector products, extending it to compute inverse‑vector products efficiently. By avoiding quadratic storage and cubic computational costs typical of naive approaches, the new method promises practical applicability to tall‑skinny network architectures. Tall‑skinny networks, characterized by a large number of layers relative to parameter width, often arise in modern deep learning models that prioritize depth for expressive power. In such settings, the Hessian matrix can be extremely large, making conventional inversion infeasible.
Implications for Deep Learning
According to the paper’s author, Ali Rahimi, the proposed algorithm processes each layer sequentially, maintaining only a constant‑size summary of intermediate results. This design enables the overall complexity to scale roughly linearly with the total number of parameters. Experimental results reported in the submission suggest that the method matches the runtime of standard Hessian‑vector multiplication while delivering exact inverse products. The authors compare the performance against baseline approaches that first construct the full Hessian. Potential applications include second‑order optimization, uncertainty quantification, and sensitivity analysis in deep learning, where access to Hessian‑inverse information can improve convergence and model interpretability. The paper is classified under Machine Learning (cs.LG) on arXiv and carries the identifier arXiv:2601.06096. It is available under an open‑access license, allowing unrestricted redistribution with attribution.This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.
Ende der Übertragung