Researchers Propose New Activations to Mitigate Learning Freeze in Evidential Deep Learning
Global: Researchers Propose New Activations to Mitigate Learning Freeze in Evidential Deep Learning
A study posted to arXiv in December 2025 outlines a novel approach to address a persistent training issue in evidential deep learning models. The paper, authored by a team of machine‑learning researchers, introduces a family of activation functions and accompanying regularizers designed to keep gradient magnitudes stable across low‑evidence regions. By doing so, the work aims to improve the reliability of uncertainty estimates produced by deterministic neural networks.
Background on Evidential Deep Learning
Evidential deep learning (EDL) extends conventional neural networks with a probabilistic layer based on Subjective Logic, enabling the model to express both predictions and associated uncertainty. This framework has attracted interest for applications that require calibrated confidence, such as medical diagnosis and autonomous systems.
Activation‑Dependent Learning Freeze
Within the Subjective‑Logic formulation, evidence values must remain non‑negative, which forces the use of specific activation functions. Prior observations indicate that certain activations cause gradients to vanish when inputs map to low‑evidence zones, a phenomenon the authors refer to as “learning freeze.” This effect hampers model convergence, particularly on challenging or under‑represented samples.
Theoretical Analysis of Gradient Behavior
The authors provide a mathematical characterization of the freeze regime, demonstrating how the geometry of common activations—such as ReLU‑based and softplus variants—produces near‑zero gradients in low‑evidence regions. Their analysis quantifies the relationship between activation curvature and the magnitude of the evidential loss gradient.
Generalized Activation Functions and Regularizers
Building on the theoretical insights, the paper proposes a parametric family of activations that maintain smooth, non‑negative evidence while avoiding extreme flattening. Complementary regularization terms are introduced to enforce consistent evidence updates regardless of the activation regime, ensuring that gradient flow remains adequate throughout training.
Empirical Evaluation Across Benchmarks
Extensive experiments were conducted on four standard image‑classification datasets—MNIST, CIFAR‑10, CIFAR‑100, and Tiny‑ImageNet—as well as two few‑shot learning tasks and a blind face‑restoration problem. Results show that models employing the proposed activations achieve higher accuracy and more reliable uncertainty metrics compared with baseline EDL implementations.
Conclusions and Outlook
The findings suggest that careful design of evidential activations can mitigate learning‑freeze effects without sacrificing the theoretical benefits of the Subjective‑Logic framework. The authors anticipate that their generalized approach will facilitate broader adoption of uncertainty‑aware neural networks in safety‑critical domains.
This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.
Ende der Übertragung