New Activation Functions Help Preserve Plasticity in Continual Learning Models
Global: New Activation Functions Help Preserve Plasticity in Continual Learning Models
Researchers announced on arXiv in September 2025 that they have identified activation functions as a key factor in mitigating the loss of plasticity that plagues continual learning systems. The study, which spans both supervised class‑incremental tasks and reinforcement‑learning environments with shifting dynamics, demonstrates that carefully designed non‑linearities can sustain model adaptability without adding extra capacity or task‑specific tuning.
Background
Continual learning aims to enable models to acquire new knowledge over time, yet it often suffers from catastrophic forgetting and a gradual decline in the ability to adapt—referred to as loss of plasticity. While extensive benchmarking has compared activation functions in static, i.i.d. settings, their impact on the evolving demands of continual learning has received limited attention.
Methodology
The authors performed a property‑level analysis focusing on the negative‑branch shape and saturation behavior of activation functions. Building on these insights, they introduced two drop‑in replacements—Smooth‑Leaky and Randomized Smooth‑Leaky—that modify the curvature and stochasticity of the activation’s negative region.
Evaluation
Both novel activations were tested in two complementary scenarios. First, they were applied to supervised class‑incremental benchmarks that simulate sequential task acquisition. Second, they were integrated into reinforcement‑learning agents operating in non‑stationary MuJoCo environments designed to produce controlled distribution and dynamics shifts. The authors also proposed a stress protocol to diagnose how activation shape influences adaptation under change.
Findings
Results indicate that the choice of activation function serves as an architecture‑agnostic lever for reducing plasticity loss. Models equipped with Smooth‑Leaky or Randomized Smooth‑Leaky consistently outperformed baseline activations across the evaluated tasks, achieving higher retention of previously learned knowledge while maintaining responsiveness to new data.
Implications
The study suggests that thoughtful activation design offers a lightweight, domain‑general strategy for sustaining plasticity in continual learning systems. Because the proposed functions require no additional parameters or specialized training regimes, they can be readily adopted in existing pipelines.
Future Directions
According to the paper, further research will explore the interaction of these activations with other continual‑learning techniques and assess their effectiveness across a broader spectrum of real‑world applications.
This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.
Ende der Übertragung