HE‑Optimized Neural Network Architecture Delivers Up to 9.78× Faster Inference
Global: HE‑Optimized Neural Network Architecture Delivers Up to 9.78× Faster Inference
In January 2026, researchers announced StriaNet, a neural‑network design specifically engineered for homomorphic encryption (HE) workloads, achieving speedups of 9.78× on ImageNet, 6.01× on Tiny ImageNet, and 9.24× on CIFAR‑10 while maintaining comparable accuracy. The work targets Machine Learning as a Service (MLaaS) platforms that must protect client data during inference.
Background
Privacy‑preserving deep learning typically relies on HE to perform linear operations on encrypted data, but the associated computational overhead has limited practical deployment. Prior efforts have focused on adapting existing plaintext models, which often inherit architectural inefficiencies that exacerbate HE costs.
Innovative Building Block
The authors introduce StriaBlock, a component designed to minimize the most expensive HE operation—ciphertext rotation. By integrating an ExRot‑Free Convolution technique and a novel Cross Kernel mechanism, StriaBlock eliminates external rotations and reduces internal rotations to roughly 19% of those required by conventional plaintext architectures.
Architectural Principles
Two guiding principles shape the overall network. The Focused Constraint Principle restricts cost‑sensitive factors such as rotation while preserving flexibility elsewhere. The Channel Packing‑Aware Scaling Principle adjusts bottleneck ratios to align with the varying ciphertext channel capacity that occurs at different network depths.
Performance Evaluation
StriaNet was benchmarked on three widely used image classification datasets. Across all tests, the HE‑tailored design delivered substantial runtime reductions without sacrificing predictive performance, demonstrating the practicality of designing networks around encryption constraints rather than retrofitting existing models.
Implications
These results suggest that future MLaaS offerings could provide stronger privacy guarantees with acceptable latency, potentially expanding the adoption of encrypted inference in sensitive domains. The authors indicate that further research will explore extending the approach to other model families and more complex tasks.
This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.
Ende der Übertragung