🧠 AI🟢 BullishImportance 6/10

FuseFSS: Efficient Secure LLM Inference with Function Secret Sharing

arXiv – CS AI|Yuhan Ma, Yong Li, Stefan Schmid|June 9, 2026 at 04:00 AM

🤖AI Summary

FuseFSS is a new compiler that streamlines secure LLM inference by consolidating fragmented protocol designs into a unified pipeline, achieving 1.24x-1.50x speedup and reducing communication overhead by 9-16% compared to existing function secret sharing approaches. The technology enables privacy-preserving queries to large language models without revealing user prompts, addressing a critical bottleneck in cryptographic systems for AI inference.

Analysis

FuseFSS represents a meaningful advancement in the infrastructure supporting private AI inference, a domain gaining traction as enterprises and users increasingly demand computational confidentiality. The research tackles a genuine inefficiency in current two-server secure inference systems: while linear operations became efficient through function secret sharing (FSS), nonlinear operations required custom protocols with substantial overhead. By unifying these disparate implementations into a single compilation framework, the authors eliminate redundant engineering and preprocessing work.

The performance improvements demonstrate tangible gains. The 1.24x-1.50x end-to-end speedup on models like BERT and GPT indicates that practical inference systems can now execute secure computations with significantly lower latency. Equally important, the 9-16% reduction in online communication and 20-24% smaller cryptographic keys lower both bandwidth and storage requirements, making privacy-preserving LLM access more viable at scale.

This advancement matters for the emerging market of confidential computing services. Organizations building privacy-focused AI platforms—whether for healthcare data analysis, financial modeling, or sensitive enterprise applications—gain a more efficient technical foundation. The compiler's modular design, where operators are specified compactly rather than built individually, also enables faster iteration and deployment of new secure operations.

Looking forward, such efficiency gains become prerequisites for mainstream adoption of private AI inference. As LLMs grow larger and query volumes increase, the overhead of cryptographic security must approach imperceptibility. FuseFSS moves in that direction, though production deployment across cloud infrastructure will reveal whether these laboratory improvements translate to real-world adoption at commercial scale.

Key Takeaways

→FuseFSS consolidates fragmented secure inference protocols into a single compiler, eliminating per-operator custom implementations.
→Achieves 1.24x-1.50x speedup and reduces online communication by 9-16% on BERT and GPT-style models.
→Decreases preprocessing overhead by 14-23% in key generation time and produces 20-24% smaller cryptographic keys.
→Enables privacy-preserving LLM queries without revealing prompts or embeddings through two-server secret sharing.
→Improves practical viability of confidential computing infrastructure for enterprise AI applications.