y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#distributed-inference News & Analysis

4 articles tagged with #distributed-inference. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles
AIBullisharXiv – CS AI · 6d ago7/10
🧠

StreamSplit: Continuous Audio Representation Learning via Uncertainty-Guided Adaptive Splitting

StreamSplit introduces a novel framework enabling continuous contrastive learning on edge devices by dynamically partitioning computation between local and cloud resources. Using reinforcement learning and uncertainty guidance, the system reduces latency by up to 4.7x and bandwidth by 77.1% while maintaining near-server accuracy, making distributed AI inference practical for resource-constrained hardware.

AINeutralarXiv – CS AI · Mar 47/105
🧠

Federated Inference: Toward Privacy-Preserving Collaborative and Incentivized Model Serving

Researchers introduce Federated Inference (FI), a new collaborative paradigm where independently trained AI models can work together at inference time without sharing data or model parameters. The study identifies key requirements including privacy preservation and performance gains, while highlighting system-level challenges that differ from traditional federated learning approaches.

AINeutralarXiv – CS AI · May 126/10
🧠

Adaptive DNN Partitioning and Offloading in Heterogeneous Edge-Cloud Continuum

Researchers propose an adaptive framework for dynamically partitioning deep neural networks across edge-cloud infrastructure, addressing limitations of static approaches. Testing on real hardware demonstrates 27-35% energy reductions and 6-23% latency improvements compared to static baselines, validating the effectiveness of runtime-adaptive strategies for heterogeneous computing environments.

AIBullisharXiv – CS AI · May 46/10
🧠

Space Network of Experts: Architecture and Expert Placement

Researchers present Space-XNet, a framework for efficiently deploying mixture-of-experts language models across satellite constellations using optimized expert placement strategies. The approach achieves a threefold latency reduction compared to conventional methods, addressing key challenges in executing energy-intensive AI workloads in space where computing and communication resources are severely constrained.