y0news
#distributed-computing5 articles
5 articles
AINeutralarXiv โ€“ CS AI ยท 6h ago4
๐Ÿง 

SLA-Aware Distributed LLM Inference Across Device-RAN-Cloud

Researchers tested distributed AI inference across device, edge, and cloud tiers in a 5G network, finding that sub-second AI response times required for embodied AI are challenging to achieve. On-device execution took multiple seconds, while RAN-edge deployment with quantized models could meet 0.5-second deadlines, and cloud deployment achieved 100% success for 1-second deadlines.

$NEAR
AIBullisharXiv โ€“ CS AI ยท 6h ago8
๐Ÿง 

Data Driven Optimization of GPU efficiency for Distributed LLM Adapter Serving

Researchers developed a data-driven pipeline to optimize GPU efficiency for distributed LLM adapter serving, achieving sub-5% throughput estimation error while running 90x faster than full benchmarking. The system uses a Digital Twin, machine learning models, and greedy placement algorithms to minimize GPU requirements while serving hundreds of adapters concurrently.

AIBullisharXiv โ€“ CS AI ยท 6h ago4
๐Ÿง 

Semantic Parallelism: Redefining Efficient MoE Inference via Model-Data Co-Scheduling

Researchers propose Semantic Parallelism, a new framework called Sem-MoE that significantly improves efficiency of large language model inference by optimizing how AI models distribute computational tasks across multiple devices. The system reduces communication overhead between devices by 'collocating' frequently-used model components with their corresponding data, achieving superior throughput compared to existing solutions.

AIBullisharXiv โ€“ CS AI ยท 6h ago1
๐Ÿง 

Permutation-Invariant Representation Learning for Robust and Privacy-Preserving Feature Selection

Researchers have developed a new framework for privacy-preserving feature selection that uses permutation-invariant representation learning and federated learning techniques. The approach addresses data imbalance and privacy constraints in distributed scenarios while improving computational efficiency and downstream task performance.

AINeutralarXiv โ€“ CS AI ยท 6h ago1
๐Ÿง 

FedVG: Gradient-Guided Aggregation for Enhanced Federated Learning

Researchers introduce FedVG, a new federated learning framework that uses gradient-guided aggregation and global validation sets to improve model performance in distributed training environments. The approach addresses client drift issues in heterogeneous data settings and can be integrated with existing federated learning algorithms.