y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#parallel-computing News & Analysis

8 articles tagged with #parallel-computing. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

8 articles
AIBullisharXiv – CS AI · Mar 177/10
🧠

Why Inference in Large Models Becomes Decomposable After Training

Researchers have discovered that large AI models develop decomposable internal structures during training, with many parameter dependencies remaining statistically unchanged from initialization. They propose a post-training method to identify and remove unsupported dependencies, enabling parallel inference without modifying model functionality.

AIBullisharXiv – CS AI · Feb 277/106
🧠

veScale-FSDP: Flexible and High-Performance FSDP at Scale

Researchers introduce veScale-FSDP, a redesigned Fully Sharded Data Parallel system that overcomes limitations of current FSDP implementations used for training large-scale AI models. The new system features flexible RaggedShard format and structure-aware planning, achieving 5-66% higher throughput and 16-30% lower memory usage while supporting advanced training methods and scaling to tens of thousands of GPUs.

AINeutralarXiv – CS AI · 4d ago5/10
🧠

Multi-Column RBF Neural Network Using Adaptive and Non-Adaptive Particle Swarm Optimization

Researchers propose MC-PSO and MC-APSO, novel parallel neural network architectures that combine multi-column radial basis function networks with particle swarm optimization algorithms. These methods outperform existing approaches in accuracy, recall, and computational efficiency on benchmark datasets by distributing training across spatial subsets.

AINeutralarXiv – CS AI · May 125/10
🧠

parHSOM: A novel parallel Hierarchical Self-Organizing Map implementation

Researchers have developed parHSOM, a parallel implementation of Hierarchical Self-Organizing Maps designed to accelerate training for cybersecurity intrusion detection systems. Testing across multiple datasets and configurations demonstrates faster training times without performance degradation compared to sequential HSOM approaches.

AIBullisharXiv – CS AI · Mar 176/10
🧠

NCCL EP: Towards a Unified Expert Parallel Communication API for NCCL

Researchers have developed NCCL EP, a new communication library for Mixture-of-Experts (MoE) AI model architectures that improves GPU-initiated communication performance. The library provides unified APIs supporting both low-latency inference and high-throughput training modes, built entirely on NVIDIA's NCCL Device API.

🏢 Nvidia
AINeutralarXiv – CS AI · May 45/10
🧠

Adaptation of AI-accelerated CFD Simulations to the IPU platform

Researchers demonstrate successful adaptation of AI-accelerated computational fluid dynamics (CFD) simulations to Graphcore's IPU platform, achieving up to 34% speedup through optimized data pipeline management. The study shows strong scalability from 2 to 16 IPUs, increasing throughput from 560.8 to 2805.8 samples per second, validating IPUs as viable accelerators for AI-enhanced scientific computing workloads.

AINeutralarXiv – CS AI · Feb 274/106
🧠

From Prompts to Performance: Evaluating LLMs for Task-based Parallel Code Generation

Researchers evaluated Large Language Models' ability to generate parallel code across three programming frameworks (OpenMP, C++, HPX) using different input prompts. The study found LLMs show varying performance depending on problem complexity and framework, revealing both capabilities and limitations in high-performance computing applications.

AINeutralarXiv – CS AI · Mar 34/104
🧠

Rethinking Policy Diversity in Ensemble Policy Gradient in Large-Scale Reinforcement Learning

Researchers propose Coupled Policy Optimization (CPO), a new reinforcement learning method that regulates policy diversity through KL constraints to improve exploration efficiency in large-scale parallel environments. The method outperforms existing baselines like PPO and SAPG across multiple tasks, demonstrating that controlled diverse exploration is key to stable and sample-efficient learning.