AIBullisharXiv – CS AI · 3d ago7/10
🧠Researchers propose a basis rotation framework to address gradient staleness in asynchronous pipeline parallelism, a technique used for distributed AI training. By aligning the optimizer's coordinate system with the Hessian eigenbasis, the method reduces training iterations by 81.7% compared to existing asynchronous baselines, enabling more efficient large-scale model training.
AIBullisharXiv – CS AI · May 117/10
🧠Researchers propose a gradient-based bilevel optimization method that automatically learns composite loss weights during pretraining by aligning gradients with downstream objectives. The approach reduces hyperparameter tuning overhead to ~30% above baseline training cost while matching or exceeding manually tuned baselines across event-sequence and computer vision tasks.
AIBearisharXiv – CS AI · May 77/10
🧠Researchers demonstrate that audio language models can be jailbroken using sparse token optimization rather than dense waveform updates, with Token-Aware Gradient Optimization (TAGO) achieving comparable attack success rates while modifying only 25% of audio tokens. The findings reveal that gradient energy concentrates in specific audio regions, suggesting future AI safety research should account for this heterogeneous token-level structure.
AIBullisharXiv – CS AI · Mar 57/10
🧠Researchers developed HAP (Heterogeneity-Aware Adaptive Pre-ranking), a new framework for recommender systems that addresses gradient conflicts in training by separating easy and hard samples. The system has been deployed in Toutiao's production environment for 9 months, achieving 0.4% improvement in user engagement without additional computational costs.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers developed new activation functions for deep neural networks based on polynomial and trigonometric orthonormal bases that can successfully train models like GPT-2 and ConvNeXt. The work addresses gradient problems common with polynomial activations and shows these networks can be interpreted as multivariate polynomial mappings.
AIBullisharXiv – CS AI · Mar 37/102
🧠Researchers propose GradientStabilizer, a new technique to address training instability in deep learning by replacing gradient magnitude with statistically stabilized estimates while preserving direction. The method outperforms gradient clipping across multiple AI training scenarios including LLM pre-training, reinforcement learning, and computer vision tasks.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers introduce Uni-X, a novel architecture for unified multimodal AI models that addresses gradient conflicts between vision and text processing. The X-shaped design uses modality-specific processing at input/output layers while sharing middle layers, achieving superior efficiency and matching 7B parameter models with only 3B parameters.
$UNI
AIBearisharXiv – CS AI · Mar 37/103
🧠Researchers have developed a new 'untargeted jailbreak attack' (UJA) that can compromise AI safety systems in large language models with over 80% success rate using only 100 optimization iterations. This gradient-based attack method expands the search space by maximizing unsafety probability without fixed target responses, outperforming existing attacks by over 30%.
AINeutralarXiv – CS AI · 3d ago6/10
🧠Researchers propose a Conflict-aware Penalty and Statistical Loss framework to address gradient norm conflicts in multimodal sentiment analysis, where dominant text modalities suppress weaker acoustic and visual streams. The approach achieves state-of-the-art results on CMU-MOSI benchmarks by balancing modality contributions and stabilizing training dynamics.
AINeutralarXiv – CS AI · 3d ago6/10
🧠Researchers introduce SPAR (Support-Preserving Action Rectification), a new offline reinforcement learning method that addresses the fundamental tension between maximizing value and staying true to training data. By anchoring policy improvements to frozen behavior cloning and operating in residual space, SPAR achieves state-of-the-art results on D4RL benchmarks while maintaining data distribution fidelity.
AINeutralarXiv – CS AI · 4d ago6/10
🧠Researchers propose a simple technique for stabilizing reinforcement learning training in PPO algorithms by randomly dropping 25% of transitions during rollouts. The method removes gradient redundancy caused by causally-dependent state sequences, improving training consistency across multiple environments without algorithmic modifications.
AIBullisharXiv – CS AI · May 126/10
🧠Researchers propose Pair-GRPO, a unified theoretical framework for LLM alignment that addresses instability and interpretability issues in reinforcement learning from human preferences. The method introduces Soft-Pair-GRPO and Hard-Pair-GRPO variants with proven gradient equivalence, monotonic policy improvement, and superior performance on standard benchmarks.
AIBullisharXiv – CS AI · May 126/10
🧠Researchers propose VIGOR, a verifier-free reinforcement learning method for large language models that eliminates dependency on gold labels or domain-specific verifiers by using gradient-norm measurements as intrinsic reward signals. The approach demonstrates measurable improvements over existing baselines on mathematical reasoning and exhibits cross-domain transfer to code tasks, addressing a major scalability constraint in current RL-based LLM training.
AIBullisharXiv – CS AI · Mar 55/10
🧠Researchers propose JPmHC (Jacobian-spectrum Preserving manifold-constrained Hyper-Connections), a new deep learning framework that improves upon existing Hyper-Connections by replacing identity skips with trainable linear mixers while controlling gradient conditioning. The framework addresses training instability and memory overhead issues in current deep learning architectures through constrained optimization on specific mathematical manifolds.
AIBullisharXiv – CS AI · Mar 36/108
🧠Researchers introduced GOME, an AI agent that uses gradient-based optimization instead of tree search for machine learning engineering tasks, achieving 35.1% success rate on MLE-Bench. The study shows gradient-based approaches outperform tree search as AI reasoning capabilities improve, suggesting this method will become more effective as LLMs advance.
AIBullisharXiv – CS AI · Mar 26/1010
🧠Researchers have developed a new quantum machine learning optimization technique using ternary encodings that significantly improves frequency tuning efficiency. The method achieves 22.8% better performance than existing approaches while requiring exponentially fewer encoding gates than traditional fixed-frequency methods.
AIBullisharXiv – CS AI · Mar 25/107
🧠Researchers introduce FedDAG, a new clustered federated learning framework that improves AI model training across heterogeneous client environments. The system combines data and gradient similarity metrics for better client clustering and uses a dual-encoder architecture to enable knowledge sharing across clusters while maintaining specialization.