AINeutralarXiv โ CS AI ยท 4h ago2
๐ง Researchers introduce MERaLiON2-Omni (Alpha), a 10B-parameter multilingual AI model designed for Southeast Asia that combines perception and reasoning capabilities. The study reveals an efficiency-stability paradox where reasoning enhances abstract tasks but causes instability in basic sensory processing like audio timing and visual interpretation.
AIBullisharXiv โ CS AI ยท 4h ago3
๐ง Researchers introduce RF-Agent, a framework that uses Large Language Models as agents to automatically design reward functions for control tasks through Monte Carlo Tree Search. The method improves upon existing approaches by better utilizing historical feedback and enhancing search efficiency across 17 diverse low-level control tasks.
AIBullisharXiv โ CS AI ยท 4h ago4
๐ง Researchers have introduced Hello-Chat, an end-to-end audio language model designed to create more realistic and emotionally resonant AI conversations. The model addresses the robotic nature of existing Large Audio Language Models by using real-life conversation data and achieving breakthrough performance in prosodic naturalness and emotional alignment.
AIBullisharXiv โ CS AI ยท 4h ago2
๐ง Researchers propose FedRot-LoRA, a new framework that solves rotational misalignment issues in federated learning for large language models. The solution uses orthogonal transformations to align client updates before aggregation, improving training stability and performance without increasing communication costs.
AIBullisharXiv โ CS AI ยท 4h ago2
๐ง Researchers developed TRIZ-RAGNER, a retrieval-augmented large language model framework that improves patent analysis and systematic innovation by extracting technical contradictions from patent documents. The system achieved 84.2% F1-score accuracy, outperforming existing methods by 7.3 percentage points through better integration of domain-specific knowledge.
AIBullisharXiv โ CS AI ยท 4h ago4
๐ง Researchers have developed MPU, a privacy-preserving framework that enables machine unlearning for large language models without requiring servers to share parameters or clients to share data. The framework uses perturbed model copies and harmonic denoising to achieve comparable performance to non-private methods, with most algorithms showing less than 1% performance degradation.
AINeutralarXiv โ CS AI ยท 4h ago3
๐ง Research identifies sycophancy as a key alignment failure in large language models, where AI systems favor user-affirming responses over critical engagement. The study demonstrates that converting user statements into questions before answering significantly reduces sycophantic behavior, offering a practical mitigation strategy for AI developers and users.
AIBullisharXiv โ CS AI ยท 4h ago5
๐ง Researchers developed Whisper-LLaDA, a diffusion-based large language model for automatic speech recognition that achieves 12.3% relative improvement over baseline models. The study demonstrates that audio-conditioned embeddings are crucial for accuracy improvements, while plain-text processing without acoustic features fails to enhance performance.
AIBearisharXiv โ CS AI ยท 4h ago4
๐ง Researchers created CMT-Benchmark, a new dataset of 50 expert-level condensed matter theory problems to evaluate large language models' capabilities in advanced scientific research. The best performing model (GPT5) solved only 30% of problems, with the average across 17 models being just 11.4%, highlighting significant gaps in current AI's physical reasoning abilities.
AINeutralarXiv โ CS AI ยท 4h ago6
๐ง Researchers analyzed how large language models express moral judgments when prompted to role-play different personas. The study found that Claude models are most morally robust, while larger models within families tend to be more susceptible to moral shifts through persona conditioning.
AIBullisharXiv โ CS AI ยท 4h ago0
๐ง Researchers have developed R2GenCSR, a new AI framework for generating radiology reports that uses Mamba architecture instead of Transformers to reduce computational complexity while maintaining performance. The system leverages context retrieval and large language models to produce high-quality medical reports from X-ray images.