y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#reasoning News & Analysis

169 articles tagged with #reasoning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

169 articles
AIBullisharXiv โ€“ CS AI ยท Mar 37/107
๐Ÿง 

Constructing Synthetic Instruction Datasets for Improving Reasoning in Domain-Specific LLMs: A Case Study in the Japanese Financial Domain

Researchers developed a method for creating synthetic instruction datasets to improve domain-specific LLMs, demonstrating with a 9.5 billion token Japanese financial dataset. The approach enhances both domain expertise and reasoning capabilities, with models and datasets being open-sourced for broader use.

AIBullisharXiv โ€“ CS AI ยท Mar 36/109
๐Ÿง 

Surgical Post-Training: Cutting Errors, Keeping Knowledge

Researchers introduce Surgical Post-Training (SPoT), a new method to improve Large Language Model reasoning while preventing catastrophic forgetting. SPoT achieved 6.2% accuracy improvement on Qwen3-8B using only 4k data pairs and 28 minutes of training, offering a more efficient alternative to traditional post-training approaches.

AIBullisharXiv โ€“ CS AI ยท Mar 36/108
๐Ÿง 

Reasoning as Gradient: Scaling MLE Agents Beyond Tree Search

Researchers introduced GOME, an AI agent that uses gradient-based optimization instead of tree search for machine learning engineering tasks, achieving 35.1% success rate on MLE-Bench. The study shows gradient-based approaches outperform tree search as AI reasoning capabilities improve, suggesting this method will become more effective as LLMs advance.

AIBullisharXiv โ€“ CS AI ยท Mar 36/105
๐Ÿง 

GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics

Researchers introduced GateLens, an LLM-based system that uses Relational Algebra as an intermediate layer to analyze complex tabular data more reliably than traditional approaches. The system demonstrated over 80% reduction in analysis time in automotive software analytics while maintaining high accuracy, outperforming existing Chain-of-Thought methods.

AIBullisharXiv โ€“ CS AI ยท Mar 36/104
๐Ÿง 

RL for Reasoning by Adaptively Revealing Rationales

Researchers introduce AdaBack, a new reinforcement learning algorithm that uses partial supervision to help AI models learn complex reasoning tasks. The method dynamically adjusts the amount of guidance provided to each training sample, enabling models to solve mathematical reasoning problems that traditional supervised learning and reinforcement learning methods cannot handle.

AIBullisharXiv โ€“ CS AI ยท Mar 36/103
๐Ÿง 

Quantile Advantage Estimation: Stabilizing RLVR for LLM Reasoning

Researchers propose Quantile Advantage Estimation (QAE) to stabilize Reinforcement Learning with Verifiable Rewards (RLVR) for large language model reasoning. The method replaces mean baselines with group-wise K-quantile baselines to prevent entropy collapse and explosion, showing sustained improvements on mathematical reasoning tasks.

AIBullisharXiv โ€“ CS AI ยท Mar 36/103
๐Ÿง 

Training Large Language Models To Reason In Parallel With Global Forking Tokens

Researchers developed Set Supervised Fine-Tuning (SSFT) and Global Forking Policy Optimization (GFPO) methods to improve large language model reasoning by enabling parallel processing through 'global forking tokens.' The techniques preserve diverse reasoning modes and demonstrate superior performance on math and code generation benchmarks compared to traditional fine-tuning approaches.

AIBullisharXiv โ€“ CS AI ยท Mar 36/104
๐Ÿง 

ChainMPQ: Interleaved Text-Image Reasoning Chains for Mitigating Relation Hallucinations

Researchers propose ChainMPQ, a training-free method to reduce relation hallucinations in Large Vision-Language Models (LVLMs) by using interleaved text-image reasoning chains. The approach addresses the most common but least studied type of AI hallucination by sequentially analyzing subjects, objects, and their relationships through multi-perspective questioning.

AIBullisharXiv โ€“ CS AI ยท Mar 36/103
๐Ÿง 

WavefrontDiffusion: Dynamic Decoding Schedule for Improved Reasoning

Researchers introduce WavefrontDiffusion, a new dynamic decoding approach for Diffusion Language Models that improves text generation quality by expanding from finalized positions rather than using fixed blocks. The method achieves state-of-the-art performance on reasoning and code generation benchmarks while maintaining computational efficiency equivalent to existing block-based methods.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1018
๐Ÿง 

Reasoning-Driven Multimodal LLM for Domain Generalization

Researchers developed RD-MLDG, a new framework that uses multimodal large language models with reasoning chains to improve domain generalization in deep learning. The approach addresses challenges in cross-domain visual recognition by leveraging reasoning capabilities rather than just visual feature invariance, achieving state-of-the-art performance on standard benchmarks.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1018
๐Ÿง 

Reason to Contrast: A Cascaded Multimodal Retrieval Framework

Researchers introduce TTE-v2, a new multimodal retrieval framework that achieves state-of-the-art performance by incorporating reasoning steps during retrieval and reranking. The approach demonstrates that scaling based on reasoning tokens rather than model size can significantly improve performance, with TTE-v2-7B reaching 75.7% accuracy on MMEB-V2 benchmark.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1013
๐Ÿง 

LLM-Driven Multi-Turn Task-Oriented Dialogue Synthesis for Realistic Reasoning

Researchers propose an LLM-driven framework for generating multi-turn task-oriented dialogues to create more realistic reasoning benchmarks. The framework addresses limitations in current AI evaluation methods by producing synthetic datasets that better reflect real-world complexity and contextual coherence.

AINeutralarXiv โ€“ CS AI ยท Mar 27/1014
๐Ÿง 

Task Complexity Matters: An Empirical Study of Reasoning in LLMs for Sentiment Analysis

A comprehensive study of 504 AI model configurations reveals that reasoning capabilities in large language models are highly task-dependent, with simple tasks like binary classification actually degrading by up to 19.9 percentage points while complex 27-class emotion recognition improves by up to 16.0 points. The research challenges the assumption that reasoning universally improves AI performance across all language tasks.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1017
๐Ÿง 

MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information

Researchers introduce MITS (Mutual Information Tree Search), a new framework that improves reasoning capabilities in large language models using information-theoretic principles. The method uses pointwise mutual information for step-wise evaluation and achieves better performance while being more computationally efficient than existing tree search methods like Tree-of-Thought.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1021
๐Ÿง 

Reallocating Attention Across Layers to Reduce Multimodal Hallucination

Researchers propose a training-free solution to reduce hallucinations in multimodal AI models by rebalancing attention between perception and reasoning layers. The method achieves 4.2% improvement in reasoning accuracy with minimal computational overhead.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1014
๐Ÿง 

Latent Self-Consistency for Reliable Majority-Set Selection in Short- and Long-Answer Reasoning

Researchers introduce Latent Self-Consistency (LSC), a new method for improving Large Language Model output reliability across both short and long-form reasoning tasks. LSC uses learnable token embeddings to select semantically consistent responses with only 0.9% computational overhead, outperforming existing consistency methods like Self-Consistency and Universal Self-Consistency.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1014
๐Ÿง 

MMKG-RDS: Reasoning Data Synthesis via Deep Mining of Multimodal Knowledge Graphs

Researchers introduce MMKG-RDS, a framework that uses multimodal knowledge graphs to synthesize high-quality training data for improving AI model reasoning abilities. Testing on Qwen3 models showed 9.2% improvement in reasoning accuracy, with applications for complex benchmark construction involving tables and formulas.

AIBullisharXiv โ€“ CS AI ยท Feb 276/106
๐Ÿง 

PATRA: Pattern-Aware Alignment and Balanced Reasoning for Time Series Question Answering

Researchers have developed PATRA, a new AI model that improves time series question answering by better understanding patterns like trends and seasonality. The model addresses limitations in existing LLM approaches that treat time series data as simple text or images, introducing pattern-aware mechanisms and balanced learning across tasks of varying difficulty.

AIBullisharXiv โ€“ CS AI ยท Feb 276/106
๐Ÿง 

Reinforcement-aware Knowledge Distillation for LLM Reasoning

Researchers propose RL-aware distillation (RLAD), a new method to efficiently transfer knowledge from large language models to smaller ones during reinforcement learning training. The approach uses Trust Region Ratio Distillation (TRRD) to selectively guide student models only when it improves policy updates, outperforming existing distillation methods across reasoning benchmarks.

AIBullisharXiv โ€“ CS AI ยท Feb 276/108
๐Ÿง 

Test-Time Scaling with Diffusion Language Models via Reward-Guided Stitching

Researchers developed a new framework called 'Stitching Noisy Diffusion Thoughts' that improves AI reasoning by combining the best parts of multiple solution attempts rather than just selecting complete answers. The method achieves up to 23.8% accuracy improvement on math and coding tasks while reducing computation time by 1.8x compared to existing approaches.

AIBullishOpenAI News ยท Aug 56/106
๐Ÿง 

Introducing gpt-oss

A new company has released gpt-oss-120b and gpt-oss-20b, two open-weight language models under Apache 2.0 license that deliver strong performance at low cost. The models excel at reasoning tasks and tool use while being optimized for efficient deployment on consumer hardware.

AIBullishOpenAI News ยท Aug 56/104
๐Ÿง 

gpt-oss-120b & gpt-oss-20b Model Card

Two new open-weight reasoning models, gpt-oss-120b and gpt-oss-20b, have been released under the Apache 2.0 license. These models are available for use under a specific gpt-oss usage policy.

AIBullishHugging Face Blog ยท Jul 86/105
๐Ÿง 

SmolLM3: smol, multilingual, long-context reasoner

SmolLM3 represents a new compact language model that combines multilingual capabilities with long-context reasoning abilities. The model appears to be designed for efficiency while maintaining strong performance across multiple languages and complex reasoning tasks.

AIBullishGoogle DeepMind Blog ยท May 206/102
๐Ÿง 

Gemini 2.5: Our most intelligent models are getting even better

Google announces updates to its Gemini AI models, with Gemini 2.5 Pro maintaining its position as the preferred coding model for developers and 2.5 Flash receiving improvements. The company introduces Deep Think, an experimental enhanced reasoning mode for the 2.5 Pro model.

AIBullishOpenAI News ยท Feb 26/105
๐Ÿง 

Introducing deep research

A new AI research agent has been launched that can synthesize large amounts of online information and complete complex multi-step research tasks through advanced reasoning capabilities. The tool is currently available to Pro users with rollout planned for Plus and Team subscribers.