y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#arxiv News & Analysis

Content tagged #arxiv focuses on preprint research from the arXiv repository, primarily covering computer science and artificial intelligence topics. Over the past 30 days, six articles have been indexed, with recent discussions centering on large language models including GPT-4 and Llama. The sentiment around these preprints remains entirely neutral, though bullish sentiment has declined 58.6 percentage points compared to the prior quarter. The tag frequently overlaps with #machine-learning, #research, and #ai-research discussions. Blockchain and cryptocurrency tickers like NEAR, LINK, and COMP have appeared alongside #arxiv content in recent coverage. Browse the articles below to explore what's currently being discussed in academic AI research.

sentiment · last 30d (6 articles) · -58.6pp bullish vs prior 90d
Top sources:arXiv – CS AI · 406
Most-discussed entities:GPT-4 · 6Llama · 4Hugging Face · 1Claude · 1Nvidia · 1
427 articles
AIBullisharXiv – CS AI · Apr 76/10
🧠

Conversational Control with Ontologies for Large Language Models: A Lightweight Framework for Constrained Generation

Researchers developed a lightweight framework that uses ontological definitions to provide modular and explainable control over Large Language Model outputs in conversational systems. The method fine-tunes LLMs to generate content according to specific constraints like English proficiency level and content polarity, consistently outperforming pre-trained baselines across seven state-of-the-art models.

AIBullisharXiv – CS AI · Apr 76/10
🧠

Neuromorphic Computing for Low-Power Artificial Intelligence

Researchers outline how neuromorphic computing could overcome energy efficiency limits in classical CMOS technology for AI applications. The approach requires co-design across materials, circuits, and algorithms to achieve brain-inspired compute-in-memory architectures.

AIBearisharXiv – CS AI · Apr 76/10
🧠

Plausibility as Commonsense Reasoning: Humans Succeed, Large Language Models Do not

A new study reveals that large language models fail to integrate world knowledge with syntactic structure for ambiguity resolution in the same way humans do. Researchers tested Turkish language models on relative-clause attachment ambiguities and found that while humans reliably use plausibility to guide interpretation, LLMs show weak, unstable, or reversed responses to the same plausibility cues.

AINeutralarXiv – CS AI · Apr 76/10
🧠

Selective Forgetting for Large Reasoning Models

Researchers propose a new framework for 'selective forgetting' in Large Reasoning Models (LRMs) that can remove sensitive information from AI training data while preserving general reasoning capabilities. The method uses retrieval-augmented generation to identify and replace problematic reasoning segments with benign placeholders, addressing privacy and copyright concerns in AI systems.

AIBullisharXiv – CS AI · Apr 76/10
🧠

Profile-Then-Reason: Bounded Semantic Complexity for Tool-Augmented Language Agents

Researchers introduce Profile-Then-Reason (PTR), a new framework for AI language agents that use external tools, which reduces computational overhead by pre-planning workflows rather than recomputing after each step. The approach limits language model calls to 2-3 times maximum and shows superior performance in 16 of 24 test configurations compared to reactive execution methods.

AIBullisharXiv – CS AI · Apr 76/10
🧠

InferenceEvolve: Towards Automated Causal Effect Estimators through Self-Evolving AI

Researchers introduce InferenceEvolve, an AI framework using large language models to automatically discover and refine causal inference methods. The system outperformed 58 human submissions in a recent competition and demonstrates how AI can optimize complex scientific programs through evolutionary approaches.

AINeutralarXiv – CS AI · Apr 76/10
🧠

Implementing surrogate goals for safer bargaining in LLM-based agents

Researchers developed methods to implement 'surrogate goals' in LLM-based agents to reduce bargaining risks by deflecting threats away from what principals care about. The study tested four approaches (prompting, fine-tuning, scaffolding) and found that scaffolding and fine-tuning methods outperformed simple prompting for implementing desired threat response behaviors.

AIBullisharXiv – CS AI · Apr 66/10
🧠

Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral Control

Researchers developed a method to identify valence-arousal subspaces in large language models, enabling controlled emotional steering of AI outputs. The technique demonstrates cross-architecture effectiveness on multiple models and reveals that emotional control can bidirectionally influence AI behaviors like refusal and sycophancy.

🧠 Llama
AIBullisharXiv – CS AI · Apr 66/10
🧠

Haiku to Opus in Just 10 bits: LLMs Unlock Massive Compression Gains

Researchers developed new compression techniques for LLM-generated text, achieving massive compression ratios through domain-adapted LoRA adapters and an interactive 'Question-Asking' protocol. The QA method uses binary questions to transfer knowledge between small and large models, achieving compression ratios of 0.0006-0.004 while recovering 23-72% of capability gaps.

AIBullisharXiv – CS AI · Apr 66/10
🧠

Hierarchical, Interpretable, Label-Free Concept Bottleneck Model

Researchers have developed HIL-CBM, a new hierarchical interpretable AI model that enhances explainability by mimicking human cognitive processes across multiple semantic levels. The model outperforms existing Concept Bottleneck Models in classification accuracy while providing more interpretable explanations without requiring manual concept annotations.

AINeutralarXiv – CS AI · Apr 66/10
🧠

Random Is Hard to Beat: Active Selection in online DPO with Modern LLMs

Research from arXiv shows that Active Preference Learning (APL) provides minimal improvements over random sampling in training modern LLMs through Direct Preference Optimization. The study found that random sampling performs nearly as well as sophisticated active selection methods while being computationally cheaper and avoiding capability degradation.

AIBullisharXiv – CS AI · Apr 66/10
🧠

A Paradigm Shift: Fully End-to-End Training for Temporal Sentence Grounding in Videos

Researchers propose a fully end-to-end training paradigm for temporal sentence grounding in videos, introducing the Sentence Conditioned Adapter (SCADA) to better align video understanding with natural language queries. The method outperforms existing approaches by jointly optimizing video backbones and localization components rather than using frozen pre-trained encoders.

AIBullisharXiv – CS AI · Apr 66/10
🧠

Unified Thinker: A General Reasoning Modular Core for Image Generation

Researchers introduce Unified Thinker, a new AI architecture that improves image generation by separating reasoning from visual generation. The modular system addresses the gap between closed-source models like Nano Banana and open-source alternatives by enabling better instruction following through executable reasoning and reinforcement learning.

AIBullisharXiv – CS AI · Mar 276/10
🧠

CodeRefine: A Pipeline for Enhancing LLM-Generated Code Implementations of Research Papers

CodeRefine is a new AI framework that automatically converts research paper methodologies into functional code using Large Language Models. The system creates knowledge graphs from papers and uses retrieval-augmented generation to produce more accurate code implementations than traditional zero-shot prompting methods.

AIBullisharXiv – CS AI · Mar 276/10
🧠

R-C2: Cycle-Consistent Reinforcement Learning Improves Multimodal Reasoning

Researchers introduce RC2, a reinforcement learning framework that improves multimodal AI reasoning by enforcing consistency between visual and textual representations. The system uses cycle-consistent training to resolve internal conflicts between modalities, achieving up to 7.6 point improvements in reasoning accuracy without requiring additional labeled data.

AIBullisharXiv – CS AI · Mar 276/10
🧠

X-OPD: Cross-Modal On-Policy Distillation for Capability Alignment in Speech LLMs

Researchers propose X-OPD, a Cross-Modal On-Policy Distillation framework to improve Speech Large Language Models by aligning them with text-based counterparts. The method uses token-level feedback from teacher models to bridge performance gaps in end-to-end speech systems while preserving inherent capabilities.

AIBearisharXiv – CS AI · Mar 276/10
🧠

Probing the Lack of Stable Internal Beliefs in LLMs

Research reveals that large language models (LLMs) struggle to maintain consistent internal beliefs or goals across multi-turn conversations, failing to preserve implicit consistency when not explicitly provided context. This limitation poses significant challenges for developing persona-driven AI systems that require stable personality traits and behavioral patterns.

AINeutralarXiv – CS AI · Mar 266/10
🧠

Qworld: Question-Specific Evaluation Criteria for LLMs

Researchers introduce Qworld, a new method for evaluating large language models that generates question-specific criteria using recursive expansion trees instead of static rubrics. The approach covers 89% of expert-authored criteria and reveals capability differences across 11 frontier LLMs that traditional evaluation methods miss.

AIBullisharXiv – CS AI · Mar 266/10
🧠

A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula

Researchers developed a scalable multi-turn synthetic data generation pipeline using reinforcement learning to improve large language models' code generation capabilities. The approach uses teacher models to create structured difficulty progressions and curriculum-based training, showing consistent improvements in code generation across Llama3.1-8B and Qwen models.

🧠 Llama
AINeutralarXiv – CS AI · Mar 266/10
🧠

Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA

A research study on retrieval-augmented generation (RAG) systems for AI policy analysis found that improving retrieval quality doesn't necessarily lead to better question-answering performance. The research used 947 AI policy documents and discovered that stronger retrieval can paradoxically cause more confident hallucinations when relevant information is missing.

AIBullisharXiv – CS AI · Mar 266/10
🧠

Beyond Multi-Token Prediction: Pretraining LLMs with Future Summaries

Researchers propose Future Summary Prediction (FSP), a new pretraining method for large language models that predicts compact representations of long-term future text sequences. FSP outperforms traditional next-token prediction and multi-token prediction methods in math, reasoning, and coding benchmarks when tested on 3B and 8B parameter models.

AINeutralarXiv – CS AI · Mar 266/10
🧠

From Sycophancy to Sensemaking: Premise Governance for Human-AI Decision Making

Researchers propose a new framework for human-AI decision making that shifts from AI systems providing fluent but potentially sycophantic answers to collaborative premise governance. The approach uses discrepancy-driven control loops to detect conflicts and ensure commitment to decision-critical premises before taking action.

AIBullisharXiv – CS AI · Mar 176/10
🧠

DOVA: Deliberation-First Multi-Agent Orchestration for Autonomous Research Automation

Researchers introduce DOVA (Deep Orchestrated Versatile Agent), a multi-agent AI platform that improves research automation through deliberation-first orchestration and hybrid collaborative reasoning. The system reduces inference costs by 40-60% on simple tasks while maintaining deep reasoning capabilities for complex research requiring multi-source synthesis.

AINeutralarXiv – CS AI · Mar 176/10
🧠

The AI Fiction Paradox

A new research paper identifies the 'AI-Fiction Paradox' - AI models desperately need fiction for training data but struggle to generate quality fiction themselves. The paper outlines three core challenges: narrative causation requiring temporal paradoxes, informational revaluation that conflicts with current attention mechanisms, and multi-scale emotional architecture that current AI cannot orchestrate effectively.

← PrevPage 8 of 18Next →