#natural-language-processing News & Analysis

108 articles tagged with #natural-language-processing. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

108 articles

AI × CryptoBullishcrypto.news · May 117/10

🤖

MoonPay buys Dawn Labs, debuts AI trader for prediction markets

MoonPay has acquired Dawn Labs and launched Dawn CLI, a tool that enables AI agents and traders to convert natural-language prompts into live trading strategies on Polymarket's prediction markets. This move signals MoonPay's strategic pivot toward AI-driven trading infrastructure within the decentralized finance ecosystem.

AIBullisharXiv – CS AI · May 17/10

🧠

NeocorRAG: Less Irrelevant Information, More Explicit Evidence, and More Effective Recall via Evidence Chains

Researchers introduce NeocorRAG, a new framework that optimizes retrieval quality in Retrieval-Augmented Generation (RAG) systems by using Evidence Chains, achieving state-of-the-art performance while reducing token consumption by 80% compared to comparable methods. The framework addresses a critical gap where improvements in retrieval metrics don't consistently translate to better reasoning accuracy.

AINeutralarXiv – CS AI · Apr 147/10

🧠

Can Large Language Models Infer Causal Relationships from Real-World Text?

Researchers developed the first real-world benchmark for evaluating whether large language models can infer causal relationships from complex academic texts. The study reveals that LLMs struggle significantly with this task, with the best models achieving only 0.535 F1 scores, highlighting a critical gap in AI reasoning capabilities needed for AGI advancement.

AIBullisharXiv – CS AI · Apr 137/10

🧠

Commanding Humanoid by Free-form Language: A Large Language Action Model with Unified Motion Vocabulary

Researchers introduce Humanoid-LLA, a Large Language Action Model enabling humanoid robots to execute complex physical tasks from natural language commands. The system combines a unified motion vocabulary, physics-aware controller, and reinforcement learning to achieve both language understanding and real-world robot control, demonstrating improved performance on Unitree G1 and Booster T1 humanoids.

AIBullisharXiv – CS AI · Mar 177/10

🧠

Agentic DAG-Orchestrated Planner Framework for Multi-Modal, Multi-Hop Question Answering in Hybrid Data Lakes

Researchers introduce A.DOT Planner, an AI framework that enables multi-hop question answering across hybrid data lakes containing both structured and unstructured data. The system uses directed acyclic graphs to orchestrate complex queries, achieving 14.8% better accuracy and 10.7% better completeness than existing solutions.

$DOT

AIBullisharXiv – CS AI · Mar 167/10

🧠

Aligning Language Models from User Interactions

Researchers developed a new method for training AI language models using multi-turn user conversations through self-distillation, leveraging follow-up messages to improve model alignment. Testing on real-world WildChat conversations showed improvements in alignment and instruction-following benchmarks while enabling personalization without explicit feedback.

AIBullisharXiv – CS AI · Mar 127/10

🧠

ES-dLLM: Efficient Inference for Diffusion Large Language Models by Early-Skipping

Researchers developed ES-dLLM, a training-free inference acceleration framework that speeds up diffusion large language models by selectively skipping tokens in early layers based on importance scoring. The method achieves 5.6x to 16.8x speedup over vanilla implementations while maintaining generation quality, offering a promising alternative to autoregressive models.

🏢 Nvidia

AIBullisharXiv – CS AI · Mar 97/10

🧠

SpecFuse: Ensembling Large Language Models via Next-Segment Prediction

Researchers introduce SpecEM, a new training-free framework for ensembling large language models that dynamically adjusts each model's contribution based on real-time performance. The system uses speculative decoding principles and online feedback mechanisms to improve collaboration between different LLMs, showing consistent performance improvements across multiple benchmark datasets.

AIBullisharXiv – CS AI · Mar 56/10

🧠

TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation

Researchers introduce TATRA, a training-free prompting method for Large Language Models that creates instance-specific few-shot prompts without requiring labeled training data. The method achieves state-of-the-art performance on mathematical reasoning benchmarks like GSM8K and DeepMath, matching or outperforming existing prompt optimization methods that rely on expensive training processes.

AIBullisharXiv – CS AI · Mar 56/10

🧠

AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

Researchers have developed AriadneMem, a new memory system for long-horizon LLM agents that addresses challenges in maintaining accurate memory under fixed context budgets. The system uses a two-phase pipeline with entropy-aware gating and conflict-aware coarsening to improve multi-hop reasoning while reducing runtime by 77.8% and using only 497 context tokens.

🧠 GPT-4

AIBullisharXiv – CS AI · Mar 57/10

🧠

LeanTutor: Towards a Verified AI Mathematical Proof Tutor

Researchers have developed LeanTutor, a proof-of-concept AI system that combines Large Language Models with theorem provers to create a mathematically verified proof tutor. The system features three modules for autoformalization, proof-checking, and natural language feedback, evaluated using PeanoBench, a new dataset of 371 Peano Arithmetic proofs.

AINeutralarXiv – CS AI · Mar 57/10

🧠

Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text Decoding

Researchers propose SemKey, a novel framework that addresses key limitations in EEG-to-text decoding by preventing hallucinations and improving semantic fidelity through decoupled guidance objectives. The system redesigns neural encoder-LLM interaction and introduces new evaluation metrics beyond BLEU scores to achieve state-of-the-art performance in brain-computer interfaces.

AIBullisharXiv – CS AI · Mar 56/10

🧠

NRR-Phi: Text-to-State Mapping for Ambiguity Preservation in LLM Inference

Researchers developed NRR-Phi, a framework that prevents large language models from prematurely committing to single interpretations of ambiguous text. The system maintains multiple valid interpretations in a non-collapsing state space, achieving 1.087 bits of mean entropy compared to zero for traditional collapse-based models.

AIBullisharXiv – CS AI · Mar 47/104

🧠

CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think

Researchers propose CoDAR, a new continuous diffusion language model framework that addresses key bottlenecks in token rounding through a two-stage approach combining continuous diffusion with an autoregressive decoder. The model demonstrates substantial improvements in generation quality over existing latent diffusion methods and becomes competitive with discrete diffusion language models.

AINeutralarXiv – CS AI · Mar 46/103

🧠

Death of the Novel(ty): Beyond n-Gram Novelty as a Metric for Textual Creativity

Research analyzing 8,618 expert annotations reveals that n-gram novelty, commonly used to evaluate AI text generation, is insufficient for measuring textual creativity. While positively correlated with creativity, 91% of high n-gram novel expressions were not judged as creative by experts, and higher novelty in open-source LLMs correlates with lower pragmatic quality.

AIBullisharXiv – CS AI · Mar 46/103

🧠

Self-Aug: Query and Entropy Adaptive Decoding for Large Vision-Language Models

Researchers developed a new training-free decoding strategy for Large Vision-Language Models that reduces hallucinations by using query-adaptive visual augmentation and entropy-based token selection. The method showed significant improvements in factual consistency across four LVLMs and seven benchmarks compared to existing approaches.

AIBullisharXiv – CS AI · Mar 46/104

🧠

Talking with Verifiers: Automatic Specification Generation for Neural Network Verification

Researchers have developed a framework that allows neural network verification tools to accept natural language specifications instead of low-level technical constraints. The system automatically translates human-readable requirements into formal verification queries, significantly expanding the practical applicability of neural network verification across diverse domains.

AIBullishCrypto Briefing · Mar 37/102

🧠

OpenAI releases GPT-5.3 Instant with fewer refusals and improved web answers

OpenAI has released GPT-5.3 Instant for ChatGPT, featuring reduced refusals, enhanced web-based answers, and fewer hallucinations across major performance benchmarks. This update represents a significant improvement in AI model reliability and user experience.

AIBullisharXiv – CS AI · Mar 37/104

🧠

Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models

Researchers introduce Uni-X, a novel architecture for unified multimodal AI models that addresses gradient conflicts between vision and text processing. The X-shaped design uses modality-specific processing at input/output layers while sharing middle layers, achieving superior efficiency and matching 7B parameter models with only 3B parameters.

$UNI

AIBullisharXiv – CS AI · Mar 37/104

🧠

SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs

Researchers introduce SwiReasoning, a training-free framework that improves large language model reasoning by dynamically switching between explicit chain-of-thought and latent reasoning modes. The method achieves 1.8%-3.1% accuracy improvements and 57%-79% better token efficiency across mathematics, STEM, coding, and general benchmarks.

AIBullishOpenAI News · Sep 47/105

🧠

Learning to summarize with human feedback

Researchers have successfully applied reinforcement learning from human feedback (RLHF) to improve language model summarization capabilities. This approach uses human preferences to guide the training process, resulting in models that produce higher quality summaries aligned with human expectations.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

MetaboT: An LLM-based Multi-Agent Frameworkfor Interactive Analysis of Mass SpectrometryMetabolomics Knowledge Graphs

MetaboT is an open-source LLM-based framework that translates natural-language questions into SPARQL queries for metabolomics knowledge graphs, significantly lowering technical barriers for researchers without programming expertise. The multi-agent architecture addresses hallucination and schema-compliance issues through specialized agents for validation, entity resolution, and query refinement, validated on the Experimental Natural Products Knowledge Graph.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

ESC-Skills: Discovering and Self-Evolving Skills for Emotional Support Conversations

ESC-Skills introduces a novel framework for emotional support conversation systems that moves beyond end-to-end generation to create interpretable, executable skills. The system discovers support interventions from successful and failed dialogues, organizes them into a skills bank with applicability conditions and risk assessments, then self-improves through multi-profile simulations and systematic failure analysis.

AINeutralarXiv – CS AI · 4d ago6/10

🧠

SEAL: Self-Evolving Agentic Learning for Conversational Question Answering over Knowledge Graphs

SEAL introduces a two-stage semantic parsing framework that combines large language models with agentic learning to improve conversational question answering over knowledge graphs. The system self-evolves through dialog history and execution feedback without retraining, achieving state-of-the-art results on complex multi-hop reasoning and aggregation tasks while reducing computational costs.

AINeutralarXiv – CS AI · 4d ago6/10

🧠

From Norms to Indicators (N2I-RAG): An Agentic Retrieval-Augmented Generation Framework for Legal Indicator Computation

Researchers introduce N2I-RAG, an AI framework that automates computation of legal indicators from normative texts using retrieval-augmented generation with built-in validation mechanisms. The system addresses hallucination risks in traditional language models by emphasizing traceability and evidence grounding, demonstrating strong performance on French marine environmental law.

Page 1 of 5Next →