y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#natural-language News & Analysis

34 articles tagged with #natural-language. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

34 articles
AIBearisharXiv – CS AI · Apr 147/10
🧠

What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models

Researchers introduce HAERAE-Vision, a benchmark of 653 real-world underspecified visual questions from Korean online communities, revealing that state-of-the-art vision-language models achieve under 50% accuracy on natural queries despite performing well on structured benchmarks. The study demonstrates that query clarification alone improves performance by 8-22 points, highlighting a critical gap between current evaluation standards and real-world deployment requirements.

🧠 GPT-5🧠 Gemini
AINeutralarXiv – CS AI · Mar 277/10
🧠

WebTestBench: Evaluating Computer-Use Agents towards End-to-End Automated Web Testing

Researchers introduced WebTestBench, a new benchmark for evaluating automated web testing using AI agents and large language models. The study reveals significant gaps between current AI capabilities and industrial deployment needs, with LLMs struggling with test completeness, defect detection, and long-term interaction reliability.

AIBullisharXiv – CS AI · Mar 56/10
🧠

IROSA: Interactive Robot Skill Adaptation using Natural Language

Researchers present IROSA, a framework combining foundation models with imitation learning for robot skill adaptation using natural language commands. The system uses a tool-based architecture that maintains safety by creating an abstraction layer between language models and robot hardware, demonstrated on industrial bearing ring insertion tasks.

AIBullisharXiv – CS AI · Mar 56/10
🧠

LMUnit: Fine-grained Evaluation with Natural Language Unit Tests

Researchers introduce LMUnit, a new evaluation framework for language models that uses natural language unit tests to assess AI behavior more precisely than current methods. The system breaks down response quality into explicit, testable criteria and achieves state-of-the-art performance on evaluation benchmarks while improving inter-annotator agreement.

AINeutralarXiv – CS AI · Mar 37/104
🧠

GLEE: A Unified Framework and Benchmark for Language-based Economic Environments

Researchers introduce GLEE, a new framework for studying how Large Language Models behave in economic games and strategic interactions. The study reveals that LLM performance in economic scenarios depends heavily on market parameters and model selection, with complex interdependent effects on outcomes.

AIBullisharXiv – CS AI · Mar 37/104
🧠

Beyond Single-Modal Analytics: A Framework for Integrating Heterogeneous LLM-Based Query Systems for Multi-Modal Data

Researchers introduce Meta Engine, a unified semantic query system that integrates multiple specialized LLM-based query systems to handle multi-modal data analysis. The system addresses fragmentation in current semantic query tools by combining specialized systems through five key components, achieving 3-24x better performance than existing baselines.

AIBullishOpenAI News · Oct 237/106
🧠

OpenAI acquires Software Applications Incorporated, maker of Sky

OpenAI has acquired Software Applications Incorporated, the company behind Sky, a natural language AI interface for Mac desktop environments. The acquisition aims to integrate Sky's macOS capabilities into ChatGPT to enhance AI user experience with more intuitive and contextual interactions.

$MKR
AIBullishOpenAI News · Aug 107/105
🧠

OpenAI Codex

OpenAI has released an improved version of Codex, their AI system that converts natural language into code. The enhanced system is now available through their API in private beta, marking a significant advancement in AI-powered programming tools.

AIBullishOpenAI News · Jan 57/107
🧠

DALL·E: Creating images from text

OpenAI has developed DALL·E, a neural network that generates images from text descriptions. This AI system can create visual content for a wide range of concepts that can be expressed in natural language.

AIBullishOpenAI News · Jan 57/105
🧠

CLIP: Connecting text and images

OpenAI introduces CLIP, a neural network that learns visual concepts from natural language supervision and can perform visual classification tasks without specific training. CLIP demonstrates zero-shot capabilities similar to GPT-2 and GPT-3, enabling it to recognize visual categories simply by providing their names.

AIBullishGoogle AI Blog · May 196/10
🧠

How AI Mode is changing the way people search in the U.S.

One year after launch, AI Mode has shifted user behavior from keyword-based searches to natural language queries, representing a fundamental change in how Americans interact with search technology. This transition demonstrates growing adoption of conversational AI interfaces and user comfort with more human-like search interactions.

How AI Mode is changing the way people search in the U.S.
AIBullishAI News · May 126/10
🧠

Laserfiche unveils AI agents for natural language workflows

Laserfiche has released AI agents capable of executing tasks through natural language prompts while maintaining integrated security protocols and compliance requirements. The announcement reflects a broader shift toward autonomous AI assistants in enterprise content management systems that can operate within predefined security boundaries.

AINeutralarXiv – CS AI · May 126/10
🧠

Effective Explanations Support Planning Under Uncertainty

Researchers propose a computational model that evaluates explanations by converting them into executable action plans through large language models and planning agents. Across four experiments with 1,200 explanations, higher-scored explanations correlate with improved navigation performance and user helpfulness judgments, demonstrating that explanation quality can be measured by practical outcomes under uncertainty.

AI × CryptoBullishDecrypt · May 116/10
🤖

MoonPay Acquires Dawn Labs, Launches AI Trading Copilot for Prediction Markets

MoonPay has acquired Dawn Labs and launched an AI trading copilot that converts natural language prompts into automated cryptocurrency trading strategies for prediction markets. This integration combines MoonPay's payment infrastructure with AI-driven trading automation, representing a convergence of crypto onboarding, artificial intelligence, and algorithmic trading.

MoonPay Acquires Dawn Labs, Launches AI Trading Copilot for Prediction Markets
🏢 Microsoft
AINeutralarXiv – CS AI · Mar 176/10
🧠

PMAx: An Agentic Framework for AI-Driven Process Mining

Researchers have developed PMAx, an autonomous AI framework that democratizes process mining by allowing business users to analyze organizational workflows through natural language queries. The system uses a multi-agent architecture with local execution to ensure data privacy and mathematical accuracy while eliminating the need for specialized technical expertise.

AIBearisharXiv – CS AI · Mar 176/10
🧠

Should LLMs, like, Generate How Users Talk? Building Dialect-Accurate Dialog[ue]s Beyond the American Default with MDial

Researchers introduced MDial, the first large-scale framework for generating multi-dialectal conversational data across nine English dialects, revealing that over 80% of English speakers don't use Standard American English. Evaluation of 17 LLMs showed even frontier models achieve under 70% accuracy in dialect identification, with particularly poor performance on non-American dialects.

AIBullisharXiv – CS AI · Mar 126/10
🧠

Speaker Verification with Speech-Aware LLMs: Evaluation and Augmentation

Researchers developed a protocol to evaluate speaker verification capabilities in speech-aware large language models, finding weak performance with error rates above 20%. They introduced ECAPA-LLM, a lightweight augmentation that achieves 1.03% error rate by integrating speaker embeddings while maintaining natural language interface.

AIBullisharXiv – CS AI · Mar 96/10
🧠

PONTE: Personalized Orchestration for Natural Language Trustworthy Explanations

Researchers introduce PONTE, a human-in-the-loop framework that creates personalized, trustworthy AI explanations by combining user preference modeling with verification modules. The system addresses the challenge of one-size-fits-all AI explanations by adapting to individual user expertise and cognitive needs while maintaining faithfulness and reducing hallucinations.

AIBullisharXiv – CS AI · Mar 37/107
🧠

LLM-assisted Semantic Option Discovery for Facilitating Adaptive Deep Reinforcement Learning

Researchers have developed a new framework that combines Large Language Models (LLMs) with Deep Reinforcement Learning to improve data efficiency, interpretability, and cross-environment transferability. The approach uses LLMs to map natural language instructions into executable rules and create semantically annotated options for better skill reuse and constraint monitoring.

AIBullisharXiv – CS AI · Mar 27/1012
🧠

Hyperdimensional Cross-Modal Alignment of Frozen Language and Image Models for Efficient Image Captioning

Researchers introduce HDFLIM, a new framework that aligns vision and language AI models without requiring computationally expensive fine-tuning by using hyperdimensional computing to create cross-modal mappings while keeping foundation models frozen. The approach achieves comparable performance to traditional training methods while being significantly more resource-efficient.

AIBullishThe Verge – AI · Feb 266/104
🧠

Microsoft’s Copilot Tasks AI uses its own computer to get things done

Microsoft announced Copilot Tasks, a new AI system that handles background tasks using cloud-based computers and browsers. The feature can schedule appointments, generate study plans, and complete various jobs on recurring, scheduled, or one-time basis using natural language commands.

AIBullishOpenAI News · Jan 76/105
🧠

How Tolan builds voice-first AI with GPT-5.1

Tolan has developed a voice-first AI companion using GPT-5.1 technology, featuring low-latency responses and real-time context reconstruction. The system incorporates memory-driven personalities to enable more natural conversational experiences.

AIBullishGoogle DeepMind Blog · Dec 126/105
🧠

Improved Gemini audio models for powerful voice experiences

Google has announced improvements to its Gemini audio models, enhancing voice interaction capabilities for more powerful and natural voice experiences. The upgrades focus on better audio processing and response quality in conversational AI applications.

Page 1 of 2Next →