#conversational-ai News & Analysis

168 articles tagged with #conversational-ai. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

168 articles

AIBearisharXiv – CS AI · Apr 137/10

🧠

Artificial intelligence can persuade people to take political actions

A large-scale study demonstrates that conversational AI models can persuade people to take real-world actions like signing petitions and donating money, with effects reaching +19.7 percentage points on petition signing. Surprisingly, the research finds no correlation between AI's persuasive effects on attitudes versus behaviors, challenging assumptions that attitude change predicts behavioral outcomes.

AIBullishCrypto Briefing · Apr 107/10

🧠

Brad Lightcap: Scaling laws show larger AI models outperform smaller ones, the evolution of language models to conversational interfaces, and the emergence of AI agency | Uncapped with Jack Altman

Brad Lightcap discusses how scaling laws demonstrate that larger AI models consistently outperform smaller ones, while highlighting the evolution from language models to conversational AI interfaces and the emerging phenomenon of AI agency. This shift toward autonomous AI systems signals significant economic and societal implications.

AIBearisharXiv – CS AI · Apr 77/10

🧠

Commercial Persuasion in AI-Mediated Conversations

A research study reveals that AI-powered conversational interfaces can triple the rate of sponsored product selection compared to traditional search engines (61.2% vs 22.4%). Users largely fail to detect this commercial steering, even with explicit sponsor labels, indicating current transparency measures are insufficient.

AIBearisharXiv – CS AI · Mar 277/10

🧠

Malicious LLM-Based Conversational AI Makes Users Reveal Personal Information

Researchers conducted a study with 502 participants demonstrating that malicious LLM-based conversational AI systems can be deliberately designed to extract personal information from users through manipulative conversation strategies. The study found that these malicious chatbots significantly outperformed benign versions at collecting personal data, with social psychology-based approaches being most effective while appearing less threatening to users.

🧠 ChatGPT

AINeutralArs Technica – AI · Mar 267/10

🧠

The debut of Gemini 3.1 Flash Live could make it harder to know if you're talking to a robot

Google is launching Gemini 3.1 Flash Live, a new conversational audio AI system being integrated into search, Gemini platform, and developer tools. The advancement in AI conversational capabilities could make it increasingly difficult for users to distinguish between human and AI interactions.

🧠 Gemini

AIBearisharXiv – CS AI · Mar 177/10

🧠

$\tau$-Voice: Benchmarking Full-Duplex Voice Agents on Real-World Domains

Researchers introduce τ-voice, a new benchmark for evaluating full-duplex voice AI agents on complex real-world tasks. The study reveals significant performance gaps, with voice agents achieving only 30-45% of text-based AI capability under realistic conditions with noise and diverse accents.

🧠 GPT-5

AIBullisharXiv – CS AI · Mar 117/10

🧠

A prospective clinical feasibility study of a conversational diagnostic AI in an ambulatory primary care clinic

Google's AMIE conversational AI successfully completed a clinical feasibility study with 100 patients at an academic medical center, demonstrating 90% accuracy in including correct diagnoses and achieving high patient satisfaction. The AI showed comparable diagnostic quality to primary care physicians while requiring no safety interventions during real-world clinical interactions.

AINeutralarXiv – CS AI · Mar 57/10

🧠

Old Habits Die Hard: How Conversational History Geometrically Traps LLMs

Researchers introduce History-Echoes, a framework revealing how large language models become trapped by their conversational history, with past interactions creating geometric constraints in latent space that bias future responses. The study demonstrates that behavioral persistence in LLMs manifests as mathematical traps where previous hallucinations and responses influence subsequent model behavior across multiple model families and datasets.

AINeutralarXiv – CS AI · Mar 57/10

🧠

Certainty robustness: Evaluating LLM stability under self-challenging prompts

Researchers introduce the Certainty Robustness Benchmark, a new evaluation framework that tests how large language models handle challenges to their responses in interactive settings. The study reveals significant differences in how AI models balance confidence and adaptability when faced with prompts like "Are you sure?" or "You are wrong!", identifying a critical new dimension for AI evaluation.

AINeutralarXiv – CS AI · Mar 56/10

🧠

SafeCRS: Personalized Safety Alignment for LLM-Based Conversational Recommender Systems

Researchers introduce SafeCRS, a safety-aware training framework for LLM-based conversational recommender systems that addresses personalized safety vulnerabilities. The system reduces safety violation rates by up to 96.5% while maintaining recommendation quality by respecting individual user constraints like trauma triggers and phobias.

AIBearisharXiv – CS AI · Mar 56/10

🧠

$\tau$-Knowledge: Evaluating Conversational Agents over Unstructured Knowledge

Researchers introduced τ-Knowledge, a new benchmark for evaluating AI conversational agents in knowledge-intensive environments, specifically testing their ability to retrieve and apply unstructured domain knowledge. Even frontier AI models achieved only 25.5% success rates when navigating complex fintech customer support scenarios with 700 interconnected knowledge documents.

AIBullishOpenAI News · Oct 67/105

🧠

Introducing apps in ChatGPT and the new Apps SDK

OpenAI is launching a new generation of interactive apps within ChatGPT that users can chat with directly. The company has released a new Apps SDK in preview, allowing developers to start building these conversational applications immediately.

AIBullishOpenAI News · Sep 297/107

🧠

Buy it in ChatGPT: Instant Checkout and the Agentic Commerce Protocol

OpenAI is introducing agentic commerce capabilities to ChatGPT, enabling AI agents, users, and businesses to collaborate in shopping experiences. This represents an early step toward AI-powered autonomous commerce systems integrated into conversational AI platforms.

AIBullishOpenAI News · Oct 17/105

🧠

Introducing the Realtime API

OpenAI has launched a new Realtime API that enables developers to integrate fast speech-to-speech capabilities directly into their applications. This API allows for real-time voice interactions without the traditional delays of converting speech to text and back to speech.

AIBullishOpenAI News · Sep 257/104

🧠

ChatGPT can now see, hear, and speak

ChatGPT is rolling out new multimodal capabilities that enable voice conversations and image recognition. These features represent a significant advancement in AI interface design, making interactions more intuitive and natural.

AIBullishOpenAI News · Nov 307/107

🧠

Introducing ChatGPT

OpenAI has introduced ChatGPT, a conversational AI model designed to interact through dialogue. The model can answer follow-up questions, admit mistakes, challenge incorrect premises, and reject inappropriate requests.

AINeutralarXiv – CS AI · Jun 256/10

🧠

Tinker Tales: A Tangible Dialogue System for Child-AI Co-Creative Storytelling

Tinker Tales is a tangible dialogue system combining physical storytelling boards, NFC-embedded toys, and mobile apps to enable child-AI collaborative storytelling. A user study with 8-year-olds demonstrates that conversation design and prompt framing significantly influence how children engage in co-creative dialogue with AI agents, with educational scaffolding affecting narrative consistency and contribution patterns.

AINeutralarXiv – CS AI · Jun 256/10

🧠

Memory Makes the Difference: Evaluating How Different Memory Roles Shape Conversational Agents

Researchers present a taxonomy of memory roles in RAG-based conversational AI systems, demonstrating that different memory types—such as clarifying versus irrelevant memories—substantially shape response quality, factual accuracy, and personalization. Using a user-centric evaluation framework, the study reveals that memory function matters more than just storage mechanisms, with implications for developing more effective conversational agents.

AIBullisharXiv – CS AI · Jun 236/10

🧠

Streaming T5-based Text-to-Speech Synthesis with Limited Lookahead

Researchers introduce S5-TTS, a streaming variant of T5-based text-to-speech that generates speech word-by-word with minimal latency by processing limited lookahead context. The system uses novel masking mechanisms and distillation techniques to maintain speech quality and speaker similarity while enabling real-time conversational AI applications.

AIBullisharXiv – CS AI · Jun 236/10

🧠

PulseCX: Breaking the Closed-World Assumption in Real-Time CX

PulseCX is a new framework that addresses a critical limitation in conversational AI for customer service: the inability to respond to real-time external events like viral trends or system outages. By using an asynchronous knowledge graph system instead of synchronous web search, PulseCX reduces latency to under 10ms while improving intent resolution and customer satisfaction in dynamic environments.

AIBearisharXiv – CS AI · Jun 236/10

🧠

AI-Mediated Negotiation: Design Reflections and Lessons

Researchers built Trucey, an AI coaching system for workplace negotiations, but found that a static handbook outperformed the conversational AI on user empowerment and usability. The study reveals that conversational AI imposes linear execution models on tasks requiring recursive, non-sequential preparation, challenging core assumptions about AI-mediated coaching design.

AINeutralarXiv – CS AI · Jun 236/10

🧠

Nous: A Predictive World Model for Long-Term Agent Memory

Nous is a novel agent memory architecture that uses predictive world models based on probability distributions rather than traditional storage methods. Evaluated on the LoCoMo benchmark, it achieves competitive F1 scores across multiple memory tasks and outperforms comparable systems like A-MEM and BeliefMem, though the authors acknowledge reproducibility challenges in cross-system comparisons.

🧠 GPT-4

AINeutralarXiv – CS AI · Jun 236/10

🧠

Post-Training Recipe, More Than Model Family, Shapes Multi-Agent LLM Conversational Behavior

Researchers found that post-training procedures significantly influence how large language models behave in multi-agent systems, often more than model family membership. Testing across 1.6M interaction chains reveals that identical base models fine-tuned differently produce more behavioral diversity than models from different families, challenging conventional wisdom about composing effective multi-LLM systems.

🧠 Llama

AINeutralOpenAI News · Jun 236/10

🧠

How Omio is building the future of conversational travel

Omio, a travel booking platform, is integrating OpenAI's technology to build conversational AI features that enhance user experience and accelerate product development. The company is transitioning toward an AI-native architecture, leveraging large language models to streamline travel planning and booking processes.

🏢 OpenAI

AINeutralWired – AI · Jun 206/10

🧠

Siri AI Hands On: A Smart, Helpful Assistant

Apple's new Siri AI represents a significant upgrade in conversational capability and contextual awareness, positioning the assistant as more integrated and useful across Apple's ecosystem. The enhancement demonstrates the broader industry shift toward more sophisticated, omnipresent AI assistants that can handle complex user interactions.

← PrevPage 2 of 7Next →