y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#voice-ai News & Analysis

20 articles tagged with #voice-ai. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

20 articles
AIBullishDecrypt – AI · 4d ago7/10
🧠

StepFun's Voice AI Topped Every Benchmark. It Also Hears Your Sighs

StepFun, a Shanghai-based AI lab known for developing efficient large language models, has achieved top benchmark results in voice AI technology with notable sensitivity to acoustic nuances like sighs. The breakthrough demonstrates the lab's capability to extend its LLM expertise into multimodal AI, potentially reshaping voice recognition and AI assistant markets.

StepFun's Voice AI Topped Every Benchmark. It Also Hears Your Sighs
AI × CryptoBullishCrypto Briefing · May 97/10
🤖

OpenAI unveils GPT-5-class voice models for real-time orchestration

OpenAI has released GPT-5-class voice models designed for real-time orchestration, which could significantly impact cryptocurrency markets and decentralized computing infrastructure. The modular voice AI tools are positioned to drive innovation and investment in AI infrastructure sectors, with potential implications for how decentralized systems handle computational tasks.

OpenAI unveils GPT-5-class voice models for real-time orchestration
🏢 OpenAI🧠 GPT-5
AIBullishOpenAI News · May 77/10
🧠

Advancing voice intelligence with new models in the API

OpenAI has introduced new realtime voice models in its API that enable advanced capabilities including reasoning, translation, and speech transcription. These models represent a significant step toward more natural and intelligent voice-based interactions, expanding the practical applications available to developers building voice-enabled applications.

🏢 OpenAI
AIBullishOpenAI News · May 47/10
🧠

How OpenAI delivers low-latency voice AI at scale

OpenAI has rebuilt its WebRTC infrastructure to enable real-time voice AI conversations with minimal latency and global scalability. The technical achievement demonstrates a significant advancement in conversational AI systems that can maintain natural turn-taking dynamics while serving users worldwide.

🏢 OpenAI
AIBearisharXiv – CS AI · Mar 177/10
🧠

$\tau$-Voice: Benchmarking Full-Duplex Voice Agents on Real-World Domains

Researchers introduce τ-voice, a new benchmark for evaluating full-duplex voice AI agents on complex real-world tasks. The study reveals significant performance gaps, with voice agents achieving only 30-45% of text-based AI capability under realistic conditions with noise and diverse accents.

🧠 GPT-5
AIBullishOpenAI News · Oct 17/105
🧠

Introducing the Realtime API

OpenAI has launched a new Realtime API that enables developers to integrate fast speech-to-speech capabilities directly into their applications. This API allows for real-time voice interactions without the traditional delays of converting speech to text and back to speech.

AIBullishBlockonomi · 1d ago6/10
🧠

Alibaba Voice AI Model Beats OpenAI and xAI on Global Benchmark

Alibaba's Fun-Realtime-TTS-Preview voice AI model ranked fifth on the Artificial Analysis Speech Arena leaderboard, outperforming systems from OpenAI and xAI. This achievement marks Alibaba as the only Chinese-engineered voice system in the global top five, supporting 30+ languages and multiple Chinese dialects.

🏢 OpenAI🏢 xAI
AINeutralTechCrunch – AI · May 106/10
🧠

Voice AI in India is hard. Wispr Flow is betting on it anyway.

Wispr Flow has accelerated growth in India following its Hinglish language rollout, demonstrating market demand for voice AI solutions in regional languages. However, the company operates within a challenging landscape where voice AI products face significant technical and adoption hurdles across the Indian market.

AIBullishBlockonomi · May 26/10
🧠

SoundHound AI (SOUN) Stock Rallies 20% Following Twilio’s Bullish Voice AI Results

SoundHound AI (SOUN) stock surged 20.1% following positive voice AI results reported by competitor Twilio, capitalizing on market enthusiasm for the voice AI sector. The rally comes ahead of SOUN's own Q1 earnings announcement scheduled for Thursday, which could provide additional catalyst for the stock's momentum.

AIBullishBlockonomi · Apr 216/10
🧠

SoundHound AI (SOUN) Stock Climbs 3% Despite Broader Tech Sector Weakness

SoundHound AI (SOUN) gained 3% on Monday despite a broader technology sector decline triggered by U.S.-Iran geopolitical tensions. The stock's resilience reflects strong fundamental performance, with the company reporting 59.4% year-over-year revenue growth and analyst price targets exceeding $14.

AINeutralarXiv – CS AI · Apr 146/10
🧠

Efficient Training for Cross-lingual Speech Language Models

Researchers introduce Cross-lingual Speech Language Models (CSLM), an efficient training method for building multilingual speech AI systems using discrete speech tokens. The approach achieves cross-modal and cross-lingual alignment through continual pre-training and instruction fine-tuning, enabling effective speech LLMs without requiring massive datasets.

AIBullisharXiv – CS AI · Mar 126/10
🧠

Speaker Verification with Speech-Aware LLMs: Evaluation and Augmentation

Researchers developed a protocol to evaluate speaker verification capabilities in speech-aware large language models, finding weak performance with error rates above 20%. They introduced ECAPA-LLM, a lightweight augmentation that achieves 1.03% error rate by integrating speaker embeddings while maintaining natural language interface.

AIBullishWired – AI · Mar 37/106
🧠

This AI Agent Is Ready to Serve, Mid-Phone Call

Deutsche Telekom is partnering with ElevenLabs to integrate AI assistant functionality directly into phone calls across its German network without requiring any app installation. This represents a significant step toward mainstream AI integration in telecommunications infrastructure.

This AI Agent Is Ready to Serve, Mid-Phone Call
AINeutralarXiv – CS AI · Mar 26/1013
🧠

Human or Machine? A Preliminary Turing Test for Speech-to-Speech Interaction

Researchers conducted the first Turing test for speech-to-speech AI systems, analyzing 2,968 human judgments across 9 state-of-the-art systems. No current S2S system passed the test, with failures primarily stemming from paralinguistic features and emotional expressivity rather than semantic understanding.

AIBullishOpenAI News · Jan 206/104
🧠

ServiceNow powers actionable enterprise AI with OpenAI

ServiceNow is expanding its integration with OpenAI to bring advanced AI capabilities to enterprise workflows. The partnership will enable AI-driven summarization, search, and voice features across ServiceNow's platform to enhance business operations.

AIBullishOpenAI News · Jan 76/105
🧠

How Tolan builds voice-first AI with GPT-5.1

Tolan has developed a voice-first AI companion using GPT-5.1 technology, featuring low-latency responses and real-time context reconstruction. The system incorporates memory-driven personalities to enable more natural conversational experiences.

AIBullishGoogle DeepMind Blog · Dec 126/105
🧠

Improved Gemini audio models for powerful voice experiences

Google has announced improvements to its Gemini audio models, enhancing voice interaction capabilities for more powerful and natural voice experiences. The upgrades focus on better audio processing and response quality in conversational AI applications.

AINeutralOpenAI News · Jun 75/107
🧠

Expanding on how Voice Engine works and our safety research

OpenAI provides technical insights into Voice Engine, their text-to-speech model technology, along with details about their safety research approach. The article explores the underlying technology and safety considerations for their voice synthesis capabilities.

AINeutralTechCrunch – AI · May 105/10
🧠

Get ready for the whisper-filled office of the future

The article explores how increasing reliance on voice-based AI interactions will transform office design and work environments. As workers spend more time speaking to computers rather than typing, physical office spaces will need to adapt to accommodate whisper-based communication and new acoustic challenges.