#human-ai-collaboration News & Analysis

133 articles tagged with #human-ai-collaboration. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

133 articles

AIBullishCrypto Briefing · Jun 267/10

🧠

OpenAI’s Mark Chen says AI models are approaching the point of generating their own innovations

OpenAI's Mark Chen has stated that AI models are approaching a capability threshold where they can autonomously generate novel innovations without human direction. This development signals a fundamental shift in AI autonomy that could reshape how industries evaluate AI performance and redefine collaboration between humans and AI systems.

🏢 OpenAI

AIBearisharXiv – CS AI · Jun 237/10

🧠

Arguments that Alter Minds: LLM Rationales Sway Human (and LLM) Notions of Plausibility

Researchers found that LLM-generated arguments significantly influence both human and AI plausibility judgments on commonsense reasoning tasks, with supportive rationales increasing confidence and opposing ones decreasing it. This reveals both a novel tool for studying human cognition and a concerning vulnerability: AI systems can persuade people to doubt their own common sense reasoning.

AIBullisharXiv – CS AI · Jun 197/10

🧠

Uncertainty Decomposition for Clarification Seeking in LLM Agents

Researchers introduce a prompt-based uncertainty decomposition method that enables LLM agents to proactively seek clarification when task specifications are ambiguous. The approach separates action confidence from request uncertainty and demonstrates 36-73% improvements in clarification performance across multiple LLM backbones compared to existing uncertainty frameworks.

🧠 GPT-5

AIBullisharXiv – CS AI · Jun 97/10

🧠

Complement or substitute? How AI increases the demand for human skills

A comprehensive empirical study analyzing 30 million US, UK, and Australian job postings finds that AI adoption increases demand for complementary human skills like analytical thinking and resilience rather than simply replacing workers. The research reveals significant wage premiums for these soft skills in AI-adjacent roles and spillover effects where AI diffusion reduces demand for substitutable tasks across entire industries and regions.

AIBullisharXiv – CS AI · Jun 97/10

🧠

Distilling LLM Reasoning into an Interpretable Policy Tree for Human-AI Collaboration

Researchers introduce Collaboration Policy Tree (Co-pi-tree), a method that distills large language model reasoning into interpretable, executable policy trees for human-AI collaboration. The approach achieves 35% performance improvement while reducing LLM queries by 78% and latency by 97%, addressing key limitations of black-box reinforcement learning and costly real-time LLM querying.

AIBullishCrypto Briefing · Jun 67/10

🧠

Mira Murati resurfaces at Bloomberg Tech to unveil Thinking Machines Lab’s ambitious AI vision

Mira Murati, former OpenAI CTO, has publicly unveiled her new venture, Thinking Machines Lab, at Bloomberg Tech, positioning it as a disruptive force in AI that prioritizes human-AI collaboration over the current dominance of major tech players. Her vision challenges the existing AI landscape by emphasizing partnership models rather than AI-centric approaches.

AIBearisharXiv – CS AI · May 297/10

🧠

How Coding Agents Fail Their Users: A Large-Scale Analysis of Developer-Agent Misalignment in 20,574 Real-World Sessions

A large-scale observational study of 20,574 real-world AI coding agent sessions reveals systematic misalignment patterns between developer intent and agent behavior. The research identifies seven recurring failure modes, with 91.49% of visible issues requiring explicit user correction, though most impose effort costs rather than irreversible damage.

AIBullisharXiv – CS AI · May 287/10

🧠

Localizing Input Uncertainty Quantification for Large Language Models via Shapley Values

Researchers introduce ShaQ, a Shapley-value-based framework that identifies which specific parts of user input cause uncertainty in large language models, rather than just flagging overall uncertainty. The method achieves state-of-the-art ambiguity detection on multiple benchmarks and demonstrates practical value in high-stakes domains like clinical settings by enabling targeted input clarification.

AIBullisharXiv – CS AI · May 287/10

🧠

FundaPod: A Multi-Persona Agent Pod Platform with Knowledge Graph Memory for AI-Assisted Fundamental Investment Research

FundaPod introduces a multi-persona AI agent platform designed to assist institutional investors in fundamental research by enabling independent agents with different investment perspectives to conduct analysis and surface disagreements for human portfolio manager review. The system uses knowledge graphs and grounded evidence models to create transparent, verifiable investment memos that prioritize human-centric decision-making over automated trading signals.

AIBullisharXiv – CS AI · May 97/10

🧠

AI Co-Mathematician: Accelerating Mathematicians with Agentic AI

Researchers have introduced the AI co-mathematician, an interactive workbench that leverages agentic AI to assist mathematicians in solving open-ended research problems. The system achieves state-of-the-art results on hard benchmarks, scoring 48% on FrontierMath Tier 4, and demonstrates practical value by helping researchers solve open problems and identify new research directions.

AINeutralarXiv – CS AI · May 77/10

🧠

Toward Human-AI Complementarity Across Diverse Tasks

A research study evaluates whether combining human and AI judgments can improve decision-making across diverse tasks, finding only modest complementarity gains of 0.4 percentage points. The primary bottleneck identified is not human accuracy but rather the inability to effectively route decisions to humans when needed and design assistance methods that help humans catch AI mistakes.

AIBullisharXiv – CS AI · Apr 157/10

🧠

IDEA: An Interpretable and Editable Decision-Making Framework for LLMs via Verbal-to-Numeric Calibration

Researchers introduce IDEA, a framework that converts Large Language Model decision-making into interpretable, editable parametric models with calibrated probabilities. The approach outperforms major LLMs like GPT-5.2 and DeepSeek R1 on benchmarks while enabling direct expert knowledge integration and precise human-AI collaboration.

🧠 GPT-5

AINeutralarXiv – CS AI · Apr 77/10

🧠

The Persuasion Paradox: When LLM Explanations Fail to Improve Human-AI Team Performance

Research reveals a 'Persuasion Paradox' where LLM explanations increase user confidence but don't reliably improve human-AI team performance, and can actually undermine task accuracy. The study found that explanation effectiveness varies significantly by task type, with visual reasoning tasks seeing decreased error recovery while logical reasoning tasks benefited from explanations.

AIBullisharXiv – CS AI · Mar 277/10

🧠

The Future of AI-Driven Software Engineering

A paradigm shift is occurring in software engineering as AI systems like LLMs increasingly boost development productivity. The paper presents a vision for growing symbiotic partnerships between human developers and AI, identifying key research challenges the software engineering community must address.

AINeutralarXiv – CS AI · Mar 267/10

🧠

Collaborative Causal Sensemaking: Closing the Complementarity Gap in Human-AI Decision Support

Researchers propose Collaborative Causal Sensemaking (CCS) as a new framework to improve human-AI collaboration in high-stakes decision making. The study identifies a 'complementarity gap' where current AI agents function as answer engines rather than true collaborative partners, limiting the effectiveness of human-AI teams.

AIBullisharXiv – CS AI · Mar 167/10

🧠

Human-AI Governance (HAIG): A Trust-Utility Approach

Researchers introduce the Human-AI Governance (HAIG) framework that treats AI systems as collaborative partners rather than mere tools, proposing a trust-utility approach to governance across three dimensions: Decision Authority, Process Autonomy, and Accountability Configuration. The framework aims to enable adaptive regulatory design for evolving AI capabilities, particularly as foundation models and multi-agent systems demonstrate increasing autonomy.

AIBullisharXiv – CS AI · Mar 97/10

🧠

Accelerating Scientific Research with Gemini: Case Studies and Common Techniques

Google's Gemini-based AI models, particularly Gemini Deep Think, have demonstrated the ability to collaborate with researchers to solve open problems and generate new proofs across theoretical computer science, economics, optimization, and physics. The research identifies effective techniques for human-AI collaboration including iterative refinement, problem decomposition, and deploying AI as adversarial reviewers to detect flaws in existing proofs.

🧠 Gemini

AI × CryptoBullishCryptoPotato · Mar 67/10

🤖

Vitalik Buterin Proposes Human-Verified AI Wallets for Crypto Transactions

Ethereum founder Vitalik Buterin has proposed a new wallet design that combines AI assistance with human verification for cryptocurrency transactions. The system would allow AI algorithms to suggest transaction plans while requiring users to manually confirm large transfers, aiming to balance automation with security.

AIBullishTechCrunch – AI · Mar 57/10

🧠

Netflix buys Ben Affleck’s AI filmmaking company InterPositive

Netflix has acquired Ben Affleck's AI filmmaking company InterPositive, marking a significant move by the streaming giant into AI-powered content creation. Affleck emphasized his goal to preserve human judgment and storytelling elements while leveraging artificial intelligence in the filmmaking process.

AIBearisharXiv – CS AI · Mar 47/102

🧠

The Geometry of Learning Under AI Delegation

Researchers developed a mathematical model showing how AI delegation can create stable low-skill equilibria where humans become persistently reliant on AI systems. The study reveals that while AI assistance improves short-term performance, it can lead to long-term skill degradation through reduced practice and negative feedback loops.

AIBullisharXiv – CS AI · Mar 46/102

🧠

PlayWrite: A Multimodal System for AI Supported Narrative Co-Authoring Through Play in XR

PlayWrite is a new mixed-reality AI system that allows users to create stories by directly manipulating virtual characters and props in XR, rather than through traditional text prompts. The system uses multi-agent AI to interpret user actions into structured narrative elements and generates final stories via large language models, demonstrating a novel approach to AI-human creative collaboration.

AINeutralarXiv – CS AI · Mar 46/105

🧠

Architecting Trust in Artificial Epistemic Agents

Researchers propose a framework for developing trustworthy AI agents that function as epistemic entities, capable of pursuing knowledge goals and shaping information environments. The paper argues that as AI models increasingly replace traditional search methods and provide specialized advice, their calibration to human epistemic norms becomes critical to prevent cognitive deskilling and epistemic drift.

AIBullisharXiv – CS AI · Mar 47/103

🧠

Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

Researchers introduce Skywork-Reward-V2, a suite of AI reward models trained on SynPref-40M, a massive 40-million preference pair dataset created through human-AI collaboration. The models achieve state-of-the-art performance across seven major benchmarks by combining human annotation quality with AI scalability for better preference learning.

AINeutralarXiv – CS AI · Jun 256/10

🧠

From Meta Idea to Advanced Mathematical Discovery -- Human-AI Co-Discovery of Sign-Embedding Quantum Algorithms

Researchers demonstrate a human-AI co-discovery workflow that transformed a vague mathematical intuition into sign-embedding quantum algorithms for matrix equations. Rather than AI autonomously solving predefined problems, the collaborative approach proved most valuable for problem formation, exploratory route-mapping, and proof development, with humans retaining critical judgment on scientific direction.

AINeutralarXiv – CS AI · Jun 256/10

🧠

Agentic Software Engineering: Foundational Pillars and a Research Roadmap

Researchers propose Structured Agentic Software Engineering (SASE), a framework reimagining software development where AI agents autonomously pursue complex goals rather than simply generating code. The approach introduces two complementary environments—one for human oversight and one for agent execution—establishing a human-AI partnership model that demands fundamental changes to traditional software engineering processes, tools, and artifacts.

Page 1 of 6Next →