AIBearisharXiv – CS AI · 2d ago7/10
🧠A large-scale observational study of 20,574 real-world AI coding agent sessions reveals systematic misalignment patterns between developer intent and agent behavior. The research identifies seven recurring failure modes, with 91.49% of visible issues requiring explicit user correction, though most impose effort costs rather than irreversible damage.
AIBullisharXiv – CS AI · 3d ago7/10
🧠FundaPod introduces a multi-persona AI agent platform designed to assist institutional investors in fundamental research by enabling independent agents with different investment perspectives to conduct analysis and surface disagreements for human portfolio manager review. The system uses knowledge graphs and grounded evidence models to create transparent, verifiable investment memos that prioritize human-centric decision-making over automated trading signals.
AIBullisharXiv – CS AI · 3d ago7/10
🧠Researchers introduce ShaQ, a Shapley-value-based framework that identifies which specific parts of user input cause uncertainty in large language models, rather than just flagging overall uncertainty. The method achieves state-of-the-art ambiguity detection on multiple benchmarks and demonstrates practical value in high-stakes domains like clinical settings by enabling targeted input clarification.
AIBullisharXiv – CS AI · May 97/10
🧠Researchers have introduced the AI co-mathematician, an interactive workbench that leverages agentic AI to assist mathematicians in solving open-ended research problems. The system achieves state-of-the-art results on hard benchmarks, scoring 48% on FrontierMath Tier 4, and demonstrates practical value by helping researchers solve open problems and identify new research directions.
AINeutralarXiv – CS AI · May 77/10
🧠A research study evaluates whether combining human and AI judgments can improve decision-making across diverse tasks, finding only modest complementarity gains of 0.4 percentage points. The primary bottleneck identified is not human accuracy but rather the inability to effectively route decisions to humans when needed and design assistance methods that help humans catch AI mistakes.
AIBullisharXiv – CS AI · Apr 157/10
🧠Researchers introduce IDEA, a framework that converts Large Language Model decision-making into interpretable, editable parametric models with calibrated probabilities. The approach outperforms major LLMs like GPT-5.2 and DeepSeek R1 on benchmarks while enabling direct expert knowledge integration and precise human-AI collaboration.
🧠 GPT-5
AINeutralarXiv – CS AI · Apr 77/10
🧠Research reveals a 'Persuasion Paradox' where LLM explanations increase user confidence but don't reliably improve human-AI team performance, and can actually undermine task accuracy. The study found that explanation effectiveness varies significantly by task type, with visual reasoning tasks seeing decreased error recovery while logical reasoning tasks benefited from explanations.
AIBullisharXiv – CS AI · Mar 277/10
🧠A paradigm shift is occurring in software engineering as AI systems like LLMs increasingly boost development productivity. The paper presents a vision for growing symbiotic partnerships between human developers and AI, identifying key research challenges the software engineering community must address.
AINeutralarXiv – CS AI · Mar 267/10
🧠Researchers propose Collaborative Causal Sensemaking (CCS) as a new framework to improve human-AI collaboration in high-stakes decision making. The study identifies a 'complementarity gap' where current AI agents function as answer engines rather than true collaborative partners, limiting the effectiveness of human-AI teams.
AIBullisharXiv – CS AI · Mar 167/10
🧠Researchers introduce the Human-AI Governance (HAIG) framework that treats AI systems as collaborative partners rather than mere tools, proposing a trust-utility approach to governance across three dimensions: Decision Authority, Process Autonomy, and Accountability Configuration. The framework aims to enable adaptive regulatory design for evolving AI capabilities, particularly as foundation models and multi-agent systems demonstrate increasing autonomy.
AIBullisharXiv – CS AI · Mar 97/10
🧠Google's Gemini-based AI models, particularly Gemini Deep Think, have demonstrated the ability to collaborate with researchers to solve open problems and generate new proofs across theoretical computer science, economics, optimization, and physics. The research identifies effective techniques for human-AI collaboration including iterative refinement, problem decomposition, and deploying AI as adversarial reviewers to detect flaws in existing proofs.
🧠 Gemini
AI × CryptoBullishCryptoPotato · Mar 67/10
🤖Ethereum founder Vitalik Buterin has proposed a new wallet design that combines AI assistance with human verification for cryptocurrency transactions. The system would allow AI algorithms to suggest transaction plans while requiring users to manually confirm large transfers, aiming to balance automation with security.
AIBullishTechCrunch – AI · Mar 57/10
🧠Netflix has acquired Ben Affleck's AI filmmaking company InterPositive, marking a significant move by the streaming giant into AI-powered content creation. Affleck emphasized his goal to preserve human judgment and storytelling elements while leveraging artificial intelligence in the filmmaking process.
AIBearisharXiv – CS AI · Mar 47/102
🧠Researchers developed a mathematical model showing how AI delegation can create stable low-skill equilibria where humans become persistently reliant on AI systems. The study reveals that while AI assistance improves short-term performance, it can lead to long-term skill degradation through reduced practice and negative feedback loops.
AIBullisharXiv – CS AI · Mar 47/103
🧠Researchers introduce Skywork-Reward-V2, a suite of AI reward models trained on SynPref-40M, a massive 40-million preference pair dataset created through human-AI collaboration. The models achieve state-of-the-art performance across seven major benchmarks by combining human annotation quality with AI scalability for better preference learning.
AIBullisharXiv – CS AI · Mar 46/102
🧠PlayWrite is a new mixed-reality AI system that allows users to create stories by directly manipulating virtual characters and props in XR, rather than through traditional text prompts. The system uses multi-agent AI to interpret user actions into structured narrative elements and generates final stories via large language models, demonstrating a novel approach to AI-human creative collaboration.
AINeutralarXiv – CS AI · Mar 46/105
🧠Researchers propose a framework for developing trustworthy AI agents that function as epistemic entities, capable of pursuing knowledge goals and shaping information environments. The paper argues that as AI models increasingly replace traditional search methods and provide specialized advice, their calibration to human epistemic norms becomes critical to prevent cognitive deskilling and epistemic drift.
AINeutralFortune Crypto · 2d ago6/10
🧠Asana, a project management platform that struggled during the AI boom, is betting on a $75 million acquisition of Stack AI to reposition itself as a human-AI collaboration tool. CEO Dan Rogers believes this move will enable the company to compete in an era where AI agents work alongside human teams.
AINeutralarXiv – CS AI · 2d ago5/10
🧠Researchers developed an AI-powered decision layer that identifies struggling students and prioritized course topics without relying on grades, combining student self-reports, observed learning difficulties, and teacher concerns. Testing in a graduate CS course showed the multi-signal approach achieved 96% accuracy in surfacing at-risk learners and aligned with instructor priorities, demonstrating transparent human-AI collaboration in educational settings.
AINeutralarXiv – CS AI · 2d ago6/10
🧠A research study comparing human and LLM reasoning capabilities found that humans are significantly more biased by source labels when evaluating logical fallacies, while LLMs maintain more consistent performance regardless of whether content is attributed to humans or AI. This finding suggests LLMs could enhance human decision-making in AI-mediated environments by providing source-agnostic analysis.
🧠 GPT-5🧠 Claude🧠 Sonnet
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers used AI-assisted methods to prove that Poincaré polynomials of moduli spaces of rational curves have only real roots, resolving a longstanding conjecture in algebraic geometry. The breakthrough employs a novel bivariate deformation technique that reveals hidden mathematical structures, with implications for understanding the topological properties of geometric spaces.
🏢 Google
AINeutralarXiv – CS AI · 2d ago6/10
🧠MOOSE-Copilot introduces a unified framework for scientific hypothesis discovery that combines exploratory ideation with fine-grained refinement through structured human-AI interaction. The web-based system enables scientists to guide LLM-powered discovery processes via initial blueprints, routing decisions, and feedback mechanisms, outperforming autonomous baselines while lowering accessibility barriers through an intuitive visual interface.
🏢 Microsoft
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers propose an ontology-driven framework called CCAI (Contextual Collaboration AI Ontology) to document and trace human-AI interactions, converting ephemeral prompt-response exchanges into structured, queryable collaboration records. The framework addresses transparency and accountability gaps in AI-assisted workflows by explicitly modeling tasks, agent roles, resources, and constraints within a machine-interpretable vocabulary.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers developed a triadic collaboration system integrating Large Language Models, teachers, and students for K-12 writing education, evaluated across 57,954 essays from 10,195 students over two years. The study demonstrates that LLMs effectively reduce teacher workload while teachers serve as quality gatekeepers, though excessive AI suggestions produce diminishing returns, indicating the need for adaptive collaboration strategies.
AIBearishFortune Crypto · 3d ago6/10
🧠Boston Consulting Group research reveals that integrating AI 'employees' into workplaces is producing counterintuitive negative effects: human workers become less accountable and more prone to errors by shifting blame onto their AI colleagues. This phenomenon suggests that despite AI's intended productivity benefits, organizational behavior deteriorates when humans can externalize responsibility to automated systems.