y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#software-development News & Analysis

46 articles tagged with #software-development. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

46 articles
AIBearisharXiv – CS AI · Mar 177/10
🧠

EvoClaw: Evaluating AI Agents on Continuous Software Evolution

Researchers introduce EvoClaw, a new benchmark that evaluates AI agents on continuous software evolution rather than isolated coding tasks. The study reveals a critical performance drop from >80% on isolated tasks to at most 38% in continuous settings across 12 frontier models, highlighting AI agents' struggle with long-term software maintenance.

AIBearishMIT Technology Review · Mar 56/10
🧠

The Download: an AI agent’s hit piece, and preventing lightning

The article discusses how online harassment is evolving with AI technology, specifically mentioning an incident where Scott Shambaugh denied an AI agent's request to contribute to matplotlib software library. The piece appears to be part of a technology newsletter covering AI-related developments and their societal implications.

AINeutralarXiv – CS AI · Feb 277/106
🧠

Echoes of AI: Investigating the Downstream Effects of AI Assistants on Software Maintainability

A controlled study of 151 professional developers found that AI coding assistants like GitHub Copilot provide significant productivity gains (30.7% faster completion) but don't impact code maintainability when other developers later modify the code. The research suggests AI-assisted code is neither easier nor harder for subsequent developers to work with.

AIBullishOpenAI News · Feb 57/106
🧠

GPT-5.3-Codex System Card

OpenAI has released GPT-5.3-Codex, described as the most capable agentic coding model to date. The system combines the advanced coding performance of GPT-5.2-Codex with enhanced reasoning and professional knowledge capabilities from GPT-5.2.

AINeutralIEEE Spectrum – AI · Jan 297/104
🧠

Was 2025 Really the Year of AI Agents?

AI agents showed mixed adoption in 2025, with significant breakthrough in programming and software development through tools like Cursor and Claude Code, but limited deployment in other industries due to accountability concerns and regulatory challenges. While programmers embraced AI agents for tasks like automated testing, many organizations remain in evaluation phases rather than production deployment.

AIBullishOpenAI News · Jan 207/103
🧠

Cisco and OpenAI redefine enterprise engineering with AI agents

Cisco and OpenAI have partnered to launch Codex, an AI software agent that integrates into enterprise workflows to accelerate development builds, automate defect resolution, and enable AI-native development practices. This collaboration aims to redefine how enterprises approach software engineering through embedded AI capabilities.

AIBullishVentureBeat – AI · Jan 57/104
🧠

The creator of Claude Code just revealed his workflow, and developers are losing their minds

Boris Cherny, creator of Claude Code at Anthropic, revealed his development workflow that uses 5 parallel AI agents and exclusively runs the slowest but smartest model, Opus 4.5. His approach transforms coding from linear programming to fleet management, achieving the output capacity of a small engineering team while maintaining a shared knowledge file that makes AI mistakes permanent lessons.

The creator of Claude Code just revealed his workflow, and developers are losing their minds
AIBullishOpenAI News · Nov 257/107
🧠

Inside JetBrains—the company reshaping how the world writes code

JetBrains is integrating GPT-5 across its development tools to help millions of developers design, reason, and build software more efficiently. This integration represents a significant advancement in AI-powered coding assistance for the global developer community.

AIBullishThe Verge – AI · 2d ago6/10
🧠

The AI code wars are heating up

The article explores the intensifying competition among tech companies to develop superior AI coding tools, with Microsoft's GitHub Copilot marking an early breakthrough in AI-assisted development before ChatGPT's mainstream emergence. Multiple players are now racing to dominate the AI coding space, signaling a shift in how software development fundamentally works.

The AI code wars are heating up
🏢 OpenAI🏢 Anthropic🏢 Microsoft
AIBearisharXiv – CS AI · 5d ago6/10
🧠

Evaluating LLM-Based 0-to-1 Software Generation in End-to-End CLI Tool Scenarios

Researchers introduce CLI-Tool-Bench, a new benchmark for evaluating large language models' ability to generate complete software from scratch. Testing seven state-of-the-art LLMs reveals that top models achieve under 43% success rates, exposing significant limitations in current AI-driven 0-to-1 software generation despite increased computational investment.

AIBullishFortune Crypto · Apr 66/10
🧠

The real impact of AI on SaaS isn’t what investors think

The article argues that AI's impact on SaaS will be to enable a surge of new software creation rather than eliminating existing software companies. Lower development costs and simplified coding through AI tools could democratize software development and expand the market.

The real impact of AI on SaaS isn’t what investors think
AIBullishThe Register – AI · Mar 267/10
🧠

AI bug reports went from junk to legit overnight, says Linux kernel czar

Linux kernel czar Linus Torvalds reports that AI-generated bug reports have dramatically improved in quality, transforming from mostly useless submissions to legitimate and valuable contributions overnight. This represents a significant milestone in AI's ability to assist with complex software development and code analysis tasks.

AIBullisharXiv – CS AI · Mar 266/10
🧠

HalluJudge: A Reference-Free Hallucination Detection for Context Misalignment in Code Review Automation

Researchers developed HalluJudge, a reference-free system to detect hallucinations in AI-generated code review comments, addressing a key challenge in LLM adoption for software development. The system achieves 85% F1 score with 67% alignment to developer preferences at just $0.009 average cost, making it a practical safeguard for AI-assisted code reviews.

AINeutralarXiv – CS AI · Mar 176/10
🧠

Lore: Repurposing Git Commit Messages as a Structured Knowledge Protocol for AI Coding Agents

Researchers propose 'Lore', a lightweight protocol that restructures Git commit messages to preserve decision-making context for AI coding agents. The system uses native Git trailers to capture reasoning, constraints, and alternatives behind code changes, addressing the growing loss of institutional knowledge as AI agents become primary code producers.

AIBullishMarkTechPost · Mar 146/10
🧠

Garry Tan Releases gstack: An Open-Source Claude Code System for Planning, Code Review, QA, and Shipping

Garry Tan has released gstack, an open-source toolkit that enhances AI-assisted coding by organizing Claude Code into 8 distinct workflow skills for product planning, engineering review, QA, and shipping. The system aims to improve coding reliability by separating different development phases into specialized operating modes with persistent browser runtime support.

🧠 Claude
AIBullisharXiv – CS AI · Mar 96/10
🧠

XAI for Coding Agent Failures: Transforming Raw Execution Traces into Actionable Insights

Researchers developed an explainable AI (XAI) system that transforms raw execution traces from LLM-based coding agents into structured, human-interpretable explanations. The system enables users to identify failure root causes 2.8 times faster and propose fixes with 73% higher accuracy through domain-specific failure taxonomy, automatic annotation, and hybrid explanation generation.

AIBullishTechCrunch – AI · Mar 56/10
🧠

Cursor is rolling out a new kind of agentic coding tool

Cursor is launching Automations, a new agentic coding tool that automatically deploys AI agents within development environments. The system can be triggered by codebase changes, Slack messages, or timers to enhance automated development workflows.

AIBullisharXiv – CS AI · Mar 55/10
🧠

FeedAIde: Guiding App Users to Submit Rich Feedback Reports by Asking Context-Aware Follow-Up Questions

FeedAIde is a new AI-powered mobile app feedback system that uses Multimodal Large Language Models to guide users through submitting detailed bug reports and feature requests. The iOS framework captures contextual information like screenshots and asks follow-up questions to improve feedback quality, with testing showing enhanced completeness compared to traditional feedback forms.

AIBearisharXiv – CS AI · Mar 37/108
🧠

Are LLMs Reliable Code Reviewers? Systematic Overcorrection in Requirement Conformance Judgement

Research reveals that Large Language Models (LLMs) systematically fail at code review tasks, frequently misclassifying correct code as defective when matching implementations to natural language requirements. The study found that more detailed prompts actually increase misjudgment rates, raising concerns about LLM reliability in automated development workflows.

AIBullisharXiv – CS AI · Mar 36/107
🧠

RepoRepair: Leveraging Code Documentation for Repository-Level Automated Program Repair

RepoRepair is a new AI-powered automated program repair system that uses hierarchical code documentation to fix bugs across entire software repositories. The system achieves a 45.7% repair rate on SWE-bench Lite at $0.44 per fix by leveraging LLMs like DeepSeek-V3 and Claude-4 for fault localization and code repair.

AIBullisharXiv – CS AI · Mar 37/1010
🧠

Inference-Time Safety For Code LLMs Via Retrieval-Augmented Revision

Researchers developed a new inference-time safety mechanism for code-generating AI models that uses retrieval-augmented generation to identify and fix security vulnerabilities in real-time. The approach leverages Stack Overflow discussions to guide AI code revision without requiring model retraining, improving security while maintaining interpretability.

Page 1 of 2Next →