AIBullisharXiv – CS AI · Jun 27/10
🧠Researchers propose MAAD (Multi-Agent Architecture Design), a framework using orchestrated AI agents with external knowledge and hierarchical memory to automate software architecture design from requirements. The system outperforms existing approaches and demonstrates that advanced LLMs significantly improve architectural quality and validation efficiency.
🧠 GPT-5
AIBullisharXiv – CS AI · May 117/10
🧠Researchers introduce CADTestBench, the first test-based evaluation framework for Text-to-CAD systems that uses executable software tests to verify whether AI-generated CAD models meet geometric and topological requirements. The framework enables both comprehensive benchmarking of existing methods and improved model generation through test-guided approaches, addressing a significant gap in CAD model evaluation methodology.
🏢 Hugging Face
AIBearisharXiv – CS AI · 6d ago6/10
🧠Researchers introduce BIM-Edit, a benchmark that evaluates large language models on their ability to edit existing Building Information Models in IFC format based on natural language instructions. The benchmark reveals significant capability gaps, with the best-performing LLM achieving only 49.5% accuracy and none solving more than 3.4% of tasks, highlighting that current AI systems struggle with the semantic preservation and relational understanding required for professional engineering workflows.
AINeutralarXiv – CS AI · Jun 106/10
🧠Researchers introduce Architect-Ant, an AI system that automatically furnishes architectural floor plans using a fine-tuned vision-language model and a new dataset of 270 professionally designed floor plans. The framework generates furniture layouts as editable symbolic code that can be rendered into realistic images while maintaining spatial validity and functional plausibility.
AINeutralarXiv – CS AI · Jun 96/10
🧠Researchers introduce RTL-BenchLS, a large-scale benchmark containing over 10,000 formally verified Verilog designs for evaluating large language models on hardware design tasks. The benchmark addresses limitations of existing datasets through three novel self-supervised tasks beyond specification-to-RTL generation, with top models achieving only 12-28% accuracy, demonstrating substantial room for improvement in LLM-based hardware automation.
AIBullisharXiv – CS AI · Jun 86/10
🧠Researchers introduce DxPTA, a design space exploration methodology for optimizing photonic transformer accelerators (PTAs) through hardware/software co-design. The approach automatically identifies optimal PTA architectures for AI models like DeiT and BERT while meeting area, power, energy, and latency constraints, achieving 15.2x faster design exploration than exhaustive methods.
AINeutralarXiv – CS AI · Jun 56/10
🧠PerceptUI is a new AI framework that uses persona-conditioned large language models to evaluate user interfaces by simulating how specific users would respond to UX questions. The system achieves human-level accuracy through contrastive learning and prompt evolution, potentially accelerating product development by reducing reliance on costly human testing and A/B tests.
AINeutralarXiv – CS AI · Jun 56/10
🧠Researchers introduce NIV (Neural Axis Variations), an AI method that automatically converts static fonts into variable fonts by predicting per-point glyph displacements across design axes like weight and width. Trained on over one million font variations from Google Fonts, the model generalizes across unseen fonts, scripts, and even handwriting, with outputs compatible with standard rendering engines.
AIBullisharXiv – CS AI · Jun 46/10
🧠HighTide is an open-source AI-assisted VLSI benchmark suite designed to standardize hardware design testing across multiple languages and technology nodes. The platform combines automated compilation infrastructure with AI agent curation to streamline chip design workflows and maintain long-term optimization records.
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers propose using multi-embodiment value functions trained across diverse robot designs as reusable models for optimizing future robot morphologies without retraining. By leveraging value gradients from frozen neural networks, this approach enables efficient design optimization across hundreds of continuous parameters and can identify performance-critical design choices.
AINeutralarXiv – CS AI · May 286/10
🧠Researchers introduce MUSE, a new benchmark for evaluating text-to-CAD generation that moves beyond simple geometry matching to assess manufacturability, functionality, and assemblability of complex 3D assemblies. Current LLM-based CAD generation systems fail significantly when evaluated against practical engineering requirements, revealing a critical gap between geometric generation and production-ready design.
AIBullisharXiv – CS AI · May 276/10
🧠Researchers propose RulePlanner, a deep reinforcement learning framework that unifies the handling of complex hardware design rules in 3D integrated circuit floorplanning. The approach addresses a critical bottleneck in chip design by automating compliance with multiple design rules simultaneously, reducing manual post-processing and accelerating the path from design to manufacturing.
AINeutralarXiv – CS AI · Apr 136/10
🧠Researchers investigate how multimodal large language models (MLLMs) can assist with usability evaluation of user interfaces by analyzing text and visual context together. The study compares MLLM-generated assessments against expert evaluations, finding that these models can effectively prioritize usability issues by severity and offer complementary insights to traditional resource-intensive evaluation methods.
AINeutralarXiv – CS AI · Apr 76/10
🧠Researchers introduce GraphicDesignBench (GDB), the first comprehensive benchmark suite for evaluating AI models on professional graphic design tasks including layout, typography, and animation. Testing reveals current AI models struggle with spatial reasoning, vector code generation, and typographic precision despite showing promise in high-level semantic understanding.
AINeutralApple Machine Learning · Feb 274/102
🧠Researchers introduce 'distinguishing variations' to help front-end developers better instantiate UI components by combining symbolic inference with design-space sampling. This approach aims to generate variations that are both realistic and visually distinct, addressing the complexity of configuring multiple component properties.