AIBullisharXiv – CS AI · 1d ago7/10
🧠MatMind is a generative foundation model designed for crystal materials science that unifies structure prediction, property forecasting, and material design within a single LLM-based framework. The model surpasses specialized graph neural networks on benchmark tasks while achieving 65.3% success on crystal generation, demonstrating that unified AI architectures can compete with purpose-built narrow specialists.
AIBullisharXiv – CS AI · 5d ago7/10
🧠Researchers introduce MicroSkill Architecture, a modular framework that organizes AI coding knowledge into atomic skill capsules rather than feeding entire codebases to language models. The approach reduces token consumption by 90%, doubles compilation success rates, and eliminates architectural violations in enterprise systems.
AINeutralarXiv – CS AI · Jun 27/10
🧠Researchers establish fundamental information-theoretic limits on decoder-only transformer attention for state-tracking tasks, proving extended reasoning degrades performance beyond a 'Deterministic Horizon' of 19-31 steps. Tool delegation consistently outperforms neural chain-of-thought across 12 models (86-94% vs 24-42% accuracy), suggesting hybrid agentic systems require external tools rather than pure neural reasoning for complex deterministic tasks.
AIBullisharXiv – CS AI · Jun 27/10
🧠Researchers introduce MemPro, an evolution framework that treats autonomous agent memory systems as adaptable programs rather than static pipelines. By iteratively diagnosing failures and refining the entire memory-construction-retrieval pipeline, MemPro outperforms fixed baselines on multiple benchmarks while maintaining computational efficiency.
AIBullisharXiv – CS AI · Jun 27/10
🧠Researchers propose the Intelligent Computing Architecture Model (ICAM), a six-layer framework that applies classical computer architecture principles to large language models and agentic AI systems. The paper maps recurring engineering challenges—cache reuse, context management, agent scheduling, and permission control—to traditional systems problems, introducing three design laws to optimize model-native computing efficiency and coordination.
🧠 Claude
AIBullisharXiv – CS AI · May 287/10
🧠Researchers propose Periodic RoPE (P-RoPE), a novel positional encoding mechanism that combines sliding window attention for local dependencies with global attention layers lacking positional constraints, enabling language models to theoretically support infinite context windows without performance degradation. The approach addresses a fundamental limitation in current LLMs where model performance degrades when sequence length exceeds the pre-trained range of positional encodings like RoPE.
AIBearisharXiv – CS AI · May 127/10
🧠A new research position argues that enterprises should stop treating large language models as monolithic solutions for all tasks and instead use them primarily for structured data extraction within modular architectures. The paper contends that LLMs have inherent capacity limits for enterprise knowledge needs and proposes delegating computation and storage to specialized components like knowledge bases and symbolic systems for better reliability and cost efficiency.
AIBullisharXiv – CS AI · May 127/10
🧠Researchers propose Agent Cybernetics, a theoretical framework applying mid-20th century control systems theory to modern LLM-based AI agents. The framework addresses critical gaps in how foundation agents are designed, offering scientific principles for reliability, continuous operation, and safe self-improvement across long-horizon tasks.
AIBullisharXiv – CS AI · May 127/10
🧠Researchers demonstrate that transformer models equipped with continuous latent context tokens can efficiently implement online learning algorithms without parameter updates. A small GPT-2-style model trained with this approach outperforms much larger language models on synthetic online prediction tasks, suggesting a promising architectural direction for adaptive AI systems.
AIBullisharXiv – CS AI · May 77/10
🧠Researchers introduce Lossless Context Management (LCM), a deterministic architecture for LLM memory that outperforms Claude Code on long-context tasks up to 1M tokens. LCM combines recursive context compression with engine-managed task partitioning, representing an evolution of recursive language models that prioritizes reliability and state retrievability over flexibility.
🧠 Claude🧠 Opus
AIBearisharXiv – CS AI · May 47/10
🧠A new research study reveals that large language models struggle to effectively use representations they learn from in-context information, even though they can encode this information internally. The findings suggest current LLMs have fundamental limitations in adapting to novel contexts, affecting their ability to generalize learned patterns to downstream tasks.
AIBullisharXiv – CS AI · Apr 207/10
🧠Researchers introduce CoMeT (Collaborative Memory Transformer), a novel architecture that enables large language models to process arbitrarily long sequences with constant memory usage and linear time complexity. The system uses a dual-memory approach with FIFO queues and gated updates, demonstrating remarkable performance on long-context tasks including 1M token sequences and real-world applications.
AINeutralarXiv – CS AI · 11h ago6/10
🧠Researchers introduce Whisper-GPT, a hybrid language model that combines continuous audio representations (spectrograms) with discrete acoustic tokens to improve speech and music generation. This approach addresses context length limitations in traditional token-based models while maintaining high-fidelity audio synthesis capabilities.
🏢 Perplexity
AINeutralarXiv – CS AI · 11h ago6/10
🧠Researchers introduce UniTok, a universal tokenizer that converts continuous time series data into discrete tokens, enabling UniTok-FM—a foundation model pretrained via next-token prediction. This unified approach supports forecasting, generation, and classification tasks without task-specific modifications, achieving competitive performance with specialized models while enabling zero-shot and few-shot inference capabilities.
AINeutralarXiv – CS AI · 1d ago6/10
🧠Researchers have formulated Transformer data propagation as a nonlinear control system and proven that Gaussian distributions remain Gaussian through the network's layers. This reduces infinite-dimensional dynamics to finite-dimensional equations governing mean and covariance evolution, connecting Transformer expressiveness to classical control theory and revealing conditions for stability or divergence.
AIBullisharXiv – CS AI · 1d ago6/10
🧠Researchers present SearchSwarm, a framework that trains large language models to intelligently delegate complex tasks to subagents while managing finite context windows. The resulting 30B-parameter model achieves state-of-the-art performance on research benchmarks by learning when and what to delegate, addressing a critical bottleneck in agentic AI systems.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers compare three orchestration approaches for AI agents handling customer-service workflows: declarative agents using natural-language skill files, imperative agents with programmatic state machines, and unscaffolded baseline agents. The study finds that retrieval quality is the dominant bottleneck, and declarative skills improve performance on procedural tasks only when evidence quality is high.
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers analyzing large language models find that loss scales inversely with network depth, suggesting most layers function similarly and reduce error through ensemble averaging rather than compositional learning. This inefficient scaling pattern may stem from architectural constraints in residual networks, indicating that improving LLM efficiency requires fundamental architectural innovations rather than simply adding more layers.
AIBullisharXiv – CS AI · Jun 16/10
🧠PhyDrawGen is a neuro-symbolic AI system that generates physics diagrams from natural language text while maintaining strict physical accuracy. By combining large language models, deterministic solvers, and vision-language models in a pipeline, it overcomes the hallucination problems of current generative models and outperforms GPT-4, Gemini 2.5, and Gemini 3 Pro on physics problems spanning mechanics, optics, and electromagnetism.
🧠 GPT-5🧠 Gemini
AIBullisharXiv – CS AI · Jun 16/10
🧠Researchers have developed an alternative to deep neural networks for large language models based on RBF (Radial Basis Function) networks that claims to find optimal solutions in closed form without iterative training. The approach promises improved explainability and accuracy while eliminating the computationally expensive training process required by traditional DNNs.
AINeutralarXiv – CS AI · May 296/10
🧠Researchers introduce Think Fast, Talk Smart, a hybrid system that combines deterministic computation with bounded LLM calls for generating health text from structured data. The approach achieves lower errors and costs than pure LLM-based alternatives by reserving neural computation for expression tasks while delegating analysis, comparison, and ranking to deterministic code.
AINeutralarXiv – CS AI · May 296/10
🧠Researchers propose a modular architecture for educational AI chatbots designed to enforce pedagogical principles and prevent negative learning outcomes. The approach addresses structural limitations in current monolithic LLM solutions by incorporating targeted modules at different exercise-solving stages, enabling more transparent and controlled student guidance.
AINeutralarXiv – CS AI · May 286/10
🧠Researchers present a modular LLM-based architecture for detecting and quantifying human values in text, addressing the need for ethical decision-making in autonomous AI systems. The approach separates value conceptualization from detection, enabling scalable application across different ethical frameworks and demonstrating strong performance on the ValueEval dataset.
AIBullisharXiv – CS AI · May 286/10
🧠Researchers introduce HGMem, a hypergraph-based working memory system that enhances multi-step retrieval-augmented generation (RAG) for large language models by modeling complex relational dependencies among facts. Unlike traditional RAG systems that treat memory as passive storage, HGMem dynamically structures information as interconnected high-order relationships, demonstrating improved performance on global sense-making benchmarks requiring complex reasoning across extended contexts.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers introduce DARE, a technique that reduces computational redundancy in Diffusion Language Models by reusing cached attention activations across tokens. The method achieves up to 1.20x per-layer latency improvements while maintaining generation quality, addressing efficiency gaps between diffusion-based and auto-regressive language models.