#llm-architecture News & Analysis

38 articles tagged with #llm-architecture. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

38 articles

AINeutralarXiv – CS AI · May 286/10

🧠

Identifying and Understanding Human Values in Text: A Tailorable LLM-based Architecture

Researchers present a modular LLM-based architecture for detecting and quantifying human values in text, addressing the need for ethical decision-making in autonomous AI systems. The approach separates value conceptualization from detection, enabling scalable application across different ethical frameworks and demonstrating strong performance on the ValueEval dataset.

AIBullisharXiv – CS AI · May 286/10

🧠

HGMEM: Hypergraph-based Working Memory to Improve Multi-step RAG for Long-Context Complex Relational Modeling

Researchers introduce HGMem, a hypergraph-based working memory system that enhances multi-step retrieval-augmented generation (RAG) for large language models by modeling complex relational dependencies among facts. Unlike traditional RAG systems that treat memory as passive storage, HGMem dynamically structures information as interconnected high-order relationships, demonstrating improved performance on global sense-making benchmarks requiring complex reasoning across extended contexts.

AINeutralarXiv – CS AI · May 126/10

🧠

DARE: Diffusion Language Model Activation Reuse for Efficient Inference

Researchers introduce DARE, a technique that reduces computational redundancy in Diffusion Language Models by reusing cached attention activations across tokens. The method achieves up to 1.20x per-layer latency improvements while maintaining generation quality, addressing efficiency gaps between diffusion-based and auto-regressive language models.

AINeutralarXiv – CS AI · May 126/10

🧠

LLM Translation of Compiler Intermediate Representation

Researchers introduce IRIS-14B, a 14-billion-parameter LLM fine-tuned to translate compiler intermediate representations between GCC's GIMPLE and LLVM IR, achieving up to 44 percentage points higher accuracy than existing state-of-the-art models. The approach demonstrates how LLMs can function as interoperability layers in hybrid compiler architectures, enabling cross-toolchain workflows without modifying existing compiler infrastructure.

AINeutralarXiv – CS AI · May 126/10

🧠

Hierarchical Mixture-of-Experts with Two-Stage Optimization

Researchers introduce Hi-MoE, a hierarchical Mixture-of-Experts framework that addresses a fundamental routing trade-off in sparse MoE models by implementing two-stage optimization: inter-group load balancing and intra-group expert specialization. Tested on large-scale NLP and vision tasks, Hi-MoE achieves 5.6% perplexity improvements and superior expert balance compared to existing methods.

🏢 Meta🏢 Perplexity

AINeutralarXiv – CS AI · May 116/10

🧠

Response-G1: Explicit Scene Graph Modeling for Proactive Streaming Video Understanding

Response-G1 introduces a novel framework for real-time video understanding that uses explicit scene graphs to align video evidence with query-specific response conditions, enabling Video-LLMs to make more accurate timing decisions during streaming video analysis without requiring fine-tuning.

AINeutralarXiv – CS AI · May 116/10

🧠

Cross-Attention and Encoder-Decoder Transformers: A Logical Characterization

Researchers present a novel logical framework for understanding encoder-decoder transformers using temporal logic extended with counting and past modalities. The work provides theoretical foundations for how these architectures process information across attention mechanisms, with implications for LLM interpretability and design.

AINeutralarXiv – CS AI · May 76/10

🧠

Emergent Hierarchical Structure in Large Language Models: An Information-Theoretic Framework for Multi-Scale Representation

Researchers reveal that large language models develop distinct hierarchical processing stages (Local, Intermediate, Global) determined by architecture family rather than model size. Using information theory, they demonstrate that Llama and Qwen models show dramatically different brittleness patterns across layers, with architectural design — not scaling — as the primary driver of model behavior.

🧠 Llama

AINeutralarXiv – CS AI · May 16/10

🧠

When 2D Tasks Meet 1D Serialization: On Serialization Friction in Structured Tasks

Researchers demonstrate that Large Language Models perform significantly better on 2D structured tasks when given visual representations rather than serialized text inputs. The study reveals that converting 2D data into 1D token sequences creates representational friction that degrades model performance, with gaps widening as task complexity increases.

AINeutralarXiv – CS AI · May 16/10

🧠

TiMem: Temporal-Hierarchical Memory Consolidation for Long-Horizon Conversational Agents

Researchers introduce TiMem, a temporal-hierarchical memory framework that helps conversational AI agents manage long conversation histories beyond LLM context limits. The system organizes interactions through a Temporal Memory Tree, achieving state-of-the-art performance on memory recall benchmarks while reducing memory overhead by over 50%.

AINeutralarXiv – CS AI · Apr 156/10

🧠

EMBER: Autonomous Cognitive Behaviour from Learned Spiking Neural Network Dynamics in a Hybrid LLM Architecture

Researchers present EMBER, a hybrid architecture combining spiking neural networks with large language models where the SNN acts as a persistent, biologically-inspired memory substrate that autonomously triggers LLM reasoning. The system demonstrates emergent autonomous behavior, initiating unprompted user contact after learning associations during idle periods, suggesting a fundamental shift in how AI systems could coordinate cognition and action.

AINeutralarXiv – CS AI · Apr 106/10

🧠

SymptomWise: A Deterministic Reasoning Layer for Reliable and Efficient AI Systems

SymptomWise introduces a deterministic reasoning framework that separates language understanding from diagnostic inference in AI-driven medical systems, combining expert-curated knowledge with constrained LLM use to improve reliability and reduce hallucinations. The system achieved 88% accuracy in placing correct diagnoses in top-five differentials on challenging pediatric neurology cases, demonstrating how structured approaches can enhance AI safety in critical domains.

AINeutralarXiv – CS AI · Mar 26/1015

🧠

Understanding In-Context Learning Beyond Transformers: An Investigation of State Space and Hybrid Architectures

Researchers conducted an in-depth analysis of in-context learning capabilities across different AI architectures including transformers, state-space models, and hybrid systems. The study reveals that while these models perform similarly on tasks, their internal mechanisms differ significantly, with function vectors playing key roles in self-attention and Mamba layers.

← PrevPage 2 of 2