y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#encoder-decoder News & Analysis

9 articles tagged with #encoder-decoder. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

9 articles
AINeutralGoogle DeepMind Blog · Oct 257/106
🧠

T5Gemma: A new collection of encoder-decoder Gemma models

Google introduces T5Gemma, a new collection of encoder-decoder large language models (LLMs) based on the Gemma architecture. This represents an expansion of Google's Gemma model family to include encoder-decoder capabilities alongside the existing decoder-only models.

AINeutralarXiv – CS AI · 23h ago6/10
🧠

Block-Based Double Decoders

Researchers propose block-based double decoders, a transformer architecture that combines the training efficiency of decoder-only models with the inference speed advantages of encoder-decoder models. The innovation uses doubly-causal block-based attention masks to enable full loss supervision and static sequence packing, achieving 2/3 reduction in KV-cache memory and per-token compute at inference time.

AINeutralarXiv – CS AI · 5d ago6/10
🧠

An End-to-End Learning Approach for Solving Capacitated Location-Routing Problems

Researchers propose DRLHQ, a deep reinforcement learning approach with heterogeneous query attention mechanisms to solve capacitated location-routing problems (CLRPs) and their open variants. This marks the first end-to-end learning framework for CLRPs, demonstrating superior performance over traditional and DRL-based baselines on benchmark datasets.

AINeutralarXiv – CS AI · May 126/10
🧠

Rethinking Constraint Awareness for Efficient State Embedding of Neural Routing Solver

Researchers propose Constraint-Aware Residual Modulation (CARM), a neural module that improves how AI solvers handle complex vehicle routing problems by maintaining global observation during constraint-aware decision-making. The advancement demonstrates significant performance improvements across multiple routing problem variants and scaling capabilities.

AINeutralarXiv – CS AI · May 116/10
🧠

Cross-Attention and Encoder-Decoder Transformers: A Logical Characterization

Researchers present a novel logical framework for understanding encoder-decoder transformers using temporal logic extended with counting and past modalities. The work provides theoretical foundations for how these architectures process information across attention mechanisms, with implications for LLM interpretability and design.

AINeutralarXiv – CS AI · May 16/10
🧠

Why Self-Supervised Encoders Want to Be Normal

Researchers develop a theoretical framework connecting Information Bottleneck principles to encoder-decoder learning through rate-distortion analysis, showing optimal representations form soft clusters on probability manifolds. The work introduces Sketched Isotropic Gaussian Regularization (SIGReg) as a principled regularizer for self-supervised, semi-supervised, and supervised learning without requiring variational bounds.

AIBullisharXiv – CS AI · Mar 36/108
🧠

Mamba-CAD: State Space Model For 3D Computer-Aided Design Generative Modeling

Researchers introduce Mamba-CAD, a state space model using Mamba architecture for generating complex 3D CAD models from parametric sequences. The model addresses limitations in handling longer, fine-grained industrial CAD sequences through an encoder-decoder framework paired with GANs, trained on a new dataset of 77,078 CAD models.

AINeutralHugging Face Blog · Oct 101/106
🧠

Transformer-based Encoder-Decoder Models

The article title references Transformer-based Encoder-Decoder Models, a fundamental AI architecture used in natural language processing and machine learning. However, no article body content was provided to analyze specific details, applications, or implications.