AINeutralGoogle DeepMind Blog · Oct 257/106
🧠Google introduces T5Gemma, a new collection of encoder-decoder large language models (LLMs) based on the Gemma architecture. This represents an expansion of Google's Gemma model family to include encoder-decoder capabilities alongside the existing decoder-only models.
AINeutralarXiv – CS AI · Jun 25/10
🧠Researchers propose an auxiliary reconstruction module to improve encoder representations in neural algorithmic reasoning systems. By forcing encoders to reconstruct input states and capture feature dependencies, the method enhances the performance of existing neural architectures on algorithmic reasoning benchmarks.
AINeutralarXiv – CS AI · Jun 26/10
🧠A comprehensive survey introduces graph neural networks (GNNs) through an encoder-decoder framework, demonstrating their effectiveness across various graph analytics tasks. The paper emphasizes critical challenges like oversmoothing and oversquashing in GNN training, providing experimental insights on how network performance scales with training data and graph complexity.
AINeutralarXiv – CS AI · Jun 16/10
🧠Researchers propose block-based double decoders, a transformer architecture that combines the training efficiency of decoder-only models with the inference speed advantages of encoder-decoder models. The innovation uses doubly-causal block-based attention masks to enable full loss supervision and static sequence packing, achieving 2/3 reduction in KV-cache memory and per-token compute at inference time.
AINeutralarXiv – CS AI · May 276/10
🧠Researchers propose DRLHQ, a deep reinforcement learning approach with heterogeneous query attention mechanisms to solve capacitated location-routing problems (CLRPs) and their open variants. This marks the first end-to-end learning framework for CLRPs, demonstrating superior performance over traditional and DRL-based baselines on benchmark datasets.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers propose Constraint-Aware Residual Modulation (CARM), a neural module that improves how AI solvers handle complex vehicle routing problems by maintaining global observation during constraint-aware decision-making. The advancement demonstrates significant performance improvements across multiple routing problem variants and scaling capabilities.
AINeutralarXiv – CS AI · May 116/10
🧠Researchers present a novel logical framework for understanding encoder-decoder transformers using temporal logic extended with counting and past modalities. The work provides theoretical foundations for how these architectures process information across attention mechanisms, with implications for LLM interpretability and design.
AINeutralarXiv – CS AI · May 16/10
🧠Researchers develop a theoretical framework connecting Information Bottleneck principles to encoder-decoder learning through rate-distortion analysis, showing optimal representations form soft clusters on probability manifolds. The work introduces Sketched Isotropic Gaussian Regularization (SIGReg) as a principled regularizer for self-supervised, semi-supervised, and supervised learning without requiring variational bounds.
AIBullisharXiv – CS AI · Mar 36/108
🧠Researchers introduce Mamba-CAD, a state space model using Mamba architecture for generating complex 3D CAD models from parametric sequences. The model addresses limitations in handling longer, fine-grained industrial CAD sequences through an encoder-decoder framework paired with GANs, trained on a new dataset of 77,078 CAD models.
AINeutralHugging Face Blog · Nov 91/107
🧠The article title suggests content about leveraging pre-trained language model checkpoints for encoder-decoder models, but no article body was provided for analysis.
AINeutralHugging Face Blog · Oct 101/106
🧠The article title references Transformer-based Encoder-Decoder Models, a fundamental AI architecture used in natural language processing and machine learning. However, no article body content was provided to analyze specific details, applications, or implications.