AINeutralarXiv – CS AI · 7h ago6/10
🧠
Augmenting Molecular Language Models with Local $n$-gram Memory
Researchers introduce MolGram, a neural architecture that enhances transformer-based language models for molecular SMILES strings by integrating a conditional n-gram memory module. This approach addresses the locality gap in character-level tokenization, enabling models to better capture chemical motifs while improving performance across molecule generation, reaction prediction, and retrosynthesis tasks with significantly fewer parameters than baseline models.