Learning Compositional Latent Structure with Vector Networks
Researchers introduce Vector Networks (VN), a neural architecture that replaces dense weight matrices with libraries of reusable rank-1 weight atoms, enabling selective composition of network components for novel tasks. The approach demonstrates significant out-of-distribution generalization improvements—up to an order of magnitude better than baselines—when familiar elements must be recombined in new ways, addressing a fundamental limitation in deep learning's ability to handle compositional reasoning.
Vector Networks represent a structural advancement in how deep neural networks organize and reuse learned computations. Traditional architectures entangle multiple behavioral patterns within shared dense weight matrices, limiting their ability to extract and recombine individual components when facing novel input combinations. VN solves this by introducing a hierarchical recurrent architecture where each layer maintains a library of rank-1 weight atoms that are selectively activated based on input-specific energy minimization. This approach enforces compositional structure at the architectural level rather than hoping it emerges from training.
The significance lies in addressing compositional generalization, a persistent weakness in current deep learning systems. When neural networks encounter familiar concepts in unfamiliar arrangements, performance typically degrades sharply. VN's design—where weight coefficients are jointly constrained by bottom-up input reconstruction and top-down feedback consistency—forces the network to learn modular, reusable components. The sparse inference mechanism ensures only relevant atoms activate per sample, maintaining computational efficiency while enabling compositional flexibility.
Benchmark results across 1D signals, 2D spatial decoding, N-body dynamics, and compositional MNIST show VN matching strong baselines on in-distribution tasks while achieving dramatically superior out-of-distribution generalization. This performance gap suggests the architecture genuinely learns compositional structure rather than memorizing patterns. The learning mechanism, updating only selected atoms through local residual signals, aligns with biological plausibility and computational efficiency principles.
For the broader AI community, VN demonstrates that compositional generalization doesn't require architectural agnosticism—specialized designs can make it a fundamental property. Future work may explore scaling VN to larger domains and investigating whether this approach generalizes beyond currently tested problem classes.
- →Vector Networks replace fixed weight matrices with libraries of reusable rank-1 atoms, enabling modular network components.
- →The architecture achieves roughly 10x better out-of-distribution performance when recombining familiar factors in novel ways.
- →Sparse, input-dependent weight inference provides both compositional flexibility and computational efficiency.
- →VN makes compositional generalization a structural property rather than an emergent behavior from training.
- →Evaluation spans multiple domains including dynamics, spatial decoding, and compositional visual recognition tasks.