Document Classification Pattern Recognition via Information Fusion: A Systematic Review of Multimodal and Multiview Representation Approaches
A comprehensive systematic review of 139 studies reveals that multimodal information fusion improves document classification accuracy by 5.28 percentage points, while multiview approaches provide modest gains of 4.67%. The research identifies critical gaps in methodological rigor, with less than 24% of studies employing statistical validation, highlighting the need for more robust research standards in the field.
This systematic review addresses a fundamental gap in machine learning research by providing the first quantitative evidence base for information fusion in document classification. The meta-analysis of 139 primary studies reveals that multimodal fusion—integrating multiple data sources—consistently outperforms single-modality approaches with a mean accuracy improvement of 5.28 percentage points, a statistically significant result. Multiview fusion, which leverages different representations of the same data, delivers more modest but consistent gains across metrics.
The research emerges from recognition that while information fusion techniques are widely implemented, the field lacks unified frameworks and standardized evaluation methodologies. This fragmentation has prevented practitioners from understanding which approaches work best for specific contexts. The systematic review's framework structures existing knowledge and identifies recurring patterns across diverse applications.
The most concerning finding relates to reproducibility and methodological rigor. Only 11.8% of multimodal studies and 23.3% of multiview studies employed statistical tests to validate their improvements, suggesting many reported performance gains lack rigorous statistical backing. This reproducibility crisis undermines confidence in published results and hampers informed decision-making for practitioners selecting fusion approaches.
For practitioners and researchers, the review's primary insight challenges assumptions about complexity: successful information fusion depends not on sophisticated algorithms but on strategic alignment between fusion methods and task requirements. This data-driven conclusion suggests that resources devoted to algorithmic innovation might be better invested in careful problem analysis and validation rigor. Future work must establish higher standards for statistical validation and reproducibility to build a more trustworthy evidence base.
- →Multimodal fusion improves document classification accuracy by 5.28 percentage points with statistical significance.
- →Multiview fusion provides consistent but modest gains (4.67% accuracy, 3.08% F1-score) across tasks.
- →Only 11.8% of multimodal and 23.3% of multiview studies use statistical validation, indicating reproducibility concerns.
- →Fusion effectiveness depends on strategic task alignment rather than algorithmic complexity.
- →The field lacks unified frameworks and standardized evaluation methodologies for comparing information fusion approaches.