AINeutralarXiv โ CS AI ยท 14h ago6/10
๐ง
Beyond Statistical Co-occurrence: Unlocking Intrinsic Semantics for Tabular Data Clustering
Researchers introduce TagCC, a novel deep clustering framework that combines Large Language Models with contrastive learning to enhance tabular data analysis by incorporating semantic knowledge from feature names and values. The approach bridges the gap between statistical co-occurrence patterns and intrinsic semantic understanding, demonstrating significant performance improvements over existing methods in finance and healthcare applications.