#clustering News & Analysis

21 articles tagged with #clustering. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

21 articles

AIBullisharXiv – CS AI · Jun 27/10

🧠

CRISP -- Clustering-Based Redundancy-Reduced Instance Sampling for Pathology Case Representation and Retrieval

CRISP is an unsupervised machine learning framework that automates the analysis of multiple whole-slide images (WSIs) in digital pathology by selectively sampling informative patches across all slides in a case rather than relying on a single pathologist-selected slide. The approach matches or exceeds current clinical practice for breast cancer diagnosis and retrieval while eliminating subjective slide selection and reducing computational burden.

AIBullisharXiv – CS AI · Mar 47/103

🧠

Robust Weight Imprinting: Insights from Neural Collapse and Proxy-Based Aggregation

Researchers propose a new IMPRINT framework for transfer learning that improves foundation model adaptation to new tasks without parameter optimization. The framework identifies three key components and introduces a clustering-based variant that outperforms existing methods by 4%.

AINeutralarXiv – CS AI · Jun 235/10

🧠

Cohort Organized Learning: Clustering Through Agreement

Researchers introduce Cohort Organized Learning (CoOL), a neural network-based clustering method that eliminates the need for explicit distance or similarity calculations. The approach uses expectation maximization to train networks capable of clustering diverse data types including vectors and images, offering a flexible alternative to traditional clustering algorithms.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Correcting Prompt Dependence in LLM Benchmarks: A Bayesian Hierarchical Model with Embedding-Space Clustering

Researchers propose a Bayesian hierarchical model with embedding-space clustering to correct fundamental flaws in LLM benchmarking methodology. The approach addresses two critical issues—insufficient evaluation samples and non-independent test prompts—improving performance metric accuracy by 4-73% in mean absolute errors, particularly relevant for adversarial robustness evaluation.

AINeutralarXiv – CS AI · Jun 46/10

🧠

ClustRecNet: A Novel End-to-End Deep Learning Framework for Clustering Algorithm Recommendation

Researchers introduce ClustRecNet, a deep learning framework that automatically recommends optimal clustering algorithms for datasets by learning from 34,000 synthetic examples. The system outperforms traditional validity indices and AutoML approaches, achieving 44% improvement over leading competitors on real-world benchmarks.

AINeutralarXiv – CS AI · Jun 25/10

🧠

NILC: Discovering New Intents with LLM-assisted Clustering

Researchers introduce NILC, a novel clustering framework that combines large language models with iterative refinement to improve new intent discovery in dialogue systems. Unlike traditional cascaded approaches relying solely on embedding-based K-Means clustering, NILC leverages LLMs to enhance cluster semantics and augment ambiguous utterances, demonstrating consistent performance gains across multiple benchmark datasets.

AINeutralarXiv – CS AI · May 285/10

🧠

Eliot: Interactively $\underline{E}$xploring Fast-Changing Scientific $\underline{Li}$terature Trends with $\underline{O}$nline Da$\underline{t}$a and Learning

Researchers present Eliot, an interactive system for exploring evolving scientific literature trends across rapidly changing fields like Large Language Models and Automated Planning. The tool retrieves arXiv papers at query time, clusters them into thematic groups, and visualizes publication patterns over time, with evaluations showing 85% accuracy in meaningful cluster labeling across eight research domains.

AINeutralarXiv – CS AI · May 286/10

🧠

SmartIterator: Visual Analytics Workflows for Supervising Unsupervised Data Grouping

SmartIterator is a visual analytics framework that helps data scientists systematically evaluate and choose between multiple unsupervised learning results across parameter sweeps. The approach operationalizes structured six-phase workflows for three clustering and topic-modeling method families, enabling informed decision-making by visualizing data grouping quality, stability, membership confidence, and domain context simultaneously.

AINeutralarXiv – CS AI · Apr 156/10

🧠

Enhancing Clustering: An Explainable Approach via Filtered Patterns

Researchers propose a pattern reduction framework for explainable clustering that eliminates redundant k-relaxed frequent patterns (k-RFPs) while maintaining cluster quality. The approach uses formal characterization and optimization strategies to reduce computational complexity in knowledge-driven unsupervised learning systems.

AINeutralarXiv – CS AI · Apr 136/10

🧠

Silhouette Loss: Differentiable Global Structure Learning for Deep Representations

Researchers introduce Soft Silhouette Loss, a novel machine learning objective that improves deep neural network representations by enforcing intra-class compactness and inter-class separation. The lightweight differentiable loss outperforms cross-entropy and supervised contrastive learning when combined, achieving 39.08% top-1 accuracy compared to 37.85% for existing methods while reducing computational overhead.

AIBullisharXiv – CS AI · Mar 276/10

🧠

UniAI-GraphRAG: Synergizing Ontology-Guided Extraction, Multi-Dimensional Clustering, and Dual-Channel Fusion for Robust Multi-Hop Reasoning

Researchers have developed UniAI-GraphRAG, an enhanced framework that improves upon existing GraphRAG systems for complex reasoning and multi-hop queries. The framework introduces three key innovations including ontology-guided extraction, multi-dimensional clustering, and dual-channel fusion, showing superior performance over mainstream solutions like LightRAG on benchmark tests.

AIBullisharXiv – CS AI · Mar 176/10

🧠

CLAG: Adaptive Memory Organization via Agent-Driven Clustering for Small Language Model Agents

Researchers introduce CLAG, a clustering-based memory framework that helps small language model agents organize and retrieve information more effectively. The system addresses memory dilution issues by creating semantic clusters with automated profiles, showing improved performance across multiple QA datasets.

AIBullisharXiv – CS AI · Mar 55/10

🧠

Learning Order Forest for Qualitative-Attribute Data Clustering

Researchers developed a new machine learning method called Learning Order Forest that improves clustering of qualitative data by using tree-like structures to represent relationships between categorical attributes. The joint learning mechanism iteratively optimizes both tree structures and clusters, outperforming 10 competing methods across 12 benchmark datasets.

AINeutralarXiv – CS AI · Mar 37/109

🧠

Universal NP-Hardness of Clustering under General Utilities

Researchers prove that clustering problems in machine learning are universally NP-hard, providing theoretical explanation for why clustering algorithms often produce unstable results. The study demonstrates that major clustering methods like k-means and spectral clustering inherit fundamental computational intractability, explaining common failure modes like local optima.

AIBullisharXiv – CS AI · Feb 275/106

🧠

Quality-Aware Robust Multi-View Clustering for Heterogeneous Observation Noise

Researchers propose QARMVC, a new AI framework for multi-view clustering that addresses heterogeneous noise in real-world data. The system uses quality scores to identify contamination levels and employs hierarchical learning to improve clustering performance, showing superior results across benchmark datasets.

AINeutralarXiv – CS AI · Mar 265/10

🧠

Cluster-R1: Large Reasoning Models Are Instruction-following Clustering Agents

Researchers have developed Cluster-R1, a new approach that trains large reasoning models (LRMs) as autonomous clustering agents capable of following instructions and inferring optimal cluster structures. The method reframes instruction-following clustering as a generative task and demonstrates superior performance over traditional embedding-based methods across 28 diverse tasks in the ReasonCluster benchmark.

AINeutralarXiv – CS AI · Mar 174/10

🧠

Unsupervised Point Cloud Pre-Training via Contrasting and Clustering

Researchers propose ConClu, an unsupervised pre-training framework for point clouds that combines contrasting and clustering techniques to learn discriminative representations without labeled data. The method outperforms state-of-the-art approaches on multiple downstream tasks, addressing the challenge of expensive point cloud annotation.

AINeutralarXiv – CS AI · Mar 44/104

🧠

Characterizing and Predicting Wildfire Evacuation Behavior: A Dual-Stage ML Approach

Researchers used machine learning techniques to analyze wildfire evacuation behavior patterns from survey data across California, Colorado, and Oregon. The study found that transportation mode during evacuations can be reliably predicted from household characteristics, while evacuation timing remains difficult to predict due to dynamic fire conditions.

AINeutralarXiv – CS AI · Mar 44/102

🧠

How to Model AI Agents as Personas?: Applying the Persona Ecosystem Playground to 41,300 Posts on Moltbook for Behavioral Insights

Researchers developed a method to model AI agents as distinct personas by analyzing 41,300 posts from Moltbook, an AI agent social platform. Using k-means clustering and validation techniques, they successfully identified and validated different behavioral patterns among AI agents, demonstrating that persona-based modeling can effectively represent diversity in AI agent populations.

AIBullisharXiv – CS AI · Mar 25/107

🧠

FedDAG: Clustered Federated Learning via Global Data and Gradient Integration for Heterogeneous Environments

Researchers introduce FedDAG, a new clustered federated learning framework that improves AI model training across heterogeneous client environments. The system combines data and gradient similarity metrics for better client clustering and uses a dual-encoder architecture to enable knowledge sharing across clusters while maintaining specialization.

AINeutralarXiv – CS AI · Mar 34/107

🧠

CA-AFP: Cluster-Aware Adaptive Federated Pruning

Researchers propose CA-AFP, a new federated learning framework that combines client clustering with adaptive model pruning to address both statistical and system heterogeneity challenges. The approach achieves better accuracy and fairness while reducing communication costs compared to existing methods, as demonstrated on human activity recognition benchmarks.