#kernel-methods News & Analysis

14 articles tagged with #kernel-methods. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

14 articles

AINeutralarXiv – CS AI · May 287/10

🧠

Why LLMs Fail at Causal Discovery and How Interventional Agents Escape

Researchers prove that large language models fundamentally cannot perform causal discovery through standard training methods, establishing this limitation as intrinsic to supervised learning rather than a model-specific flaw. They propose Agentic Causal Bayesian Optimization (A-CBO), which bypasses this constraint by using frozen language models as query oracles within an external optimization loop, achieving superior performance on causal inference benchmarks.

AIBullisharXiv – CS AI · Mar 56/10

🧠

Data-Aware Random Feature Kernel for Transformers

Researchers introduce DARKFormer, a new transformer architecture that reduces computational complexity from quadratic to linear while maintaining performance. The model uses data-aware random feature kernels to address efficiency issues in pretrained transformer models with anisotropic query-key distributions.

AINeutralarXiv – CS AI · Jun 96/10

🧠

How Deep Are Deep GPs, Really? A Sharp Threshold and a Non-Gaussian Limit for Compositional GPs

Researchers establish a sharp bandwidth threshold for deep Gaussian processes, proving that below this threshold compositional GPs converge to non-Gaussian, non-degenerate limit distributions rather than degenerating to constant functions. This advances theoretical understanding of deep Bayesian models and their limiting behavior as network depth increases.

AINeutralarXiv – CS AI · Jun 96/10

🧠

Kernel Affine Hull Machines as Compute-Efficient Encoders for Frozen Semantic Spaces

Researchers propose Kernel Affine Hull Machines (KAHM) as a lightweight alternative to transformer-based neural encoders for semantic search in frozen representation spaces. The method achieves 8.53x faster query encoding while maintaining competitive retrieval performance, offering practical efficiency gains for production deployment scenarios.

AINeutralarXiv – CS AI · Jun 46/10

🧠

KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning

Researchers introduce KITE, a novel example selection method for in-context learning in large language models that uses information theory and kernel methods to choose task-specific examples from a prompt bank. The approach addresses limitations of existing nearest-neighbor methods by improving diversity and generalization, demonstrating measurable improvements across classification tasks in label-scarce scenarios.

AINeutralarXiv – CS AI · Jun 46/10

🧠

Extending Fair Null-Space Projections for Continuous Attributes to Kernel Methods

Researchers extend null-space projection techniques for fairness in machine learning to kernel methods, enabling fair regression with continuous protected attributes. The method transforms kernel matrices directly and demonstrates competitive performance with Support Vector Regression across multiple datasets, advancing the limited field of continuous fairness in ML systems.

🏢 Meta

AINeutralarXiv – CS AI · May 285/10

🧠

Supervised Distributional Reduction via Optimal Transport and Dependence Maximization

Researchers propose Supervised Distributional Reduction (SDR), a machine learning algorithm combining optimal transport theory with dependence maximization to create compact data representations that preserve both geometric structure and predictive information. The method extends the Fused Gromov-Wasserstein framework and offers applications in representation learning and adaptive kernel design for Gaussian Process modeling.

AINeutralarXiv – CS AI · May 126/10

🧠

Transformers Can Implement Preconditioned Richardson Iteration for In-Context Gaussian Kernel Regression

Researchers demonstrate that standard transformer models with softmax attention can implement preconditioned Richardson iteration to solve Gaussian kernel ridge regression tasks during in-context learning. The theoretical construction and empirical validation reveal how transformers decompose nonlinear prediction into interpretable algorithmic steps, advancing mechanistic understanding of transformer capabilities.

AINeutralarXiv – CS AI · May 126/10

🧠

One for All: A Non-Linear Transformer can Enable Cross-Domain Generalization for In-Context Reinforcement Learning

Researchers propose a non-linear transformer architecture that enables reinforcement learning agents to generalize across different domains through in-context learning, establishing a theoretical connection between transformers and kernel-based temporal difference learning. By interpreting transformers as operators in Reproducing Kernel Hilbert Space, the work demonstrates that value functions from diverse domains can share a unified weight set, with MetaWorld experiments validating the approach.

AINeutralarXiv – CS AI · May 116/10

🧠

Closed-Form Linear-Probe Dataset Distillation for Pre-trained Vision Models

Researchers introduce CLP-DD, a novel dataset distillation method optimized for frozen pre-trained vision models using closed-form linear probing. The technique achieves comparable or superior performance to existing methods while running 14x faster and using 87.5% less GPU memory on ImageNet-1K.

AINeutralarXiv – CS AI · May 96/10

🧠

Amortized Linear-time Exact Shapley Value for Product-Kernel Methods

Researchers introduce PKeX-Shapley, an algorithm that computes exact Shapley values for product-kernel machine learning models in quadratic time, eliminating the need for approximations. The method exploits the multiplicative structure of product kernels to achieve linear-time-per-feature attribution without sampling or density estimation, extending beyond predictive models to statistical discrepancy measures like MMD and HSIC.

AIBullisharXiv – CS AI · Apr 206/10

🧠

Transformer Neural Processes - Kernel Regression

Researchers introduce Transformer Neural Process - Kernel Regression (TNP-KR), a scalable machine learning architecture that dramatically reduces computational complexity for neural processes from O(n²) to O(n_c) while maintaining or exceeding accuracy. The breakthrough enables processing of 100K context points with 1M+ test points on a single GPU, advancing the feasibility of neural processes for large-scale applications.

AINeutralarXiv – CS AI · Jun 234/10

🧠

QBioFusion-QSAR: Morgan-Anchored Quantum Multiple Kernel Learning for Small-Data Ligand Classification

QBioFusion-QSAR introduces a quantum multiple kernel learning framework combining quantum fidelity kernels with traditional Morgan fingerprints for drug discovery classification tasks. On a 54-molecule benchmark, the hybrid approach modestly improved accuracy and correlation metrics, though statistical validation across multiple random partitions showed gains were not consistently significant beyond classical methods.

AINeutralarXiv – CS AI · Mar 44/103

🧠

Adaptive Personalized Federated Learning via Multi-task Averaging of Kernel Mean Embeddings

Researchers propose a new Personalized Federated Learning approach that automatically learns optimal collaboration weights between agents without prior knowledge of data heterogeneity. The method uses kernel mean embedding estimation to capture statistical relationships between agents and includes a practical implementation for communication-constrained federated settings.