#zero-shot-learning News & Analysis

85 articles tagged with #zero-shot-learning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

85 articles

AINeutralarXiv – CS AI · Jun 196/10

🧠

Disentangling Linguistic Relatedness from Task Alignment in Cross-Lingual Transfer

Researchers studying cross-lingual transfer in large language models found that fine-tuning on Arabic does not produce language-family-specific improvements. Models with weak initial performance improved across all languages tested, while strong models showed minimal gains regardless of linguistic relatedness, suggesting task-format alignment matters more than linguistic proximity.

AINeutralarXiv – CS AI · Jun 196/10

🧠

VERITAS: Verifier-Guided Proof Search for Zero-Shot Formal Theorem Proving

VERITAS introduces a zero-shot framework for formal theorem proving that leverages rich verifier feedback signals rather than binary pass/fail outcomes. Using a two-phase approach combining Best-of-N sampling with critic-guided Monte Carlo Tree Search, the system achieves 40.6% accuracy on miniF2F benchmarks and demonstrates particular strength in combinatorial problems where iterative lemma recovery is critical.

AINeutralarXiv – CS AI · Jun 116/10

🧠

Metadata-Aware Multi-Prompt Reasoning for Zero-Shot Accident Understanding

Researchers present a three-stage pipeline for zero-shot accident detection in surveillance videos that combines temporal localization, semantic classification, and spatial grounding using vision-language models. The method decomposes accident understanding into when, what, and where components, achieving significant improvements over baseline approaches on the ACCIDENT benchmark.

AINeutralarXiv – CS AI · Jun 106/10

🧠

Time Series as Language: A Universal Tokenizer for General-Purpose Time Series Foundation Models

Researchers introduce UniTok, a universal tokenizer that converts continuous time series data into discrete tokens, enabling UniTok-FM—a foundation model pretrained via next-token prediction. This unified approach supports forecasting, generation, and classification tasks without task-specific modifications, achieving competitive performance with specialized models while enabling zero-shot and few-shot inference capabilities.

AINeutralarXiv – CS AI · Jun 106/10

🧠

Unsupervised Style Representation Learning for AI-Text Detection via Paraphrase Inversion

Researchers have developed an unsupervised method for detecting AI-generated text by learning style representations through paraphrase inversion, without requiring authorship labels. The approach demonstrates competitive performance in both few-shot and zero-shot detection scenarios while generalizing better to unseen language models than existing supervised methods.

AINeutralarXiv – CS AI · Jun 106/10

🧠

Drawing with Strangers: Population Scaling Drives Zero-Shot Mutual Intelligibility in Emergent Sketching

Researchers demonstrate that scaling training populations in emergent communication systems enables zero-shot mutual intelligibility (ZMI)—successful communication between independently trained agent groups with no prior exposure. The study uses emergent sketching as a communication modality, showing that larger populations develop universal visual-grounding strategies rather than closed-group dialects, with potential applications for building interoperable AI systems.

AIBullisharXiv – CS AI · Jun 96/10

🧠

From USD Scenes to Knowledge Graphs: Zero-Shot Ontology Grounding with LLMs

Researchers demonstrate that large language models can automate the grounding of 3D scene objects to formal ontology classes without training, achieving 90-96% accuracy on kitchen scenes. This zero-shot approach eliminates reliance on brittle, manually curated dictionaries and represents a significant advance in knowledge graph construction for robotic task reasoning.

AINeutralarXiv – CS AI · Jun 85/10

🧠

Hierarchical Semantic-Constrained Heterogeneous Graph for Audio-Visual Event Localization

Researchers propose HSCHG, a novel framework for open-vocabulary audio-visual event localization that addresses temporal consistency and hierarchical semantic constraints by combining heterogeneous graphs in Euclidean space with hyperbolic space representations. The method uses hierarchical entailment regularization to improve recognition of unseen event categories while maintaining cross-modal alignment and semantic consistency across video and segment levels.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Never Seen Before: Benchmarking Genuine Zero-Shot Composed Image Retrieval with Consistent Video-Sourced Datasets

Researchers introduce ZeroSight, a new benchmark for Zero-Shot Composed Image Retrieval that addresses critical flaws in existing datasets by using video-sourced data published after CLIP's training cutoff and proposing SC4CIR, a training-free method that reveals current ZS-CIR performance metrics significantly overestimate actual model capabilities.

AIBullisharXiv – CS AI · Jun 86/10

🧠

MatterDoor: Sampling Zero-shot Spatio-semantic Priors using Generative Models

Researchers introduce MatterDoor, a method enabling autonomous robots to infer hidden room structure and semantics from doorway-occluded views using pretrained generative vision models without task-specific training. The approach combines VLM-guided outpainting, depth estimation, and semantic segmentation to generate 3D hypotheses of unobserved spaces, evaluated on a new Matterport3D-derived benchmark for robot navigation and object-reaching tasks.

AINeutralarXiv – CS AI · Jun 56/10

🧠

I Know What You Meme, Even If it Emerged Today: Understanding Evolving Memes through Open-World Knowledge Acquisition

Researchers introduce Query Retrieve Conclude, a zero-shot framework that improves meme understanding by identifying knowledge gaps, retrieving current web evidence, and synthesizing grounded background knowledge. The approach addresses limitations of existing methods that rely on outdated or incomplete parametric knowledge, demonstrating improvements across meme understanding and detection tasks using a new benchmark dataset of 2024-2026 memes.

AINeutralarXiv – CS AI · Jun 56/10

🧠

DAST: A VLM-LLM Framework for Cross-Interface Anomaly Detection in O-RAN

Researchers present DAST, a zero-shot AI framework combining Vision Language Models and Large Language Models to detect anomalies and denial-of-service attacks in O-RAN (Open Radio Access Network) infrastructure. The system achieved 0.910 F1-Score by converting network telemetry into visual representations and cross-referencing them against domain knowledge, addressing critical security gaps in disaggregated 5G/6G networks.

AINeutralarXiv – CS AI · Jun 46/10

🧠

A Goal-Set Characterization of Task Composition in the Boolean Task Algebra

Researchers demonstrate that the Boolean Task Algebra (BTA) framework for reinforcement learning can be substantially simplified by eliminating redundant base tasks. Their goal-set-based composition method achieves comparable performance while reducing computational costs for both learning and composition across diverse environments, with experiments showing that additional base tasks provide no performance benefits.