27 articles tagged with #in-context-learning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AINeutralarXiv โ CS AI ยท Apr 67/10
๐ง Researchers published a comprehensive technical survey on Large Language Model augmentation strategies, examining methods from in-context learning to advanced Retrieval-Augmented Generation techniques. The study provides a unified framework for understanding how structured context at inference time can overcome LLMs' limitations of static knowledge and finite context windows.
AINeutralarXiv โ CS AI ยท Mar 267/10
๐ง Researchers have developed techniques to mitigate many-shot jailbreaking (MSJ) attacks on large language models, where attackers use numerous examples to override safety training. Combined fine-tuning and input sanitization approaches significantly reduce MSJ effectiveness while maintaining normal model performance.
AIBullisharXiv โ CS AI ยท Mar 97/10
๐ง Researchers developed Localized In-Context Learning (L-ICL), a technique that significantly improves large language model performance on symbolic planning tasks by targeting specific constraint violations with minimal corrections. The method achieves 89% valid plan generation compared to 59% for best baselines, representing a major advancement in LLM reasoning capabilities.
AIBullisharXiv โ CS AI ยท Mar 57/10
๐ง Researchers discovered that Large Language Models become increasingly sparse in their internal representations when handling more difficult or out-of-distribution tasks. This sparsity mechanism appears to be an adaptive response that helps stabilize reasoning under challenging conditions, leading to the development of a new learning strategy called Sparsity-Guided Curriculum In-Context Learning (SG-ICL).
AIBullisharXiv โ CS AI ยท Mar 56/10
๐ง Researchers introduce RDB-PFN, the first relational foundation model for databases trained entirely on synthetic data to overcome privacy and scarcity issues with real relational databases. The model uses a Relational Prior Generator to create over 2 million synthetic tasks and demonstrates strong few-shot performance on 19 real-world relational prediction tasks through in-context learning.
AIBullisharXiv โ CS AI ยท Mar 57/10
๐ง Researchers propose Supervised Calibration (SC), a new framework to improve In-Context Learning performance in Large Language Models by addressing systematic biases through optimal affine transformations in logit space. The method achieves state-of-the-art results across multiple LLMs including Mistral-7B, Llama-2-7B, and Qwen2-7B in few-shot learning scenarios.
๐ง Llama
AINeutralarXiv โ CS AI ยท Mar 57/10
๐ง Researchers studied how large language models generalize to new tasks through "off-by-one addition" experiments, discovering a "function induction" mechanism that operates at higher abstraction levels than previously known induction heads. The study reveals that multiple attention heads work in parallel to enable task-level generalization, with this mechanism being reusable across various synthetic and algorithmic tasks.
AINeutralarXiv โ CS AI ยท Mar 47/103
๐ง Research compares Transformers, State Space Models (SSMs), and hybrid architectures for in-context retrieval tasks, finding hybrid models excel at information-dense retrieval while Transformers remain superior for position-based tasks. SSM-based models develop unique locality-aware embeddings that create interpretable positional structures, explaining their specific strengths and limitations.
AIBullisharXiv โ CS AI ยท Mar 47/104
๐ง Researchers propose Many-Shot In-Context Fine-tuning (ManyICL), a novel approach that significantly improves large language model performance by treating multiple in-context examples as supervised training targets rather than just prompts. The method narrows the performance gap between in-context learning and dedicated fine-tuning while reducing catastrophic forgetting issues.
AINeutralarXiv โ CS AI ยท Mar 47/103
๐ง Researchers introduce Spectrum Tuning, a new post-training method that improves AI language models' ability to generate diverse outputs and follow in-context steering instructions. The technique addresses limitations in current post-training approaches that reduce models' distributional coverage and flexibility when tasks require multiple valid answers rather than single correct responses.
AINeutralarXiv โ CS AI ยท 2d ago6/10
๐ง Researchers have developed a method to make transformer neural networks interpretable by studying how they perform in-context classification from few examples. By enforcing permutation equivariance constraints, they extracted an explicit algorithmic update rule that reveals how transformers dynamically adjust to new data, offering the first identifiable recursion of this kind.
AINeutralarXiv โ CS AI ยท 3d ago6/10
๐ง Researchers propose Noise-Aware In-Context Learning (NAICL), a plug-and-play method to reduce hallucinations in auditory large language models without expensive fine-tuning. The approach uses a noise prior library to guide models toward more conservative outputs, achieving a 37% reduction in hallucination rates while establishing a new benchmark for evaluating audio understanding systems.
AIBullisharXiv โ CS AI ยท 3d ago6/10
๐ง Researchers introduce RecaLLM, a post-trained language model that addresses the 'lost-in-thought' phenomenon where retrieval performance degrades during extended reasoning chains. The model interleaves explicit in-context retrieval with reasoning steps and achieves strong performance on long-context benchmarks using training data significantly shorter than existing approaches.
AINeutralCrypto Briefing ยท 5d ago7/10
๐ง Vishal Misra discusses how transformers learn correlations rather than causal relationships, highlighting the importance of in-context learning and Bayesian updating for advancing AI capabilities beyond pattern matching toward genuine reasoning.
AINeutralarXiv โ CS AI ยท 6d ago6/10
๐ง Researchers conducted a comparative analysis of demonstration selection strategies for using large language models to predict users' next point-of-interest (POI) based on historical location data. The study found that simple heuristic methods like geographical proximity and temporal ordering outperform complex embedding-based approaches in both computational efficiency and prediction accuracy, with LLMs using these heuristics sometimes matching fine-tuned model performance without additional training.
AINeutralarXiv โ CS AI ยท 6d ago6/10
๐ง Researchers investigate in-context learning (ICL) in speech language models, revealing that speaking rate significantly affects model performance and acoustic mimicry, while induction heads play a causal role identical to text-based ICL. The study bridges the gap between text and speech domains by analyzing how models learn from demonstrations in text-to-speech tasks.
AINeutralarXiv โ CS AI ยท 6d ago6/10
๐ง Researchers evaluated how well large language models can perform formal grammar-based translation tasks using in-context learning, finding that LLM translation accuracy degrades significantly with grammar complexity and sentence length. The study identifies specific failure modes including vocabulary hallucination and untranslated source words, revealing fundamental limitations in LLMs' ability to apply formal grammatical rules to translation tasks.
AIBullisharXiv โ CS AI ยท Mar 176/10
๐ง Researchers propose a theoretical framework based on category theory to formalize meta-prompting in large language models. The study demonstrates that meta-prompting (using prompts to generate other prompts) is more effective than basic prompting for generating desirable outputs from LLMs.
AINeutralarXiv โ CS AI ยท Mar 176/10
๐ง Research shows that synthetic data designed to enhance in-context learning capabilities in AI models doesn't necessarily improve performance. The study found that while targeted training can increase specific neural mechanisms, it doesn't make them more functionally important compared to natural training approaches.
๐ข Perplexity
AIBullisharXiv โ CS AI ยท Mar 66/10
๐ง Researchers introduce DP-MTV, the first framework enabling privacy-preserving multimodal in-context learning for vision-language models using differential privacy. The system allows processing hundreds of demonstrations while maintaining formal privacy guarantees, achieving competitive performance on benchmarks like VizWiz with only minimal accuracy loss.
AIBullisharXiv โ CS AI ยท Mar 37/104
๐ง Researchers propose combining In-Weight Learning (IWL) and In-Context Learning (ICL) through modular memory architectures to solve continual learning challenges in AI. The framework aims to enable AI agents to continuously adapt and accumulate knowledge without catastrophic forgetting, addressing key limitations of current foundation models.
AIBullisharXiv โ CS AI ยท Mar 36/104
๐ง Researchers introduce VINCIE, a novel approach that learns in-context image editing directly from videos without requiring specialized models or curated training data. The method uses a block-causal diffusion transformer trained on video sequences and achieves state-of-the-art results on multi-turn image editing benchmarks.
AIBullisharXiv โ CS AI ยท Mar 26/1016
๐ง Researchers investigate in-context learning (ICL) in world models, identifying two core mechanisms - environment recognition and environment learning - that enable AI systems to adapt to new configurations. The study provides theoretical error bounds and empirical evidence showing that diverse environments and long context windows are crucial for developing self-adapting world models.
AINeutralarXiv โ CS AI ยท Mar 26/1015
๐ง Researchers conducted an in-depth analysis of in-context learning capabilities across different AI architectures including transformers, state-space models, and hybrid systems. The study reveals that while these models perform similarly on tasks, their internal mechanisms differ significantly, with function vectors playing key roles in self-attention and Mamba layers.
AIBullisharXiv โ CS AI ยท Feb 276/106
๐ง Researchers developed LEREDD, an LLM-based system that automates the detection of dependencies between software requirements using Retrieval-Augmented Generation and In-Context Learning. The system achieved 93% accuracy in classifying requirement dependencies, significantly outperforming existing baselines with relative gains of over 94% in F1 scores for specific dependency types.