13 articles tagged with #small-language-models. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv โ CS AI ยท 2d ago7/10
๐ง Researchers introduce Pioneer Agent, an automated system that continuously improves small language models in production by diagnosing failures, curating training data, and retraining under regression constraints. The system demonstrates significant performance gains across benchmarks, with real-world deployments achieving improvements from 84.9% to 99.3% in intent classification.
AIBullisharXiv โ CS AI ยท 2d ago7/10
๐ง Researchers demonstrate a cost-effective approach to training specialized small language models by using LLMs as one-time teachers to generate synthetic training data. By converting 3.2 billion maritime vessel tracking records into 21,543 QA pairs, they fine-tuned Qwen2.5-7B to achieve 75% accuracy on maritime tasks at a fraction of the cost of deploying larger models, establishing a reproducible framework for domain-specific AI applications.
๐ง GPT-4
AIBullisharXiv โ CS AI ยท Mar 117/10
๐ง Researchers demonstrated that a fine-tuned small language model (SLM) with 350M parameters can significantly outperform large language models like ChatGPT in tool-calling tasks, achieving a 77.55% pass rate versus ChatGPT's 26%. This breakthrough suggests organizations can reduce AI operational costs while maintaining or improving performance through targeted fine-tuning of smaller models.
๐ข Meta๐ข Hugging Face๐ง ChatGPT
AIBullisharXiv โ CS AI ยท 1d ago6/10
๐ง Researchers introduce HintMR, a hint-assisted reasoning framework that improves mathematical problem-solving in small language models by using a separate hint-generating model to provide contextual guidance through multi-step problems. This collaborative two-model system demonstrates significant accuracy improvements over standard prompting while maintaining computational efficiency.
AINeutralarXiv โ CS AI ยท 2d ago6/10
๐ง Researchers demonstrate that five mature small language model architectures (1.5B-8B parameters) share nearly identical emotion vector representations despite exhibiting opposite behavioral profiles, suggesting emotion geometry is a universal feature organized early in model development. The study also deconstructs prior emotion-vector research methodology into four distinct layers of confounding factors, revealing that single correlations between studies cannot safely establish comparability.
๐ง Llama
AIBullisharXiv โ CS AI ยท 6d ago6/10
๐ง Researchers introduce EmoMAS, a Bayesian multi-agent framework that enables small language models to perform sophisticated negotiation by treating emotional intelligence as a strategic variable. The system coordinates game-theoretic, reinforcement learning, and psychological agents to optimize negotiation outcomes while maintaining privacy through edge deployment, demonstrating performance comparable to larger models across high-stakes domains.
AIBullisharXiv โ CS AI ยท Apr 76/10
๐ง Researchers developed a new training approach that makes small language models more effective search agents by teaching them to consistently use search tools rather than relying on internal knowledge. The method achieved significant performance improvements of 17.3 points on Bamboogle and 15.3 points on HotpotQA, reaching large language model-level results while maintaining lower computational costs.
AINeutralarXiv โ CS AI ยท Apr 76/10
๐ง Researchers conducted the first comprehensive analysis of emotion representations in small language models (100M-10B parameters), finding that these models do possess internal emotion vectors similar to larger frontier models. The study evaluated 9 models across 5 architectural families and discovered that emotion representations localize at middle transformer layers, with generation-based extraction methods proving superior to comprehension-based approaches.
๐ข Perplexity๐ง Llama
AIBullisharXiv โ CS AI ยท Mar 26/1012
๐ง Researchers propose TASC (Task-Adaptive Sequence Compression), a framework for accelerating small language models through two methods: TASC-ft for fine-tuning with expanded vocabularies and TASC-spec for training-free speculative decoding. The methods demonstrate improved inference efficiency while maintaining task performance across low output-variability generation tasks.
AINeutralarXiv โ CS AI ยท Mar 27/1017
๐ง Researchers introduce RooflineBench, a framework for measuring performance capabilities of Small Language Models on edge devices using operational intensity analysis. The study reveals that sequence length significantly impacts performance, model depth causes efficiency regression, and structural improvements like Multi-head Latent Attention can unlock better hardware utilization.
AIBullisharXiv โ CS AI ยท Feb 276/106
๐ง Researchers developed a three-stage framework using Small Language Models (SLMs) to automatically translate natural language queries into Kusto Query Language (KQL) for cybersecurity operations. The approach achieves high accuracy (98.7% syntax, 90.6% semantic) while reducing costs by up to 10x compared to GPT-4, potentially solving bottlenecks in Security Operations Centers.
AIBullisharXiv โ CS AI ยท Feb 276/107
๐ง Researchers propose Struct-SQL, a knowledge distillation framework that improves Small Language Models for Text-to-SQL tasks by using structured Chain-of-Thought reasoning instead of unstructured approaches. The method achieves an 8.1% improvement over baseline distillation, primarily by reducing syntactic errors through formal query execution plan blueprints.
AINeutralarXiv โ CS AI ยท Feb 274/107
๐ง Researchers benchmarked small language models (SLMs) for leader-follower role classification in human-robot interaction, finding that fine-tuned Qwen2.5-0.5B achieves 86.66% accuracy with 22.2ms latency. The study demonstrates SLMs can effectively handle real-time role assignment for resource-constrained robots, though performance degrades with increased dialogue complexity.