y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#llm News & Analysis

956 articles tagged with #llm. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

956 articles
AINeutralarXiv – CS AI · Mar 34/104
🧠

Collab-REC: An LLM-based Agentic Framework for Balancing Recommendations in Tourism

Researchers propose Collab-REC, a multi-agent LLM framework for tourism recommendations that uses three specialized agents (Personalization, Popularity, and Sustainability) with a moderator to reduce popularity bias and increase diversity. The system successfully surfaces lesser-visited destinations and addresses over-tourism concerns through balanced, multi-perspective recommendations.

AIBullisharXiv – CS AI · Mar 34/103
🧠

Token-Efficient Item Representation via Images for LLM Recommender Systems

Researchers propose I-LLMRec, a new method for AI recommender systems that uses images instead of lengthy text descriptions to represent items, reducing computational token usage while maintaining recommendation quality. The approach leverages the information overlap between images and descriptions to create more efficient and robust LLM-based recommendation systems.

AINeutralarXiv – CS AI · Mar 34/104
🧠

Knowledge-Based Design Requirements for Generative Social Robots in Higher Education

Researchers identify 12 knowledge-based design requirements for generative social robots in higher education, categorized into self-knowledge, user-knowledge, and context-knowledge. The study addresses risks like hallucinations and overreliance in AI tutoring systems through interviews with university students and lecturers.

AINeutralApple Machine Learning · Mar 35/103
🧠

Learning to Reason for Hallucination Span Detection

Researchers are developing new methods to detect hallucinations in large language models by identifying specific spans of unsupported content rather than making binary decisions. The study evaluates Chain-of-Thought reasoning approaches to improve the complex multi-step process of hallucination span detection in LLMs.

AIBullisharXiv – CS AI · Mar 25/106
🧠

ProductResearch: Training E-Commerce Deep Research Agents via Multi-Agent Synthetic Trajectory Distillation

Researchers developed ProductResearch, a multi-agent AI framework that creates synthetic training data to improve e-commerce shopping agents. The system uses multiple AI agents to generate comprehensive product research trajectories, with experiments showing a compact model fine-tuned on this synthetic data significantly outperforming base models in shopping assistance tasks.

AINeutralarXiv – CS AI · Mar 25/107
🧠

HotelQuEST: Balancing Quality and Efficiency in Agentic Search

Researchers introduce HotelQuEST, a new benchmark for evaluating agentic search systems that balances quality and efficiency metrics. The study reveals that while LLM-based agents achieve higher accuracy than traditional retrievers, they incur substantially higher costs due to redundant operations and poor optimization.

AINeutralarXiv – CS AI · Mar 25/104
🧠

Terminology Rarity Predicts Catastrophic Failure in LLM Translation of Low-Resource Ancient Languages: Evidence from Ancient Greek

A study evaluated large language models (Claude, Gemini, ChatGPT) translating Ancient Greek texts, finding high performance on previously translated works (95.2/100) but declining quality on untranslated technical texts (79.9/100). Terminology rarity was identified as a strong predictor of translation failure, with rare terms causing catastrophic performance drops.

AINeutralarXiv – CS AI · Mar 25/107
🧠

Integrating LLM in Agent-Based Social Simulation: Opportunities and Challenges

A research position paper examines the integration of Large Language Models (LLMs) in agent-based social simulations, highlighting both opportunities and limitations. The study proposes Hybrid Constitutional Architectures that combine classical agent-based models with small language models and LLMs to balance expressive flexibility with analytical transparency.

AINeutralarXiv – CS AI · Mar 25/109
🧠

From Moderation to Mediation: Can LLMs Serve as Mediators in Online Flame Wars?

Researchers explore using large language models (LLMs) as mediators rather than just moderators in online conflicts, developing a framework that combines judgment evaluation and empathetic intervention. Their study using Reddit data shows API-based models outperform open-source alternatives in de-escalating flame wars and fostering constructive dialogue.

AINeutralarXiv – CS AI · Mar 25/107
🧠

User Misconceptions of LLM-Based Conversational Programming Assistants

Researchers analyzed user misconceptions about LLM-based programming assistants like ChatGPT, finding users often have misplaced expectations about web access, code execution, and debugging capabilities. The study examined Python programming conversations from WildChat dataset and identified the need for clearer communication of tool capabilities to prevent over-reliance and unproductive practices.

AINeutralarXiv – CS AI · Mar 25/105
🧠

LEC-KG: An LLM-Embedding Collaborative Framework for Domain-Specific Knowledge Graph Construction -- A Case Study on SDGs

Researchers developed LEC-KG, a new framework that combines Large Language Models with Knowledge Graph Embeddings to better extract and structure information from unstructured text. The system was tested on Chinese Sustainable Development Goal reports and showed significant improvements over traditional LLM approaches, particularly for identifying rare relationships in domain-specific content.

AINeutralarXiv – CS AI · Feb 274/106
🧠

From Prompts to Performance: Evaluating LLMs for Task-based Parallel Code Generation

Researchers evaluated Large Language Models' ability to generate parallel code across three programming frameworks (OpenMP, C++, HPX) using different input prompts. The study found LLMs show varying performance depending on problem complexity and framework, revealing both capabilities and limitations in high-performance computing applications.

AINeutralarXiv – CS AI · Feb 274/103
🧠

TabDLM: Free-Form Tabular Data Generation via Joint Numerical-Language Diffusion

Researchers introduce TabDLM, a new AI framework that generates synthetic tabular data containing both numerical values and free-form text using joint numerical-language diffusion models. The approach addresses limitations of existing diffusion and LLM-based methods by combining masked diffusion for text with continuous diffusion for numbers, enabling better synthetic data generation for privacy and data augmentation applications.

AINeutralarXiv – CS AI · Feb 274/104
🧠

Instruction-based Image Editing with Planning, Reasoning, and Generation

Researchers propose a new multi-modality approach for instruction-based image editing that combines Chain-of-Thought planning, region reasoning, and generation capabilities. The method uses large language models and diffusion models to improve complex image editing tasks compared to existing single-modality approaches.

AINeutralarXiv – CS AI · Feb 274/106
🧠

LLM4AD: A Platform for Algorithm Design with Large Language Model

Researchers have introduced LLM4AD, a unified Python platform that leverages large language models for algorithm design across optimization, machine learning, and scientific discovery domains. The platform features modular components, comprehensive evaluation tools, and extensive support resources including tutorials and a graphical user interface to facilitate LLM-assisted algorithm development.

AIBullishApple Machine Learning · Feb 274/103
🧠

Scaling Search Relevance: Augmenting App Store Ranking with LLM-Generated Judgments

Researchers developed a method to improve app store search relevance by using large language models to generate textual relevance judgments, addressing the scarcity of expert-labeled data. A specialized fine-tuned model significantly outperformed general-purpose LLMs in evaluating semantic fit between queries and results.

AINeutralApple Machine Learning · Feb 245/103
🧠

Beyond a Single Extractor: Re-thinking HTML-to-Text Extraction for LLM Pretraining

Researchers investigate whether using a single HTML-to-text extractor for web-scale LLM pretraining datasets leads to suboptimal data utilization. The study reveals that different extractors can result in substantially different pages surviving filtering pipelines, despite similar model performance on standard language tasks.

AINeutralApple Machine Learning · Feb 244/103
🧠

The Potential of CoT for Reasoning: A Closer Look at Trace Dynamics

Researchers conducted an in-depth analysis of Chain-of-thought (CoT) prompting traces from competition-level mathematics questions to understand how different parts of CoT contribute to final answers. The study aims to clarify the driving forces behind CoT reasoning success in large language models, examining trace dynamics to better understand this widely-used AI reasoning technique.

AINeutralHugging Face Blog · Dec 114/106
🧠

New in llama.cpp: Model Management

The article title indicates new model management features have been added to llama.cpp, but the article body appears to be empty or unavailable. Without the actual content, specific details about the new functionality cannot be determined.

AINeutralHugging Face Blog · Sep 224/107
🧠

SyGra: The One-Stop Framework for Building Data for LLMs and SLMs

The article title mentions SyGra as a one-stop framework for building data for Large Language Models (LLMs) and Small Language Models (SLMs). However, no article body content was provided to analyze the specific details, features, or implications of this framework.

AINeutralHugging Face Blog · Sep 105/106
🧠

Jupyter Agents: training LLMs to reason with notebooks

The article appears to discuss Jupyter Agents, a system for training large language models to perform reasoning tasks using computational notebooks. However, the article body was not provided in the input, limiting the ability to provide detailed analysis.

AINeutralSynced Review · Aug 144/108
🧠

Which Agent Causes Task Failures and When?Researchers from PSU and Duke explores automated failure attribution of LLM Multi-Agent Systems

Researchers from Penn State University and Duke University are exploring automated failure attribution in LLM Multi-Agent Systems to identify which agents cause task failures and when. The study addresses a common issue where multi-agent systems fail to complete tasks despite high activity levels, aiming to improve system reliability and debugging.

← PrevPage 36 of 39Next →