y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#llm News & Analysis

956 articles tagged with #llm. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

956 articles
AINeutralarXiv โ€“ CS AI ยท Mar 94/10
๐Ÿง 

Transforming Agency. On the mode of existence of Large Language Models

A new academic paper analyzes the ontological nature of Large Language Models like ChatGPT, concluding they are not autonomous agents but rather 'linguistic automatons' or 'libraries-that-talk' that lack true agency. The research argues that LLMs fail to meet key conditions for autonomous agency including individuality, normativity, and interactional asymmetry, while still enabling new forms of human-machine interaction.

๐Ÿง  ChatGPT
AINeutralarXiv โ€“ CS AI ยท Mar 94/10
๐Ÿง 

Conditioning LLMs to Generate Code-Switched Text

Researchers developed a methodology to fine-tune large language models (LLMs) for generating code-switched text between English and Spanish by back-translating natural code-switched sentences into monolingual English. The study found that fine-tuning significantly improves LLMs' ability to generate fluent code-switched text, and that LLM-based evaluation methods align better with human preferences than traditional metrics.

AINeutralarXiv โ€“ CS AI ยท Mar 64/10
๐Ÿง 

Towards automated data analysis: A guided framework for LLM-based risk estimation

Researchers propose a new framework that combines Large Language Models with human supervision for automated dataset risk estimation. The approach aims to address limitations of manual auditing and AI hallucinations by having LLMs identify database properties and generate analysis code under human guidance.

AINeutralarXiv โ€“ CS AI ยท Mar 64/10
๐Ÿง 

Legal interpretation and AI: from expert systems to argumentation and LLMs

This research paper examines how AI and Law research has evolved in approaching legal interpretation through three main methodologies: expert systems for knowledge engineering, argumentation frameworks for assessing interpretive claims, and machine learning models including LLMs for automated legal argument generation.

AINeutralarXiv โ€“ CS AI ยท Mar 54/10
๐Ÿง 

BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning

Researchers trained a compact 1.5B parameter language model to solve beam physics problems using reinforcement learning with verifiable rewards, achieving 66.7% improvement in accuracy. However, the model learned pattern-matching templates rather than true physics reasoning, failing to generalize to topological changes despite mastering the same underlying equations.

AINeutralarXiv โ€“ CS AI ยท Mar 54/10
๐Ÿง 

Token-Oriented Object Notation vs JSON: A Benchmark of Plain and Constrained Decoding Generation

A benchmark study compares Token-Oriented Object Notation (TOON) with JSON for structured data serialization in LLMs, finding that while TOON reduces token usage, plain JSON shows better accuracy overall. The research reveals that TOON's efficiency benefits may only emerge at scale where syntax savings offset the initial prompt overhead.

AINeutralarXiv โ€“ CS AI ยท Mar 54/10
๐Ÿง 

How does fine-tuning improve sensorimotor representations in large language models?

A research study reveals that fine-tuning Large Language Models can bridge the 'embodiment gap' by aligning their representations with human sensorimotor experiences. The improvements generalize across languages and related sensory dimensions but are highly dependent on the specific learning objective used.

AINeutralarXiv โ€“ CS AI ยท Mar 54/10
๐Ÿง 

Bridging Pedagogy and Play: Introducing a Language Mapping Interface for Human-AI Co-Creation in Educational Game Design

Researchers developed a web tool that uses natural language as the primary interface for LLM-assisted educational game design, allowing instructors to collaborate with AI to create games with specific learning outcomes. The tool maps pedagogy to gameplay through four linked components while maintaining human agency in critical design decisions.

AINeutralarXiv โ€“ CS AI ยท Mar 54/10
๐Ÿง 

SpotIt+: Verification-based Text-to-SQL Evaluation with Database Constraints

SpotIt+ is a new open-source tool that evaluates Text-to-SQL systems through verification-based testing, actively searching for database instances that reveal differences between generated and ground truth SQL queries. The tool incorporates constraint-mining that combines rule-based specification mining with LLM validation to generate more realistic test scenarios.

AINeutralarXiv โ€“ CS AI ยท Mar 54/10
๐Ÿง 

MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities of Daily Living

Researchers have released MuRAL, a new dataset containing over 21 hours of multi-resident smart home sensor data with natural language annotations for training AI models. The dataset aims to improve Large Language Models' ability to understand human activities in complex smart home environments, though current LLMs still struggle with key tasks like resident identification and activity prediction.

AINeutralarXiv โ€“ CS AI ยท Mar 54/10
๐Ÿง 

RLJP: Legal Judgment Prediction via First-Order Logic Rule-enhanced with Large Language Models

Researchers propose RLJP, a new framework for Legal Judgment Prediction that combines first-order logic rules with large language models to improve AI-based legal decision making. The system uses a three-stage approach including Confusion-aware Contrastive Learning to dynamically optimize judgment rules and showed superior performance on public datasets.

AINeutralarXiv โ€“ CS AI ยท Mar 54/10
๐Ÿง 

When Relevance Meets Novelty: Dual-Stable Periodic Optimization for Serendipitous Recommendation

Researchers propose Co-Evolutionary Alignment (CoEA), a new recommendation system method that uses dual large language models to balance relevant and novel content suggestions. The system addresses traditional recommendation bias through dynamic optimization that considers both long-term group identity and short-term individual preferences.

AINeutralarXiv โ€“ CS AI ยท Mar 54/10
๐Ÿง 

CareMedEval dataset: Evaluating Critical Appraisal and Reasoning in the Biomedical Field

Researchers introduce CareMedEval, a new dataset with 534 questions based on 37 scientific articles to evaluate large language models' ability to perform critical appraisal in biomedical contexts. Testing reveals current AI models struggle with this specialized reasoning task, achieving only 0.5 exact match rates even with advanced prompting techniques.

AINeutralarXiv โ€“ CS AI ยท Mar 44/103
๐Ÿง 

Compact Prompting in Instruction-tuned LLMs for Joint Argumentative Component Detection

Researchers developed a novel approach using instruction-tuned Large Language Models to improve argumentative component detection in text analysis. The method reframes the task as language generation rather than traditional sequence labeling, achieving superior performance on standard benchmarks compared to existing state-of-the-art systems.

AINeutralarXiv โ€“ CS AI ยท Mar 44/103
๐Ÿง 

AnchorDrive: LLM Scenario Rollout with Anchor-Guided Diffusion Regeneration for Safety-Critical Scenario Generation

Researchers have developed AnchorDrive, a two-stage AI framework that combines large language models with diffusion models to generate realistic safety-critical scenarios for autonomous driving systems. The system uses LLMs for controllable scenario generation based on natural language instructions, then employs diffusion models to create realistic driving trajectories.

AINeutralarXiv โ€“ CS AI ยท Mar 44/103
๐Ÿง 

A Directed Graph Model and Experimental Framework for Design and Study of Time-Dependent Text Visualisation

Researchers developed a framework to study how people interpret time-dependent text visualizations using directed graph models and synthetic data generated by LLMs. The study found that users struggle to identify predefined patterns in text relationships, suggesting visualization tools may need personalized approaches rather than one-size-fits-all solutions.

AIBullisharXiv โ€“ CS AI ยท Mar 44/103
๐Ÿง 

Sensory-Aware Sequential Recommendation via Review-Distilled Representations

Researchers propose ASEGR, a novel AI framework that enhances product recommendation systems by extracting sensory attributes from user reviews using large language models. The system uses a two-stage pipeline where an LLM extracts structured sensory data which is then distilled into compact embeddings for sequential recommendation models.

AINeutralarXiv โ€“ CS AI ยท Mar 35/106
๐Ÿง 

Multi-Sourced, Multi-Agent Evidence Retrieval for Fact-Checking

Researchers propose WKGFC, a new AI system that uses knowledge graphs and multi-agent retrieval to improve fact-checking accuracy. The system addresses limitations of current methods that rely on textual similarity by implementing an automated Markov Decision Process with LLM agents to retrieve and verify evidence from multiple sources.

AIBullisharXiv โ€“ CS AI ยท Mar 35/108
๐Ÿง 

Beyond Static Instruction: A Multi-agent AI Framework for Adaptive Augmented Reality Robot Training

Researchers developed a multi-agent AI framework for adaptive Augmented Reality robot training that uses Large Language Models to dynamically adjust learning environments based on individual cognitive profiles. The system processes multimodal inputs including voice, physiology, and robot data to personalize industrial robot training experiences in real-time.

AIBullisharXiv โ€“ CS AI ยท Mar 35/105
๐Ÿง 

Iterative LLM-based improvement for French Clinical Interview Transcription and Speaker Diarization

Researchers developed a multi-pass LLM post-processing system that significantly improves French clinical speech transcription accuracy by alternating between speaker recognition and word recognition passes. The system achieved significant word error rate reductions in suicide prevention conversations while maintaining stability in neurosurgery consultations with feasible computational costs for clinical deployment.