956 articles tagged with #llm. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullishHugging Face Blog · May 316/106
🧠Hugging Face has launched an LLM Inference Container for Amazon SageMaker, enabling easier deployment and scaling of large language models on AWS infrastructure. This integration streamlines the process for developers to host and serve AI models in production environments.
AIBullishHugging Face Blog · Apr 266/104
🧠Databricks announces partnership with Hugging Face to accelerate Large Language Model training and tuning by up to 40%. This collaboration aims to optimize AI model development workflows and reduce computational costs for enterprises working with LLMs.
AIBullishHugging Face Blog · Mar 96/107
🧠The article title suggests a technical breakthrough in fine-tuning large 20 billion parameter language models using Reinforcement Learning from Human Feedback (RLHF) on consumer-grade hardware with just 24GB of GPU memory. However, no article body content was provided for analysis.
AIBullishHugging Face Blog · Sep 166/106
🧠The article discusses optimizations for running BLOOM inference using DeepSpeed and Accelerate frameworks to achieve significantly faster performance. This represents technical advances in making large language model inference more efficient and accessible.
AINeutralOpenAI News · Jul 256/106
🧠The article presents a framework for analyzing potential hazards and risks associated with large language models that generate code. This research addresses growing concerns about AI-generated code safety and reliability as LLMs become more widely adopted for software development tasks.
AINeutralOpenAI News · Mar 35/104
🧠OpenAI is seeking researchers to study the economic impacts of large language models through an expression of interest call. This research initiative aims to better understand how AI technologies affect economic systems and markets.
AINeutralarXiv – CS AI · Apr 74/10
🧠Researchers developed a privacy-preserving AI system that analyzes classroom videos to understand student engagement using pose detection and gaze tracking, with data processed by the QwQ-32B-Reasoning LLM. The system deletes original video frames and retains only geometric coordinates to comply with FERPA privacy regulations.
AINeutralarXiv – CS AI · Apr 75/10
🧠Researchers developed TRACE, a framework to evaluate how LLMs allocate trust between conflicting software artifacts like code, documentation, and tests. The study found that current LLMs are better at identifying natural-language specification issues than detecting subtle code-level problems, with models showing systematic blind spots when implementations drift while documentation remains plausible.
AINeutralarXiv – CS AI · Apr 75/10
🧠Researchers developed an automated framework using large language models to compare AI safety policy documents across a shared taxonomy of activities. The study found that model choice significantly affects comparison outcomes, with some document pairs showing high disagreement across different LLMs, though human expert evaluation showed high inter-annotator agreement.
AINeutralarXiv – CS AI · Apr 75/10
🧠Researchers found that large language models (LLMs) have an asymmetry between their internal knowledge and prompted responses when detecting analogies. While probing reveals models understand rhetorical analogies better than their prompted responses suggest, both methods perform poorly on narrative analogies requiring deeper abstraction.
AINeutralarXiv – CS AI · Apr 74/10
🧠Researchers have developed QualAnalyzer, an open-source Chrome extension that makes AI-assisted qualitative research more transparent by preserving detailed audit trails of LLM analysis processes. The tool processes data segments independently and maintains records of prompts, inputs, and outputs to enable systematic comparison between AI and human judgments.
AINeutralarXiv – CS AI · Apr 74/10
🧠Researchers have developed discourse_simulator, an open-source Python framework that combines large language models with agent-based modeling to simulate how public attitudes change over time in response to real-world events. The framework models social media interactions and opinion dynamics through AI agents in social networks, offering a new tool for social science research on attitude polarization and belief evolution.
AINeutralarXiv – CS AI · Apr 75/10
🧠Paper Espresso is an open-source platform that uses large language models to automatically discover, summarize, and analyze trending arXiv papers to help researchers manage information overload. Over 35 months, it has processed over 13,300 papers and revealed key trends in AI research, including a surge in reinforcement learning for LLM reasoning and strong correlation between topic novelty and community engagement.
🏢 Hugging Face
AINeutralarXiv – CS AI · Apr 74/10
🧠Researchers at Trinity College Dublin implemented an AI Teaching Assistant using Retrieval Augmented Generation for a Motion Picture Engineering course, testing it with 43 students over 7 weeks. The study found students rated the AI-TA as beneficial (4.22/5) but preferred human tutoring, while exam performance remained unchanged when AI-TA access was allowed.
AINeutralarXiv – CS AI · Apr 64/10
🧠Researchers explored using Contrastive Prompt Tuning (CPT) to improve Large Language Models' ability to generate energy-efficient code, combining contrastive learning with parameter-efficient fine-tuning. The study tested CPT across Python, Java, and C++ on three different models, finding consistent accuracy improvements for two models but variable efficiency gains depending on model, language, and task complexity.
AINeutralarXiv – CS AI · Apr 64/10
🧠Research reveals that large language models can reproduce the qualitative structure of human social reasoning but struggle with quantitative magnitude calibration. Pragmatic prompting strategies that consider speaker knowledge and motives can improve this calibration, though fine-grained accuracy remains partially unresolved.
AINeutralarXiv – CS AI · Apr 64/10
🧠The 2nd LLM+Graph Workshop at VLDB 2025 in London focused on integrating large language models with graph-structured data for practical applications. The workshop highlighted key research directions and innovative solutions bridging LLMs, graph data management, and graph machine learning.
AIBullisharXiv – CS AI · Apr 65/10
🧠Researchers propose a new framework using Large Language Models for causal graph discovery that requires only linear queries instead of quadratic, making it more efficient for larger datasets. The method uses breadth-first search and can incorporate observational data, achieving state-of-the-art results on real-world causal graphs.
AINeutralarXiv – CS AI · Apr 64/10
🧠Researchers developed a two-stage prompt selection strategy for zero-shot text-to-speech synthesis that improves emotional intensity and speaker consistency. The method evaluates prompts using prosodic features, audio quality, and text-emotion coherence in a static stage, then uses textual similarity for dynamic prompt selection during synthesis.
AINeutralarXiv – CS AI · Mar 275/10
🧠Research reveals that Large Language Models (GPT-4 and GPT-5) demonstrate better assessment performance on math problems they can solve correctly versus those they cannot. While math problem-solving expertise supports assessment capabilities, step-level error diagnosis remains more challenging than direct problem solving.
🧠 GPT-4🧠 GPT-5
AINeutralarXiv – CS AI · Mar 275/10
🧠A research paper introduces metamorphic testing as a solution for testing AI and LLM-integrated software systems. The approach addresses the challenge of unreliable LLM outputs and limited labeled ground truth by using relationships between multiple test executions as test oracles.
AIBullisharXiv – CS AI · Mar 274/10
🧠Researchers tested a dual-architecture LLM-based automated scoring system for educational assessments and found it generally robust to construct-irrelevant factors like meaningless text padding and spelling errors. The study shows promise for LLM-based scoring systems' reliability when properly designed, though off-topic responses were heavily penalized.
AINeutralarXiv – CS AI · Mar 275/10
🧠Research comparing AI models for COVID-19 X-ray diagnosis found that smaller discriminative models like Covid-Net achieve 95.5% accuracy with 99.9% lower carbon footprint than large language models. The study reveals that while LLMs like GPT-4 are versatile, they create disproportionate environmental impact for classification tasks compared to specialized smaller models.
🧠 GPT-4🧠 GPT-4.5🧠 ChatGPT
AINeutralarXiv – CS AI · Mar 264/10
🧠Researchers developed Konkani LLM, a specialized language model for the low-resource Indian language Konkani, using a synthetic 100k instruction dataset. The model addresses training data scarcity across multiple scripts (Devanagari, Romi, Kannada) and demonstrates competitive performance against proprietary models in machine translation tasks.
🧠 Gemini🧠 Llama
AINeutralarXiv – CS AI · Mar 174/10
🧠Researchers propose a new constraint-based approach to LLM routing that formulates the problem as weighted MaxSAT/MaxSMT optimization, using natural language feedback to create constraints over model attributes. Testing on a 25-model benchmark shows this method can effectively route queries to appropriate LLMs based on user preferences expressed in natural language.