y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#steering News & Analysis

2 articles tagged with #steering. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles
AINeutralarXiv – CS AI Β· Mar 46/102
🧠

How Controllable Are Large Language Models? A Unified Evaluation across Behavioral Granularities

Researchers introduce SteerEval, a new benchmark for evaluating how controllable Large Language Models are across language features, sentiment, and personality domains. The study reveals that current steering methods often fail at finer-grained control levels, highlighting significant risks when deploying LLMs in socially sensitive applications.

AINeutralarXiv – CS AI Β· Apr 146/10
🧠

Why Steering Works: Toward a Unified View of Language Model Parameter Dynamics

Researchers present a unified framework for understanding how different methods control large language modelsβ€”including fine-tuning, LoRA, and activation interventionsβ€”revealing a fundamental trade-off between steering strength and output quality. The analysis explains this through an activation manifold perspective and introduces SPLIT, a new steering method that improves control while better preserving model coherence.