y0news
#nlp6 articles
6 articles
AIBullisharXiv โ€“ CS AI ยท 6h ago3
๐Ÿง 

Toward General Semantic Chunking: A Discriminative Framework for Ultra-Long Documents

Researchers developed a new discriminative AI model based on Qwen3-0.6B that can efficiently segment ultra-long documents up to 13k tokens for better information retrieval. The model achieves superior performance compared to generative alternatives while delivering two orders of magnitude faster inference on the Wikipedia WIKI-727K dataset.

AIBullisharXiv โ€“ CS AI ยท 6h ago3
๐Ÿง 

Pseudo Contrastive Learning for Diagram Comprehension in Multimodal Models

Researchers propose a new training method called pseudo contrastive learning to improve diagram comprehension in multimodal AI models like CLIP. The approach uses synthetic diagram samples to help models better understand fine-grained structural differences in diagrams, showing significant improvements in flowchart understanding tasks.

AINeutralarXiv โ€“ CS AI ยท 6h ago5
๐Ÿง 

LFQA-HP-1M: A Large-Scale Human Preference Dataset for Long-Form Question Answering

Researchers released LFQA-HP-1M, a dataset with 1.3 million human preference annotations for evaluating long-form question answering systems. The study introduces nine quality rubrics and shows that simple linear models can match advanced LLM evaluators while exposing vulnerabilities in current evaluation methods.

AINeutralarXiv โ€“ CS AI ยท 6h ago8
๐Ÿง 

Task Complexity Matters: An Empirical Study of Reasoning in LLMs for Sentiment Analysis

A comprehensive study of 504 AI model configurations reveals that reasoning capabilities in large language models are highly task-dependent, with simple tasks like binary classification actually degrading by up to 19.9 percentage points while complex 27-class emotion recognition improves by up to 16.0 points. The research challenges the assumption that reasoning universally improves AI performance across all language tasks.

AIBullisharXiv โ€“ CS AI ยท 6h ago5
๐Ÿง 

Task-Centric Acceleration of Small-Language Models

Researchers propose TASC (Task-Adaptive Sequence Compression), a framework for accelerating small language models through two methods: TASC-ft for fine-tuning with expanded vocabularies and TASC-spec for training-free speculative decoding. The methods demonstrate improved inference efficiency while maintaining task performance across low output-variability generation tasks.

AINeutralarXiv โ€“ CS AI ยท 6h ago1
๐Ÿง 

ARGUS: Seeing the Influence of Narrative Features on Persuasion in Argumentative Texts

Researchers introduce ARGUS, a framework for studying how narrative features influence persuasion in online arguments. The study analyzes a ChangeMyView corpus using both traditional classifiers and large language models to identify which storytelling elements make arguments more convincing.