#quality-assessment News & Analysis

6 articles tagged with #quality-assessment. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

6 articles

AIBearisharXiv – CS AI · May 117/10

🧠

Quality-Conditioned Agreement in Automated Short Answer Scoring: Mid-Range Degradation and the Impact of Task-Specific Adaptation

Research reveals that AI models, particularly few-shot large language models, struggle significantly with mid-range quality responses in automated short answer scoring, while fine-tuned models and human experts maintain consistent performance across all quality levels. This degradation raises fairness concerns for students with developing understanding, emphasizing the need for quality-conditioned evaluation metrics.

🧠 GPT-4🧠 GPT-5🧠 Claude

AINeutralarXiv – CS AI · Jun 236/10

🧠

Explanations for Automatic Speech Recognition

Researchers have developed explainable AI techniques to improve trust and understanding of automatic speech recognition (ASR) systems by identifying minimal subsets of audio frames that cause specific transcriptions. The study adapts established XAI methods from image classification and evaluates them against multiple ASR systems including Google API and DeepSpeech using 100 audio samples.

AINeutralarXiv – CS AI · Jun 115/10

🧠

Skill-Augmented AI Agents for Medical Research Analysis: An Exploratory Multi-Model Human Evaluation in an NSCLC Transcriptomic Biomarker Task

Researchers evaluated whether AI agents equipped with specialized medical research skills produce higher-quality outputs than native language models on transcriptomic biomarker analysis tasks. While skill-augmented AI showed directional improvements in expert-rated quality, the gains were modest and within the margin of expert-rating noise, suggesting larger, more rigorous studies are needed.

AINeutralarXiv – CS AI · Jun 15/10

🧠

Fine-grained Verification via Diagnostic Reasoning Supervision for Aspect Sentiment Triplet Extraction

Researchers propose FiVeD, a fine-grained verification framework for Aspect Sentiment Triplet Extraction that improves extraction accuracy by up to 3.53 F1 points through multi-task learning with validity classification, quality scoring, error detection, and rationale generation. The framework addresses a critical gap in ASTE systems by post-hoc verification of extracted triplets, enabling adjustable precision-recall tradeoffs for downstream NLP applications.

AIBullisharXiv – CS AI · Feb 275/106

🧠

Quality-Aware Robust Multi-View Clustering for Heterogeneous Observation Noise

Researchers propose QARMVC, a new AI framework for multi-view clustering that addresses heterogeneous noise in real-world data. The system uses quality scores to identify contamination levels and employs hierarchical learning to improve clustering performance, showing superior results across benchmark datasets.

AINeutralarXiv – CS AI · Mar 34/104

🧠

Seeing Beyond 8bits: Subjective and Objective Quality Assessment of HDR-UGC Videos

Researchers introduce Beyond8Bits, a large-scale dataset of 44K HDR user-generated videos with 1.5M crowd ratings, and HDR-Q, the first multimodal large language model designed for HDR video quality assessment. The work addresses limitations of current video quality systems that are optimized for standard dynamic range content.

$NEAR