y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#dataset-benchmark News & Analysis

5 articles tagged with #dataset-benchmark. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

5 articles
AIBullisharXiv – CS AI · Jun 57/10
🧠

DragOn: A Benchmark and Dataset for Drag-Based GUI Interactions

Researchers introduce DragOn, a large-scale benchmark dataset with 286K training screenshots and 3.5M tasks designed to improve GUI agents' ability to perform drag-based interactions like highlighting, resizing, and swiping. The dataset addresses a critical gap where drag-grounding capabilities lag significantly behind click-grounding in AI models controlling desktops and mobile devices.

🧠 Claude
AIBullisharXiv – CS AI · May 277/10
🧠

FineVLA: Fine-Grained Instruction Alignment for Steerable Vision-Language-Action Policies

Researchers introduce FineVLA, a framework that enhances Vision-Language-Action models for robotics by incorporating fine-grained instruction supervision beyond simple goal-level commands. The system combines 972,247 trajectories into a curated dataset of 47,159 fine-grained trajectories and demonstrates that mixing fine-grained and coarse instructions improves real-world robot manipulation success rates to 62.7% compared to 49.9% with goal-level instructions alone.

AINeutralarXiv – CS AI · Jun 96/10
🧠

ArtiFact: A Large-Scale Multi-Modal Cultural Heritage Dataset

Researchers introduce ArtiFact, a large-scale multi-modal dataset containing 651,045 museum records from three major art institutions combined with images, text, and structured data. The dataset benchmarks AI systems on cross-modal error detection and semantic query processing tasks, revealing significant challenges in detecting domain-specific errors and handling culturally-nuanced information retrieval.

AINeutralarXiv – CS AI · Jun 56/10
🧠

MAviS: A Multimodal Conversational Assistant For Avian Species

Researchers introduce MAviS, a specialized multimodal AI system combining image, audio, and text data for avian species identification and ecological monitoring. The system includes a large dataset covering 1,000+ bird species, a fine-tuned language model, and a comprehensive benchmark, demonstrating state-of-the-art performance in domain-specific biodiversity conservation applications.

AINeutralarXiv – CS AI · May 46/10
🧠

ViLegalNLI: Natural Language Inference for Vietnamese Legal Texts

Researchers have introduced ViLegalNLI, the first large-scale Vietnamese Natural Language Inference dataset for legal texts, containing 42,012 premise-hypothesis pairs from statutory documents. The dataset enables AI systems to understand legal reasoning patterns and supports development of reliable AI tools for Vietnamese legal analysis and decision-making.