y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#document-analysis News & Analysis

5 articles tagged with #document-analysis. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

5 articles
AIBullisharXiv โ€“ CS AI ยท Apr 157/10
๐Ÿง 

CropVLM: Learning to Zoom for Fine-Grained Vision-Language Perception

Researchers introduce CropVLM, a reinforcement learning-based method that enables Vision-Language Models to dynamically focus on relevant image regions for improved fine-grained understanding tasks. The approach works with existing VLMs without modification and demonstrates significant performance gains on text recognition and document analysis without requiring human-labeled training data.

AIBullisharXiv โ€“ CS AI ยท Mar 46/102
๐Ÿง 

ScaleDoc: Scaling LLM-based Predicates over Large Document Collections

ScaleDoc is a new system that enables efficient semantic analysis of large document collections using LLMs by combining offline document representation with lightweight online filtering. The system achieves 2x speedup and reduces expensive LLM calls by up to 85% through contrastive learning and adaptive cascade mechanisms.

AINeutralarXiv โ€“ CS AI ยท Apr 106/10
๐Ÿง 

Improved Evidence Extraction and Metrics for Document Inconsistency Detection with LLMs

Researchers introduce improved methods for detecting inconsistencies in documents using large language models, including new evaluation metrics and a redact-and-retry framework. The work addresses a research gap in LLM-based document analysis and includes a new semi-synthetic dataset for benchmarking evidence extraction capabilities.

AIBullisharXiv โ€“ CS AI ยท Mar 126/10
๐Ÿง 

A Two-Stage Architecture for NDA Analysis: LLM-based Segmentation and Transformer-based Clause Classification

Researchers developed a two-stage AI architecture using LLaMA-3.1-8B-Instruct and Legal-Roberta-Large models to automate the analysis of Non-Disclosure Agreements (NDAs). The system achieved high accuracy with ROUGE F1 of 0.95 for document segmentation and weighted F1 of 0.85 for clause classification, demonstrating potential for automating legal document analysis.

AIBullisharXiv โ€“ CS AI ยท Feb 276/105
๐Ÿง 

MoDora: Tree-Based Semi-Structured Document Analysis System

Researchers introduce MoDora, an AI-powered system that uses tree-based analysis to understand and answer questions about semi-structured documents containing mixed data elements like tables, charts, and text. The system addresses challenges in processing fragmented OCR data and hierarchical document structures, achieving 5.97%-61.07% accuracy improvements over existing baselines.