y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#information-extraction News & Analysis

10 articles tagged with #information-extraction. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

10 articles
AIBearisharXiv – CS AI · Jun 47/10
🧠

Topology Matters: Measuring Memory Leakage in Multi-Agent LLMs

Researchers introduce MAMA, a framework measuring how network topology affects private information leakage in multi-agent LLM systems. The study demonstrates that denser connectivity and shorter distances between attackers and targets significantly increase memory leakage, with practical implications for securing distributed AI systems.

AIBearisharXiv – CS AI · May 97/10
🧠

LeakDojo: Decoding the Leakage Threats of RAG Systems

LeakDojo is a new research framework that systematically evaluates security vulnerabilities in Retrieval-Augmented Generation (RAG) systems, revealing that stronger LLM instruction-following capabilities correlate with higher data leakage risks. The study benchmarks six attack methods across multiple LLMs and datasets, providing critical insights into how RAG databases can be exploited and suggesting that improvements in RAG faithfulness may paradoxically increase security vulnerabilities.

AINeutralarXiv – CS AI · Mar 117/10
🧠

A Consensus-Driven Multi-LLM Pipeline for Missing-Person Investigations

Researchers have developed Guardian, an AI system using multiple large language models (LLMs) to assist in missing-person investigations during the critical first 72 hours. The system employs a consensus-driven pipeline that coordinates specialized LLM models for information extraction and processing, with fine-tuning using QLoRA methodology.

AINeutralarXiv – CS AI · Jun 56/10
🧠

IDEAL: Leveraging Infinite and Dynamic Characterizations of Large Language Models for Query-focused Summarization

Researchers propose IDEAL, a novel framework for query-focused summarization that enhances large language models through two key innovations: Query-aware HyperExpert for fine-grained query alignment and Query-focused Infini-attention for processing lengthy documents. The approach demonstrates effectiveness across existing QFS benchmarks and expands LLM accessibility for personalized text summarization.

AINeutralarXiv – CS AI · Jun 26/10
🧠

DiffuSent: Towards a Unified Diffusion Framework for Aspect-Based Sentiment Analysis

Researchers introduce DiffuSent, a non-autoregressive diffusion framework that reformulates seven aspect-based sentiment analysis (ABSA) subtasks as boundary denoising processes. The approach achieves significant improvements over existing generative models, particularly on multi-word expressions, while delivering up to 181x faster inference speeds through parallel decoding rather than sequential token generation.

AINeutralarXiv – CS AI · Jun 15/10
🧠

Fine-grained Verification via Diagnostic Reasoning Supervision for Aspect Sentiment Triplet Extraction

Researchers propose FiVeD, a fine-grained verification framework for Aspect Sentiment Triplet Extraction that improves extraction accuracy by up to 3.53 F1 points through multi-task learning with validity classification, quality scoring, error detection, and rationale generation. The framework addresses a critical gap in ASTE systems by post-hoc verification of extracted triplets, enabling adjustable precision-recall tradeoffs for downstream NLP applications.

AINeutralarXiv – CS AI · Jun 16/10
🧠

DTBench: A Synthetic Benchmark for Document-to-Table Extraction

Researchers introduce DTBench, a synthetic benchmark for evaluating large language models on document-to-table extraction tasks. Using a reverse Table2Doc synthesis approach with multi-agent workflows, the benchmark covers 13 subcategories across 5 major capability areas, revealing significant performance gaps and persistent challenges in reasoning and conflict resolution across mainstream LLMs.

AINeutralarXiv – CS AI · May 276/10
🧠

Reliable Extraction of Clinical Follow-Up Instructions: A Hybrid Neural-Symbolic Pipeline

Researchers developed a hybrid neural-symbolic pipeline for extracting clinical follow-up instructions from outpatient notes, pairing medical actions with future dates. The system significantly outperformed generative AI models (GPT-4o-mini and LLaMA-3) at linking actions to dates, achieving 99.7% F1 score on seen data versus 51-57% for baselines, demonstrating that symbolic reasoning outperforms pure language generation for structured clinical extraction tasks.

🧠 GPT-4
AINeutralarXiv – CS AI · May 276/10
🧠

Doc-CoB: Enhancing Document Understanding with Visual Chain-of-Boxes Reasoning

Researchers introduce Doc-CoB, a new framework that improves how AI models understand documents by progressively focusing on relevant layout regions while maintaining global context. The approach combines coarse-to-fine visual reasoning with multimodal large language models and demonstrates significant performance improvements across seven benchmarks.

AIBullisharXiv – CS AI · Apr 206/10
🧠

DiZiNER: Disagreement-guided Instruction Refinement via Pilot Annotation Simulation for Zero-shot Named Entity Recognition

Researchers introduce DiZiNER, a framework that improves zero-shot named entity recognition by simulating human annotation disagreement processes using multiple LLMs. The approach achieves state-of-the-art results on 14 of 18 benchmarks, closing the performance gap between zero-shot and supervised systems by over 11 percentage points.

🧠 GPT-5