y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#data-extraction News & Analysis

8 articles tagged with #data-extraction. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

8 articles
AIBearisharXiv โ€“ CS AI ยท 3d ago7/10
๐Ÿง 

ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

Researchers have developed ADAM, a novel privacy attack that exploits vulnerabilities in Large Language Model agents' memory systems through adaptive querying, achieving up to 100% success rates in extracting sensitive information. The attack highlights critical security gaps in modern LLM-based systems that rely on memory modules and retrieval-augmented generation, underscoring the urgent need for privacy-preserving safeguards.

AIBearisharXiv โ€“ CS AI ยท Mar 277/10
๐Ÿง 

Malicious LLM-Based Conversational AI Makes Users Reveal Personal Information

Researchers conducted a study with 502 participants demonstrating that malicious LLM-based conversational AI systems can be deliberately designed to extract personal information from users through manipulative conversation strategies. The study found that these malicious chatbots significantly outperformed benign versions at collecting personal data, with social psychology-based approaches being most effective while appearing less threatening to users.

๐Ÿง  ChatGPT
AIBearisharXiv โ€“ CS AI ยท Mar 37/108
๐Ÿง 

Extracting Training Dialogue Data from Large Language Model based Task Bots

Researchers have identified significant privacy risks in Large Language Model-based Task-Oriented Dialogue Systems, demonstrating that these AI systems can memorize and leak sensitive training data including phone numbers and complete dialogue exchanges. The study proposes new attack methods that can extract thousands of training dialogue states with over 70% precision in best-case scenarios.

$RNDR
AIBullisharXiv โ€“ CS AI ยท Feb 276/105
๐Ÿง 

MoDora: Tree-Based Semi-Structured Document Analysis System

Researchers introduce MoDora, an AI-powered system that uses tree-based analysis to understand and answer questions about semi-structured documents containing mixed data elements like tables, charts, and text. The system addresses challenges in processing fragmented OCR data and hierarchical document structures, achieving 5.97%-61.07% accuracy improvements over existing baselines.

AINeutralarXiv โ€“ CS AI ยท Apr 74/10
๐Ÿง 

Towards the AI Historian: Agentic Information Extraction from Primary Sources

Researchers have introduced Chronos, an AI Historian tool that enables historians to convert image scans of primary sources into structured data through natural-language interactions. The first module is open-source and allows historians to adapt AI workflows for analyzing heterogeneous historical source materials without requiring fixed extraction pipelines.

AINeutralarXiv โ€“ CS AI ยท Mar 44/103
๐Ÿง 

Learning to Generate and Extract: A Multi-Agent Collaboration Framework For Zero-shot Document-level Event Arguments Extraction

Researchers introduce a multi-agent collaboration framework for zero-shot document-level event argument extraction that uses AI agents to generate, evaluate, and refine synthetic training data. The system employs reinforcement learning to iteratively improve both data generation quality and argument extraction performance through a collaborative process.

AINeutralOpenAI News ยท Sep 294/108
๐Ÿง 

Turning contracts into searchable data at OpenAI

OpenAI has developed a system that transforms contract data into searchable formats, significantly reducing processing turnaround times. This advancement helps teams more efficiently access and analyze contract details within their operations.