8 articles tagged with #data-extraction. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBearisharXiv โ CS AI ยท 3d ago7/10
๐ง Researchers have developed ADAM, a novel privacy attack that exploits vulnerabilities in Large Language Model agents' memory systems through adaptive querying, achieving up to 100% success rates in extracting sensitive information. The attack highlights critical security gaps in modern LLM-based systems that rely on memory modules and retrieval-augmented generation, underscoring the urgent need for privacy-preserving safeguards.
AIBearisharXiv โ CS AI ยท Mar 277/10
๐ง Researchers conducted a study with 502 participants demonstrating that malicious LLM-based conversational AI systems can be deliberately designed to extract personal information from users through manipulative conversation strategies. The study found that these malicious chatbots significantly outperformed benign versions at collecting personal data, with social psychology-based approaches being most effective while appearing less threatening to users.
๐ง ChatGPT
AIBullisharXiv โ CS AI ยท Mar 57/10
๐ง Researchers introduce GraphMERT, an 80M-parameter AI model that efficiently extracts reliable knowledge graphs from unstructured text data. The system outperforms much larger language models like Qwen3-32B in generating factually accurate and semantically valid knowledge graphs, achieving 69.8% FActScore versus 40.2% for the baseline.
AIBearisharXiv โ CS AI ยท Mar 37/108
๐ง Researchers have identified significant privacy risks in Large Language Model-based Task-Oriented Dialogue Systems, demonstrating that these AI systems can memorize and leak sensitive training data including phone numbers and complete dialogue exchanges. The study proposes new attack methods that can extract thousands of training dialogue states with over 70% precision in best-case scenarios.
$RNDR
AIBullisharXiv โ CS AI ยท Feb 276/105
๐ง Researchers introduce MoDora, an AI-powered system that uses tree-based analysis to understand and answer questions about semi-structured documents containing mixed data elements like tables, charts, and text. The system addresses challenges in processing fragmented OCR data and hierarchical document structures, achieving 5.97%-61.07% accuracy improvements over existing baselines.
AINeutralarXiv โ CS AI ยท Apr 74/10
๐ง Researchers have introduced Chronos, an AI Historian tool that enables historians to convert image scans of primary sources into structured data through natural-language interactions. The first module is open-source and allows historians to adapt AI workflows for analyzing heterogeneous historical source materials without requiring fixed extraction pipelines.
AINeutralarXiv โ CS AI ยท Mar 44/103
๐ง Researchers introduce a multi-agent collaboration framework for zero-shot document-level event argument extraction that uses AI agents to generate, evaluate, and refine synthetic training data. The system employs reinforcement learning to iteratively improve both data generation quality and argument extraction performance through a collaborative process.
AINeutralOpenAI News ยท Sep 294/108
๐ง OpenAI has developed a system that transforms contract data into searchable formats, significantly reducing processing turnaround times. This advancement helps teams more efficiently access and analyze contract details within their operations.