AIBullisharXiv – CS AI · 3d ago7/10
🧠Researchers demonstrate that knowledge graphs extracted from a single neuroscience textbook can be converted into high-quality training data to fine-tune language models, enabling expert-level reasoning that outperforms larger LLMs while using far fewer parameters. This approach challenges the prevailing assumption that domain expertise requires massive, diverse datasets, showing instead that structured, curated knowledge can produce superior specialized AI systems.
AIBullisharXiv – CS AI · 4d ago7/10
🧠GraphDancer is a new post-training framework that enables large language models to reason over heterogeneous graph-structured data by combining natural-language reasoning with graph function execution. The two-stage curriculum approach uses structural complexity ordering to teach models to explore and reason over graphs, achieving strong cross-domain generalization with only a 3B parameter backbone.
AIBullishcrypto.news · May 47/10
🧠SAP is acquiring AI startup Prior Labs for over €1 billion to expand its tabular AI capabilities for structured business data processing. The acquisition strengthens SAP's position in enterprise AI by adding specialized models designed to work with the types of data most common in business applications.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers benchmark token-optimized data formats (TRON and TOON) against JSON in agentic AI systems, finding TRON reduces token consumption by up to 27% with acceptable accuracy trade-offs. The study reveals that while these alternatives show promise in isolated tasks, their real-world performance in multi-turn agent loops exposes limitations, particularly with TOON's parsing cascades and parallel tool-call handling.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers propose replacing LLM-based triggers in proactive agent systems with a lightweight temporal graph learning (TGL) model that processes structured event streams directly. The approach achieves 16.7% mean F1 improvement while running 4-7x faster on GPUs and 12-83x faster on consumer hardware, with a 220 MiB footprint suitable for on-device deployment.
AIBullisharXiv – CS AI · 4d ago6/10
🧠Researchers demonstrate that knowledge graphs significantly outperform traditional document stores for LLM-based industrial asset operations, achieving 100% accuracy on 467 maintenance scenarios compared to 65% with flat data structures. The study reveals that data architecture, not LLM orchestration design, is the primary performance bottleneck in structured operational domains.
🏢 Hugging Face🧠 GPT-4
AINeutralarXiv – CS AI · 4d ago6/10
🧠Researchers introduce an anonymous gradient-boosted decision tree (GBDT) protocol enabling secure training on vertically partitioned data between two parties while hiding record identifiers. The approach uses dual circuit-PSI and oblivious pseudorandom functions to eliminate ID exposure risks inherent in standard private set intersection methods, while achieving computational efficiency comparable to non-private approaches.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers developed a semi-structured extraction method for digitizing fragmented clinical reports using OCR and question-answering models, introducing 'key coverage' as a metric to measure data completeness. The approach achieves F1 scores above 0.83 on real-world hospital data from 20+ institutions using a lightweight BERT model, demonstrating that canonical key inventory completeness drives extraction performance.
AINeutralarXiv – CS AI · May 16/10
🧠Researchers demonstrate that Large Language Models perform significantly better on 2D structured tasks when given visual representations rather than serialized text inputs. The study reveals that converting 2D data into 1D token sequences creates representational friction that degrades model performance, with gaps widening as task complexity increases.
AINeutralarXiv – CS AI · Apr 206/10
🧠A comprehensive survey examines how Large Language Models can be effectively integrated with graph-based data structures to improve reasoning, retrieval, and decision-making across domains. The research categorizes integration approaches by purpose, graph type, and strategy, providing practitioners with guidance on selecting appropriate techniques for specific applications in healthcare, finance, robotics, and other fields.
AINeutralarXiv – CS AI · Apr 136/10
🧠Researchers introduce ASTRA, a new architecture designed to improve how large language models process and reason about complex tables through adaptive semantic tree structures. The method combines tree-based navigation with symbolic code execution to achieve state-of-the-art performance on table question-answering benchmarks, addressing fundamental limitations in how tables are currently serialized for LLMs.
AINeutralarXiv – CS AI · Mar 96/10
🧠Researchers introduce NGDBench, a comprehensive benchmark for evaluating neural networks' ability to work with graph databases across five domains including finance and medicine. The benchmark supports full Cypher query language capabilities and reveals significant limitations in current AI models when handling structured graph data, noise, and complex analytical tasks.
AINeutralarXiv – CS AI · Mar 55/10
🧠Researchers present a new transformer architecture that jointly trains on natural language and structured data by maintaining separate knowledge and language representations. The model uses a key-value repository system with journey-based role transport to enable cross-attention between linguistic context and structured knowledge graphs.
AINeutralarXiv – CS AI · Mar 54/10
🧠A benchmark study compares Token-Oriented Object Notation (TOON) with JSON for structured data serialization in LLMs, finding that while TOON reduces token usage, plain JSON shows better accuracy overall. The research reveals that TOON's efficiency benefits may only emerge at scale where syntax savings offset the initial prompt overhead.
AINeutralarXiv – CS AI · Mar 34/106
🧠Researchers developed LexChronos, an AI framework that extracts structured event timelines from Indian Supreme Court judgments using a dual-agent architecture. The system achieved 0.8751 F1 score on synthetic data and showed 75% preference over unstructured approaches in legal text summarization tasks.
AINeutralarXiv – CS AI · Mar 24/106
🧠Researchers propose a new multi-agent reinforcement learning framework that uses three cooperative agents with attention mechanisms to automate feature transformation for machine learning models. The approach addresses key limitations in existing automated feature engineering methods, including dynamic feature expansion instability and insufficient agent cooperation.