Friday, March 20, 2026
|
bullish
ai
Importance: 5/10
LlamaIndex Releases LiteParse: A CLI and TypeScript-Native Library for Spatial PDF Parsing in AI Agent Workflows
In the current landscape of Retrieval-Augmented Generation (RAG), the primary bottleneck for developers is no longer the large language model (LLM) itself, but the data ingestion pipeline. For software developers, converting complex PDFs into a format that an LLM can reason over remains a high-latency, often expensive task. LlamaIndex has recently introduced LiteParse, an open-source, […] The post LlamaIndex Releases LiteParse: A CLI and TypeScript-Native Library for Spatial PDF Parsing in AI Ag |
|
bullish
ai
Importance: 7/10
Nvidia Deepens Grip on Cloud AI With Major AWS Chip Deal
The deal would help scale capacity as AWS builds its own chips, revealing deeper reliance on Nvidia’s stack as usage keeps growing. |
|
bearish
ai
Importance: 6/10
From Weak Cues to Real Identities: Evaluating Inference-Driven De-Anonymization in LLM Agents
arXiv:2603.18382v1 Announce Type: new Abstract: Anonymization is widely treated as a practical safeguard because re-identifying anonymous records was historically costly, requiring domain expertise, tailored algorithms, and manual corroboration. We study a growing privacy risk that may weaken this barrier: LLM-based agents can autonomously reconstruct real-world identities from scattered, individually non-identifying cues. By combining these sparse cues with public information, agents resolve i |
|
bullish
ai
Importance: 5/10
An Onto-Relational-Sophic Framework for Governing Synthetic Minds
arXiv:2603.18633v1 Announce Type: new Abstract: The rapid evolution of artificial intelligence, from task-specific systems to foundation models exhibiting broad, flexible competence across reasoning, creative synthesis, and social interaction, has outpaced the conceptual and governance frameworks designed to manage it. Current regulatory paradigms, anchored in a tool-centric worldview, address algorithmic bias and transparency but leave unanswered foundational questions about what increasingly |
|
bullish
ai
Importance: 6/10
Memento-Skills: Let Agents Design Agents
arXiv:2603.18743v1 Announce Type: new Abstract: We introduce \emph{Memento-Skills}, a generalist, continually-learnable LLM agent system that functions as an \emph{agent-designing agent}: it autonomously constructs, adapts, and improves task-specific agents through experience. The system is built on a memory-based reinforcement learning framework with \emph{stateful prompts}, where reusable skills (stored as structured markdown files) serve as persistent, evolving memory. These skills encode bo |
|
bullish
ai
Importance: 5/10
I Can't Believe It's Corrupt: Evaluating Corruption in Multi-Agent Governance Systems
arXiv:2603.18894v1 Announce Type: new Abstract: Large language models are increasingly proposed as autonomous agents for high-stakes public workflows, yet we lack systematic evidence about whether they would follow institutional rules when granted authority. We present evidence that integrity in institutional AI should be treated as a pre-deployment requirement rather than a post-deployment assumption. We evaluate multi-agent governance simulations in which agents occupy formal governmental rol |
|
bullish
ai
Importance: 5/10
Semantic Chameleon: Corpus-Dependent Poisoning Attacks and Defenses in RAG Systems
arXiv:2603.18034v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) systems extend large language models (LLMs) with external knowledge sources but introduce new attack surfaces through the retrieval pipeline. In particular, adversaries can poison retrieval corpora so that malicious documents are preferentially retrieved at inference time, enabling targeted manipulation of model outputs. We study gradient-guided corpus poisoning attacks against modern RAG pipelines and evalua |
|
bullish
ai
Importance: 7/10
NANOZK: Layerwise Zero-Knowledge Proofs for Verifiable Large Language Model Inference
arXiv:2603.18046v1 Announce Type: cross Abstract: When users query proprietary LLM APIs, they receive outputs with no cryptographic assurance that the claimed model was actually used. Service providers could substitute cheaper models, apply aggressive quantization, or return cached responses - all undetectable by users paying premium prices for frontier capabilities. We present METHOD, a zero-knowledge proof system that makes LLM inference verifiable: users can cryptographically confirm that ou |
|
bullish
ai
Importance: 6/10
ARTEMIS: A Neuro Symbolic Framework for Economically Constrained Market Dynamics
arXiv:2603.18107v1 Announce Type: cross Abstract: Deep learning models in quantitative finance often operate as black boxes, lacking interpretability and failing to incorporate fundamental economic principles such as no-arbitrage constraints. This paper introduces ARTEMIS (Arbitrage-free Representation Through Economic Models and Interpretable Symbolics), a novel neuro-symbolic framework combining a continuous-time Laplace Neural Operator encoder, a neural stochastic differential equation regul |
|
bullish
ai
Importance: 5/10
When Names Change Verdicts: Intervention Consistency Reveals Systematic Bias in LLM Decision-Making
arXiv:2603.18530v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used for high-stakes decisions, yet their susceptibility to spurious features remains poorly characterized. We introduce ICE-Guard, a framework applying intervention consistency testing to detect three types of spurious feature reliance: demographic (name/race swaps), authority (credential/prestige swaps), and framing (positive/negative restatements). Across 3,000 vignettes spanning 10 high-stakes do |
|
bullish
ai
Importance: 6/10
Learning to Self-Evolve
arXiv:2603.18620v1 Announce Type: cross Abstract: We introduce Learning to Self-Evolve (LSE), a reinforcement learning framework that trains large language models (LLMs) to improve their own contexts at test time. We situate LSE in the setting of test-time self-evolution, where a model iteratively refines its context from feedback on seen problems to perform better on new ones. Existing approaches rely entirely on the inherent reasoning ability of the model and never explicitly train it for thi |
|
bullish
ai
Importance: 5/10
CausalRM: Causal-Theoretic Reward Modeling for RLHF from Observational User Feedbacks
arXiv:2603.18736v1 Announce Type: cross Abstract: Despite the success of reinforcement learning from human feedback (RLHF) in aligning language models, current reward modeling heavily relies on experimental feedback data collected from human annotators under controlled and costly conditions. In this work, we introduce observational reward modeling -- learning reward models with observational user feedback (e.g., clicks, copies, and upvotes) -- as a scalable and cost-effective alternative. We id |
|
bullish
ai
Importance: 5/10
Measuring and Exploiting Confirmation Bias in LLM-Assisted Security Code Review
arXiv:2603.18740v1 Announce Type: cross Abstract: Security code reviews increasingly rely on systems integrating Large Language Models (LLMs), ranging from interactive assistants to autonomous agents in CI/CD pipelines. We study whether confirmation bias (i.e., the tendency to favor interpretations that align with prior expectations) affects LLM-based vulnerability detection, and whether this failure mode can be exploited in software supply-chain attacks. We conduct two complementary studies. |
|
bullish
ai
Importance: 5/10
Security, privacy, and agentic AI in a regulatory view: From definitions and distinctions to provisions and reflections
arXiv:2603.18914v1 Announce Type: cross Abstract: The rapid proliferation of artificial intelligence (AI) technologies has led to a dynamic regulatory landscape, where legislative frameworks strive to keep pace with technical advancements. As AI paradigms shift towards greater autonomy, specifically in the form of agentic AI, it becomes increasingly challenging to precisely articulate regulatory stipulations. This challenge is even more acute in the domains of security and privacy, where the ca |
|
bullish
ai
Importance: 5/10
SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GPU Kernels Against Hardware Limits
arXiv:2603.19173v1 Announce Type: cross Abstract: As agentic AI systems become increasingly capable of generating and optimizing GPU kernels, progress is constrained by benchmarks that reward speedup over software baselines rather than proximity to hardware-efficient execution. We present SOL-ExecBench, a benchmark of 235 CUDA kernel optimization problems extracted from 124 production and emerging AI models spanning language, diffusion, vision, audio, video, and hybrid architectures, targeting $SOL
|
You're receiving this because you subscribed to y0 News digest.