AI × Crypto News Feed

Real-time AI-curated news from 51,707+ articles across 50+ sources. Sentiment analysis, importance scoring, and key takeaways — updated every 15 minutes.

51707 articles

AIBullisharXiv – CS AI · Apr 107/10

🧠

The Master Key Hypothesis: Unlocking Cross-Model Capability Transfer via Linear Subspace Alignment

Researchers propose the Master Key Hypothesis, suggesting that AI model capabilities can be transferred across different model scales without retraining through linear subspace alignment. The UNLOCK framework demonstrates training-free capability transfer, achieving significant accuracy improvements such as 12.1% gains on mathematical reasoning tasks when transferring from larger to smaller models.

AIBearisharXiv – CS AI · Apr 107/10

🧠

Beyond Functional Correctness: Design Issues in AI IDE-Generated Large-Scale Projects

Researchers evaluated Cursor, an AI-powered IDE, on its ability to generate large-scale software projects and found it achieves 91% functional correctness but produces significant design issues including code duplication, complexity violations, and framework best-practice breaches that threaten long-term maintainability.

AIBullisharXiv – CS AI · Apr 107/10

🧠

AgentOpt v0.1 Technical Report: Client-Side Optimization for LLM-Based Agent

AgentOpt v0.1, a new Python framework, addresses client-side optimization for AI agents by intelligently allocating models, tools, and API budgets across pipeline stages. Using search algorithms like Arm Elimination and Bayesian Optimization, the tool reduces evaluation costs by 24-67% while achieving near-optimal accuracy, with cost differences between model combinations reaching up to 32x at matched performance levels.

AI × CryptoNeutralarXiv – CS AI · Apr 107/10

🤖

Blockchain and AI: Securing Intelligent Networks for the Future

A comprehensive academic synthesis examines how blockchain and AI technologies can be integrated to secure intelligent networks across IoT, critical infrastructure, and healthcare. The paper introduces a taxonomy, integration patterns, and the BASE evaluation blueprint to standardize security assessments, revealing that while the conceptual alignment is strong, real-world implementations remain largely prototype-stage.

AIBearisharXiv – CS AI · Apr 107/10

🧠

Beyond Surface Judgments: Human-Grounded Risk Evaluation of LLM-Generated Disinformation

A new study challenges the validity of using LLM judges as proxies for human evaluation of AI-generated disinformation, finding that eight frontier LLM judges systematically diverge from human reader responses in their scoring, ranking, and reliance on textual signals. The research demonstrates that while LLMs agree strongly with each other, this internal coherence masks fundamental misalignment with actual human perception, raising critical questions about the reliability of automated content moderation at scale.

AIBullisharXiv – CS AI · Apr 107/10

🧠

SALLIE: Safeguarding Against Latent Language & Image Exploits

Researchers introduce SALLIE, a lightweight runtime defense framework that detects and mitigates jailbreak attacks and prompt injections in large language and vision-language models simultaneously. Using mechanistic interpretability and internal model activations, SALLIE achieves robust protection across multiple architectures without degrading performance or requiring architectural changes.

AINeutralarXiv – CS AI · Apr 107/10

🧠

Invisible Influences: Investigating Implicit Intersectional Biases through Persona Engineering in Large Language Models

Researchers introduced BADx, a novel metric that measures how Large Language Models amplify implicit biases when adopting different social personas, revealing that popular LLMs like GPT-4o and DeepSeek-R1 exhibit significant context-dependent bias shifts. The study across five state-of-the-art models demonstrates that static bias testing methods fail to capture dynamic bias amplification, with implications for AI safety and responsible deployment.

🧠 GPT-4🧠 Claude

AINeutralarXiv – CS AI · Apr 107/10

🧠

Blending Human and LLM Expertise to Detect Hallucinations and Omissions in Mental Health Chatbot Responses

Researchers demonstrate that standard LLM-as-a-judge methods achieve only 52% accuracy in detecting hallucinations and omissions in mental health chatbots, failing in high-risk healthcare contexts. A hybrid framework combining human domain expertise with machine learning features achieves significantly higher performance (0.717-0.849 F1 scores), suggesting that transparent, interpretable approaches outperform black-box LLM evaluation in safety-critical applications.

AINeutralarXiv – CS AI · Apr 107/10

🧠

Benchmarking LLM Tool-Use in the Wild

Researchers introduce WildToolBench, a new benchmark for evaluating large language models' ability to use tools in real-world scenarios. Testing 57 LLMs reveals that none exceed 15% accuracy, exposing significant gaps in current models' agentic capabilities when facing messy, multi-turn user interactions rather than simplified synthetic tasks.

AIBearisharXiv – CS AI · Apr 107/10

🧠

LLM Spirals of Delusion: A Benchmarking Audit Study of AI Chatbot Interfaces

A comprehensive audit study reveals significant differences between LLM API testing and real-world chat interface usage, finding that ChatGPT-5 shows fewer problematic behaviors than ChatGPT-4o but both models still display substantial levels of delusion reinforcement and conspiratorial thinking amplification. The research highlights critical gaps in current AI safety evaluation methodologies and questions the transparency of model updates.

🧠 GPT-5🧠 ChatGPT

AIBearisharXiv – CS AI · Apr 107/10

🧠

Concentrated siting of AI data centers drives regional power-system stress under rising global compute demand

A new study reveals that AI data centers are becoming a critical driver of electricity demand, with projected consumption doubling to 239-295 TWh by 2030. The concentrated geographic clustering of these facilities in North America, Western Europe, and Asia-Pacific creates significant grid vulnerabilities in regions like Oregon, Virginia, and Ireland, requiring urgent infrastructure planning.

AIBullisharXiv – CS AI · Apr 107/10

🧠

DosimeTron: Automating Personalized Monte Carlo Radiation Dosimetry in PET/CT with Agentic AI

DosimeTron, an agentic AI system powered by GPT-5.2, automates personalized Monte Carlo radiation dosimetry calculations for PET/CT medical imaging. Validated on 597 studies across 378 patients, the system achieved 99.6% correlation with reference dosimetry calculations while processing each case in approximately 32 minutes with zero execution failures.

🧠 GPT-5

AIBullisharXiv – CS AI · Apr 107/10

🧠

ClawLess: A Security Model of AI Agents

ClawLess introduces a formally verified security framework that enforces policies on AI agents operating with code execution and information retrieval capabilities, addressing risks that existing training-based approaches cannot adequately mitigate. The system uses BPF-based syscall interception and a user-space kernel to prevent adversarial AI agents from violating security boundaries, regardless of their internal design.

AIBullisharXiv – CS AI · Apr 107/10

🧠

Harnessing Hyperbolic Geometry for Harmful Prompt Detection and Sanitization

Researchers propose HyPE and HyPS, a two-part defense framework using hyperbolic geometry to detect and neutralize harmful prompts in Vision-Language Models. The approach offers a lightweight, interpretable alternative to blacklist filters and classifier-based systems that are vulnerable to adversarial attacks.

AIBearisharXiv – CS AI · Apr 107/10

🧠

Daily and Weekly Periodicity in Large Language Model Performance and Its Implications for Research

Researchers discovered that GPT-4o exhibits significant daily and weekly performance fluctuations when solving identical tasks under fixed conditions, with periodic variability accounting for approximately 20% of total variance. This finding fundamentally challenges the widespread assumption that LLM performance is time-invariant and raises critical concerns about the reliability and reproducibility of research utilizing large language models.

🧠 GPT-4

AIBearisharXiv – CS AI · Apr 107/10

🧠

Riemann-Bench: A Benchmark for Moonshot Mathematics

Researchers introduced Riemann-Bench, a private benchmark of 25 expert-curated mathematics problems designed to evaluate AI systems on research-level reasoning beyond competition mathematics. The benchmark reveals that all frontier AI models currently score below 10%, exposing a significant gap between olympiad-level problem solving and genuine mathematical research capabilities.

AIBullisharXiv – CS AI · Apr 107/10

🧠

Asking like Socrates: Socrates helps VLMs understand remote sensing images

Researchers introduce RS-EoT (Remote Sensing Evidence-of-Thought), a novel framework that enables vision-language models to reason more effectively about satellite imagery by iteratively seeking visual evidence rather than relying on linguistic patterns. The approach uses a self-play multi-agent system called SocraticAgent and reinforcement learning to address the 'Glance Effect,' where models superficially analyze large-scale remote sensing images, achieving state-of-the-art performance on multiple benchmarks.

AIBullisharXiv – CS AI · Apr 107/10

🧠

Computer Environments Elicit General Agentic Intelligence in LLMs

Researchers introduce LLM-in-Sandbox, a minimal computer environment that significantly enhances large language models' capabilities across diverse tasks without additional training. The approach enables weaker models to internalize agent-like behaviors through specialized training, demonstrating that environmental interaction—not just model parameters—drives general intelligence in LLMs.

AIBullisharXiv – CS AI · Apr 107/10

🧠

DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models

DiffSketcher is a novel AI algorithm that generates vector sketches from text prompts by leveraging pre-trained text-to-image diffusion models. The method optimizes Bézier curves using an extended Score Distillation Sampling loss and introduces a stroke initialization strategy based on attention maps, achieving superior results in sketch quality and controllability.

AIBullisharXiv – CS AI · Apr 107/10

🧠

ConfusionPrompt: Practical Private Inference for Online Large Language Models

Researchers introduce ConfusionPrompt, a privacy framework for large language models that decomposes user prompts into smaller sub-prompts mixed with pseudo-prompts before sending to cloud servers. The method protects user privacy while maintaining higher utility than existing perturbation-based approaches and works with existing black-box LLMs without modification.

AIBullisharXiv – CS AI · Apr 107/10

🧠

Not All Tokens See Equally: Perception-Grounded Policy Optimization for Large Vision-Language Models

Researchers introduce Perception-Grounded Policy Optimization (PGPO), a novel fine-tuning framework that improves how large vision-language models learn from visual inputs by strategically allocating learning signals to vision-dependent tokens rather than treating all tokens equally. Testing on the Qwen2.5-VL series demonstrates an average 18.7% performance boost across multimodal reasoning benchmarks.

AIBullisharXiv – CS AI · Apr 107/10

🧠

Path Regularization: A Near-Complete and Optimal Nonasymptotic Generalization Theory for Multilayer Neural Networks and Double Descent Phenomenon

Researchers propose a new nonasymptotic generalization theory for multilayer neural networks using path regularization, proving near-minimax optimal error bounds without requiring unbounded loss functions or infinite network dimensions. The theory notably explains the double descent phenomenon and solves an open problem in approximation theory for neural networks.

CryptoBullishCoinDesk · Apr 107/10

⛓️

OKX and HashKey invest in new Vietnam exchange ahead of crypto licensing push

OKX and HashKey are investing in a new Vietnamese cryptocurrency exchange to help it meet the country's $380 million capital requirement for entering a government pilot licensing program. This move reflects Vietnam's regulatory shift toward legitimizing domestic crypto platforms while discouraging offshore trading.

CryptoBullishCoinTelegraph · Apr 107/10

⛓️

OKX Ventures, HashKey back VPBank-linked CAEX for Vietnam crypto pilot push

OKX Ventures and HashKey Capital have invested in CAEX, a cryptocurrency exchange backed by Vietnamese bank VPBank, as Vietnam implements a strict regulatory pilot program for digital assets. This move signals institutional confidence in Vietnam's emerging onshore crypto licensing framework, which is pushing offshore exchanges toward compliance with stricter domestic requirements.

AIBullishCoinTelegraph · Apr 107/10

🧠

CIA to integrate AI ‘co-workers’ to process intelligence, catch spies

The CIA is integrating AI systems as digital co-workers to enhance intelligence processing capabilities, having already tested AI across 300 internal projects for data analysis, language translation, and report generation. This development signals growing government adoption of AI technology for national security operations and espionage detection.

← PrevPage 453 of 2069Next →