AIBullisharXiv – CS AI · May 297/10
🧠Researchers introduce PokerSkill, a framework that enables large language models to play expert-level poker without training or computational solvers by combining rule-based poker skills with LLM reasoning. The approach achieves competitive performance against state-of-the-art GTO benchmarks, reducing losses by 49-61% compared to standard LLM prompting and outperforming established poker bots.
🧠 GPT-5🧠 Claude🧠 Opus
AIBearisharXiv – CS AI · May 127/10
🧠A new research position argues that enterprises should stop treating large language models as monolithic solutions for all tasks and instead use them primarily for structured data extraction within modular architectures. The paper contends that LLMs have inherent capacity limits for enterprise knowledge needs and proposes delegating computation and storage to specialized components like knowledge bases and symbolic systems for better reliability and cost efficiency.
AIBullisharXiv – CS AI · May 117/10
🧠ATHENA is an autonomous AI framework that automates scientific computing and machine learning research by autonomously selecting mathematical approaches, generating code, and iteratively improving solutions through a contextual bandit learning process. The system achieves validation errors as low as 10^-14 and demonstrates performance surpassing traditional foundation models in solving complex multiphysics problems.
AI × CryptoBullisharXiv – CS AI · May 97/10
🤖Researchers demonstrated quantum-enhanced large language models by integrating Cayley-parameterised unitary adapters into pre-trained LLMs and executing them on IBM's 156-qubit quantum processor. The approach improved Llama 3.1 8B's perplexity by 1.4% using only 6,000 additional parameters, marking the first practical validation of quantum-classical hybrid AI on real quantum hardware at scale.
🏢 Perplexity🧠 Llama
AINeutralarXiv – CS AI · 4d ago6/10
🧠BiNSGPS introduces a bidirectional neuro-symbolic framework that enables dynamic feedback loops between machine learning models and symbolic solvers for geometry problem-solving. Unlike traditional unidirectional approaches, this system allows the neural component to actively incorporate feedback and correct errors, addressing fundamental limitations in AI's ability to solve complex geometric reasoning tasks.
AINeutralarXiv – CS AI · May 296/10
🧠Researchers analyzed ClinicalTrials.gov data to track AI adoption in clinical research, finding exponential growth in AI-related trials globally with machine learning, deep learning, and large language models increasingly prevalent. Using a hybrid human-AI screening approach, the study revealed that while AI and humans agreed on identifying non-AI studies, they diverged significantly on classifying human-AI interactions, highlighting the need for clearer trial reporting standards.
🧠 GPT-5
AINeutralarXiv – CS AI · May 296/10
🧠Researchers introduce RACE-Sched, an asynchronous AI framework that combines real-time symbolic heuristics with LLM-powered reasoning to solve dynamic job shop scheduling problems in industrial systems. The approach decouples fast reactive execution from slower deliberative optimization, enabling superior performance over deep reinforcement learning baselines while maintaining interpretability and millisecond-level response times.
AINeutralarXiv – CS AI · May 286/10
🧠Researchers developed an LLM-based pipeline that automatically tags learning resources with competencies from structured frameworks, combining language models with graph constraints and evidence extraction. The system achieved strong performance metrics (0.57 micro-F1, 0.82 MRR) while providing transparent, auditable evidence spans—outperforming traditional baselines and addressing the labor-intensive challenge of manual resource tagging in educational systems.
AINeutralarXiv – CS AI · May 276/10
🧠Researchers developed a hybrid neural-symbolic pipeline for extracting clinical follow-up instructions from outpatient notes, pairing medical actions with future dates. The system significantly outperformed generative AI models (GPT-4o-mini and LLaMA-3) at linking actions to dates, achieving 99.7% F1 score on seen data versus 51-57% for baselines, demonstrating that symbolic reasoning outperforms pure language generation for structured clinical extraction tasks.
🧠 GPT-4
AIBullisharXiv – CS AI · Mar 96/10
🧠Researchers propose Hybrid Hierarchical RL (H²RL), a new framework that combines symbolic logic with deep reinforcement learning to address misalignment issues in AI agents. The method uses logical option-based pretraining to improve long-horizon decision-making and prevent agents from over-exploiting short-term rewards.
AIBullisharXiv – CS AI · Mar 36/104
🧠A research study comparing AI-generated advice to human Reddit responses found that large language models like GPT-4o significantly outperformed crowd-sourced advice on effectiveness, warmth, and user satisfaction metrics. The study suggests human advice can be enhanced through AI polishing, pointing toward hybrid systems combining AI, crowd input, and expert oversight.