Analytics Digests Sources Topics RSS AI Crypto

#open-book-evaluation News & Analysis

1 article tagged with #open-book-evaluation. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

1 articles

AINeutralarXiv – CS AI · Mar 47/102

🧠

MedCalc-Bench Doesn't Measure What You Think: A Benchmark Audit and the Case for Open-Book Evaluation

Researchers audited the MedCalc-Bench benchmark for evaluating AI models on clinical calculator tasks, finding over 20 errors in the dataset and showing that simple 'open-book' prompting achieves 81-85% accuracy versus previous best of 74%. The study suggests the benchmark measures formula memorization rather than clinical reasoning, challenging how AI medical capabilities are evaluated.

Tag Connections

102

#geopolitical↔#iran

97

#iran↔#market

89

#bitcoin↔#market

79

77

#bitcoin↔#iran

76

65

64

61

#fed↔#inflation

61

Tag Sentiment

#ai967 articles

#market727 articles

#iran708 articles

#bitcoin440 articles

#trump255 articles

#trading186 articles

#geopolitical173 articles

#security160 articles

#china155 articles

#inflation133 articles

BullishNeutralBearish

◆ AI Mentions

🏢Anthropic

104×

🏢OpenAI

101×

🏢Nvidia

91×

🧠Claude

65×

🧠Gemini

50×

🧠GPT-5

37×

🧠ChatGPT

24×

🧠Grok

17×

🏢Hugging Face

15×

🧠Opus

14×

🏢Meta

14×

🧠Llama

13×

🏢Google

11×

🧠GPT-4

11×

🧠Sonnet

7×

🏢xAI

6×

🏢Perplexity

4×

🏢Microsoft

4×

🧠Stable Diffusion

2×

🏢Mistral

2×

Stay Updated

Everything combined

▲ Trending Tags

1#ai967 2#market727 3#iran708 4#bitcoin440 5#trump255 6#trading186 7#geopolitical173 8#security160 9#china155 10#inflation133 11#stablecoin129 12#fed118 13#ethereum116 14#institutional102 15#openai99

Filters

Sentiment

Importance

Sort

📡 See all 70+ sources

y0.exchange

Your AI agent for DeFi

Connect Claude or GPT to your wallet. AI reads balances, proposes swaps and bridges — you approve. Your keys never leave your device.

8 MCP tools · 15 chains · $0 fees

Connect Wallet to AI →How it works →

Viewing: y0 Digest feed