Introducing GeneBench-Pro

中文日本語 Español

OpenAI Jun 17, 2026

GeneBench-Pro is a new benchmark designed to evaluate AI models on their ability to perform complex, judgment-heavy computational biology research tasks.

Read Full Article

Summary

GeneBench-Pro is a research-level benchmark created to test AI models on higher-order scientific reasoning and judgment in computational biology. Unlike standard benchmarks that test factual recall, GeneBench-Pro focuses on "research taste," requiring models to handle ambiguity, revise hypotheses, and navigate complex datasets toward decision-ready outcomes. Built with synthetic, causal data to ensure objective grading, the benchmark includes 129 problems across various domains. Results indicate that while frontier models like GPT-5.6 Sol show rapid improvement in scientific reasoning, they still struggle with the iterative, inferential processes that characterize expert human research.

(Source：OpenAI)

中文日本語 Español

Read Full Article

TechCrunch Jun 30, 2026

Nvidia competitor Etched hits $5B valuation, $1B in sales for AI chip

TechCrunch Jun 30, 2026

Anthropic launches Claude Sonnet 5 as a cheaper way to run agents

Anthropic Jun 30, 2026

Introducing Claude Sonnet 5

TechCrunch Jun 30, 2026

Acti puts AI agents directly into your smartphone keyboard

The Verge Jun 30, 2026