Acing this new AI exam — which its creators say is the toughest in the world — might point to the first signs of AGI

中文日本語 Español

Live Science Feb 27, 2026

Researchers created "Humanity’s Last Exam," a difficult 2,500-question test to measure AI models against human-level knowledge.

Read Full Article

Summary

Researchers from the Center for AI Safety and Scale AI introduced "Humanity’s Last Exam" (HLE), a rigorous test designed to gauge how close current AI models are to achieving human-level knowledge across over 100 subjects. The exam consists of 2,500 non-searchable, precise, and unambiguous questions, vetted by over 1,000 subject-matter experts, aiming to surpass the limitations of existing benchmarks like MMLU. Initial testing showed poor performance, with OpenAI’s o1 scoring only 8.3%, though researchers predicted models could reach 50% accuracy by the end of 2025. As of February 2026, the highest score achieved was 48.4% by Google’s Gemini 3 Deep Think, significantly lower than the 90% achieved by human experts. The creators emphasize that while high accuracy on HLE demonstrates expert-level performance on verifiable questions, it is a necessary but not sufficient condition for concluding that Artificial General Intelligence (AGI) has been reached.

(Source：Live Science)

中文日本語 Español

Read Full Article

Happy Mag Apr 15, 2026

Lumen’s CEO warns that AI bots now rule the internet

TIME Apr 14, 2026

A New AI Tool Could Transform How We Diagnose Genetic Diseases

TechCrunch Apr 14, 2026

Anthropic co-founder confirms the company briefed the Trump administration on Mythos

TechCrunch Apr 14, 2026

Max Hodak’s Science Corp. is preparing to place its first sensor in a human brain

TechCrunch Apr 14, 2026

How vibe coding app Anything is rebuilding after getting booted from the App Store twice

Anthropic Apr 14, 2026

Anthropic’s Long-Term Benefit Trust appoints Vas Narasimhan to Board of Directors

The Verge Apr 14, 2026

Has Google’s AI watermarking system been reverse-engineered?

PC Guide Apr 14, 2026

"A serious threat to privacy" Meta issued warning by 75 orgs over planned facial recognition in smart glasses

The Verge Apr 14, 2026

Daniel Moreno-Gama is facing federal charges for attacking Sam Altman’s home and OpenAI’s HQ

The Verge Apr 13, 2026

AI influencers are ‘everywhere’ at Coachella