We asked four AI coding agents to rebuild Minesweeper—the results were explosive

Ars Technica
Four leading AI coding agents were tested on their ability to recreate the classic game Minesweeper with a surprise feature, yielding varied results.

Summary

Ars Technica tested four AI coding agents (OpenAI’s Codex, Anthropic’s Claude Code with Opus 4.5, Google’s Gemini CLI, and Mistral Vibe) by giving them a single prompt to create a full-featured, mobile-compatible web version of Minesweeper that included a fun, new gameplay feature. The agents operated directly on local files without human debugging in a 'single shot' test. OpenAI Codex scored highest (9/10) for correctly implementing the crucial 'chording' feature and adding useful mobile controls, despite a less exciting bonus feature. Anthropic Claude Code scored second (7/10); it was the fastest to generate code and had the most polished presentation with 'Power Mode' features, but it critically omitted chording. Mistral Vibe scored poorly (4/10) for lacking chording and sound effects, though its rainbow background was a minor 'fun' addition. Google Gemini CLI failed completely (0/10), producing non-working code due to issues with sound effects and dependency requirements. The conclusion is that while AI agents show capability, they currently function best as tools to augment, rather than replace, human developers.

(Source:Ars Technica)