News

Enterprise

Artificial Intelligence

Americas

Anthropic Redesigns Engineering Tests After Claude Models Match Human Candidate Performance

Anthropic has shifted to abstract logic puzzles for technical hiring after discovering its newest AI models can solve traditional realistic engineering tests.

NewDecoded

Published Jan 23, 2026

Jan 23, 2026

3 min read

Image by Antropic

The AI Evaluation Arms Race

Anthropic recently revealed that it has overhauled its technical hiring process after its own AI models began outperforming top-tier human performance engineers. Tristan Hume, a lead on the performance optimization team, documented how successive versions of Claude rendered traditional take-home tests nearly useless. By the time Claude Opus 4.5 arrived, it could match the output of the strongest human candidates within standard testing time limits.

The company's evaluation journey began with a realistic simulation of a hardware accelerator where candidates optimized parallel code. This format worked well for over a year, helping the firm hire dozens of engineers who built its current model clusters. However, the rapid advancement of Claude forced a shift from realistic work toward increasingly unconventional challenges to maintain a clear hiring signal.

Hume discovered that as long as a problem resembled real-world engineering, Claude could solve it using its vast training data. Even when the team designed complex data transposition tasks, the model identified clever architectural tricks that mirrored human reasoning. This prompted a move to out-of-distribution tests where AI experience offers little to no advantage over pure human adaptability.

Solving for Novelty

The current iteration of the test is modeled after Zachtronics-style logic puzzles, featuring tiny, constrained instruction sets and zero initial debugging tools. Candidates must build their own tooling and solve abstract problems that rely on raw logic rather than knowledge of existing systems. This ensures the evaluation captures the ability to navigate novel environments, a skill the company finds increasingly vital. Anthropic has now released the original version of its take-home as an open challenge on GitHub. While elite humans still hold an edge over AI when given unlimited time, the two-hour benchmark is now a dead heat. The company is inviting anyone who can beat the 1487-cycle score set by Claude Opus 4.5 to reach out to their recruiting team directly.

Decoded Take

The transition from realistic simulations to abstract puzzles signals a fundamental shift in how the tech industry must define human expertise. As large language models master standard workflows and common architectural patterns, the competitive advantage for human workers is shifting away from experience toward the ability to navigate novel, low-context environments. This suggests that the traditional technical interview is nearing an end, replaced by evaluations that measure how quickly a human can adapt to systems that have never existed before.

Want to advertise your Data, Analytics, or AI here? Reach out!

NewDecoded

Want to advertise your Data, Analytics, or AI here? Reach out!

NewDecoded

Want to advertise your Data, Analytics, or AI here? Reach out!

NewDecoded

Share this article

News

Feb 19, 2026

Martech Veterans Launch Kana With $15M to Automate Marketing Through Agentic AI

News

Feb 19, 2026

Martech Veterans Launch Kana With $15M to Automate Marketing Through Agentic AI

News

Feb 19, 2026

Martech Veterans Launch Kana With $15M to Automate Marketing Through Agentic AI

News

Feb 19, 2026

World Labs Secures $1 Billion to Advance Spatial Intelligence and 3D World Models

News

Feb 19, 2026

World Labs Secures $1 Billion to Advance Spatial Intelligence and 3D World Models

News

Feb 19, 2026

World Labs Secures $1 Billion to Advance Spatial Intelligence and 3D World Models

News

Jan 24, 2026

Rare Earth Recycler Cyclic Materials Raises $75 Million to Scale Low Carbon Magnet Supply for AI and EVs

News

Jan 24, 2026

Rare Earth Recycler Cyclic Materials Raises $75 Million to Scale Low Carbon Magnet Supply for AI and EVs

News

Jan 24, 2026

Rare Earth Recycler Cyclic Materials Raises $75 Million to Scale Low Carbon Magnet Supply for AI and EVs

News

Jan 24, 2026

e& enterprise and Emergence Partner to Deploy Autonomous Agentic AI Across MENAT Region

News

Jan 24, 2026

e& enterprise and Emergence Partner to Deploy Autonomous Agentic AI Across MENAT Region

News

Jan 24, 2026

e& enterprise and Emergence Partner to Deploy Autonomous Agentic AI Across MENAT Region

News

Jan 24, 2026

Wio Bank Becomes First Regional Bank to Join NVIDIA Inception for AI Innovation

News

Jan 24, 2026

Wio Bank Becomes First Regional Bank to Join NVIDIA Inception for AI Innovation

News

Jan 24, 2026

Wio Bank Becomes First Regional Bank to Join NVIDIA Inception for AI Innovation

News

Jan 24, 2026

Toyota Launches Vista Platform Powered by Databricks to Unify Global Data and AI

News

Jan 24, 2026

Toyota Launches Vista Platform Powered by Databricks to Unify Global Data and AI

News

Jan 24, 2026

Toyota Launches Vista Platform Powered by Databricks to Unify Global Data and AI