#1NIST Finds DeepSeek V4 Pro Trails US Frontier Models by 8 Months, Flags Benchmark Inflation
NIST's Center for AI Standards and Innovation published its independent evaluation of DeepSeek V4 Pro across cyber, software engineering, math, reasoning, and natural sciences. While DeepSeek's own benchmarks claim parity with Opus 4.6 and GPT-5.4, CAISI's non-public benchmarks place it closer to GPT-5 — roughly eight months behind the frontier. Notably, the model is still more cost-efficient than GPT-5.4 mini on five of seven benchmarks tested, reinforcing China's open-weight cost advantage even as capability claims are deflated.