These tests mean nothing; I yet to see a model that is better than Sonnet 4 for ...

nnevatie · 2025-10-02T19:10:51 1759432251

Well, Codex with GPT5 High wins Claude Sonnet 4.5 - this is anecdotal, but I've used both extensively.

solarkraft · 2025-10-02T20:17:34 1759436254

At what speed? At some point you’ll have to compare to Opus.

adastra22 · 2025-10-02T23:33:32 1759448012

And Sonnet 4.5 is better than Opus.

Bolwin · 2025-10-02T19:45:30 1759434330

Well yeah no surprise. You should try glm 4.6

Oras · 2025-10-03T08:24:27 1759479867

I tried it, and it was shockingly bad compared to their benchmarks and to Claude Sonnet 4.

I tried it with Claude Code CLI, it didn't follow instructions correctly (I had a Claude.md file with clear instructions), stopped after a few implementations (less than 3 minutes), and produced code that does not work.

For the benefit of the doubt, I changed instructions to be NextJS platform as I thought it's a known framework and it might do better, but still, same quality issues.

adastra22 · 2025-10-02T23:33:10 1759447990

Well, Sonnet 4.5 is better.