Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

These tests mean nothing; I yet to see a model that is better than Sonnet 4 for coding. I tried many, all of them are sub-par, even with a small code base.


Well, Codex with GPT5 High wins Claude Sonnet 4.5 - this is anecdotal, but I've used both extensively.


At what speed? At some point you’ll have to compare to Opus.


And Sonnet 4.5 is better than Opus.


Well yeah no surprise. You should try glm 4.6


I tried it, and it was shockingly bad compared to their benchmarks and to Claude Sonnet 4.

I tried it with Claude Code CLI, it didn't follow instructions correctly (I had a Claude.md file with clear instructions), stopped after a few implementations (less than 3 minutes), and produced code that does not work.

For the benefit of the doubt, I changed instructions to be NextJS platform as I thought it's a known framework and it might do better, but still, same quality issues.


Well, Sonnet 4.5 is better.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: