More

jatora · 2026-06-10T02:28:46 1781058526

Other models arent even close except for gpt 5.5. You're dead wrong on that. You read too many benchmarks and/or chinese propaganda. There hasn't been a serious contender in agentic SWE besides OAI and anthropic for a long time, and no chinese model has even reached opus 4.5 performance yet. The moat isnt insurmountable but it is very solid for at least a 12 month lead time. Which is such an insane amount of time in this landscape and industry. The moat is stretching, not shrinking, on agentic SWE. And that is literally the only moat that matters for RSI.

gck1 · 2026-06-10T02:52:07 1781059927

DeepSeek 4 Pro is performing agentic SWE tasks for me quite well. It can't do everything Opus can do, but if OpenAI and Anthropic disappeared tomorrow, I'd figure out ways to make it work with harness improvements and other optimizations.

Anthropic can stretch the moat all they want, but in the department of trust, they put a final nail in their coffin today. Anthropic is pure evil at this point.

jatora · 2026-06-10T03:39:08 1781062748

'evil' lol. Every single corporation you deal with is evil then. it's greed. and almost every large model provider is guilty of it. China is all open source right now. cool! gee i wonder what would happen if they ever actually achieved SOTA? They would clamp down on that so fast Dadio's dradel would spin

AuthAuth · 2026-06-10T04:49:06 1781066946

China isnt "all open source" they still keep their top models out of the public view. Its easy to "open source" models when they're so far behind very few will pay for them.

Open source in quotes because they are not open source and not even close to open source.

jatora · 2026-06-10T18:56:16 1781117776

And what models do they keep out of public view? What ridiculous propaganda is this?!

prmoustache · 2026-06-10T05:12:04 1781068324

Can't we stop using "open-source" when it is just freeware?

SXX · 2026-06-10T05:28:03 1781069283

Open-weight is both meaningful and unique term.

gck1 · 2026-06-10T03:56:23 1781063783

> Every single corporation you deal with is evil then.

I don't know. If my ISP started MITMing my traffic so that they could silently rewrite packets, and/or deleting files on my computer because they thought me sharing wireless AP with my SO was me trying to compete with them, I'd call them evil.

I believe they tried something similar to the first one a few years ago in the US, and I remember people called that evil to the point where tech giants shut down their websites in protest.

> gee i wonder what would happen if they ever actually achieved SOTA? They would clamp down on that so fast Dadio's dradel would spin

Cool. Let them "achieve SOTA" and close down the models. Let the pendulum swing the other way.

You seem to not understand what China's goal is here. They want the AI bubble to burst and take your 401ks with it. And OAI/ANTs decisions are driving you towards that cliff.

ggoo · 2026-06-10T02:43:14 1781059394

I use gpt 5.5 at work (because they pay for it) and DeepSeek at home (because I pay for it) and while I do agree one is better than the other, I think you’re really overstating how far apart they are. Just my take.

mirsadm · 2026-06-10T06:13:44 1781072024

What's 12 months lead time worth? Not much from what I can tell. Contrary to what these AI companies might tell you, if an AI model can't do it, a human can still do the work.

gitanovic · 2026-06-10T08:48:45 1781081325

Honest question, is it possible that since might be using the latest/best model to analyze and improve the existing ones, the moat will expand exponentially, making the models better and more efficient at each iteration until there is no point in competing?

jpfromlondon · 2026-06-10T10:41:08 1781088068

All models from the past two years are close in the general case.

This is just another incremental improvement, rushed out to boost the ipo, AI has the capacity to aid an engineer but this minor bump in performance will have essentially zero impact on the productivity of an engineer working on real world solutions when compared with any other major model.

We are trending towards asymtotic and it can't happen fast enough, that's when the true cost of this will become evident.

solenoid0937 · 2026-06-10T02:36:00 1781058960

Most of HN is stuck in this fantasyland where they insist their local LLM setup is comparable to Opus 4.8 or GPT 5.5. It's like a collective delusion, I've never seen anything like it.

written-beyond · 2026-06-10T02:41:09 1781059269

You can get really good results with Chinese models. You're putting Opus and GPT on too high of a pedestal.

solenoid0937 · 2026-06-10T02:56:36 1781060196

I use Chinese models (for simple personal projects), they just don't compare to GPT or Opus for any serious work.

I do not know why every Chinese model fan thinks that people that aren't impressed by them simply don't use them.

SXX · 2026-06-10T05:36:55 1781069815

Wast majority of software engineers do very little except of moving JSONs around and building CRUDs.

It's quite obvious that when you dont try to do something particularly complex there will be literally no difference between GPT, Claude, Gemini and Deepseek.

Fot many things I'm doing in gamedev Gemini 2.5 Pro was already good enough even though it released more than year ago.

Once you pass certain threshold it's just enough.

Vetch · 2026-06-10T11:15:47 1781090147

What constitutes serious work and how seriously have you tried to do serious work with them? While those trying to claim a 30B dense model can match Opus 4.6 are engaging in either beyond over-excessive over-exaggeration or performing rather routine tasks, it's disingenuous in the other direction to claim the latest open 1T models are not useful for serious work. I find those making such claims have rarely spent more than a few minutes on halfhearted attempts and often on recently obsoleted models.

Openweight models turned a corner around kimi 2.6, deepseek v4 pro/flash, hy3 and mimo 2.5 pro. Similar to how closed LLMs turned a corner around gpt 5.2 and opus 4.5.

While they remain a step behind closed frontier models, for real world tasks ranging across functional reactive programming, distributed systems, mathematical modeling, to-the-millisecond highly optimized spatial data-structures, complex compute shaders and shader effects and non-trivial systems involving parser combinators and algebraic effect systems, I can say that open models have very recently gone from useless to productive. For my work, mimo v2.5 pro is hands down better than sonnet 4.6.

bigbadfeline · 2026-06-10T02:51:49 1781059909

Some of the new and open models are very capable now, The truth is, the value of the model is in the mind of the user - the big names are impressive to those who know little and are dazed by little, but they are bound to end up wrong regardless of how good the model is.

jatora · 2026-06-10T03:34:42 1781062482

This is ridiculous. How about the rational users who use the best current model regardless of brand? The value of the model is in the quality of the output over time. I give every major model a chance. Coding and scripts in the chat are nothing compared to the power of agentic SWEEEEEEEEE. And nothing is remotely close to claude and gpt. If you're comfortable with being well behind SOTA intelligence, then good for you, but some of us prefer to be efficient with our time and resources. With your mindset, you will never truly SWEEEEEEEEEEEEEEEEEEEE

jpfromlondon · 2026-06-10T12:08:57 1781093337

that isn't rational, rational is using the model that can best solve your current problem in the timeliest cost considered manner.

I'm not working on the frontier problems, I don't need god-in-a-box for $600 per month.

jatora · 2026-06-10T16:45:56 1781109956

its not god in a box and its not $600 per month

and almost nobody is working on frontier problems. they just want frontier intelligence to solve their given problems in a superior manner.

you're minimizing and exaggerating all of the wrong things. cope more i guess - more compute for us!

jpfromlondon · 2026-06-10T21:39:37 1781127577

Your comment makes it pretty clear that mine went over your head and that's fine, these tools are for people like you, godspeed.

jatora · 2026-06-07T12:47:09 1780836429

I don't see the point of your comment besides sidestepping a clearly revolutionary mind and an interesting scenario.

probably_wrong · 2026-06-07T14:29:48 1780842588

The point of my comment is to call attention to the SV tendency of hyper-focusing on the newest shiny toy as a solution to all problems while ignoring the real solutions to the real problems we have right now.

If we assume roughly 1.2k people were as smart as Einstein when he was born then, thanks to birth rates, we could have our "10000 Albert Einsteins" today. Statistically speaking ~3k of them alone were born in either India or China and are probably working a regular, badly-to-okay paid job [1]. We could be recruiting them today.

But no one cares about that because the premise is flawed and it's not about solving "medical, scientific, and societal issues". It's about making money and chasing "interesting scenarios" instead of actual solutions. As the meme format goes, men will literally clone Albert Einstein's brain instead of giving proper funding to schools.

And sure, chasing SF scenarios is fun, but let's not pretend that any of it is about making society better. As the sibling comment points out, we are more likely to get a clone of Rupert Murdoch than one of Stephen Hawking.

[1] For extra irony we can imagine a non-zero number of them work for patent offices.

SauntSolaire · 2026-06-07T20:41:02 1780864862

Well it would certainly help to know before they're born which children are going to be Einstein. Maybe with ten thousand of them around we could ask some to help sort the education system out.

Ma8ee · 2026-06-07T15:01:12 1780844472

I think part of the point is that Einstein’s genius was only partially the brain. It was also a unique upbringing in a specific point in time that made it possible. We would have many more geniuses if we game more people the opportunities.

jatora · 2026-06-05T13:12:06 1780665126

> Also: I liked a song and it was sonos. I unliked it after discovering. I feel so stupid, so often.

This is asinine. Keep depriving yourself of things you enjoy I guess?

Anamon · 2026-06-10T11:40:00 1781091600

That's like saying it's irrational to disregard a news report after finding out it was fake. You believed it and found it interesting at the time. Why would knowledge about it being untrue change that?

Because we don't read reports just to have something to read, we do because we want to learn something about the world. And actual music fans don't listen to music just to drown out the silence, they want to listen to what another human being has to say, someone with their own history, experiences, views on life and the world, something to say and the creativity to express it in an interesting way. It makes absolute sense to be pissed off at something artificially generated, devoid of any intent or meaning, statistically generated based on a mass-pirated library of real musicians' work, skinsuiting as music with content.

I admit there are situations where music is used just to have sound, and some people just want to have something on in the background without having to pax attention to it. And one could lament or not the fact that humans are no longer really necessary to produce vapid muzak. But going by the GP's words, I don't think that that's what they were doing or looking for.

nunez · 2026-06-05T14:27:37 1780669657

Every like to an AI-generated work is (literally!) one more data point in support of record labels dropping human artists for AI artists that will do what they want, perform where they want, and give all of their profits back to the label.

Movie studios are "signing" AI artists from AI studios for massive dollars; this is happening.

Maybe you don't care, but music is beautiful and difficult, and I really enjoy hearing works from people that have a passion for it.

You don't have to worry, though; most people are in your school of thought. "Who cares? It's good." Short-term thinking is best-term thinking.

jatora · 2026-06-06T14:50:16 1780757416

I dont think AI music proliferation is in favor of music studios at all, hence why they want to crush it by all means necessary. You should question your stance when you find yourself on the same side as the MPAA/RIAA/etc.

I am in favor of being able to find music I like, with the least friction possible, without fueling the legacy music industry that is inflated far beyond reasonability.

nunez · 2026-06-08T15:18:46 1780931926

AI music is EXCELLENT for the labels.

They (theoretically) have the physical infrastructure and capital to produce entirely AI-generated records, master and press them, promote and market them, retain an in-house band to play the record and book venues for them to play in (see also: Live Nation/Ticketmaster).

The members of the band are replaceable, thus capping the compensation ceiling.

The label retains almost all of the revenue end-to-end in this model. No messy contracts (except to AI providers).

Why would they invest in human talent that has needs, like, idk, sleep, when they can have AI-generated artists that can be available any time for any reason? (Lots of people don't care where their music comes from, unfortunately, and Spotify, the biggest streaming app, is already working hard to desensitize people to AI-generated tracks.)

This is already happening:

- https://www.ohio.edu/news/2025/07/your-new-favorite-band-may...

- Even Timbaland (very successful producer in the 2000s) started an AI-only label: https://www.thefader.com/2025/07/25/imoliver-human-ai-music-...

account42 · 2026-06-08T10:13:18 1780913598

> You should question your stance when you find yourself on the same side as the MPAA/RIAA/etc.

This is tribal thinking.

wussboy · 2026-06-05T13:18:57 1780665537

Perhaps knowing a human with talent worked on it, putting some small part of themselves and their lived experience into the music has value to them? If so, then their actions make complete sense.

tejohnso · 2026-06-05T13:34:41 1780666481

Human created music might have value to them, but it doesn't mean that the AI song was valueless. They admit they enjoyed it. So it doesn't make sense in terms of it not having value.

I wouldn't say it's asinine though. People reject creative output out of personal protest against the creator. Someone might love a movie only to refuse to ever watch it again because they found out the director was accused of something horrible.

Some people just don't want to support anything to do with AI. Although in this case the OP admits to also using AI directly so there's some inconsistency there, which is consistent with the state of confusion and uncertainty OP is expressing.

jatora · 2026-06-06T14:52:07 1780757527

True, asinine was too strong a critique. I am just not in favor of emotional decision making where it can be helped. I can despise Tom Cruise as a person while still enjoying many of his movies. I know that some can't do that. I have always detached the creators from the art and that makes AI slot right in for me.

sodapopcan · 2026-06-06T17:00:43 1780765243

It's not emotional when there is still always a person behind these "works" that stands top benefit from it from you listening to it.

sodapopcan · 2026-06-05T13:33:49 1780666429

It's not asinine at all. Context matters in art. Otherwise, more songs exist that I would probably really like than I will ever hear, so I'm going to focus on the human-made ones. Besides, part of the joy of music extends beyond listening. For many people, myself included, if we feel really connected to a song we like to learn about the people who created it.

jatora · 2026-06-06T14:54:54 1780757694

Context doesnt matter in all art. Sunset, aurora borealis, waterfall, clear night skies, etc. These are devoid of a creator and have beauty nonetheless. While they aren't "art", they evoke the exact same brain centers, which means the creator can be factored out of our enjoyment of external stimuli. And you can follow that chain of reasoning to a more rational stance on AI and art in general.

sodapopcan · 2026-06-06T16:58:44 1780765124

> Context doesnt matter in all art. Sunset, aurora borealis, waterfall, clear night skies, etc. These are devoid of a creator and have beauty nonetheless.

I'm sure the religious fundamentalists would have something to say about that XD I'm not one, though, but I don't much appreciate the framing of needing a more "rational" stance here as it kind of signals bad faith to me. There is nothing emotional about worrying about artists' livelihood which has already been under attack for decades before LLMs existed. I'm not saying that one HAS to derive (additional) enjoyment from the presence of a creator. I listen to plenty of human-produced music where I don't care about the artist but these are always songs that I listen to for a short period of time then never think about again. And of course this goes way beyond just who the creator is as a person, it's also about what they inflict into the art itself: the unique vocal inflections, odd ways someone play's an instrument, how they perform it... but that's getting more into this than I meant to when I started typing what I thought would be two sentences, lol.

Anyway, I'm also not saying that people can't enjoy generated output, I'm just not going to support it. Especially since there is still always a person behind it who is maybe going to benefit from people listening to it. No thank you.

jatora · 2026-06-05T02:22:35 1780626155

This is weird to me because i am using claude code 10+ hours/day 7 days a week, usually multiple sessions, and run into api errors maybe in 1 or 2 sessions per week. And about..2 major outages of 10-20min in the last month. Not terrible and nowhere near what you are reporting. Therefore I dont believe you, because you dont even couch this in terms of it being something that seems particular to you or your region. Obvious dishonestly is fairly bad of you.

jatora · 2026-06-03T23:12:25 1780528345

I also randomly wrote some code in a bind yesterday, while I was on the toilet, and it felt so strange. That was the first I'd written in probably 6 months.

tcoff91 · 2026-06-03T23:16:43 1780528603

You don't even make small tweaks by hand? There's so many things that are honestly faster to do by hand than wait for agents to do.

jatora · 2026-06-04T03:13:47 1780542827

Nope I'm a couple levels too far removed from the code at this point for that. Closest I get is during meta-management (modularizing, complexity reduction, etc) with agents

jatora · 2026-06-03T02:47:19 1780454839

AI is far better at security than the majority of security professionals. It is a net positive.

People constantly compare AI to this very rare expert human rather than the reality of who is already employed. Experts like you are a major culprit of this. And it puts you at odds with yourself to both admit the industry is full of subpar workers and then lament that they will be replaced with workers that are better, but still worse than you.

What is wrong with someone to make them think in this manner? Is it just a kneejerk response with little thought? Is it ego? Is it a coping mechanism? I find it very strange and interesting and annoying.

void-star · 2026-06-06T02:38:50 1780713530

You are leaping to the assumption that I don’t actually believe in the tech. This is incorrect. I am griping with the way it is being recklessly and stupidly deployed by poeople who really don’t know what they’re doing.

nullpoint420 · 2026-06-03T06:35:08 1780468508

I also don’t like your framing, here.

We need experts to know when AI is wrong, which it is all the time.

Earlier this week someone commented here that we shouldn’t expect a language model to know that you need to drive a car to a car wash, to wash a car.

So then, what do we expect it to know? Who’s responsible for when it’s wrong?

Also, why can’t Mythos just fix all these issues itself if it’s so smart. And test them to make sure they work?

void-star · 2026-06-06T02:37:04 1780713424

I actually agree somewhat with jatora. However a large segment of the top ~20% of security folks are being forced to become reverse centaurs, as opposed to centaurs (disempowered vs empowered) due to the factors I mentioned. I genuinely see value in the tech, but it is currently being deployed recklessly and stupidly.

scrollaway · 2026-06-03T08:06:37 1780473997

> why can’t Mythos just fix all these issues itself if it’s so smart. And test them to make sure they work?

“Why”: because you didn’t ask it. It’s not its job in this case.

You don’t hire an accountant and tell them “why can’t you fix my cash-flow problems and make me money if you’re so smart”

nullpoint420 · 2026-06-03T15:21:30 1780500090

Ah ok, sure. The difference being the model should know how to do both based on what I’ve been told.

So why didn’t Anthropic ask it for me?

jatora · 2026-06-03T02:39:59 1780454399

for a cost, and so not handled perfectly at all. at least google doesnt charge us to harvest us like kagi does lol

jatora · 2026-06-03T02:39:01 1780454341

1970 called

parrellel · 2026-06-03T02:45:10 1780454710

Hey my encyclopedia had yearly updates through 2008!! :D

But seriously, its actually nicer... and the equations on the back covers don't randomly change to crap every fourth time you open the book.

jatora · 2026-06-03T02:38:34 1780454314

Why dont you want AI?

jatora · 2026-06-03T02:37:45 1780454265

ill take ai search over blog and seo-infused search any day