Hacker Newsnew | past | comments | ask | show | jobs | submit | epistasis's commentslogin

I was a bit confused by your definitions, but here's how Mozilla broke out [1] the 271, um, things:

> As additional context, we apply security severity ratings from critical to low to indicate the urgency of a bug:

> * sec-critical and sec-high are assigned to vulnerabilities that can be triggered with normal user behavior, like browsing to a web page. We make no technical difference between these, but sec-critical bugs are reserved for issues that are publicly disclosed or known to be exploited in the wild.

> * sec-moderate is assigned to vulnerabilities that would otherwise be rated sec-high but require unusual and complex steps from the victim.

> * sec-low is assigned to bugs that are annoying but far from causing user harm (e.g, a safe crash).

> Of the 271 bugs we announced for Firefox 150: 180 were sec-high, 80 were sec-moderate, and 11 were sec-low.

Mozilla uses the term "vulnerability" for even sec-high, even though they say right below that it doesn't mean the same thing as a practical exploit. And on their definitional page, they classify even sec-low as "vulnerabilities" [2].

Words are tools, that get their utility from collective meaning. I'd be interested where you recieved your semantics from and if they match up or disagree with Mozilla.

[1] https://hacks.mozilla.org/2026/05/behind-the-scenes-hardenin...

[2] https://wiki.mozilla.org/Security_Severity_Ratings/Client


I work at Mozilla; I fixed a bunch of these bugs.

In general, I would say that our use of "vulnerability" lines up with what jerrythegerbil calls "potential vulnerability". (In cases with a POC, we would likely use the word "exploit".) Our goal is to keep Firefox secure. Once it's clear that a particular bug might be exploitable, it's usually not worth a lot of engineering effort to investigate further; we just fix it. We spend a little while eyeballing things for the purpose of sorting into sec-high, sec-moderate, etc, and to help triage incoming bugs, but if there's any real question, we assume the worst and move on.

So were all 271 bugs exploitable? Absolutely not. But they were all security bugs according to the normal standards that we've been applying for years.

(Partial exception: there were some bugs that might normally have been opened up, but were kept hidden because Mythos wasn't public information yet. But those bugs would have been marked sec-other, and not included in the count.)

So if you think we're guilty of inflating the number of "real" vulnerabilities found by Mythos, bear in mind that we've also been consistently inflating the baseline. The spike in the Firefox Security Fixes by Month graph is very, very real: https://hacks.mozilla.org/2026/05/behind-the-scenes-hardenin...


What types of vulnerabilities was it finding? Cross site scripting, privilege escalation, etc? Mostly memory corruption or any Javascript logic bugs?

I work on SpiderMonkey, so I mostly looked at the JS bugs. It was a smorgasbord of various things. Broadly speaking I'd say the most impressive bugs were TOCTOU issues, where we checked something and later acted on it, and the testcase found a clever way to invalidate the result of the check in between.

If you look closely at, say, this patch, you might get a sense of what I mean (although the real cleverness is in the testcase, which we have not made public): https://hg-edge.mozilla.org/integration/autoland/rev/c29515d...


> although the real cleverness is in the testcase, which we have not made public

What is the point of keeping it private? I'd bet feeding this patch to Opus and asking to look for specific TOCTOU issue fixed by the patch will make it come up with a testcase sooner or later.


The same is also true of a good security researcher, and has been for a long time. The question is mostly whether it takes long enough to come up with a testcase that we've managed to ship the fix to all affected releases, and given people some time to update. (And maybe LLMs do change the calculus there! We'll have to wait and see.)

Possibly! One of the many areas that might need rethinking in the age of AI (that started in February of this year) is how long security bugs should be hidden. We live in interesting times.

Given the commit is 4 weeks old, will it eventually get comments?

The code before the patch does not look obviously wrong. Now, some more lines were added, but would you now say it now looks less obviously wrong, or more obviously correct?

It seems that the invariants needed here are either in some person's heads, or in some document that is not referenced.

Reading the code for the first time, the immediate question is: "What other lines might be missing? How can I figure?"

If the "obviously correct" level of the code does not increase for a human reviewer, how is it ensured that a similar problem will not arise in the future? Or do we need more LLM to tell us which other lines need to be added?


Yeah, the test with the patch also adds comments. The human reviewer had extra context available.

I did get Opus to do an audit for similar problems elsewhere, to supplement the investigations that we were already doing by hand. It initially thought it found something, but when asked to produce a testcase, it thought for 20 minutes and admitted defeat. I suspect that the difference between Opus and Mythos is in small edges like this: if Mythos is smart enough to spot why Opus's discovery didn't work a little bit faster, and it can waste less time chasing down red herrings, then it's more likely to find a real bug within the limits of a context window. It's not that Opus completely lacks some capability, it's that it has trouble chaining all the pieces together consistently.


Very cool, thank you.

I'd say it leans towards memory corruption kinds of issues, as those are easiest to pass the validator, thanks to AddressSanitizer. I think there's a lot of potential for making the validator more sophisticated. Like maybe you add a JS function that will only crash when run in the parent process and have a validator that checks for that specific crash, as a way for the LLM to "prove" that it managed to run arbitrary JS in the parent. Would that turn up subtler issues? Maybe.

You may not be able to comment, but do you feel like Mythos is accomplishing anything that couldn't have already been done with Opus and the right prompting?

I've assumed I could send an agent using a publicly available model bug hunting in a codebase like this and get tons of results, assuming I wanted to burn the tokens, so it's really unclear to me whether the Mythos hype is justified or if it's just an easy button (and subsidized tokens?) to do what is already possible.


I never got direct access to Mythos, so all I know is what I've seen from the quality of the bugs being produced. I also haven't been involved at the prompting end.

So the best answer I can give is: I dunno, maybe it's possible to find bugs like this using Opus, but if so, where are they? Did nobody think to try "please find the bug in this code" pre-Mythos? I've done enough auditing with Opus to be convinced that it can be a good assistant to somebody who already knows what they're doing, but in practice the big wave of AI-discovered bugs started with Mythos.

I'm sure lots of people have assumed they could send a publicly available model bug hunting and find things. I have not noticed a huge amount of success. We've had some very nice correctness bugs reported, but skimming through the list of security bugs I've fixed recently, the AI-related ones all seem to be Mythos.

My best guess is that Mythos is just enough better along just enough axes that its hit rate on finding potential bugs and filtering out the real ones from the hallucinations is good enough to matter. Like, there's no obvious qualitative difference between 3.6kg of uranium-232 and 3.8 kg of uranium-232, just a small quantitative increase. But if you form both of them into spheres, only one of them has reached critical mass. Can you do something clever to reach critical mass with 3.6kg of uranium? Maybe! But needing to do something clever is a non-trivial barrier in itself.


I did some experiments and Opus seemed pretty able to wire up a harness to find bugs and write PoC + patch for each. It's still a lot of work to get fixes upstreamed from outside so I think even if outsiders have better tools (Mythos etc) it won't change the report rate much, people may find more bugs but they won't report them. I suspect that's part of the calculation of the phased rollout for Mythos, finding bugs is already not the bottleneck.

> It's still a lot of work to get fixes upstreamed from outside

I'm going to disagree in the specific case of Firefox. First, although it has diverged a long way from its roots, Mozilla still has the community project ideal in its DNA. Enough, at least, that I stumbled while reading the clause "from outside" -- if you're finding and reporting actual relevant security bugs, you're already on the inside. SpiderMonkey in particular still has a good amount of code being written and even maintained by non-employees. (Examples: Temporal and LoongArch64 JIT support).

Second, the bug bounty program still exists[0] and is being used. If someone were sitting on a pile of AI-discovered exploits, then it has monetary value which is rapidly draining away the longer they aren't reported.[1] That's incentive to put in the work to report them properly.

Third, I agree that finding bugs is likely not the bottleneck. Validating them is. With previous models, the false positive rate was too high so they required too much work to whittle down to the valid ones. A PoC is a very strong signal that a bug is valid, and that's where I just don't believe you: without a really good harness, I don't think Opus was good enough to find very many bugs with PoCs. It could find some, just not very many.[2]

[0] For now. It remains to be seen how it will adapt to the AI age. For the moment, it hasn't been severely nerfed like Google's.

[1] One could make the argument that people who are inexpert enough to only be able to poke an AI to find bugs are also the people more likely to sell them on the black market rather than disclosing them. It seems plausible. Still, some people would still be disclosing, and not many were filing quality bugs pre-Mythos. Some were, but it was a trickle compared to post-Mythos.

[2] Also note that I personally, as a SpiderMonkey developer, don't find a huge amount of value in the AI-generated patches that accompany these bug reports. Sometimes they're useful to better illustrate the problem, especially since the AI's problem analysis is usually subtly wrong in important ways. They can be a decent starting point for a real patch. But I'll still need to go through my own process of figuring out what the right fix is, even in the handful of cases where I end up with the same thing the AI did.


We have a bounty program. If you can find security bugs in Firefox, please let us pay you for them. You don't need to provide a fix; a testcase that crashes in an interesting way is often enough to qualify.

https://www.mozilla.org/en-US/security/client-bug-bounty/


> I suspect that's part of the calculation of the phased rollout for Mythos, finding bugs is already not the bottleneck.

I was wondering this too. By working directly with tech companies and (one assumes) subsidizing tokens, they're empowering the people on the inside who absolutely want to have the bugs fixed.

Who outside of Mozilla is going to pay and spend the effort to find Firefox bugs? Sure some hobbyists and contributors might, but they don't have the institutional knowledge of the codebase which can help guide an agent prompts, nor do they have strong incentives to try and report them, nor do they necessarily have the time to craft good bug reports that stand out from the slop reports.

My assumption would be that most people working to discover bugs this way in Firefox are interested in using them rather than getting them fixed, so maintainers wouldn't necessarily even know the degree to which it was already happening.


The incentive is that Mozilla will pay you thousands of dollars if you find a security bug: https://www.mozilla.org/en-US/security/client-bug-bounty/

We have many outside contributors who have successfully submitted security bugs and received payments.


I'm not a security dev or researcher or anything, but as an outsider my understanding matches how Mozilla uses the terms. Though words used by specialists and the general public can offer differ...

How about this: a "vulnerability" is a "vulnerability", but after it was identified and verified to cause problem, that's when it should be called a "bug", because it could make the software do unwanted things.

At Mozilla, everything is called a bug. It's what other systems call an "issue". So it's too late for your terminology at Mozilla. (Example: I have a bug to improve the HTML output of my static analysis tool. There is nothing incorrect or flawed about the current output.)

At Mozilla, but not everywhere: exploits are a subset of vulnerabilities are a subset of bugs.



Fwiw i think this is right. A bug is anything that doesn't do what you want it to do, and nobody should want a vulnerability in their software

When I worked at Mozilla, _everything_ was called a bug, whether it was a software issue, a problem in the office or some paperwork missing.

Much as GitHub calls everything an "issue" and GitLab a "work item".


Can you elaborate why those bugs weren't found by e.g. fuzzing in the past?

I'm genuinely curious what "types" of implementation mistakes these were, like whether e.g. it was library usage bugs, state management bugs, control flow bugs etc.

Would love to see a writeup about these findings, maybe Mythos hinted us towards that better fuzzing tools are needed?


If I had to guess, I'd say that AI is better at finding TOCTOU bugs than fuzzing because it starts by looking at the code and trying to find problems with it, which naturally leads it to experiment with questions like "is there any way to make this assumption false?", whereas fuzzing is more brute force. Fuzzing can explore way more possible states, but AI is better at picking good ones.

In this particular sense, AI tends to find bugs that are closer to what we'd see from a human researcher reading the code. Fuzz bugs are often more "here's a seemingly innocuous sequence of statements that randomly happen to collide three corner cases in an unexpected way".

Outside of SpiderMonkey, my understanding is that many of the best vulnerabilities were in code that is difficult to fuzz effectively for whatever reason.


Fuzzing isn't good at things like dealing with code behind a CRC check, whereas the audit based approach using an LLMs can see the sketchy code, then calculate the CRC itself to come up with a test case. I think you end up having to write custom fuzzing harnesses to get at the vulnerable parts of the code. (This is an example from a talk by somebody at Anthropic.)

That being said, I think there's a lot of potential for synergy here: if LLMs make writing code easier, that includes fuzzers, so maybe fuzzers will also end up finding a lot more bugs. I saw somebody on Twitter say they used an LLM to write a fuzzer for Chrome and found a number of security bugs that they reported.


Presumably there are (implicit?) "sec-none" things, like [a] from the recently released 150.0.2 [b] which makes absolutely zero mention about "Security Impact" or "Severity" in the bug report, unlike [c], which is listed in the Mozilla weblog post [2].

Security things are mentioned in the Release Notes [b] pointing to a completely different document [d].

Perhaps sometimes a bug is 'just' a bug, and not a vulnerability.

[a] https://bugzilla.mozilla.org/show_bug.cgi?id=2034980 ; "Can't highlight image scans in Firefox 150+"

[b] https://www.firefox.com/en-CA/firefox/150.0.2/releasenotes/

[c] https://bugzilla.mozilla.org/show_bug.cgi?id=2024918

[d] https://www.mozilla.org/en-US/security/advisories/mfsa2026-4...


> Mozilla uses the term "vulnerability" for even sec-high, even though they say right below that it doesn't mean the same thing as a practical exploit.

That’s not evident in what you pastedat all.

What you pasted says

> sec-critical and sec-high are assigned to vulnerabilities that can be triggered with normal user behavior […] We make no technical difference between these […] sec-critical bugs are reserved for issues that are publicly disclosed or known to be exploited in the wild.

> sec-low is assigned to bugs that are annoying but far from causing user harm (e.g, a safe crash).

From this one infers that the "180 were sec-high" bugs found are actually exploitsble but known to have been found in the wild, and are NOT mere annoying bugs.

The difference between 180 and 270 does nothing to deflate the signicance, or lack there of, of the implication re: Mythos.


Yes, it is not in what I pasted, as I said, "even though they say right below". If you don't believe me then click on either of the links.

Unless there's a single buyer, or at least a coordinated decision of "above this price, we all buy nothing, and below this price we all buy our normal amounts", it truly does matter what the price is when there's not enough for sale.

The price is determined by who needs it the most and is willing to give up the most cash. Instead of rationing by lines, or fixed quantities, it's allocation to those who can either make the most out of the jet fuel, or those for whom money is the least valuable.


I think the best interpretation of the overall stock market valuation is as a barometer of the wealthy's feelings, i.e. expectations of future returns rather than true future returns... but doesn't it seem like feelings have completely unmoored from the baked-in damage to the global economy from the current shortage?

Or has GDP growth become so decoupled from energy use that I'm wrong and stock market valuations are completely OK, even as airlines brace for fewer flights due to energy shocks?


I wonder if some of this is because of the "prices are set on the margins" effect of markets. The price of anything is set by the folks who are actively transacting at any given time; if you're not buying and selling, your opinion doesn't matter.

Oftentimes, near a market top, the people who are value investors and actually care about price end up selling off all their holdings. But because they have already sold, and are not buying, they drop out of the market entirely. Prices get set by the people who are price insensitive, because they're the only ones willing to participate. As a result, you often get the "blow off top" right before a market crash, where the stock market moves sharply upwards even though fundamentals say it should crash. All the folks who believe it will crash have already left and no longer participate in price-setting.


> But because they have already sold, and are not buying, they drop out of the market entirely. Prices get set by the people who are price insensitive, because they're the only ones willing to participate.

The buy & hold value investor is also not participating in price discovery since they are just passively holding.


That makes sense, but something I've been wondering as of late... what if markets are just being... cornered? Surely, it can difficult to assess genuine market activity from not, no?

I mean I know it sounds absurd for something like the size of the stock market, but all you really need to do is have enough capital to consume floating demand, and you can control prices. Even easier with options, as they give leverage to do this. And arguably this essentially explains the behavior behind stocks like GME, AMC, AVIS, BIRD... heck even Tesla. If you look at Tesla, most of its stock price appreciation happened in 2020-2021, when SoftBank was allegedly controlling their options chain. And since its peak in 2021, when SoftBank stopped, Tesla has underperformed the S&P 500, and would probably would have fallen by now if it wasn't for all the passive inflows from being in the Mag 7. I mean, it could just be coincidence or genuine market activity, but how can we be certain that markets aren't just being cornered by coordinated groups?

Also, the oil market right now doesn't make a whole lot of sense if you compare futures and spot and what analyst estimates are giving even in the best case scenario. But Japan mentioned considering shorting over $1.4 trillion in oil futures, and if they actually are, then suddenly things make a bit more sense...


The stock market has stopped making sense since the 2008 financial meltdown. Many reasons for this. In part, ZIRP is to blame -- a ton of money flowed into stocks simply because they were the only way to tread water and get a return. In part, you can blame Bitcoin (which demonstrated at scale that you can have an "asset" with zero underlying value and net-negative social utility that still functions as an appreciating value token,) and the meme-stock.

The stock market is basically detached from the industrial manufacturing/production economy -- and even to some extent the services/insurance economy -- and is now vibes/feels based.


Regardless of individual stock performance, present-day total stock market liquidity is a proxy for expectations of future total stock market liquidity.

If people are keeping their money in the market (regardless of allocation inside the market) they are expecting that any other asset class will perform worse in the near future. If they expect that commodities are too volatile, spending won't pay off, monies and bonds will inflate away and land may face legal risks from populist, technocratic or extrajudiciary changes to the legal system then their least worst options are to go all in on stocks. Furthermore, the energy sector is going to have a windfall from filling up the VLCCs of the world and look for anywhere to dump the cash that helps escape taxes, driving future liquidity expectations even higher.


Right, and "monies and bonds will inflate away" is related to what I said re ZIRPs -- you can't expect a decent return by parking your money with the bank, either, as interest rates are low to nonexistent. The stock market's the only good option. This was no accident, it was an intentional policy move... And now, "the DOW's over 50,000!"

What's even more troubling is that there was once the pretense that valuations had something to do with fundamentals, but this has gone entirely out the window since about 2013.

So basically none of it makes any sense and you've just got to ride the tiger.



Thanks the second chart is more concerning than the first. The first has a variety of potential distortions at play first of all portion of industrial activity that's done by public vs private companies shifts over time and second of all a lot of these companies do 40 to 50% of their business overseas.

Whereas a historically low ratio of earnings to index value is a deeper concern to me


I’ve never looked at the numbers to see exactly how large of a pool it is but millions of Americans are effectively forced to buy into the market to fund our 401ks and future retirements. That’s billions of dollars of inflow every month that has to go somewhere, usually index funds.

it is exactly this. I am self-employed and have been managing my solo 401k for 15+ years now. while I am fully aware of market being overvalued I will still max out my 401k this year ($80k, I am over 50 :) ) and that money needs to go somewhere. I could theoretically sell and sit on the cash and wait on "crash" but I could have technically done that a year ago (this were similarly not very rosy) or even earlier and that obviously would have been a terrible financial decision. no one can time the market and therein lies the core issue, money will always be flowing in...

Another explanation is that their expectation of the future value of money is very low. If money turns out to devalue 10X then owning stocks (call on the future income stream of companies selling presumably useful things/services in the future) that seem to be 10X overvalued makes sense.

One of the explanations I've heard is that a lot of traders were caught out by how quickly the economy rebounded from COVID-19, so they're overcompensating by underreacting to the current situation.

> One of the explanations I've heard is that a lot of traders were caught out by how quickly the economy rebounded from COVID-19, so they're overcompensating by underreacting to the current situation.

Yikes.

The reason Covid wasn’t as bad as it could be was WFH.

There’s no equivalent for oil. You can’t grow food at scale without fertilizer.


WFH and a big dollop of "socialism" in the US. Supposedly the reason many voters felt that their lives were better under Trump 1.0 was due to the various subsidy and rent freeze measures taken by the US government during the pandemic.

Probably markets expect the situation to be resolved soon.

You can check the term structure of oil to confirm: https://www.cmegroup.com/markets/energy/crude-oil/light-swee...

Equities are (in theory) priced on net-present-value of future cash flows, so a temporary <1 year disruption is important but not massively so.


Is it possible that a lot of people are hoping for a dot-com bubble 2.0 so they can sell at the market peak before it pops?

That would explain why they're ignoring fundamentals.

They could think that OpenAI and Anthropic IPOs will drive prices higher, and it still isn't time to sell.


It's a multi-layered question. On the one hand if energy is more expensive and you still consume roughly the same amount but actually that raises gdp. However you're assuming that there's demand destruction with the rise of energy was your reasonable assumption... but that's uneven and there's going to be an increase in demand for certain things that actually benefit from higher energy prices like anything in electric tech or green energy.

Probably most importantly the economy at this point is largely a digital economy rather than one centered on goods.. in other words GDP growth is no longer coupled to energy consumption. The fact that we're able to transition to a largely remote workforce around covid is a testament to this.

The implication is that in the event that these high prices sustain and there is some demand destruction a lot of fundamental parts of the economy will continue to function in an evolved way for example online.

And then of course there's AI which could be considered sort of an extension of this digital economy which is driving so much of the underlying growth.

That doesn't mean there won't be hot spots like this article is pointing out perhaps the UK is particularly exposed. On the other hand the fact that so much of the UK's economy is financial services and hence in a way benefits from all this volatility ... means it is not all so clear.

It would be easier to say that the real impact will be on the manufacturing powerhouses but even they will benefit from the transition to a solar and battery based energy system.

Now if you believe that it's inevitable that this bubble will have a slowdown and you speculate that the bubble might be partially punctured by these high energy prices that seems like a reasonable hypothesis... but it could also be the opposite.. In other words the demand destruction for energy could actually mean capital is looking for a place to invest fruitful elsewhere.

Counterintuitive as it all may seem, this system is simply not one anyone can reasonably expect to make predictions around at least any more reliably than walking into a casino and expecting to beat the house.

Similarly the experts and talking heads telling you the implications of this war or the Ukraine war are making one dimensional predictions that are simply not honest enough about how chaotic and reflexive these systems are


Probably people are betting that the situation resolves itself before shit hits the fan.

And there is a bunch of plausible reasons that this belief is not crazy (of course nobody really knows).

- trump literally is called TACO

- the war is really unpopular in usa and midterms are getting closer. There is domestic usa pressure to wrap this up.

- Iran's ecconomy was a mess before all this and is now a disaster. The blockade goes both ways and it seems unlikely iran can keep it up long term

- As shortages approach international pressure from uninvolved parties to resolve the situation one way or another will mount.


I know you're probably talking about the compute infrastructure, but I think the electricity infrastructure side is interesting too, data centers are doing things in dumb ways because the need for operational expansion speed is greater than the need dollars:

> It’s regulation with the utilities. There are ramp rates, there are all of these things that you’re supposed to do to not screw up the grid. Data centers have been in gross violation of that. When you think about what’s wrong with data centers, they have load volatility, which we just talked about, then they decide to power it with behind-the-meter natural gas generators. These natural gas generators, their shaft is supposed to last for seven years. It’s lasting 10 months because of all the cycling.

https://www.volts.wtf/p/doing-data-centers-the-not-dumb-way

On the compute infrastructure, there are standard NVIDIA reference designs like this:

https://www.nvidia.com/en-us/technologies/enterprise-referen...

I haven't bothered to look but I'd guess Mellanox GPU-to-GPU networks, and massive custom code for splitting tensors across GPUs, and for shuttling activations across GPU nodes.


Agreed on the insignificance of the degrowth moevement on seriousness. However there is a serious NIMBY movement, and the degrowth movement provides the moral cover for a lot of very selfish people that are not serious about degrowth but wish to accomplish the same goals for individual projects near them.

Oh, very glad to see this, ML applications that were mentioned in it are exactly why I was thinking this was such a disastrous change.

However, the tedium of the reply chain reminds me why I tend to focus most energy on internal projects rather than external open source...

Docker may have been built for a specific type of use case that most developers are familiar with (e.g. web apps backed by a DB container) but containerization is useful across so much of computing that are very different. Something that seems trivial in the python/DB space, having one or two different small duplicates of OS layers, is very different once you have 30 containers for different models+code, and then ~100 more dev containers lying around as build artifacts from building and pushing, and pulling, each at ~10GB, that the inefficient new system is just painful.

The smallest PyTorch container I ever built was 1.8GB, and that was just for some CPU-only inference endpoints, and that took several hours of yak shaving to achieve, and after a month or two of development it had ballooned back to 8GB. Containers with CUDA, or using significant other AI/ML libraries, get really big. YAGNI is a great principle for your own code when writing from scratch, but YAGNI is a bit dangerous when there's been an entire ecosystem built on your product and things are getting rewritten from scratch, because the "you" is far larger than the developer making the change. Docker's core feature has always been reusable and composable layers, so seeing it abandoned seems that somebody took YAGNI far too extreme on their own corner of the computing world.


Docker(Hub) just isn't built for this use case. I've built https://clipper.dev to better handle ML/large images. It consists of a registry+pull client that breaks apart layers and does content addressing of individual chunks by _uncompressed_ hash, so that content can be better shared. My pull client has better parallelization and wastes much less bandwidth. It annoys the heck out of me when I change one file in a layer and have to redownload bytes my device already has. By sharing across layers I've seen 80-90% improvements in pull times for "patches".

I'm also in the process of building a BuildKit builder, I'm seeing large improvements on the speed of exporting images. The same image that takes Docker >3 minutes to export and push takes me under a minute. https://github.com/clipper-registry/benchmarks/actions/runs/...


> ...containerization is useful across so much of computing that are very different....

So much that containerization in general predates Linux, and UNIX, all the way back to System 360.

Also it got introduced into Tru64, HP-UX, BSD and Solaris, before landing into Linux.


This is hell for a lot of ML containers, that have gigabytes of CUDA and PyTorch. Before at least you could keep your code contained to a layer. But if I understand this correctly every code revision duplicates gigabytes of the same damn bloated crap.

It's even worse when you end up installing PyTorch as a separate package in some other layer. It's not shared between layers at all with regular Docker.

If you have problems with 13 (I believe) GB of docker layers ... how do you deal with terabytes or petabytes of AI training data?

Petabytes of training data is only one application of PyTorch, which is going to use tens of thousands of containers, but...

Inference, development cycles, any of the application domains of PyTorch that don't involve training frontier models... all of those are complicated by excessive container layers.

But mostly dev really sucks with writing out an extra 10GB for a small code change.


Going to self promote one last time here - I've built a fix for this, at least for the registry/image export side, at https://clipper.dev. Docker(Hub) can't share large files between layers, but I can.

You don't even need MB of training data for some ML applications. AI is the sexy thing nowadays, but neural networks (Torch is a NN library) are generally useful for even small regression and clarification problems.

For some problems you might even be able to get away with single digit numbers of training points (classic example of this regime being Physics-Informed Neural Networks)


Yeah, our handful of models we just commit to the git repo--usually only a few MB.

Image still ends up being like 6-8Gi tho. iirc pytorch had a hard dependency on CUDA libs which pulled in a bunch of different hardware-specific kernel binaries. The models ran on CPU and didn't even need CUDA but it was incredibly hard to remove them--there was some pytorch init code that expected the CUDA crap to exist even on CPU-only.


the training data is on a separate drive; or the training data isn't that large for this use case; or they aren't training.

You don’t train petabytes on your laptop.

They are in the "Year unknown (12)" category but that's a weird thing to include.

Plus they aren't even AI products, but either have vigorous pre-AI use (W&B) or are completely non-AI but just used by lots of prototypes (streamlit)


Weights and Biases is a tool for monitoring AI training runs.

I don't see how you can say it has "pre-AI use" unless you are narrowly defining AI to be LLMs.


Weights & Biases goes way back to 2017, before deep learning was called "AI". It's used for all sorts of non-AI ML too. It may be used for lots of AI but it sees a ton of non-AI use too. It's definitely not my thing (I prefer MLFlow) but lots of people love it.

As writer who has loved using em-dashes and bulleted-lists in email for decades, it really pains me to see this. It's like Michael Bolton in Office Space for me https://www.youtube.com/watch?v=fhxRAsnizbk

I get it, but this one's a double-whammy with "it's not X, but Y".

People call me a bot all the time just because I write in paragraphs without basic spelling or grammatical mistakes.


Very nice! I use something client side like this on my Mac, and if my terminal was Linux I'd want this too.

The host key section makes me wonder about doing this with servers, but what are the security guarantees in places like the cloud with that? Are you relegated to a software TPM, and if so, what guarantees does a software TPM have?

My question after reading the README.md: what are the requirements from the OS? Can it be Windows, Linux, etc?


> The host key section makes me wonder about doing this with servers, but what are the security guarantees in places like the cloud with that? Are you relegated to a software TPM, and if so, what guarantees does a software TPM have?

For the cloud? You would probably have a software TPM so not super secure, but you would still prevent the keys from being extracted away from the server. And if you don't trust your hypervisor/cloud provider you probably have other issues?

In my head the security guarantees are more straightforward for physical servers where you have a fTPM or a dTPM.

> My question after reading the README.md: what are the requirements from the OS? Can it be Windows, Linux, etc?

This only supports Linux.


Thanks for the quick answers! I don't know much about TPMs, so I'll have to read about software ones to find out about that model.

One of my clients has security audits where even certs result in fights about "secrets on the machine" and having this one level of indirection for host keys may help out, even if a SWTPM doesn't provide much. At least, depending on how the SWTPM presents on the filesystem.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: