Hacker Newsnew | past | comments | ask | show | jobs | submit | mbitsnbites's commentslogin

Same with BuildCache, except you also get a fast local cache so you effectively have an L1 and an L2 cache.

In fact, since you also have super fast "direct mode" caching that bypasses the preprocessor (like ccache but unlike sccache), BuildCache really has three logical levels of cache: direct, preprocessor and remote (S3, redis, ...).


Though I'm not actively working with Firefox so can't speak for their use cases, one important use case for clobber builds is CI.

I'm the author of BuildCache, and where I work we make thousands of clobber builds every day in our CI. Caching helps tremendously for keeping build times short.

There are a few use cases for local development too. For instance if you switch between git branches you may have to make near full rebuilds (e.g. in C++ if some header file is touched that gets included by many files).

Another advantage as a local dev is that you can tap into the central CI cache and when you pull the latest trunk and build it, chances are that the CI system has already built that version (e.g. as part of a merge gate) so you will get cache hits.


I see, I might be confused by the terminology. "clobber" to me suggests intentionally trying to throw away cached results (clobbering what you have), but it sounds like you might just use it to mean builds where you don't have any existing build state already present.


For what it's worth, browser uptake is largely dictated by the browser being default shipped with some major OS. Very few users make an active choice (statistically speaking).

Safari is popular because it ships with iOS and macOS.

Edge (previously IE) is popular because it ships with Windows.

Chrome, however, is popular for several reasons. One reason is that it ships with Android and ChromeOS, but before that Google had a very aggressive multi-channel campaign where they pushed it with large banners on Google search (everyone used Google) and they made deals with Windows AV vendors so that when a user installed anti-virus on their computer Chrome was automatically installed and made the default browser. Another reason is that Google has consistently develeoped Chrome together with their web services, so things like search, maps, gmail, docs etc tend to work best in Chrome.

The only default channels that Firefox has, that I'm aware of, are verious Linux distros, and they have a pretty thin market slice.


And a local cache (kind of level 1 and level 2 caches)


Just a note for the posterity: The continuation of the project is called Bitfrost CC and lives here: https://codeberg.org/mbitsnbites/bitfrostcc


Thanks for the references! After writing the blog I was looking for such references.


Thanks for the feedback, and the interesting ideas. It's good to know that I was on to something and not completely off :-)

I'm mostly doing this for learning purposes, but a hidden agenda is to create a low-latency codec that can be used in conjunction with other codecs that deal primarily with luma information. AV1 and friends are usually too heavy in those settings, so I try to keep things simple.


I truly get that. That's also one of the reasons why I started from scratch once I got the idea, rather than researching all the available papers and implementations etc (because the latter is quite overwhelming, while the former took me about a week of spare time hacks).

My scope is also a bit unusual, I think, because one of the applications I'm thinking about is to "augment" luma-only codecs with chroma. One such codec is https://gitlab.com/llic/llic

But most of all, I wanted to learn.


The model is based on Qwen2.5-Coder-7b it seems. I currently run some quantized variant of Qwen2.5-Coder-7b locally with llama.cpp and it fits nicely in the 8GB VRAM of my Radeon 7600 (with excellent performance BTW), so it looks like it should be perfectly possible.

I would also only use Zeta locally.


Are you happy with the speed with your 8GB GPU?


The big cores do. They essentially pump division through something like an FMA (fused multiply-add) unit, possibly the same unit that is used for multiplication and addition. That's for the Newton-Raphson steps, or Goldschmidt steps.

In hardware it's much easier to do a LUT-based approximation for the initial estimate rather than the subtraction trick, though.

It's common for CPUs to give 6-8 accurate bits in the approximation. x86 gives 13 accurate bits. Back in 1975, the Cray 1 gave 30 (!) accurate bits in the first approximation, and it didn't even have a division instruction (everything about that machine was big and fast).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: