Can anyone expand on this point? I read an article saying that the big AI co's datacentre spend was a bunch of lies because they can't build datacentres at anywhere near the rate they want to.
> they can't build datacenters at anywhere near the rate they want to
That was because the supplies the datacentre needed were constrained - supply-constrained, not end-user demand constrained, so would be in agreement with the GP comment (and the article I read didn't imply anything about lying).
What quantisation do the creators intend this to be run at? They talk about 16GB of ram, so should it be run at 8 bit? People here are talking about using q4, but I would have thought a smaller model like this wouldn't perform well at such low bits per parameter. Edit, it looks like their bechmarks would have been done at 16 bit float, as the hugging face release is that size: https://huggingface.co/google/gemma-4-12B . Which is a little deceptive: they're advertising an 8 bit size will fit on 16GB laptops, while releasing a 16bit size.
I guess we have to wait for someone to produce perplexity curves at different Q's.
The alternative it a strict zero trust network design with very internet access only via RDP or similar protocols. Not many companies are willing to do this.
This reminds me that I once had a computer magazine from the 80s that came with a green vinyl record "single" that had a game on it for a popular computer, perhaps the Commodore 64. It was useless to me as I had a Z80 machine, but a curiosity.
As far as I remember it should be the other way around: Sinclair had analog audio input/output so one could hook up a turntable instead of the tape. Commodore 64, on the other hand, had a proprietary tape recorder called Datasette so there was no way to hook up a turntable to it. Of course, one could always just copy the signal from a vinyl record to a casette tape and then play it back to the computer.
These records that were stuck to magazines were annoying as hell. They are made from the thinest sliver of plastic, the thickness of a candy wrapper, and would invariably have suffered some sort of kink in them on their way to the store. I can't remember if I ever got one to work properly with my Speccy.
Back in the 80s most of the (monthly) magazines had a cassette tape glued to them, with demos or full games.
But there was also a brief period of time when you'd get a vinyl instead. I remember loading games from those a couple of times, though the tape deck was the standard approach and much more common.
From what I'm reading it's probably the same chip that's used in the DGX Spark, the memory bandwidth at 300MB/s is equivalent to an M5 Pro, however you can't get an M5 Pro with 128GB of RAM. Apple pushes you to the biggest M5 Max chip, which at the 14 inch form factor, costs you $5099. You can get an ASUS GB10 machine with 2TB storage for $4000, so I guess the RTX Spark laptops will be more than that due to battery and screen, etc.
Perhaps the next generation of the spark will improve on the bandwidth and RAM size numbers. Yes it's a lot like a Strix Halo, but this has CUDA, which will be of interest to developers who want that.
I was looking for AMD AI Max+ 395 laptops recently, and the only ones I've found were 13 inch models, which seems odd from a heat dumping standpoint. I'm looking for 16 inches, I guess the 13 inch form factor would make it easy for commutes where you're taking it to dock to a large monitor at work or home, but no 14 inch screens?
I've tried the Z13 Flow and I actually like the form factor except for the folio keyboard. I especially like that, since it's a tablet, it vents hot air out the top instead of into your lap/table. But the whole driver situation was very weird and things would randomly stop working. That may have improved since I tried one ~1 year ago.
The value of openrouter isn't as a middleman for users of claude, gemini or chatgpt, it's for those looking to find a model that fills the use case at a lower price than the top 3.
Except the latency is significant and not suitable for clients with advanced agent features. The experience between using a frontier model via first party API and the best open weight models via OpenRouter is night and day. Can't get any real work done with it.
Or just leave the machine plugged in and turned on for like 5 minutes while you grab a coffee or have a conversation. It doesn't really take that long to warm up to room temperature. Unless this guy is like biking 15 miles to work in the winter in which case, he is doing Wisconsin wrong, you're supposed to drive to work with a beer to warm you up.
Can anyone expand on this point? I read an article saying that the big AI co's datacentre spend was a bunch of lies because they can't build datacentres at anywhere near the rate they want to.
reply