Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Intel knows it's no longer inside (theverge.com)
130 points by jonbaer on June 1, 2016 | hide | past | favorite | 124 comments


Have there been studies on the relative software utilization of client hardware over the years?

If clients are reduced to consumption terminals where special purpose IP blocks (e.g. video decoding) neutralize the difference between x86 and ARM, there is less justification for the price premium of a powerful general purpose CPU.

If powerful clients have large local storage and compute-oriented apps that are optimized for low-latency human interaction on large desktop monitors or VR workspaces, then there would be good reason to pay for a general purpose CPU.

A powerful desktop can also serve as a cloud to mobile devices, but a cloud server cannot offer the low-latency response of a local desktop. Why discard the one form factor that can serve both roles?

Is the problem Moore's Law or the lack of desktop software innovation and business models? If Hollywood can promote business models via PC hardware DRM (e.g. SGX), can the software industry support hardware for client-oriented business models?


The i7 from 2010 still runs fine in my gaming PC. Unlike a Pentium 3 from 2000, there have been no major speed advances in the past 6 years. True, today's i7 can push 16 threads instead of 8, but even with a full screen game, YouTube, Twitch, Chrome, and 2 game servers, it barely uses 6 threads.

The devices aren't necessarily disappearing; we just don't need new ones anymore.


I have been looking around for a replacement laptop. Historically I have primarily looked at 3 numbers. RAM, hard drive size and CPU speed. It is just for running one or more IDEs so I don't have any particularly special requirement.

I feel like CPU speeds have plateaued. Not so long ago getting a new machine after a few years meant getting a new CPU that was vastly quicker whereas all of the laptops I am looking at are only marginally faster than my ~3 year old laptop.

My hard drive requirements are not huge so it looks like I am essentially buying a new laptop just to get more RAM.


The most important speed up you can currently get is changing to an NVMe SSD running at 3 times the speed of a SATA SSD.

However, for many devices it difficult to wade through specs working out if you are actually getting NVMe (or whether you are just getting M2 SATA at the slower speed instead).


That is an important speedup if you access a lot of data compared to what your DRAM can cache, or often do a full reboot (which means cold re-reading a lot of executable images from storage). Max out the RAM first.


We're dumping too much energy into a tiny chip for heat to dissipate. At GHz speeds we can handle inbound photons in real time at their native frequencies. What needs to happen: slower, low power, massively parallel neuromorphic chips.


The 3 numbers for me are battery life, linux compatibility, cost (in that order)


On the mobile space unless you are targeting C or C++ code on Android and WP, the CPU doesn't matter.

So in Android with dex, UWP with .NET Native and iOS with bitcode, compiled at store servers the client systems can have multiple CPU offerings.

But that brings some headaches when profiling the applications across devices.


LLVM bitcode already has some architecture-specific things baked in (pointer size, supported data sizes, macros matching to architectures in C are already evaluated, ...). You can target a neutral bitcode ABI that can be further compiled down to different architectures. That's what PNaCl still does, but I think Apple doesn't do that.


I also don't know, but since it is done at the store level, can probably replace those parts if needed.

Way easier than converting between Assembly languages.


> Why discard the one form factor that can serve both roles?

Half way between cloud and mobile means you are not needed for the most part. I think that VR rendering will also be eaten by mobile with high end GPUs.

The only reason I need desktop is for compiles, but compiles could easily be done on the cloud.


It is almost as if you're telling us that you can code on a phone. "Mobile with high end GPUs" - what sort of battery do you have in mind for that?

If you're calling a laptop "mobile" then ah - okay, but that's not the general use of the term.


I need a large high resolution screen (4K 30" or better), a wireless keyboard and a wireless mouse. I need responsiveness to my inputs, so I need a local GPU and some local processing (pretty much where the web browser is headed), but most storage and computation could happen on the cloud and I wouldn't notice if the apps were written properly.

My hope setup is already this for the most part: laptop + 4K monitor + wireless keyboard + wireless mouse. And I write applications and do CAD and 3D computer graphics. If I can live this way now, it means that most everyone else can too. Just need to continue to shrink that laptop down until it is just the size of a smart phone.


Plenty of people code on tablets with keyboards, especially in environments where the computation is offloaded (eg ssh, Cloud-based IDEs, Jupyter, etc).


Are these "tablet" tablets, or something more like a laptop that masquerades as a tablet?


A 10" Android tablet + bluetooth keyboard isn't bad from a hardware perspective. The OS and software stink.


Now this is just getting silly. Comparing a lightweight laptop to a modern tablet with a peripheral keyboard, the primary difference is a mechanical hinge.


The real difference is more about the underlying software ecosystem. And granted - this is slowly converging.


Few years ago, I setup my android platform to allow chroot into a ubuntu for ARM rootfs directory. With that, there is almost zero different than Linux PC. (Well, except may 16+GB of RAM and few TB of SSD/HDD)


Mostly iPads, but I use my Nexus 7.

Surface Pros too of course, but I agree this is getting close to a laptop.


Anyone else think this VR stuff is going to crash and burn hard? I'm really not sure why there is so much optimism around this area, it seems way too niche to drive general trends in computing.


I'm on the fence because I'm biased against something that requires changing user behavior, like wearing headgear.

With rehashing of 3D tv in the last few years, it was obvious that it was more of a product gimmick with manufacturers than real value.

With VR, you're seeing big companies pushing into the space from a spectrum of angles, ranging from re-imagining the workspace to 3D gaming. The bigger question is whether or not VR will become commonplace before augmented reality advances enough to replace it.


Other than for gaming and watching certain kinds of videos, I can't see VR (or AR) being a huge deal. I can think of all kinds of neat things that could be done with it, but the production costs are just too high and the equipment is too clunky.

Even in gaming, I expect VR games will be only moderately successful.


I can see AR taking over the workspace, where you put on an ergo headset and see a digital overlay of multiple screens.


I'd be surprised if that became a thing outside of specialized cases like doctors seeing a patients vital signs and AI advice while they operate.

One thing I've expected to see more of is Second Life. I thought for sure all the VR hype would reinvigorate Second Life for things like online classes and virtual meetings.


No way! I can totally see replacing monitors with AR glasses. Six monitors costs $1,000+, which a headset would probably be cheaper than.


While I understand the scepticism if you've used a VR headset then you'd likely know that it's a very compelling experience in a way a 3d tv isn't. I don't know if it'll get big enough to drive trends in computing but I think its niche is likely to be large enough to be sustainable.


I'm very skeptical. I think something like Voxiebox has a better chance of creating an immersive 3D experience, because it doesn't require you to hijack your senses.

http://www.voxiebox.com/


I agree with your assessment.

I saw a voxiebox demonstrated at the NIAC symposium last year, and I was quite impressed. Being able to display a real-time 3D picture of my face as I made expressions was pretty cool. However, with the price point and mechanical manufacturing required for it, I don't think it will ever become mass-market in its current form. The state of the technology reminds me of the PDP-11 before PCs became a common thing; the basics are in place but it's just not quite there yet.

I don't want to invest in a voxiebox, but I think I would bet strong on whatever replaces it.


I met them and saw it at Barcade in Brooklyn about two years ago. I was blown away, but it definitely did have the Apple ][ feel. Very bulky prototype. I bet they continue to refine it though. Those guys seemed really sharp and motivated.


There is one important additional fact one ought to keep in mind when reading something like this:

Intel derives much of its competitive advantage from its ability to manufacture.

This means that a lot of Intel's advantage will only express itself when the market grows really large. There have been a number of times when other companies beat Intel to a new technology (introducing, say, the 64 Bit chip several months ahead), but when the market took off and the cycle began, only Intel was able to deliver in sufficient quantity.

I don't know what the total market for a particular chip is likely to be, or how fast it will get there.


This just isn't correct.

Intel's big advantage has been their ability to reliably and continually improve their manufacturing process, and for this to reliably and continually deliver performance improvements (the tick/tock strategy).

There have been a number of times when other companies beat Intel to a new technology (introducing, say, the 64 Bit chip several months ahead), but when the market took off and the cycle began, only Intel was able to deliver in sufficient quantity.

This is entirely untrue. AMD delivered a better consumer-focused 64bit architecture, and for an entire generation the Athlon outperformed the Pentium4. Yes, Intel sold plenty, but AMD did too.

That was the last time Intel made a misstep in their x64 product line.

Nowdays, desktop chips are facing some challenges. TSMC, Samsung and other are able to deliver plenty of ARM chips to market for the phone and tablet chips. Nvidia is increasingly eating the compute market.


You've said this isn't correct a few times but nothing you've said in between contradicts anything you seem to disagree with.


I read your point as being that they can manufacture large volumes ("only Intel was able to deliver in sufficient quantity").

This wasn't the case during the x64 transition, which was the example you used. To quote Wikipedia:

In commercial terms, the Athlon "Classic" was an enormous success not just because of its own merits, but also because Intel endured a series of major production, design, and quality control issues at this time. In particular, Intel's transition to the 180 nm production process, starting in late 1999 and running through to mid-2000, suffered delays. There was a shortage of Pentium III parts.[citation needed] In contrast, AMD enjoyed a remarkably smooth process transition and had ample supplies available, causing Athlon sales to become quite strong[1]

However, that was during the early 2000s. Since 2007, Intel has continually improved their processes and architectures on a reliable timeline[2], and haven't had any significant problems delivering those continual improvements.

[1] https://en.wikipedia.org/wiki/Athlon#Athlon_.22Classic.22

[2] https://en.wikipedia.org/wiki/Tick-Tock_model


Yes, that is what my point was. And yes, your most recent response clarifies your counterargument.

I assume you meant to say "Since 2007, AMD has continually improved..."

So I stand corrected in that I didn't realize AMD had closed the gap.

However, my point was about the relative manufacturing ability of Intel and its competitors. Am I to understand your claim that AMD's manufacturing is (approximately) as good as Intel's at the moment?

I don't have any new information, but its a very, very hard problem. And a brief google search seems to indicate that AMD has faced more recent challenges: http://www.wired.com/2012/03/amd-global-foundries/


I assume you meant to say "Since 2007, AMD has continually improved..."

No, Intel is the one with the reliable continual improvement process. That is their advantage, not ability to manufacture large volumes.


IMO, Intel's greatest weakness is that they got so large on such a relatively narrow set of product lines. Their fabs are all optimized for massive runs of dies, unlike their competitors in the fab space. It makes them far less agile.


ARM is a very compeitive field of a number of manufacturers, building chips for billions of devices. The advantage is no longer Intel's. More, the Great Old Ones - IBM with their zSeries and Power8 and now Oracle with SPARC of all things, are making a play for the server racks again. They're trapped between the massive economies of scale ARM is enjoying at the low end, and improvements in lowering cost and increasing yield for small batches of fancy and performant chips by the big R&D houses at the fat-margined high end.

The middle of the road is shrinking fast.


Sun didn't fab the Sparc themselves and therefore it's unlikey that Oracle will. AFAIK they were fabbed in companies like TI. Source: TI's annual report from 2007 http://investor.ti.com/secfiling.cfm?filingID=1193125-07-424...

Therefore, Intel's competitive advantage of running their own fabs could keep them ahead in the server space as well.


TSMC fabs the SPARC chips. Source: I worked on the SPARC systems, but this info is a quick google search away.


Well, the phone market's pretty large already, by any definition...


I wonder if there will soon be room for a non-x86 PC due to the rise of mobile devices. Don't think there's been one available since the demise of the Power Mac.


Well, the raspberry pi ( and others like it ) could be argued to be it, to an extend...


Apple have smoothly changed processor architecture twice before, they could do it again. Maybe they'll go ARM everywhere at some point.


I was there for both. It wasn't particularly smooth. Backwards compatibility was... what's the word I'm looking for? Laughable. That's it.


These days "good enough" CPUs are cheap enough that they could probably field a generation of switchover machines which had both x86 and ARM processors and just build a virtual machine style environment into the OS to more or less seamlessly run the "legacy" apps on "legacy" hardware without the need for all of that software backward compatibility. Running old apps on a cheap i3 would be far better than trying to do x86-64 software emulation on ARM.

Of course, the business relationship side is not quite as clear as the technology side if you want to ensure consistent delivery of a lot of parts -- "sell us a bunch of CPUs for this machine designed as a stopgap until we don't need your CPUs anymore", though maybe that is solvable by dealing with AMD.

All that said... I'm perfectly happy with x86 on the desktop and have yet to see any indication that ARM can compete on the types of jobs I still do with "desktop"-class (whether they be actual desktops or laptops) machines, like code compiling, RAW photo editing, video editing, etc.


I believe Apple tried to do just that in the early days of the 68K to PowerPC switchover (I may be confusing them with another company), but it ended up not working very well.


Is ARM objectively better than x86?


ARM is better for <5W applications when performance is secondary to power consumption, but Intel is beating them at performance/watt ratio everywhere else.


So why not use both? Just like iPhones use a M9 processor for some dedicated computations, why can't MacBooks offload some less intensive computations to an energy efficient ARM processor.

This in no way eats the x86 market but in fact grows the ARM market.


If they bring TouchID to the Mac maybe that is what they'll do as it uses the secure enclave inside the A series processors.


You can't say that, because there aren't 45W and more ARM chips to compare with Intels...


It is objectively cheaper.


The ARM ecosystem is objectively healthier than x86, due to competition from multiple vendors.


Only if both Google and Microsoft focus on making that happen. Microsoft seems to have given up on it, so now we're only left with the new Play Store-enabled Chromebooks, that should actually give a slight advantage to ARM chips in terms of efficiency of running those apps. But so far Google hasn't been too enthusiastic about promoting ARM-based Chromebooks either, and has preferred to bring over Intel's monopoly to Chromebooks as well. But we'll see. At the very least we should be seeing some AMD Zen APUs in Chromebooks next year.

Apple could help a great deal here, too, if it decides to turn the iPad Pro, into something more like an iOS-based Remix OS-like notebook. In other words, make the iPad Pro more useful by giving it an actual notebook keyboard, and redesign the OS interface a little more to accommodate a desktop environment. Keep the price at least $300 less than a Macbook Air.


>> so far Google hasn't been too enthusiastic about promoting ARM-based Chromebooks either, and has preferred to bring over Intel's monopoly to Chromebooks as well.

Is it Google's doing ? or just Intel competing to the max, and desperately ?


I could see such a thing but in the server market. OpenPower (eg; Power8/Power9 when it comes out) could be damaging to Intel.


>Brian Krzanich said the company's focus was on moving to the cloud, with data centers and the Internet of Things considered primary growth drivers

Cloud, servers.. I get it, Intel is well established there, but IoT ? Do they have a foothold in this area ? When I look on their page I don't see anything promising.. and when I think of IoT, I imagine some low power ARM SOCs like Raspbery Pis combined with Arduinos rather then anything Intel sells today.

http://www.intel.com/content/www/us/en/internet-of-things/pr...


Wind River (Subsidiary of Intel) has a large foothold in the IoT market with their VxWorks and WR Linux OSs that play in the embedded/MCU market. Also see their free new open source OS, Rocket, which is being developed as the "Zephyr Project" in collaboration with the Linux Foundation.


Which are mostly running on ARM and PPC. Intel screwed up by dumping Xscale and their legacy embedded lineup. Windriver isn't going to save them.


WR is aggressively building out its Helix Cloud platform. Silicon is useless without the services/app dev/data management backend. If WR can migrate OS users up to Helix Cloud they can (somewhat) offset losses on Intel hardware and giving the newer OSs away for free.


With regards to OS for microcontrollers, ARM's mbed has far better low-power,security and ecosystem story.

And as far as Linux OS's - nobody will buy an OS that can't be ported to ARM and be done so reasonably, so i'm not sure Wind River helps Intel that much in selling chips.


Well, there's the new Quark, Curie, Edison offerings.

I don't see how that can save a company though.

They used to sell $200 processors like popcorn.


IoT = more clients = more servers, no?


Not really. A typical IOT client sends a few bytes every few minutes or hours. that's not much.

But the "internet of watching things", a world full of connected cameras is a different story. But it ain't a great slogan :)


> A typical IOT client sends a few bytes every few minutes or hours

That's the SigFox definition of IOT but not necessarily typical for all cases. There are several usecases out there which require slightly higher bandwidth and data rate, but not quite broadband.

One quick example is over the air firmware updates. Take MSP430. If we were to build a smart meter using a MSP430, our firmware probably be couple of kilobytes.

Likewise, the size of the data can vary upstream as well.


I agree that there are some other usecases, and they might impact wireless standards and chips at the client, for example.

But do you think those usecases have a big impact when coming to evaluate how big of cloud the IOT will require ? if so, please share a bit, so we could grasp the scale of things.


To some extent. But most IoT applications aren't about heavy back-end processing. They are about capturing things that were to small to bother with before. The amazon dash buttons would actually save them a few page loads when you order tide because they never had to load the gui. If your fridge is emailing your shopping list to some server once a day ( assume it makes 5 or 6 calls while you cook dinner or whatever ) so again you're not making a huge impact on a server. It may be some increase, but much of that is probably offset as we become better at over subscription of huge servers.


http://openconnectivity.org/ is the software side of the Intel IoT play


Since the transistors in chips are becoming smaller and smaller, heat is an increasing issue. This is why ARM chips for smartphones and IoT are booming, because their requirements of small chips are compatible with the shrinking size of the transistor. Now Intel tries to get a cut of the market, but they are too late.

Anyway, I think they should keep focussing on making the computer chips and focus on the temperature problem.


Indeed. Intel's thinking right now (or rather in the past couple of years) seems to be that "hey, we missed the smartphone market, but at least we're early for the IoT market, right?"

Except there are two things wrong with that theory:

1) "IoT" is mainly a new name for something that has already existed - a market that has already been dominated by ARM: the embedded/microcontroller market. ARM has much better expertise in this market, better relationships, and better ecosystem, as well as a full range of products.

2) If Intel's chips weren't competitive in the mobile market (especially when you compare all three metrics of performance, power, and price), then why would they be in the embedded IoT market, which plays even more to ARM's strengths?


1) "IoT" is mainly a new name for something that has already existed - a market that has already been dominated by ARM: the embedded/microcontroller market.

The IoT is much more than just the embedded hardware. It's the potential that lies within connecting embedded hardware. Extracting and refining data from embedded machines/sensors/nodes. The whole ecosystem of apps/services that can potentially be created on top of these new data streams.


You're not wrong, but from the a hardware vendor perspective, grandparent is right: an IOT device is simply a embedded/microcontroller market. Maybe with a bit more emphasis on connectivity, but this is not new in any way.


Agree on the definition of an "IoT device" but scoping that way misses most of the market opportunity.

While the embedded/microcontroller tech hasn't changed drastically, the cloud/edge processing & analytics parts of the tech stack are new. The "services" business models are also new.


Again, I do not disagree with you. However, the hardware vendor (the one who sells the parts that will be used on these devices) still keeps selling about the same devices.

And these are the kind of devices Intel has decided to overlook 15, 20 years ago and now ARM is eating its cake.


Insisting on this vision would be missing a bit part, or even the entire core, of the new Intel strategy -- they plan to sell more Xeons to those additional data centers that would be required to process new IoT data.


> Maybe with a bit more emphasis on connectivity

What makes it an even bigger ARM turf than otherwise.


>> The whole ecosystem of apps/services that can potentially be created on top of these new data streams.

I haven't seen any of that yet, just hype. Of course every company wants to collect more data on consumers and is hoping IoT will enable them. None has provided a compelling reason for a "consumer" to buy an IoT device. Look at how Nest was all hype and then had a thermostat fail at its primary function when it lost the net connection. That's a huge step backward, and it's not going to get any better when the data collection remains the priority.


Consumer-facing smarthome/wearable devices like Nest are a very small vertical sliver of a much larger market. Think industrial, automotive, medical, aerospace and defense. Check out platforms like GE Predix, PTC ThingWorx, AWS IoT, Azure IoT, etc. Industrial analytics/efficiency/optimization - that's where the big money is.


I agree that consumer-facing IoT devices are a single vertical in a long list of verticals (many of which are more mature) but I really wanted to mention that IBM also has a platform in this space that was omitted from your list: Watson IoT.

(Disclaimer: I work on the MessageSight messaging engine used in Watson IoT)


>Now Intel tries to get a cut of the market, but they are too late.

It's really more complex then that.

x86_64 has a lot of backwards compatibility. Even low power chips made by Intel typically consume 2-5x the wattage of ARM counter parts. Intel's very low power line (matches ARM) doesn't actually have the 64bit extension and is functionally a i586 chip from circa 1999-2003. Modern x86_64 chips have a whole section of die space dedicated to emulation, re-ordering, re-naming, and caching for us to pretend x86 is fast.

Then you have monopoly. Intel is the only company making x86_64 chips (Yes VIA/AMD exist, but collectively they have <10% of the market). They are the only show in town, it's their prices. While ARM simply licenses it's IP to other companies, who then compete with one another and drive prices even lower.


I don't know why the argument that x86_64 has some affect on efficiency keeps coming up. All modern OoO chips microarch is decoupled from the instruction decoders. All the "strange" x86 instructions are effectively microcoded, which doesn't affect the execution performance of the instructions actually contained in high performance code. In many ways the ability to easily spill/operate against memory/stack provides instruction density advantages, and simplifies certain dependency calculations.

So, whether ARM64 has an power efficiency advantage isn't clear at all when one tries to actually compare them fairly. I frequently see people linearly scaling power/performance curves, or comparing cores that contain ECC, significantly more IO, etc against cores that don't and claiming that the core with 10x the IO bandwidth is somehow less efficient computationally because it consumes 5 watts more to power a PCIe bus.

So, lets get this out of the way, performance is not linear to power. Simply having a clock rate 50% faster has a large impact on power given the rough P=CV^2f equation because often the voltage goes up to support the higher frequencies. Worse designing for a target top end frequency of 4Ghz may entail extra pipeline stages/etc than one targeting 2Ghz in the same process. The result is significantly higher power draw at the same frequency due to the fact that higher frequencies are supported. Then there is leakage current/etc, which will be proportional to the number of transistors, so a design with 15MB of cache is going to waste more on leakage than one with 1MB. The extra 14MB of cache may only contribute another percent or two to the bottom line in many workloads, but frequently the difference between a processors at X performance and one at 1.5X is not due to a single factor but dozens of design tradeoffs that individually only net small percentage gains.

Bottom line, its better to compare implementations, and when ARM vs x86 chips are compared on similar grounds, they are a lot closer than you hear in glib remarks on forums like this. Intel's handicap in mobile (lack of native x86 android apps) and ARM's handicap in desktop/server all really come down to the software (and maybe in the case of some ARM products what one might consider alpha/immature products).


None of Intel's or ARM's very low power chips for IoT applications are OoO - uses too much power. Intel have actually resorted to using subsets of x86 on some of them, and even then they're mysteriously unwilling to release proper power usage information.


>for us to pretend x86 is fast.

"Instruction sets" are not fast, the implementations are. Also AMD has a nice chunk of marketshare which will only get bigger once Zen is out.


Instruction sets can affect the speed of the decoder hardware, and how many instructions fit in the L1 cache, to speed up jumping around reading them from memory.

Getting more register names in x86-64 was a huge boost too.


>Modern x86_64 chips have a whole section of die space dedicated to emulation, re-ordering, re-naming, and caching for us to pretend x86 is fast.

"to pretend x86 is fast" is a terrible choice of words. The fundamental problem is the memory wall. There are only two ways to solve it. Either put the processor directly onto the memory or add complex circuitry to mitigate it's effects. The former is difficult because DRAM and CPUs use different manufacturing processes. Thus neither x86 or ARM can avoid the latter if they want to maximise performance and in turn they become less power inefficient.

What you should have said is "to pretend DRAM is fast". It's not.


Actually AMD has around 20-30% marketshare...


I think you're right that it's more than 10, but I don't think it's much more than 20%. Q4 last year it was almost exactly 20% as part of a downward trend. There's a good chance it's somewhat below 20% right now.


Transistors are not becoming much smaller. Moore's law is stagnating, due to a host of reasons. Multiple patterning is very expensive, EUV is not available - and even if it comes, it will be really expensive with a host of technical challenges for everyone. And then it will very quickly hit a wall as well. The future of computer chips is dim. I think Intel is smart to diversify into other markets.


> if Intel's vision comes to pass then that device — the smartphone — will constantly be communicating with Intel-powered data centers

So Intel and Microsoft want to go where IBM and Oracle are withering?


@Intel: how about thinking about 10GHz single core CPUs? And where is Atom? Why are we stuck in 2006 era CPU singe core performance?

@nVidia: how about thinking about affordable high end GPUs? The GPUs got 400% more expensive since 2011.

It's a pitty that AMD bought ATI, and now they can't compete with Intel and nVidia. And we all have to suffer and pay higher prices and get less performance - because no real competition exists any more.


> @Intel: how about thinking about 10GHz single core CPUs?

They tried that -- the Pentium 4. The key design goal was to push clock speed as high as possible, and they used some crazy tricks, like 30-some-stage pipelines, a really long instruction scheduling loop with a pretty long lookahead, a double-pumped ALU with staggered 16-bit half-adds, etc. Fascinating from a microarchitecture perspective, but the big lesson was that high clock speeds sacrifice efficiency. The tricks needed to get there cause a lot of performance outliers/bad cases too -- e.g. the instruction scheduling replay system on lookahead conflicts was notorious for "tornados" which would kill IPC. And the branch prediction of the day was somewhat suboptimal for the pipeline lengths involved.

> Why are we stuck in 2006 era CPU singe core performance?

We aren't! The core microarchitecture teams at Intel and their competitors have made a lot of incremental progress -- Intel gets about 15% single-thread performance per generation, for example. No one is evilly scheming and holding back from turning a knob higher. There's just a lot of really hard engineering. Clock speed has topped out due to power limitations so we're at the point of looking for better branch prediction algorithms, cache replacement algorithms, and lots of little tricks everywhere to optimize bad cases. It's hard work (this was my job for a bit).

I'd recommend looking at, e.g., the proceedings of ISCA and MICRO conferences in the 2000-2006 timeframe -- this was when the industry and associated academia figured out that chasing clock speed was a losing battle after a certain point.


Right, even single CPU performance of mainstream Intel chips has more than doubled in 7 years (Nehalem to Skylake).

Many important things which affect performance have been stagnant for a very long time, especially cache size and RAM latency. But there's been steady progress in raw CPU power, coupled with a significant reduction in TDP.


Do you have a reference for the doubling? How much does the boost from AVX affect the score?

I'm asking because I recently "upgraded" my desktop to a cheap westmere 6 core xeon [1] as in most benchmarks it is competitive with recent i7s.

[1] x5670 @3ghz, but should easily overclock well into the 4ghz range.

Edit: autocorrect


Primatelabs provide benchmarks for: 32 bit 1 CPU, 32 bit threaded, 64 bit 1 CPU, 64 bit threaded.

https://browser.primatelabs.com

Useful information if your workload is single threaded.

Note that their usage of the word browser is nothing to do with www browsers.


Oh, Geekbench which can be 'won' by implementing a single encryption instruction in hardware.

Still, even looking at the scores for less broken sub benchmarks, like (possibly) lua and dijkstra show a %50 increase in performance from Westmere to Skylake for the same frequency, which is more than I was expecting. And skylake should potentially clock much higher although I believe the top is currently still 4ghz.

Of course for anything that can take advantage AVX2 (or AVX3 for xeons) skylake would smoke the old xeon.

edit: reword


What does it mean to have a "double pumped" ALU? Google is failing me.


The ALU performs an operation on half of the machine word (16 of the 32 bits) each half-cycle, i.e., one on the rising edge and one on the falling edge of the clock. Basically they split the carry chain across two (half-)pipe stages and then run it twice as fast.

See Hinton et al., "The microarchitecture of the Pentium 4 processor" [1] for all the nifty details -- pp 8-9, and Fig 7 in particular.

[1] http://www.ecs.umass.edu/ece/koren/ece568/papers/Pentium4.pd...


That also makes for some very interesting timings, where the instruction runs faster in the case that there's no carry between the two halves.

From section 2 of this:

https://gmplib.org/~tege/x86-timing.pdf

"Pentium F0-F2 can sustain 3 add r, i per cycle for -32768 <= i <= 32767, but for larger immediate operands it can sustain only about 3/2 per cycle."


Makes sense, thanks!


It runs at 2x the clock rate as the main chip. So if the chip is running at 2Ghz, the ALU is running at 4Ghz.

While this might not make sense at the surface (the next stage of the pipeline won't be ready for the data a half clock cycle earlier), you can minimize logic area by doing 2 quick 16 bit operations. Additionally, for some operations that have dependent instructions, for example a super scalar processor executing two integer instructions at once, if you have an ADD that depends on another ADD, you can forward the result from the first cycle on the first ALU to the second cycle on the second, although I'm not sure if this is actually done.

Another case where this is useful is if the result of an ADD operation needs to be used on a LOAD instruction earlier, you can forward the result a half clock cycle sooner.


> @Intel: how about thinking about 10GHz single core CPUs? And where is Atom? Why are we stuck in 2006 era CPU singe core performance?

Because that's physically impractical (if not impossible) with today's technology. Intel themselves actually have a good article[0] on the topic. Besides, there's way more to CPU performance than clock speed—today's Intel CPUs are several times faster than the Penitum 4, for instance, even at lower frequencies.

> @nVidia: how about thinking about affordable high end GPUs? The GPUs got 400% more expensive since 2011.

What are you talking about? GPU performance is plateauing a little bit, but the cards are only getting cheaper and more energy efficient, especially the high-end ones. Both Nvidia and AMD have announced new high-end cards coming soon that will give you the performance of the current high-end cards (Titan X and Fury X) at a little over half the price. Edit: Apparently they won't be that much cheaper than previous generations, but certainly within the grasp of any enthusiast or professional who really wants/needs them.

Plus, isn't "affordable high end" kind of an oxymoron? "High end" is almost by definition not affordable for most people.

[0]: https://software.intel.com/en-us/blogs/2014/02/19/why-has-cp...


I would guess frik probably means this: in 2010/2011 you had to pay ~350 euros for AMDs 5870 when it launched, or ~450 euros for nvidias GTX 580. Now you have to pay ~700 euros for AMDs Fury X or ~800 euros for nvidias GTX 980ti at launch. That's not 400% more expensive but they are not cheaper either.


In the US, release price of GTX 580 was $500 and release price of the 1080 at launch is $600. Converting the 580's price to 2016 dollars gives about a 10% increase in price.

The 980 TI I'm not sure is the best comparison to a GTX 580. It was intentionally a very expensive ultra-high end card. The GTX 590 debuted at $700.

There does seem to be a problem with prices being ridiculously high in most non-US regions, though.


Atom is no longer needed since their non-atom CPUs have a low enough power profile now. We're stuck in single core performance hell because physics.

I don't know where you've been but GPU competition between nvidia and AMD has been pretty healthy for the last few years, and AMDs specifically have been quite affordable. (That's not even taking into account they easily paid for themselves with Litecoin mining at one point.) It's only CPU competition that has been dead, but that's been the case ever since the Core line came out. (I guess you could count ARM too for low-end CPU competition, but both Intel and ARM can actually afford to lose that market entirely as their businesses are propped up by other things.)


Competition really hasn't been healthy, beginning with the Kepler generation. Ever since then, AMD has been unable to get a solid win, and most of their cards have been rebrands. They're still running some first-gen GCN chips (dating back to 2012) in their current lineup. Heck, some of their prices have actually increased in the last few years (eg the 390 rebrand raised the price by $100).

NVIDIA held the upper hand with GK110 (780/Titan), then AMD dropped the 290X. This forced NVIDIA to readjust their prices and release the 780 Ti as an interim measure, but they held onto their lead position. Then NVIDIA started with the Maxwell series, which drastically improved both performance and efficiency. AMD tried to respond to the Titan X with the Fury X, but NVIDIA cut the legs out from under them by releasing the 980 Ti. Now once again AMD is behind the performance curve - NVIDIA has staked out the high-end space with the 1080 and the midrange with the 1070, and AMD is stuck in the low-end market with the RX480 until NVIDIA can get GP106 out.

AMD's response to the 1080 is going to be the Vega 10 chip, and unfortunately it's unlikely to be released before Q4 2016 at the earliest (more likely Q1 or Q2 2017). NVIDIA already has GP102 coming down the pipe, which is going to be a Titan/1080 Ti, and they'll drop that when they feel it's necessary to spoil AMD's sales.

That's the story of AMD and NVIDIA recently. AMD has put out some cards that were good value for the money - both the 7950/7970 and 290/290X have been very long-lived performers (and will likely continue to perform well thanks to DX12). But they haven't taken the top since 2012, all they can do at this point is force some price reductions. Like it or not they aren't a serious competitor in the GPU space any more than they are in the CPU space. They are a budget choice, not a serious contender.


To add to that, since Kepler AMD fell from ~45% marketshare to 20%. It seems they want to regain marketshare by starting with the 199 dollar AMD RX 480 which should deliver performance between GTX 970 and GTX 980 (maybe even like the small Fury).

There is also a rumor about AMD pushing Vega forward to october to release it around the new Battlefield (I doubt it, but it would certainly be nice).


> We're stuck in single core performance hell because physics.

Wouldn't other semiconductor materials allow higher clock speeds and it's just that the switch would be difficult to do incrementally?


Wouldn't other semiconductor materials allow higher clock speeds and it's just that the switch would be difficult to do incrementally?

Short answer is "no", but other materials may allow smaller manufacturing technologies. "Incrementally" isn't really a factor - every die-shrink is basically a new manufacturing process.

http://arstechnica.com/gadgets/2015/02/intel-forges-ahead-to...

http://arstechnica.com/gadgets/2015/07/ibm-unveils-industrys...


MHz/GHz are overrated. Not to mention today's 3GHz Intel and AMD offerings are much faster (per core) than let's say, the 3GHz offerings of 10y (?) ago

nVidia got their bets right with investing on uses other than gaming for their GPUs now they're eating AMD/ATI in the deep learning field

Affordable and High End are rarely on the same phrase, so there's your answer to when nVidia is going to do that


MHz/GHz is basically RPM in a car, not the greatest determinate of speed.


What a perfect analogy! Next time I'm having this discussion with people, I'll have to remember this.



Except that the absolute pinnacle of motorsport go as high as they can because it is what produces the most power.

Not that this matter for normal cars.


Sure, some sports car engines turn at over 9,000 RPM, but a car stuck in 1st gear on 9 inch wide street tires is not going to go faster than the same car in 5th with 11.5 inch wide racing tires. Even at the pinnacle of motorsport RPM is a poor determinant of speed, it takes a whole system.


That is true, but at least for engines, it's a fairly safe bet, that if the engines goes beyond 12k rpm, it's build for speed.

That said, I suppose this differs depending on wether or not you think of the gearbox as a part of the engine (CPU) or not.


My analogy was comparing the CPU to a car, so I do consider the gearbox. An engine, no matter how many RPMs it can turn, is pretty useless without the vehicle.


The redline is almost never the point of peak horsepower. This is primarily driven by the fact that the valves can't close/open fast enough at these speeds. Its much worse on domestic vehicles that have to care about fuel efficiency (and so can't use stiffer springs) but the effect is still visible even in race cars.


In F1 the previous generation engines was limited to 18k rpm by the regulations. They could go higher, and since they always went up to the 18k, it's safe to say it provided the most power, otherwise they would have shifted before.


A humorous take on answering these questions: http://scholar.harvard.edu/files/mickens/files/theslowwinter...


Everything by James Mickens seems to humorously puncture research. Take his essay The Saddest Moment, https://www.usenix.org/system/files/login-logout_1305_micken..., which had me laughing out loud because I've attended just that kind of conference talk, seen that set of diagrams, and I swear I've had to coordinate lunch via IM with those kinds of people ;)


I have found Hacker News to be a very concentrated source of useful material, with a high ratio of "want to read / glad I read" articles.

... but nothing compares, nothing has prepared me, for the sheer awesomeness, of these two links. I now must store the PDFs in a safe repository with GoogleCalendar note to re-read them frequently :->


The man is beyond brilliant. Check out his take on computer security: https://www.youtube.com/watch?v=tF24WHumvIc


I'm curious, what would you do that 10GHz would allow ?

IMO the only issue is commercial pressure killing integration. Until Vulkan GPUs were underutilized. Audio drivers lying about hardware capabilities. Software layers ever shifting. Absurd use cases: ads invaded html5, half accelerated html video....




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: