Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

X86's legacy really only affects the decoding logic, whereas the actual CPU itself could be almost any superscalar execution stage (simplified).

For example, amd64 has about 20 general purpose registers, but the silicon that actually executes it could have something like an order of magnitude more to play with (i.e. for register renaming)



From the article

> If we have more decoders we can chop up more instructions in parallel and thus fill up the ROB faster.

> And this is where we see the huge differences. The biggest baddest Intel and AMD microprocessor cores have 4 decoders, which means they can decode 4 instructions in parallel spitting out micro-ops. > But Apple has a crazy 8 decoders. Not only that but the ROB is something like 3x larger. You can basically hold 3x as many instructions. No other mainstream chip maker has that many decoders in their CPUs.

> You see, for x86 an instruction can be anywhere from 1–15 bytes long. On a RISC chip instructions are fixed size. Why is that relevant in this case?

> Because splitting up a stream of bytes into instructions to feed into 8 different decoders in parallel becomes trivial if every instruction has the same length.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: