The last few percentage points of performance take an insane amount of power. If...

tambourine_man · on Aug 23, 2022

Yes and no. After the CISC vs RISC war was over I also though ISAs where implementation details.

But from what I’ve read, having different length instructions makes extracting parallelism way harder. That’s why Apple can make such crazy wide machines.

ReactiveJelly · on Aug 23, 2022

Oh yeah, isn't ARM fixed-size instructions and x86_64 is variable-size? So decoding x86_64 requires clever pipelining, whereas ARM is just "Every X bytes is an instruction" and you can parallelize easily.

I wonder if we'll see Intel or AMD try to make another somewhat-backwards-compatible ISA jump to keep up with ARM.

x86_32 --> x86_64 --> x86_512?

kaba0 · on Aug 23, 2022

If I’m not mistaken, based on similar threads on HN decoding is never the bottleneck, so I would be hesitant to write x86 off for mobile devices. It probably does make transition to smaller scale harder, and that is where most efficiency wins happen.

tambourine_man · on Aug 23, 2022

We should never write x86 off when there are billions behind it and variable length instructions have their advantages as well, such as code density, which may come to play an important role again in the future.

But it is much easier to simply chop off a stream of instructions at every X bits than to evaluate a portion and decide what to do later and that difference get larger the wider you go.

cesarb · on Aug 23, 2022

> variable length instructions have their advantages as well, such as code density

Variable length instructions in general do have a code density advantage, but x86 is a particularly poor example. For historical reasons, it wastes short encodings with rarely used things like BCD adjustment instructions, and on 64 bits often requires an extra prefix byte. The RISC-V developers did a size comparison when designing their own compressed ISA, and the variable-length x86-64 used more space than the fixed-length 64-bit ARM; for 32 bits, ARM's variable-length Thumb2 was the winner (see page 14 of https://riscv.org/wp-content/uploads/2015/06/riscv-compresse...).

tambourine_man · on Aug 23, 2022

Nice, thanks. I didn’t know x86 was that bad in this regard.

_ph_ · on Aug 23, 2022

For many years, Intel had quite a process advantage over the competition. That of course helped them a lot with making low power processors vs. what AMD could achieve. And the non-x86 competition had basically stopped making processors in this domain. However, there was a reason that RISC designs were used in most low power applications like embedded and of course smart phones.

Yes, with the complexity and transistor budgets, the disadvantages of x86 can be somewhat glossed over, otherwise they would have vanished from the market long ago, but they add a certain overhead which cannot be ignored when looking at low power applications. The efforts the CPU needs to take until it can execute the commands is higher and x86 requires more optimizations done by the CPU than RISC designs. Which today also contain a translation layer, but a way simpler one tha x86, as the assembly instructions match modern CPU structures better.

It is probably no coincidence that Intel, which had to work around the issues of executing CISC code on a modern CPU chose the EPIC design for the Itanium. Which goes beyond RISC in putting compexity towards the code generation vs. on-cpu optimizations. Too bad it didn't work out - it might have, if AMD had not added 64bit extensions to x86. While there were certainly a lot of technical challenges which were never completely solved, the processors seemed to perform quite well when run with well optimized code. Perhaps they were just one or two process generations to early. While considered large for that time, their transistor count was small compared to a modern iPhone processor. I wonder how they would perform if just ported to 7nm (the latest CPUs were 32nm).

johnklos · on Aug 24, 2022

Even though Intel is known for putting tremendous work and effort in to their compilers, and therefore have compilers that put out excellent results (even on AMD), the compilers never delivered on the promises they made with Itanic.

If you'd like to see some first-hand observations about modern-ish compilers on Itanic, check out this person on Twitter who does lots of development on Itanic:

https://twitter.com/jhamby