Oh yeah, isn't ARM fixed-size instructions and x86_64 is variable-size? So decoding x86_64 requires clever pipelining, whereas ARM is just "Every X bytes is an instruction" and you can parallelize easily.
I wonder if we'll see Intel or AMD try to make another somewhat-backwards-compatible ISA jump to keep up with ARM.
If I’m not mistaken, based on similar threads on HN decoding is never the bottleneck, so I would be hesitant to write x86 off for mobile devices. It probably does make transition to smaller scale harder, and that is where most efficiency wins happen.
We should never write x86 off when there are billions behind it and variable length instructions have their advantages as well, such as code density, which may come to play an important role again in the future.
But it is much easier to simply chop off a stream of instructions at every X bits than to evaluate a portion and decide what to do later and that difference get larger the wider you go.
> variable length instructions have their advantages as well, such as code density
Variable length instructions in general do have a code density advantage, but x86 is a particularly poor example. For historical reasons, it wastes short encodings with rarely used things like BCD adjustment instructions, and on 64 bits often requires an extra prefix byte. The RISC-V developers did a size comparison when designing their own compressed ISA, and the variable-length x86-64 used more space than the fixed-length 64-bit ARM; for 32 bits, ARM's variable-length Thumb2 was the winner (see page 14 of https://riscv.org/wp-content/uploads/2015/06/riscv-compresse...).
I wonder if we'll see Intel or AMD try to make another somewhat-backwards-compatible ISA jump to keep up with ARM.
x86_32 --> x86_64 --> x86_512?