Magic Pixie Dust or Why x86 Cannot Beat ARM

John Stokes wrote:”It’s also the case that as ARM moves up the performance ladder, it will necessarily start to drop in terms of power efficiency. Again, there is no magic pixie dust here, and the impact of the ISA alone on power consumption in processors that draw many tens of watts is negligible.

Some people just don’t get it. x86 is taking tens of watts to make a box. I have three in my home running Atom. I know what I have and how much heat they dissipate. ARM can provide the user wonderful performance with less silicon, fewer transistors. “The Cortex-A9 power-optimized hard macro implementation delivers its peak performance of 4000 DMIPS while consuming less than 250mW per CPU when selected from typical silicon.” That was for 40nm. They have 20nm parts scheduled for this year and 14nm is in the pipes.

That’s not magic pixie dust. With a smaller and simpler instruction set, ARM takes fewer transistors to get the job done. ARM cores at the same resolution as Intel Atoms are 4 times smaller in area. x86 has bloated its instruction set over the years and Intel cannot escape it with the huge installed base. ARM does not have that burden. Intel can reduce its power/MIPS ratio with Moore’s Law but ARM is no slouch. It has partners at TSMC and IBM that are headed for 20nm and 14nm production so ARM will always be ahead in that. When both x86 and ARM reach the point of vanishing returns in that power consumption is not an issue, ARM will have a large share of personal computing markets. Further, even when power/MIPS is not an issue, size, price and total power consumption will still matter in data centres which are growing in importance.

Now, this is mostly about mobile computing but that is more than half of all personal computing so ARM has plenty of room to grow. Notebooks are about 60% of unit shipments of non-ARM PCs at the moment and ARMed smartphones and tablets should catch up to the unit shipments of notebooks in a year or so. So, performance per watt does matter and ARM will take a huge share. Some people are used to keeping their notebooks plugged in so x86 won’t suddenly be gone but even the smartphone now meets all computing needs of much of the market for personal computing devices. That will only grow and docks and connectors will take care of more of the need for desktop computing. There will just be no more need for big box/hair-drying PCs in our homes and offices except for content generators and number crunchers. All the heavy work will be done on servers and servers will be far fewer in number than PCs. ARM will take over the computing space just as they have the embedded space.

Did we mention price? With volume production, the price of ARM CPUs will always be less than x86. Less silicon and lower licensing fees means lower prices.

About Robert Pogson

I am a retired teacher in Canada. I taught in the subject areas where I have worked for almost forty years: maths, physics, chemistry and computers. I love hunting, fishing, picking berries and mushrooms, too.
This entry was posted in technology. Bookmark the permalink.

16 Responses to Magic Pixie Dust or Why x86 Cannot Beat ARM

  1. No, x86 has eight general purpose registers, and x86_64 has sixteen. On x86 these are EAX, EBX, ECX, EDX, ESI, EBP, ESP, EDI. x86_64 adds eight more, named R8 through R15. But these are just the logical registers. There are many more physical registers in an out-of-order CPU.

    I don’t have an explanation for Vista’s poor performance. You may be right that caching is involved, but the instruction cache seems most unlikely. From experience, cache-related bottlenecks are usually due to the data cache, because code is highly suited to caching in a way that data is not. Code is normally accessed sequentially in read-only mode with high temporal locality, whereas data can be accessed randomly.

  2. x86 has four general-purpose registers.

    Don’t forget that a typical loop in an OS deals with one process. You can have a hundred processes running. If any of them overflow the cache performance suffers. If it was not the cache what is your explanation for Vista running like a champ on a Celeron with 2MB cache but being several times slower than GNU/Linux on an AMD64 with 512MB cache? The Celeron was on a notebook and it ran GNU/Linux or Vista quite easily but the AMD64 was clearly slower with Vista, to the point of unusability. The user, who had used GNU/Linux on an ancient machine asked for GNU/Linux to be installed.

  3. @Robert Pogson

    Predication only shows up in the 32-bit instructions, not Thumb or Thumb-2. Arguably it is actually a misfeature. While good for in-order designs predication isn’t a good idea for superscalar out-of-order designs because of its interaction with register renaming.

    The impact of instruction size is really not as much as you think. I doubt there is even a single hot loop in any of your software which would overflow a typical ARM or x86 L1 I-cache.

    Lastly, remember that x86_64 actually offers 16 registers against ARM’s 15 (or 8 in Thumb mode). But these are logical registers; the actual number of physical registers may be much larger. In any case you can’t argue that ARM has more registers.

  4. @Joe.

    “Oh so if I have the ARM netlist will I be able to manufacture ARM processors as easily as I can copy software? I think not!”

    Actually, yes. Perhaps you have heard of FPGAs?

    ARM has been slow to move into the FPGA market and the reason for this is because FPGA designs can be copied quite easily, and as an IP-only company, nothing terrifies ARM more than the possibility that people might use their designs without paying. So ARM waited for the appropriate level of DRM technology before releasing (say) the Cortex-M0, which is only available as an “encrypted netlist” (i.e. DRM-encumbered).

    “ARM lawyers aren’t threatening individual people and interfering with people’s personal dealings the same way the current copyright regime surrounding software and other media does. Nor will they until copying ARM processors are as easy as copying a file on a computer.”

    Really? Because I know of a few ARM-compatible projects on Opencores which were “cease & desisted”. They were all clean-room reverse-engineered by hobbyists, based on the optimistic ideal that “you can’t copyright/patent an ISA”. But ARM stomped them anyway. By the way, the 8086 clone on Opencores is still going, as are the various MIPS and Microblaze clones.

  5. ARM has many more registers than x86. You load data into those registers and compute on those registers so instructions need be only 32bits long. ARM instructions are all conditional so you don’t need a compute instruction and a decision instruction. It’s just one instruction. This makes it much easier to fit a large loop in the small cache.

    A review of Cortex A8 shows ARM binaries 0-10% larger than x86 binaries.
    see http://vanshardware.com/2010/08/mirror-the-coming-war-arm-versus-x86/. New software will use the Thumb-2 instruction set which makes code 20-30% smaller.

    You are right that this is an old argument but the bottom line is that ARM is good enough for many tasks including desktops/notebooks and it will help make them cheaper and smaller. Don’t discount the effect of smaller. Shipping all the way from China to the end-user is less costly for a smaller unit so there is a savings at every stage from production to usage. It amounts to a lot.

  6. Joe says:

    @ Linux Apostate

    Oh so if I have the ARM netlist will I be able to manufacture ARM processors as easily as I can copy software? I think not!

    ARM lawyers aren’t threatening individual people and interfering with people’s personal dealings the same way the current copyright regime surrounding software and other media does. Nor will they until copying ARM processors are as easy as copying a file on a computer.

    Once someone invents a physical replicator we can talk about stuff like this, but as of today ARM’s “intellectual property” maneuvering is threatening no one’s freedom but industrial players.

  7. Ray says:

    “ARM’s instructions are tiny. Many more will fit in a small cache than x86 instructions.”

    Wait, individually, yes it true. But in an average program, it simply takes more cache as more individual instructions are needed for ARM. It’s another RISC vs CISC debate. Each one has it’s own advantages and disadvantages.

  8. M$ has a monopoly that has allowed them to milk the world. That is obscene and illegal in the way it was accumulated.

  9. The linked presentation doesn’t actually contain a comparison with x86.

    But my contention here is *not* that x86 is intrinsically better than ARM, but rather that ARM is not really any different. Choosing the ARM ISA doesn’t get you anything that you can’t also get with a sufficiently well-designed x86 core. No “magic pixie dust” here.

    Did you know that ARM was originally intended for high-performance desktop workstations? For the first few years of the company’s lifetime, that is exactly where it was used. Then came set-top boxes, then PDAs and cellphones, and it is only during the PDA/cellphone era that low power consumption started to be a selling point. Even then the low power consumption is mostly achieved by choice of implementation technology rather than actual circuit design. The only magic here is good PR.

    I’m intrigued by the suggestion that ARM’s intellectual property activities are ok because they are not too greedy. Seems like a rather arbitrary criteria: is your quarrel with Microsoft only that Windows is too expensive?

  10. ARM seems to accept “normal” profits unlike M$. I suppose they could be overtaken with green and increase prices ten-fold but I don’t see that happen. They are in it for the long haul and have been faithfully working at it since the days of the 6502 (of which I still have working examples). I see mobile as an end-around play in competition with Wintel. The promised port of “8” is still vapourware and ARM does not depend on that for success. They are doing quite well with Linux.

    Thumb-2 instructions sets permit compiled code about 25% smaller than x86. A lot of that comes from using register operations and every instruction being conditional. That really suits a lot of cases.

    see http://elinux.org/images/8/8a/Experiment_with_Linux_and_ARM_Thumb-2_ISA.pdf

  11. I have seen lots of evidence. What have you not seen: smartphones running browsers and media players and keeping track of stuff people care about. I have seen 24 inch monitors and good keyboards attached to smartphones making a quite usable desktop environment. I expect eventually docks for smartphones could be a standard thing. It is so cool to have your computing environment on-line in your pocket with a session running 24×7 if necessary. The market seems not to have accepted the “all in one” but the smartphone is a monster. Billions of ARM processors are shipped per quarter and they are very reliable because they have a lower parts-count and run cooler. It all makes sense to me.

  12. “ARM’s instructions are tiny. Many more will fit in a small cache than x86 instructions.”

    Are you sure about that? The average size of an x86 instruction is less than 32 bits. It is true that they can be up to 104 bits but very long instructions of that sort are rare in real code. Why not try an “objdump” on some x86 program and see for yourself?

    “We should not own CPUs to hold bits we don’t need to flip.”

    Maybe, but this is actually an argument against RISC, because with RISC you have many equally good possibilities for encoding the same program. This is potentially a bad thing, because the code is not as dense as it could be.

    I wonder how you feel about ARM’s status as the Microsoft of CPU design. I’m an embedded systems engineer and I know of a number of 80386/486 clones not manufactured by Intel/AMD or even IBM or Cyrix. They are not used in desktop PCs, but they are common in low-power embedded devices running Windows, DOS or Linux. They can exist because x86 is an open standard, easily reimplemented by third parties. Only a few parts (MMX, SSE and so on) are actually exclusive to AMD/Intel.

    Whereas there are no ARM clones at all. All ARM CPU designs come directly from Cambridge, no matter where the chips were actually manufactured. Just like how every version of Windows comes from Redmond. ARM’s lawyers descend on anyone attempting a third-party reimplementation of their architecture. Personally I don’t care about any of that stuff, because I think ARM (and indeed Microsoft) have an absolute right to the trade secrets, copyrights and patented designs that they have developed over many years at the cost of billions of pounds (or dollars). R&D isn’t free and the cost of designing these things is now huge. But given your previous statements about intellectual property, copyright, patents and so on, are you really sure you are backing the good guys here?

  13. oldman says:

    x86 is not making much sense for smartphones and it will not for notebooks soon.”

    ONly when ARM has proven that it is up to the task of running full blown programs on top og th eoperating systems that support them.

  14. The cache size issue is directly related to the instruction set. ARM’s instructions are tiny. Many more will fit in a small cache than x86 instructions. Hence, ARM chips did not need large caches to do well. x86 now has caches about the size of the main memory of PCs of 15 years ago. I seem to recall browsing, word-processing and playing games in those days. We don’t own PCs to hold software we are not using. We should not own CPUs to hold bits we don’t need to flip. It’s like cylinders in cars. Many folks find 3 to 4 cylinders quite acceptable in the cities where most live and yet you can buy a 12 cylinder car. x86 is not making much sense for smartphones and it will not for notebooks soon.

    The ARM instruction set has only 32bit instructions. x86 has far more instructions that are totally useless still rattling around in the decode structure. x86 needs up to 16 bits, just to specify the opcode. ARM has way more registers so often an ARM instruction is very short just specifying the operation on some register. see http://ref.x86asm.net/geek64.html

    A benchmark of Atom N450 v ARM A8 shows they were comparable but the Atom used 3X as much power.

  15. Yesterday you were telling us about the importance of caches, but today it’s all about the instruction set architecture. Color me confused. I notice that most of the area on recent Intel dies is taken up by cache, so maybe the instruction decoder isn’t actually that important?

    Stokes’ point about the relative unimportance of the ISA is one that I have made on many occasions myself. ARM’s ISA is not actually that much simpler than x86 when you consider the variants (Thumb, etc.) and the extensions (e.g. Neon, the two different sorts of FPU). The ISA also has anti-features that make implementation quite difficult, like having the PC as one of the GPRs. And predication, which makes effective out-of-order designs very difficult. (This sort of bad design is common to all ISAs; there are no truly future-proof ISAs, although something like Alpha comes very close.)

    That said, there *is* a reason why Intel will be playing catchup to ARM in power terms for some years. That reason is ARM’s substantial patent portfolio which grants them monopoly rights to manufacture certain sorts of design. Don’t you agree: intellectual property is a great thing.

  16. oldman says:

    “Did we mention price? With volume production, the price of ARM CPUs will always be less than x86. Less silicon and lower licensing fees means lower prices.”

    So what, Pog. People buy computers based on the ability to run specific applications, not on power consumption. AS long as we are not talking about systems consuming powers like servers (even my brand new Dell precision M6500 runs quiet and cool) having even a 130W cpu in a home system is a non issue for most.

    x86 beats ARM so far because of the huge pool of applications available both commercial AND FOSS, and when you get out of the mobile and embedded device market that ARM is king of, it has yet to be proven that ARM is up to the task of supporting full function desktop applications.

    This having been said, there does exist the possibility that ARM is going give x86 a run for its money because as the world discovers how much they can get done on either iOS or android based mobile applications running on mobile devices. However, Having workled with both smart phones and the iPad tablet with and without keyboards, I can assure you that these devices will at best only replace traditional computers (desktop or mobile) for a small number of people at best.

Leave a Reply