Major OOPS With ARMed Devices

“On the ARM architecture, due to the split instruction and data caches, JIT engines have to clear the processor’s instruction cache to ensure that any freshly generated instructions are loaded and run.
 
Mono’s engineers noticed that, when flushing 128-byte blocks from the I-cache, only 64 bytes were being cleared, allowing the processor core to run stale and mismatched code and crash the running application.”
 
See Bug of the month: Cache flow problem crashes Samsung phone apps
Yes, ARMed devices are wonderful, but being different also causes different problems for programmers. The BIG-little architecture intended to optimize lifetimes of batteries has the unintended consequence of making caching of data and instructions a bit more difficult when processes move from one processor to another of different size. I’ve seen this many times. The more complex the system the longer it takes to learn it and the more mistakes are made.

Now, if only folks could fix the problem with the LeMaker Cello so that I could buy one sooner rather than later…

About Robert Pogson

I am a retired teacher in Canada. I taught in the subject areas where I have worked for almost forty years: maths, physics, chemistry and computers. I love hunting, fishing, picking berries and mushrooms, too.
This entry was posted in technology and tagged , , , , , . Bookmark the permalink.

7 Responses to Major OOPS With ARMed Devices

  1. oiaohm says:

    dougman its not a skylake right there have been a few bugs this year for those??

    ARM issue is a little more horible.

    Anyone who bothers reading arm64 manual will notice than instructions in the I-cache are meant to be read-only. Arm64 chips were not really designed to work with programs rewriting themselves on the fly.
    https://events.linuxfoundation.org/sites/events/files/slides/slides_10.pdf
    This is 2015 conference slide set there is a matching video.

    Lets just say Arm64 cache design can get evil. D-cache on arm is designed on the presume things change I-cache is design on the idea that things stay constant.

    https://community.arm.com/groups/processors/blog/2010/02/17/caches-and-self-modifying-code
    This is 2010 arm talking about doing self modifying code making it clear that the method is really a hack that now has a bug.

    Arm cpu are not really designed for JIT. Arm cpu are designed more for AOT. Mono attempting to using JIT on an CPU design for AOT is kinda asking for trouble.

    Emulators like Dolphin and PPSSPP they are in the horible location where they have no choice but to JIT. Mono? why should be first question.

    Self Modifying code JIT.
    http://blog.onlinedisassembler.com/blog/?p=133
    x86 turns out not to be healthy for performance either. So JIT does not in fact run nicely on i3 & i5 either. Just samsung arm managed to make it 2 levels worse with a handling bug.

    So you have to ask yourself what in heck do mono developers think they are up-to.

  2. luvr wrote, “They get other bugs. Just remember the FDIV bug.”

    That’s a hardware bug. TFA is about a software bug, not treating the hardware properly, assuming all caches are the same shape/size.

  3. luvr says:

    implying such a bug will never hit Intel iSomething CPUs is moronic

    They get other bugs. Just remember the FDIV bug.

  4. kurkosdr wrote, “Lots of ARM devices run smoothly too.”

    TFA’s ARMed device runs smoothly, too. It’s the software that’s not written correctly for it. The random nature of failures made it difficult to find the fault but it will be solved properly sooner or later now that the anthill has been kicked wide open. Adding a bit of complexity to the data structures representing processes to the kernel should be sufficient to deal with the matter. The kernel is very nice but it’s not “write once, run everywhere”. It shouldn’t be an issue for the upcoming “Cello” since it uses 4 identical cores. Big-little makes sense for mobile devices but it’s not as relevant for server/desktop usage where folks usually don’t run from a battery and load-averages may be non-zero fairly often. With huge numbers of cores being contemplated, even in mobility, Big-little may be obsolete. If you have sixteen identical cores, just run one for the lightest loads. Then, you have a similar benefit and this “bug” disappears. So there are at least two solutions to this problem. Perhaps several will be used in different situations.

  5. dougman says:

    “Yet another dogbrain moronic comment.”

    You mad bro?

  6. kurkosdr says:

    ” My Intel i3 & i5 runs nicely. ”

    Yet another dogbrain moronic comment.

    Lots of ARM devices run smoothly too. But with hardware design schedules running so tight, implying such a bug will never hit Intel iSomething CPUs is moronic.

    But don’t worry, there are always people who didn’t receive a brain transplant from Canis Lupus Familiaris to correct you.

  7. dougman says:

    My Intel i3 & i5 runs nicely.

Leave a Reply

Your email address will not be published. Required fields are marked *