Loading Up Linux

Bugs Found In Linux

Name Description Kernel version Impacted applications Maximum measured performance impact
Group Imbalance When launching multiple applications with different thread counts, some CPUs are idle while others are overloaded 2.6.38+ All 13X
Scheduling Construction No load balancing between nodes that are 2-hops apart 3.9+ All 27X
Overload-on-Wakeup Threads wake up on overloaded cores while some other cores are idle. 2.6.32+ Applications that sleep or wait 22%
Missing Scheduling Domains The load is not balanced between NUMA nodes 3.19+ All 138X

See The Linux Scheduler: a Decade of Wasted Cores

See also, Scheduler patches: 6x performance increase when system is under heavy load

We’ve known forever that Linux is not perfect but it works very well. TFA quoted above points out several bugs in the scheduler which lets us down on heavily loaded systems with many threads. Lets see if this fix can be introduced into the mainstream kernel.

UPDATE – This could actually go somewhere. Peter Zijlstra has nibbled

About Robert Pogson

I am a retired teacher in Canada. I taught in the subject areas where I have worked for almost forty years: maths, physics, chemistry and computers. I love hunting, fishing, picking berries and mushrooms, too.
This entry was posted in technology and tagged , , . Bookmark the permalink.

17 Responses to Loading Up Linux

  1. oiaohm says:

    Deaf Spy the tools for SQL Server locking issue tracking is not just about the SQL Server only scheduler. It traces down into drivers locks. Because IO locks badly damage SQL Server performance.

    SQLServerSpinlockContention.pdf

    Really go read the PDF instead of guessing all the information about the issue was described. Sorry DeafSpy you are a idiot who is attacking who has no read the PDF to know that I have in fact referenced one of the debugging cases where you did into the Windows kernel. Funny that you in past claimed to be an experience SQL server user Deaf Spy yet you have never done lock diagnostics. Like one SQL server insance can be running had half the speed of another and the difference is that one of the IO drivers has a locking bug. The spinlock diagnostic path will find it. Its like a important thing to know how to check when running anything on windows server.

    SQL Server is not is only operating system it depends on drivers and other parts provided by the OS that all can contain locks. Spinlock diagnostics is a very important process.

    Funny you just showing that your skill is bogus as well deaf spy.

  2. Deaf Spy says:

    Microsoft provides different bits of documentation all over the place to attempt to provide the information to diagnose when Windows server or desktop has run into lock hell.

    Fifi, Fifi, you never fail to show up as the village, pardon, Robert’s idiot. You own reference is about the internal scheduler of SQL Server, which has absolutely nothing to do with Windows kernel, server or desktop. SQL Server, dearest Fifi, has basically its own operating system, scheduler and all.

    Fifi, you are totally irrelevant and clueless. Shut up.

  3. dougman says:

    “If you cannot get Certificate II level work right Dougman why should I go into Cert 4 and higher work with you. ”

    Perhaps you could ELI5? Not that I honestly care about it, but still you nor Robert have explained this notion of devices sharing the same IP.

  4. oiaohm says:

    If you cannot get Certificate II level work right Dougman why should I go into Cert 4 and higher work with you. Basically you have proven yourself too big of a idiot to get answers here.

  5. oiaohm says:

    dougman so another two more pointless spams by a idiot.

  6. dougman says:

    “So another attempt at being insulting ”

    Attempt, just a few entries ago you got all butt-hurt about it and demanded I apologize! Which is it?

  7. dougman says:

    Fifi, I wonder that you will still be talking. Nobody marks you.

  8. oiaohm says:

    So another attempt at being insulting when are you going to bother being on topic dougman or are you just a moron.

  9. dougman says:

    So resorting to miss quoting because you cannot accept being wrong. No its possible to mock a person without miss quoting. Of course cannot like being pulled up that miss quoting is invalid so now is going to quote everything to try to get me to give that miss quoting is allow.

    Sorry it attack my IQ again because you were pulled up for being completely wrong again.

    This just shows the level idiot you are who does not know the difference between mocking and miss quoting. Its the old idiot excuse that mocking allows them to do miss quoting and other major mistakes.

    So lets be more of a fraud . I have not raise the fact you don’t have a clue about farm equipment or anything else. It was me who point out that the breakages robert was suffering from was avoidable by ploughing.

    You know, this is a super big maybe. I have mentioned it before but the process is a pain in ass. The fact you referring to miss quoting as mocking really means you are a idiot. You said that miss quoting is fine for mocking right.

    Please use reference sources that are not conflict of interest in future then you may not be basing you ideas fiction.

    Badly packages applications exist. Depending how built Yes it would be better at long last has agreed to sit down and work Ok it would be nicer

    Interesting point. There is something interesting here. Since you were completely wrong again please do this to yourself.

    There are about 12 cases that has done this on different topics including religion. I typed that line because I am sick of the same repeated mistake. So much the 100 percent normal idiot game play.

    Since I suffer from dyslexia I have to understand the diagnostic methods used for it and the statistics about it.

    Dougman you have painted yourself in a corner you accused ltsp of not being in active that was not the case.

    opps typo.

  10. oiaohm says:

    http://mrpogson.com/2016/12/12/dhcp-trashed-by-toos-really/#comment-365376
    dougman answer provide
    Layer-2 Load Balancing and a Layer-4 Load Balancing example answer over there first to prove you are not a moron before I bother answering this.

  11. dougman says:

    “Windows 2012 has more locks than 2003. Linux kernel historically had 1 big kernel lock now it has thousands of individual locks.”

    What locks are those? Perhaps someone should put a combination lock on your computer, so as to limit your use.

  12. oiaohm says:

    Deaf Spy
    Back in the dark ages of Windows XP and 2003…
    Reality here is the more multi core you support the more locks you have.

    Windows 2012 has more locks than 2003. Linux kernel historically had 1 big kernel lock now it has thousands of individual locks.

    So this is basically you being totally clueless Deaf Spy.

    Sorry no point attacking me when you are absolutely clueless. Windows NT kernel that is core to Windows 2012 and Windows 10 with each new version has a increasing number of locks like the Linux kernel has. Each time attempting to make locks cover a smaller and smaller area to reduce conflicts. Linux is winning the race to the smaller area of lock coverage. One of the biggest barriers to windows making locks area of coverage smaller is windows binary driver support.

    https://www.microsoft.com/en-us/download/details.aspx?id=26666&e6b34bbe-475b-1abd-2c51-b5034bcdd6d2=True

    Microsoft provides different bits of documentation all over the place to attempt to provide the information to diagnose when Windows server or desktop has run into lock hell.

    Of course here is dougman insulting me because he does not have a clue if I am right or wrong or if deafspy was right or wrong so has to put a response here because I had. So proving dougman is a absolute idiot.

  13. dougman says:

    “Last chance to save your face and back away from the Fifiest Fifi Evar!”

    I told Fifi to partake in some meatspin for a few days, maybe it will reset his chemical imbalance.

  14. Deaf Spy says:

    Of course, Fifi again links to something totally irrelevant.

    Robert, are you sure your high esteem of Fifi’s technical thoughts were not a result of high temperature and flu? Last chance to save your face and back away from the Fifiest Fifi Evar!

  15. Deaf Spy says:

    the NT kernel has lots of locks

    Back in the dark ages of Windows XP and 2003…

  16. oiaohm says:

    https://support.microsoft.com/en-au/kb/2207548

    kurkosdr there are a long list of Windows bugs around the Scheduler as well. Some are insane like this one where the only solution is disable power management so make your servers consume more power in idle times..

    Companies paying for Redhat max support there is no cost saving on OS between Windows or Linux in that case. Yet on average Redhat had been performing better even with that long list of defects because windows has a equally long list of scheduler defects.

    Linux kernel has lots of lock these days why it include lock debugging tools like lockdep.

    Some of Linux supports over 4096 cores(can be over a 1024 cores per socket) windows server tops out at 640 cores. Kind of a difference larger systems do help to test stuff more completely. There is another one that is also interesting work to make Linux kernel support full hard realtime has found huge numbers of insanely hard to find performance bugs. Missing Scheduling Domains comes out of the real-time kernel work on a quad core system.

    Broader usage more complete testing created over time better scheduler and locking.

  17. kurkosdr says:

    But Linux is good for servers. It’s not like all those companies choose Linux for their webserver because it is cheaper or something…

    Also: “the Linux kernel might waste cores but the NT kernel has lots of locks and thread syncho. TheRightChoice between implementation simplicity and solving 100% of the problem (aka best utilization of cores) has been made by Linux *strokes graybeard* “

Leave a Reply