One big difference that I’ve noticed between Windows and Linux is that Windows does a much better job ensuring that the system stays responsive even under heavy load.
For instance, I often need to compile Rust code. Anyone who writes Rust knows that the Rust compiler is very good at using all your cores and all the CPU time it can get its hands on (which is good, you want it to compile as fast as possible after all). But that means that for a time while my Rust code is compiling, I will be maxing out all my CPU cores at 100% usage.
When this happens on Windows, I’ve never really noticed. I can use my web browser or my code editor just fine while the code compiles, so I’ve never really thought about it.
However, on Linux when all my cores reach 100%, I start to notice it. It seems like every window I have open starts to lag and I get stuttering as the programs struggle to get a little bit of CPU that’s left. My web browser starts lagging with whole seconds of no response and my editor behaves the same. Even my KDE Plasma desktop environment starts lagging.
I suppose Windows must be doing something clever to somehow prioritize user-facing GUI applications even in the face of extreme CPU starvation, while Linux doesn’t seem to do a similar thing (or doesn’t do it as well).
Is this an inherent problem of Linux at the moment or can I do something to improve this? I’m on Kubuntu 24.04 if it matters. Also, I don’t believe it is a memory or I/O problem as my memory is sitting at around 60% usage when it happens with 0% swap usage, while my CPU sits at basically 100% on all cores. I’ve also tried disabling swap and it doesn’t seem to make a difference.
EDIT: Tried nice -n +19
, still lags my other programs.
EDIT 2: Tried installing the Liquorix kernel, which is supposedly better for this kinda thing. I dunno if it’s placebo but stuff feels a bit snappier now? My mouse feels more responsive. Again, dunno if it’s placebo. But anyways, I tried compiling again and it still lags my other stuff.
The Linux kernel uses the CPU default scheduler, CFS, a mode that tries to be fair to all processes at the same time - both foreground and background - for high throughput. Abstractly think “they never know what you intend to do” so it’s sort of middle of the road as a default - every CPU cycle of every process gets a fair tick of work unless they’ve been intentionally
nice
’d or whatnot. People who need realtime work (classic use is for audio engineers who need near-zero latency in their hardware inputs like a MIDI sequencer, but also embedded hardware uses realtime a lot) reconfigure their system(s) to that to that need; for desktop-priority users there are ways to alter the CFS scheduler to help maintain desktop responsiveness.Have a look to Github projects such as this one to learn how and what to tweak - not that you need to necessarily use this but it’s a good point to start understanding how the mojo works and what you can do even on your own with a few sysctl tweaks to get a better desktop experience while your rust code is compiling in the background. https://github.com/igo95862/cfs-zen-tweaks (in this project you’re looking at the set-cfs-zen-tweaks.sh file and what it’s tweaking in
/proc
so you can get hints on where you research goals should lead - most of these can be set with a sysctl)There’s a lot to learn about this so I hope this gets you started down the right path on searches for more information to get the exact solution/recipe which works for you.
I’d say
nice
alone is a good place to start, without delving into the scheduler rabbit hole…I would agree, and would bring awareness of
ionice
into the conversation for the readers - it can help control I/O priority to your block devices in the case of write-heavy workloads, possibly compiler artifacts etc.
“they never know what you intend to do”
I feel like if Linux wants to be a serious desktop OS contender, this needs to “just work” without having to look into all these custom solutions. If there is a desktop environment with windows and such, that obviously is intended to always stay responsive. Assuming no intentions makes more sense for a server environment.
Even for a server, the UI should always get priority, because when you gotta remote in, most likely shit’s already going wrong.
Totally agree, I’ve been in the situation where a remote host is 100%-ing and when I want to ssh into it to figure out why and possibly fix it, I can’t cause ssh is unresponsive! leaving only one way out of this, hard reboot and hope I didn’t lose data.
This is a fundamental issue in Linux, it needs a scheduler from this century.
You should look into IPMI console access, that’s usually the real ‘only way out of this’
SSH has a lot of complexity but it’s still the happy path with a lot of dependencies that can get in your way- is it waiting to do a reverse dns lookup on your IP? Trying to read files like your auth key from a saturated or failing disk? syncing logs?
With that said i am surprised people are having responsiveness issues under full load, are you sure you weren’t running out of memory and relying heavily on swapping?
Sounds like Kubuntu’s fault to me. If they provide the desktop environment, shouldn’t they be the ones making it play nice with the Linux scheduler? Linux is configurable enough to support real-time scheduling.
FWIW I run NixOS and I’ve never experienced lag while compiling Rust code.
Linux defaults are optimized for performance and not for desktop usability.
If that is the case, Linux will never be a viable desktop OS alternative.
Either that needs to change or distributions targeting desktop needs to do it. Maybe we need desktop and server variants of Linux. It kinda makes sense as these use cases are quite different.
EDIT: I’m curious about the down votes. Do people really believe that it benefits Linux to deprioritise user experience in this way? Do you really think Linux will become an actual commonplace OS if it keeps focusing on “performance” instead of UX?
“The kernel runs out of time to solve the NP-complete scheduling problem in time.”
More responsiveness requires more context-switching, which then subtracts from the available total CPU bandwidth. There is a point where the task scheduler and CPUs get so overloaded that a non-RT kernel can no longer guarantee timed events.
So, web browsing is basically poison for the task scheduler under high load. Unless you reserve some CPU bandwidth (with cgroups, etc.) beforehand for the foreground task.
Since SMT threads also aren’t real cores (about ~0.4 - 0.7 of an actual core), putting 16 tasks on a 16/8 machine is only going to slow down the execution of all other tasks on the shared cores. I usually leave one CPU thread for “housekeeping” if I need to do something else. If I don’t, some random task is going to be very pleased by not having to share a core. That “spare” CPU thread will be running literally everything else, so it may get saturated by the kernel tasks alone.
nice +5
is more of a suggestion to “please run this task with a worse latency on a contended CPU.”.(I think I should benchmark make -j15 vs. make -j16 to see what the difference is)
That’s all fine, but as I said, Windows seems to handle this situation without a hitch. Why can Windows do it when Linux can’t?
Also, it sounds like you suggest there is a tradeoff between bandwidth and responsiveness. That sounds reasonable. But shouldn’t Linux then allow me to easily decide where I want that tradeoff to lie? Currently I only have workarounds. Why isn’t there some setting somewhere to say “Yes, please prioritise responsiveness even if it reduces bandwidth a little bit”. And that probably ought to be the default setting. I don’t think a responsive UI should be questioned - that should just be a given.
You’re right of course. I think the issue is that Linux doesn’t care about the UI. As far as it is concerned GUI is just another program. That’s the same reason you don’t have things like ctrl-alt-del on Linux.
To be fair, there should be some heuristics to boost priority of anything that has received input from the hardware. (a button click e.g.) The no-care-latency jobs can be delayed indefinitely.
Why can Windows do it when Linux can’t?
Windows lies to you. The only way they don’t get this problem is that they are reserving some CPU bandwidth for the UI beforehand. Which explains the 1-2% y-cruncher worse results on windows.
If that’s the solution to the problem, it’s a good solution. Linux ought to do the same thing, cause none of the suggestions in this thread have worked for me.
nice +5 cargo build
nice is a program that sets priorities for the CPU scheduler. Default is 0. Goes from -19, which is max prio, to +19 which is min prio.
This way other programs will get CPU time before cargo/rustc.
It’s more of a workaround than a solution. I don’t want to have to do this for every intensive program I run. The desktop should just be responsive without any configuration.
You could give your compiler a lower priority instead of upping everything else.
I’d still need to lower the priority of my C++ compiler or whatever else intensive stuff I’d be running. I would like a general solution, not a patch just for running my Rust compiler.
How do you expect the system to know what program is important to you and which isn’t?
The windows solution is to switch tasks very often and to do a lot of accounting to ensure fair distribution. This results in a small but significant performance degradation. If you want your system to perform worse overall you can achieve this by setting the default process time slice value very low - don’t come back complaining if your builds suddently take 10-20% longer though.
The correct solution is for you to tell the system what’s important and what is not so it can do what you want properly.
You might like to configure and use the auto nice deamon: https://and.sourceforge.net/
How do you expect the system to know what program is important to you and which isn’t?
Hmm
The windows solution is to switch tasks very often and to do a lot of accounting to ensure fair distribution.
Sounds like you have a good idea already!
So the better approach would be to spawn all desktop and base GUI things with
nice -18
or something?No. This will wreak havoc. At most at -1 but I’d advise against that. Just spawn the lesser-prioritised programs with a positive value.
Could you elaborate?
Critical operating system tasks run at -19. If they don’t get priority it will create all kinds of problems. Audio often runs below 0 as well, at perhaps -2, so music doesn’t stutter under load. Stuff like that.
Ok, nice. Do you know what other undefined processes are spawned with?
Default is 0. Also, processes inherit the priority of their parent.
This is another reason why starting the desktop environment as a whole with a different prio won’t work: the compiler is started as a child of the editor or shell which is a child of the DE so it will also have the changed prio.
Damn… thanks thats complicated