-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linux load average increased since v,2.2.0 #15453
Comments
I have also seen this on an idle RL8.8 system with [root@repo2 ~]# uptime 12:54:01 up 11 min, 1 user, load average: 1.00, 0.93, 0.58 [root@repo2 ~]# zpool get autotrim repo2 NAME PROPERTY VALUE SOURCE repo2 autotrim on local [root@repo2 ~]# ps aux | grep -w D\< root 1264 0.0 0.0 0 0 ? D< 12:42 0:00 [vdev_autotrim] [root@repo2 ~]# cat /etc/redhat-release Rocky Linux release 8.8 (Green Obsidian) [root@repo2 ~]# uname -a Linux repo2 4.18.0-477.27.1.el8_8.x86_64 #1 SMP Wed Sep 20 15:55:39 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux [root@repo2 ~]# cat /sys/module/zfs/version 2.2.0-1 In addition, disabling [root@repo2 ~]# zpool set autotrim=off repo2 [root@repo2 ~]# zpool get autotrim repo2 NAME PROPERTY VALUE SOURCE repo2 autotrim off default [root@repo2 ~]# ps aux | grep -w D\< root 1264 0.0 0.0 0 0 ? D< 12:42 0:00 [vdev_autotrim] However, rebooting does get rid of that that kernel thread and restores the system to a load of less than 1, [root@repo2 ~]# uptime 13:03:02 up 1 min, 2 users, load average: 0.15, 0.12, 0.05 [root@repo2 ~]# ps aux | grep -w D\< root 3276 0.0 0.0 12144 1100 pts/1 S+ 13:03 0:00 grep --color=auto -w D< [root@repo2 ~]# zpool get autotrim repo2 NAME PROPERTY VALUE SOURCE repo2 autotrim off default This then begs the question whether |
Note, there are no extra lines in [root@repo2 ~]# cat /proc/spl/kstat/zfs/dbgmsg ... 1698523656 spa_history.c:294:spa_history_log_sync(): command: zpool set autotrim=on repo2 [root@repo2 ~]# ps aux | grep -w D\< root 7193 0.0 0.0 0 0 ? D< 13:07 0:00 [vdev_autotrim] |
It seems to me that |
All three have the same stack:
|
Getting the same behaviour after 2.1 -> 2.2 upgrade. OL9, default uek kernel
|
Same here, NixOS 23.11, brand new. |
@behlendorf Is there any new insight into this problem? Because, unfortunately it doesn't look like jxdking has been active in the last time. When I did my own tests, I was also able to reproduce this behavior both on Ubuntu 22.04 (with zfs-2.2.2 dkms manually installed) as well as both on Ubuntu 23.10 and the pre-release version of 24.04 which both ship with zfs-2.2.x build-in. It looks like the load is permanently increased by one for each pool that has autotrim set to on even if there is no other activity on the drives. |
My interpretation of the data is different. The # of threads in uninterruptible sleep is actually the number of TRIMable underlying storage devices across all zpools. I have 2 zpools built on three NVME vdevs. My load is +3, and I can see one auto_trim thread in D state for each vdev. In other words, the load will scale with the number of underlying storage devices capable of TRIM, so this is a bigger problem, not a smaller one. |
Also just noticed this on my system. I have two pools that are on SSDs, autotrim is enabled, and my load will only ever go down to 2.0, matching the other's descriptions of having one Fully updated ArchLinux and ZFS compiled from the AUR (no archzfs). I might try building with that commit reverted and see if it still does it. |
No. The correct description is that you have two total TRIMable vdevs assigned to zpool. The number of zpools is not a factor. The number of vdevs participating in auto_trim each get an uninterruptible thread. That is the deciding factor driving the load increase. This is why this is a bigger problem than it appears. And, the bigger and more complex your system is, the worse this is. The load will scale with the number of vdevs participating in auto_trim. E.G.
QED |
Here is the attempt to fix the issue #15781 . My test environment was teared down months ago in order to work on other projects. I have no way to test it. Please give feedback whether it fixes the issue. Thanks. |
Thank you very much. I did a quick test on my testing environment running Ubuntu 22.04 LTS and can indeed confirm that the high load average is fixed and the uninterruptible threads are gone. |
Fixed by #15781 |
System information
Describe the problem you're observing
I recently upgraded to v,2.2.0 on my workstation. Since this upgrade, I noticed my load average has increased +3 from what it was before. Since Linux includes uninterruptible tasks in the load calculation, I am pretty sure this is it, as I have three (new) uninterruptible sleep threads:
Any ideas with what changed on the autotrim code path? I have been using this feature for quite a long time.
I am not seeing any abnormal latency issues nor excessive autotrim ops on my two zpools, and performance remains good despite the increased load average.
Describe how to reproduce the problem
look for uninterruptible zfs threads contributing to load average increase
Include any warning/errors/backtraces from the system logs
N/A
The text was updated successfully, but these errors were encountered: