-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add slow disk diagnosis to ZED #15469
Conversation
Just some quick questions as I was looking over the code -
...but that makes it sound like a top level vdev state. Does something special happen to leaf vdevs marked DEGRADED? Like we stop allocating data on them or something? Or is it functionally the same as being FAULTED?
|
An individual vdev is degraded when it trips the checksum SERD diagnosis. For this PR, we are extending that to include a too many slow I/Os diagnosis. A degraded vdev is mentioned in pool-concepts:
I will update the man page to reflect that slow I/Os can also cause a degraded state. A hot spare will replace a vdev that is degraded or faulted. We are leveraging this behavior so that a slow disk can be replaced with a spare. |
Yes, with the default SERD slow I/O thresholds a generic pool would see a degrade if one, and only one, vdev is slow (30 sec+ delay events). The current behavior is that the per-vdev slow I/O counters increase and zevents are generated, but no diagnosis occurs. |
I think you may only want to enable the behavior if the user has set |
37c9f86
to
18dcd6c
Compare
Rebased to resolve merge conflicts. Addressed review feedback. |
Just for reference I noticed some test failures for CentOS 7 |
Is this related to the following messages I'm finding in kmsg on my QNAP? 2023-08-09 10:18:13 -07:00 <7> [ 606.733785] ----- [DISK SLOW] Pool zpool1 vd 8851305046343565969 ----- Thank you. |
18dcd6c
to
b777a27
Compare
Fixed commit message to be |
I believe it is related in that the above message is likely a summary of a zevent for a slow I/O. If you have cli (shell) access you can confirm with |
@don-brady when you get a chance can you rebase this. I also the new test case if failing on the CentOS 7 builder.
|
The CentOS 7 test failure was very interesting. On this OS, and only on this OS, a spa txg sync was coming in during/after the test file reads and generating a lot of slow I/Os from the txg sync writes which in turn was causing a degrade where one was not expected for the test. The current Also updated
|
f1fe8ce
to
258c87e
Compare
Another fix for ZTS for recent Ubuntu failures. |
tests/zfs-tests/tests/functional/events/zed_slow_io_many_vdevs.ksh
Outdated
Show resolved
Hide resolved
tests/zfs-tests/tests/functional/events/zed_slow_io_many_vdevs.ksh
Outdated
Show resolved
Hide resolved
9a447eb
to
a9d10d4
Compare
a9d10d4
to
a46f3d3
Compare
The ZFS on QNAP is a branch from the original Solaris release and is too old to have the '-s' option. Thank you for trying. |
a46f3d3
to
458977c
Compare
tests/zfs-tests/tests/functional/events/zed_slow_io_many_vdevs.ksh
Outdated
Show resolved
Hide resolved
458977c
to
9ee6d77
Compare
19e552d
to
dded2f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. @don-brady could you just rebase this on the latest master branch for force update the PR. We should get a cleaner CI run.
98a1410
to
ea98ab7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed-by: Allan Jude <[email protected]>
@don-brady it looks like this change is causing at least the following ZTS failures. https://github.com/openzfs/zfs/actions/runs/7429220144?pr=15469
|
Investigating ZTS |
Slow disk response times can be indicative of a failing drive. ZFS currently tracks slow I/Os (slower than zio_slow_io_ms) and generates events (ereport.fs.zfs.delay). However, no action is taken by ZED, like is done for checksum or I/O errors. This change adds slow disk diagnosis to ZED which is opt-in using new VDEV properties: VDEV_PROP_SLOW_IO_N VDEV_PROP_SLOW_IO_T If multiple VDEVs in a pool are undergoing slow I/Os, then it skips the zpool_vdev_degrade(). Sponsored-By: OpenDrives Inc. Sponsored-By: Klara Inc. Co-authored-by: Rob Wing <[email protected]> Signed-off-by: Don Brady <[email protected]>
84dfec5
to
fa957c7
Compare
Update In the ZTS, there are ZED related tests in The mystery I encountered, was that in one of the Root Cause Normally ZED keeps persistent state ( To fix this failure, I added a call to |
Slow disk response times can be indicative of a failing drive. ZFS currently tracks slow I/Os (slower than zio_slow_io_ms) and generates events (ereport.fs.zfs.delay). However, no action is taken by ZED, like is done for checksum or I/O errors. This change adds slow disk diagnosis to ZED which is opt-in using new VDEV properties: VDEV_PROP_SLOW_IO_N VDEV_PROP_SLOW_IO_T If multiple VDEVs in a pool are undergoing slow I/Os, then it skips the zpool_vdev_degrade(). Sponsored-By: OpenDrives Inc. Sponsored-By: Klara Inc. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Rob Wing <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes openzfs#15469
Slow disk response times can be indicative of a failing drive. ZFS currently tracks slow I/Os (slower than zio_slow_io_ms) and generates events (ereport.fs.zfs.delay). However, no action is taken by ZED, like is done for checksum or I/O errors. This change adds slow disk diagnosis to ZED which is opt-in using new VDEV properties: VDEV_PROP_SLOW_IO_N VDEV_PROP_SLOW_IO_T If multiple VDEVs in a pool are undergoing slow I/Os, then it skips the zpool_vdev_degrade(). Sponsored-By: OpenDrives Inc. Sponsored-By: Klara Inc. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Rob Wing <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes openzfs#15469
Slow disk response times can be indicative of a failing drive. ZFS currently tracks slow I/Os (slower than zio_slow_io_ms) and generates events (ereport.fs.zfs.delay). However, no action is taken by ZED, like is done for checksum or I/O errors. This change adds slow disk diagnosis to ZED which is opt-in using new VDEV properties: VDEV_PROP_SLOW_IO_N VDEV_PROP_SLOW_IO_T If multiple VDEVs in a pool are undergoing slow I/Os, then it skips the zpool_vdev_degrade(). Sponsored-By: OpenDrives Inc. Sponsored-By: Klara Inc. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Rob Wing <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes openzfs#15469
Slow disk response times can be indicative of a failing drive. ZFS currently tracks slow I/Os (slower than zio_slow_io_ms) and generates events (ereport.fs.zfs.delay). However, no action is taken by ZED, like is done for checksum or I/O errors. This change adds slow disk diagnosis to ZED which is opt-in using new VDEV properties: VDEV_PROP_SLOW_IO_N VDEV_PROP_SLOW_IO_T If multiple VDEVs in a pool are undergoing slow I/Os, then it skips the zpool_vdev_degrade(). Sponsored-By: OpenDrives Inc. Sponsored-By: Klara Inc. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Rob Wing <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #15469
Motivation and Context
Slow disk response times can be indicative of a failing drive. ZFS currently tracks slow I/Os (slower than
zio_slow_io_ms
) and generates events (ereport.fs.zfs.delay
). But no action is taken, like is done for checksum or I/O errors. It would be nice to have the option to degrade a slow disk that is slowing down throughput and allow for it to be replaced with a spare.Description
Add new VDEV properties,
VDEV_PROP_SLOW_IO_N
andVDEV_PROP_SLOW_IO_T
that ZED can consume for the SERD engine parameters. They default to 10 slow I/Os in 30 seconds.Add a new SERD engine type,
zc_serd_slow_io
, that can track these events and initiate azpool_vdev_degrade()
when the limit has occurred.If multiple VDEVs in a pool are undergoing slow I/Os, then skip the
zpool_vdev_degrade()
.Also did some minor code cleanup in the zfs diagnosis module.
Sponsored-By: OpenDrives Inc.
Sponsored-By: Klara Inc.
How Has This Been Tested?
zed_slow_io
: verify that override properties function as expected and a VDEV gets degradedzed_slow_io_many_vdevs
: confirm that when multiple VDEVs are involved no degrade occurs.Types of changes
Checklist:
Signed-off-by
.