-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hierarchical bandwidth and operations rate limits. #16205
Open
pjd
wants to merge
2
commits into
openzfs:master
Choose a base branch
from
pjd:ratelimits
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
/* | ||
* CDDL HEADER START | ||
* | ||
* The contents of this file are subject to the terms of the | ||
* Common Development and Distribution License (the "License"). | ||
* You may not use this file except in compliance with the License. | ||
* | ||
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE | ||
* or http://www.opensolaris.org/os/licensing. | ||
* See the License for the specific language governing permissions | ||
* and limitations under the License. | ||
* | ||
* When distributing Covered Code, include this CDDL HEADER in each | ||
* file and include the License file at usr/src/OPENSOLARIS.LICENSE. | ||
* If applicable, add the following below this CDDL HEADER, with the | ||
* fields enclosed by brackets "[]" replaced with your own identifying | ||
* information: Portions Copyright [yyyy] [name of copyright owner] | ||
* | ||
* CDDL HEADER END | ||
*/ | ||
|
||
/* | ||
* Copyright (c) 2024 The FreeBSD Foundation | ||
* | ||
* This software was developed by Pawel Dawidek <[email protected]> | ||
* under sponsorship from the FreeBSD Foundation. | ||
*/ | ||
|
||
#ifndef _SYS_VFS_RATELIMIT_H | ||
#define _SYS_VFS_RATELIMIT_H | ||
|
||
#include <sys/dmu_objset.h> | ||
|
||
#ifdef __cplusplus | ||
extern "C" { | ||
#endif | ||
|
||
struct vfs_ratelimit; | ||
|
||
#define ZFS_RATELIMIT_BW_READ 0 | ||
#define ZFS_RATELIMIT_BW_WRITE 1 | ||
#define ZFS_RATELIMIT_BW_TOTAL 2 | ||
#define ZFS_RATELIMIT_OP_READ 3 | ||
#define ZFS_RATELIMIT_OP_WRITE 4 | ||
#define ZFS_RATELIMIT_OP_TOTAL 5 | ||
#define ZFS_RATELIMIT_FIRST ZFS_RATELIMIT_BW_READ | ||
#define ZFS_RATELIMIT_LAST ZFS_RATELIMIT_OP_TOTAL | ||
#define ZFS_RATELIMIT_NTYPES (ZFS_RATELIMIT_LAST + 1) | ||
|
||
int vfs_ratelimit_prop_to_type(zfs_prop_t prop); | ||
zfs_prop_t vfs_ratelimit_type_to_prop(int type); | ||
|
||
struct vfs_ratelimit *vfs_ratelimit_alloc(const uint64_t *limits); | ||
void vfs_ratelimit_free(struct vfs_ratelimit *rl); | ||
struct vfs_ratelimit *vfs_ratelimit_set(struct vfs_ratelimit *rl, | ||
zfs_prop_t prop, uint64_t limit); | ||
|
||
int vfs_ratelimit_data_read(objset_t *os, size_t blocksize, size_t bytes); | ||
int vfs_ratelimit_data_write(objset_t *os, size_t blocksize, size_t bytes); | ||
int vfs_ratelimit_data_copy(objset_t *srcos, objset_t *dstos, size_t blocksize, | ||
size_t bytes); | ||
int vfs_ratelimit_metadata_read(objset_t *os); | ||
int vfs_ratelimit_metadata_write(objset_t *os); | ||
|
||
void vfs_ratelimit_data_read_spin(objset_t *os, size_t blocksize, size_t bytes); | ||
void vfs_ratelimit_data_write_spin(objset_t *os, size_t blocksize, size_t bytes); | ||
|
||
#ifdef __cplusplus | ||
} | ||
#endif | ||
|
||
#endif /* _SYS_VFS_RATELIMIT_H */ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1184,6 +1184,117 @@ and the minimum is | |
.Sy 100000 . | ||
This property may be changed with | ||
.Nm zfs Cm change-key . | ||
.It Sy limit_bw_read Ns = Ns Ar size Ns | Ns Sy none | ||
.It Sy limit_bw_write Ns = Ns Ar size Ns | Ns Sy none | ||
.It Sy limit_bw_total Ns = Ns Ar size Ns | Ns Sy none | ||
Limits the read, write, or combined bandwidth, respectively, that a dataset and | ||
its descendants can consume. | ||
Limits are applied to file systems, volumes and their snapshots. | ||
Bandwidth limits are in bytes per second. | ||
.Pp | ||
The configured limits are hierarchical, just like quotas; i.e., even if a | ||
higher limit is configured on the child dataset, the parent's lower limit will | ||
be enforced. | ||
.Pp | ||
The limits are applied at the VFS level, not at the disk level. | ||
The dataset is charged for each operation even if no disk access is required | ||
(e.g., due to caching, compression, deduplication, or NOP writes) or if the | ||
operation will cause more traffic (due to the copies property, mirroring, | ||
or RAIDZ). | ||
.Pp | ||
Read bandwidth consumption is based on: | ||
.Bl -bullet | ||
.It | ||
read-like syscalls, eg., | ||
.Xr aio_read 2 , | ||
.Xr copy_file_range 2 , | ||
.Xr pread 2 , | ||
.Xr preadv 2 , | ||
.Xr read 2 , | ||
.Xr readv 2 , | ||
.Xr sendfile 2 | ||
.It | ||
syscalls like | ||
.Xr getdents 2 | ||
and | ||
.Xr getdirentries 2 | ||
.It | ||
reading via mmaped files | ||
.It | ||
.Nm zfs Cm send | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similar to above, maybe mention snapshot mounts? |
||
.El | ||
.Pp | ||
Write bandwidth consumption is based on: | ||
.Bl -bullet | ||
.It | ||
write-like syscalls, eg., | ||
.Xr aio_write 2 , | ||
.Xr copy_file_range 2 , | ||
.Xr pwrite 2 , | ||
.Xr pwritev 2 , | ||
.Xr write 2 , | ||
.Xr writev 2 | ||
.It | ||
writing via mmaped files | ||
.It | ||
.Nm zfs Cm receive | ||
.El | ||
.It Sy limit_op_read Ns = Ns Ar count Ns | Ns Sy none | ||
.It Sy limit_op_write Ns = Ns Ar count Ns | Ns Sy none | ||
.It Sy limit_op_total Ns = Ns Ar count Ns | Ns Sy none | ||
Limits the read, write, or both metadata operations, respectively, that a | ||
dataset and its descendants can generate. | ||
Limits are number of operations per second. | ||
.Pp | ||
Read operations consumption is based on: | ||
.Bl -bullet | ||
.It | ||
read-like syscalls where the number of operations is equal to the number of | ||
blocks being read (never less than 1) | ||
.It | ||
reading via mmaped files, where the number of operations is equal to the | ||
number of pages being read (never less than 1) | ||
.It | ||
syscalls accessing metadata: | ||
.Xr readlink 2 , | ||
.Xr stat 2 | ||
.El | ||
.Pp | ||
Write operations consumption is based on: | ||
.Bl -bullet | ||
.It | ||
write-like syscalls where the number of operations is equal to the number of | ||
blocks being written (never less than 1) | ||
.It | ||
writing via mmaped files, where the number of operations is equal to the | ||
number of pages being written (never less than 1) | ||
.It | ||
syscalls modifing a directory's content: | ||
.Xr bind 2 (UNIX-domain sockets) , | ||
.Xr link 2 , | ||
.Xr mkdir 2 , | ||
.Xr mkfifo 2 , | ||
.Xr mknod 2 , | ||
.Xr open 2 (file creation) , | ||
.Xr rename 2 , | ||
.Xr rmdir 2 , | ||
.Xr symlink 2 , | ||
.Xr unlink 2 | ||
.It | ||
syscalls modifing metadata: | ||
.Xr chflags 2 , | ||
.Xr chmod 2 , | ||
.Xr chown 2 , | ||
.Xr utimes 2 | ||
.It | ||
updating the access time of a file when reading it | ||
.El | ||
.Pp | ||
Just like | ||
.Sy limit_bw | ||
limits, the | ||
.Sy limit_op | ||
limits are also hierarchical and applied at the VFS level. | ||
.It Sy exec Ns = Ns Sy on Ns | Ns Sy off | ||
Controls whether processes can be executed from within this file system. | ||
The default value is | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This barely matters, but we don't have anything else really called
vfs_
. Should this bezfs_ratelimit
, orzfs_vfs_ratelimit
, or something like that?(and change the function names etc to match, of course)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FOLLOWUP: Having just read the zvol changes, I'm even more convinced this should definitely not say just be
zfs_ratelimit
, notvfs
or anything else.vfs
is sort of a strange term to see in the zvol code.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course I wanted to use zfs_ratelimit, but it is already taken:) See module/zfs/zfs_ratelimit.c. I'm open to changing the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about zfs_ioratelimit or zfs_reqlimit or zfs_iolimit or something else? The naming conflicts with VFS on one side in ZVOL on another and zfs_ratelimit on third are indeed annoying.