-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NAS-132166 / Address memory consumption related to AIO reads #423
Conversation
Add backpressure mechanism for AIO reads in io_uring module. In some edge cases where storage backend can't keep up with client reads, io_uring / samba will allow queueing up reads beyond what is reasonable leading to OOM errors in some extreme edge cases. This commit keeps a counter of outstanding read requests per-TCON and switches to pread(2) once the queue depth limit is reached. This effectively applies backpressure to client and prevents excessive memory consumption. A total count of synchronous reads performed is logged when TCON is disconnected if log level for the VFS module is sufficiently high (3 or higher). Fix memory leak related to AIO memory pool for some failure scenarios for AIO reads. The memory pool for AIO buffers is allocated under a memory context that persists for the duration of the SMB session. When chunks of pool are assigned out for a particular AIO request, a separate small allocation (io_link) is performed with a pointer to the data buffer and a talloc destructor set such that when the io_link is freed the buffer is returned to the pool. Initially these were performed as two separate steps (assigning out buffer from pool and setting up io_link) which allowed for situations to arise in which the request could error out before the io_link was set resulting in the buffer never being returned to the memory pool. This commit revises the workflow and API for the memory pool such that the chunk is always assinged with an associated io_link allocated under the memory context of the initial SMB2 read request. Once the request is successful and we're getting ready to respond to client then we reparent the io_link to the context of the response. This ensures that the buffer is always returned when a limited-life talloc chunk is freed.
41ec055
to
e0a3e15
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
// free initial data blob | ||
TALLOC_FREE(state->io_lnk); | ||
state->out_data = data_blob_null; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So upon schedule_smb2_aio_read
failure, we free the buffer. However, on line 597 below ("Ok, read into memory. Allocate the out buffer.") we reallocate one.
Is there any need? / Would it be more efficient to just free on all the return paths between here and there?
If this is being done to force schedule_smb2_sendfile_read
down a particular path (when it sees data_blob_null) then maybe add a comment to that effect.
Add backpressure mechanism for AIO reads in io_uring module. In some edge cases where storage backend can't keep up with client reads, io_uring / samba will allow queueing up reads beyond what is reasonable leading to OOM errors in some extreme edge cases. This commit keeps a counter of outstanding read requests per-TCON and switches to pread(2) once the queue depth limit is reached. This effectively applies backpressure to client and prevents excessive memory consumption. A total count of synchronous reads performed is logged when TCON is disconnected if log level for the VFS module is sufficiently high (3 or higher). Fix memory leak related to AIO memory pool for some failure scenarios for AIO reads. The memory pool for AIO buffers is allocated under a memory context that persists for the duration of the SMB session. When chunks of pool are assigned out for a particular AIO request, a separate small allocation (io_link) is performed with a pointer to the data buffer and a talloc destructor set such that when the io_link is freed the buffer is returned to the pool. Initially these were performed as two separate steps (assigning out buffer from pool and setting up io_link) which allowed for situations to arise in which the request could error out before the io_link was set resulting in the buffer never being returned to the memory pool. This commit revises the workflow and API for the memory pool such that the chunk is always assinged with an associated io_link allocated under the memory context of the initial SMB2 read request. Once the request is successful and we're getting ready to respond to client then we reparent the io_link to the context of the response. This ensures that the buffer is always returned when a limited-life talloc chunk is freed.
Add backpressure mechanism for AIO reads in io_uring module.
In some edge cases where storage backend can't keep up with client reads, io_uring / samba will allow queueing up reads beyond what is reasonable leading to OOM errors in some extreme edge cases.
This commit keeps a counter of outstanding read requests per-TCON and switches to pread(2) once the queue depth limit is reached. This effectively applies backpressure to client and prevents excessive memory consumption. A total count of synchronous reads performed is logged when TCON is disconnected if log level for the VFS module is sufficiently high (3 or higher).
Fix memory leak related to AIO memory pool for some failure scenarios for AIO reads.
The memory pool for AIO buffers is allocated under a memory context that persists for the duration of the SMB session. When chunks of pool are assigned out for a particular AIO request, a separate small allocation (io_link) is performed with a pointer to the data buffer and a talloc destructor set such that when the io_link is freed the buffer is returned to the pool.
Initially these were performed as two separate steps (assigning out buffer from pool and setting up io_link) which allowed for situations to arise in which the request could error out before the io_link was set resulting in the buffer never being returned to the memory pool.
This commit revises the workflow and API for the memory pool such that the chunk is always assinged with an associated io_link allocated under the memory context of the initial SMB2 read request. Once the request is successful and we're getting ready to respond to client then we reparent the io_link to the context of the response. This ensures that the buffer is always returned when a limited-life talloc chunk is freed.