Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large ioctl response resize logic is incorrect. #624

Closed
FelixMcFelix opened this issue Nov 29, 2024 · 0 comments · Fixed by #621
Closed

Large ioctl response resize logic is incorrect. #624

FelixMcFelix opened this issue Nov 29, 2024 · 0 comments · Fixed by #621
Assignees
Labels

Comments

@FelixMcFelix
Copy link
Collaborator

FelixMcFelix commented Nov 29, 2024

While stress-testing as part of the quest to reproduce #618 / #621, I have a few ports which hold vast numbers of firewall allows, UFT flows, and TCP flows. Accordingly I'm often getting messages like:

kyle@farme:~/gits/omicron$ pfexec opteadm dump-tcp-flows -p opte6
Error: failed to get response for command DumpTcpFlows in 3 attempts
kyle@farme:~/gits/omicron$ pfexec opteadm dump-layer firewall -p opte6
Error: failed to get response for command DumpLayer in 3 attempts

The code to handle this case is incorrectly performing the resize.

// The command ran successfully, but there is not enough space to
// copyout(9F) the response. In this case bump up the size of the
// response buffer and retry.
if raw_err == libc::ENOBUFS {
assert!(rioctl.resp_len_actual != 0);
assert!(rioctl.resp_len_actual > resp_buf.capacity());
// Make room for the size the kernel claims to need
resp_buf.reserve(rioctl.resp_len_actual - resp_buf.capacity());
rioctl.resp_bytes = resp_buf.as_mut_ptr();
rioctl.resp_len = resp_buf.capacity();
rioctl.resp_len_actual = 0;
continue;
}

If we throw in some debug logging, eprintln!("OS wants to return {}, I have {}.", rioctl.resp_len_actual, resp_buf.capacity());, we see that

OS wants to return 257396, I have 16384.
OS wants to return 257396, I have 241012.
OS wants to return 257396, I have 241012.

resp_buf.len() is zero, so the math is permanently off. Another piece is that on queries such as TCP flow state, response size can be a bit of a moving target as we rerun the query on a failure:

kyle@farme:~/gits/omicron$ pfexec opteadm dump-tcp-flows -p opte6
OS wants to return 58542, I have 16384.
OS wants to return 58546, I have 58542.

We'll want some extra headroom on top of the xde-given response length.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant