-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: set THP_DISABLE=true in shim, and restore it before starting runc #195
Conversation
I think this problem is related to specific application scenarios.
|
Because some processes in the environment do not show the use of huge pages, but need to use huge pages to improve performance. |
30351c5
to
0e4dae5
Compare
crates/runc/src/lib.rs
Outdated
@@ -366,6 +366,7 @@ pub trait Spawner: Debug { | |||
/// and some other utilities. | |||
#[cfg(feature = "async")] | |||
impl Runc { | |||
#[cfg(not(target_os = "linux"))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have runc on non Linux environments. But also can apply same suggestion as above to avoid func duplication.
5ac6db6
to
80ee51f
Compare
Thanks for your suggestions. I have changed them. |
654bf0f
to
7c6d2dd
Compare
crates/runc/src/lib.rs
Outdated
@@ -368,6 +368,22 @@ pub trait Spawner: Debug { | |||
impl Runc { | |||
async fn launch(&self, cmd: Command, combined_output: bool) -> Result<Response> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async fn launch(&self, cmd: Command, combined_output: bool) -> Result<Response> {
There are some problems here, the cmd variable needs to be mutable, but if on a non-Linux environment, cmd cannot be mutable. If you don't use a separate function name, the code looks ugly because you need to create a new mut cmd and use a #[cfg] block. Is there a better way?
let mut vars: Vec<(&str, &str)> = Vec::new(); | ||
#[cfg(target_os = "linux")] | ||
let mut thp_disabled = String::new(); | ||
#[cfg(not(target_os = "linux"))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let disabled = if cfg!(target_os = "linux") {
// Query whether THP is disabled.
if let Ok(x) = prctl::get_thp_disable() {
let _ = prctl::set_thp_disable(true);
true
} else {
false
}
} else {
false
};
let vars = vec![("THP_DISABLED", &disabled.to_string())]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since it sounds like #[cfg(target_os = "linux")]
is needed, it still might be clearer to do something like:
let thp_disabled = false;
#[cfg(target_os = "linux")]
let thp_disabled = match prctl::get_thp_disable() {
Ok(x) => {
let _ = prctl::set_thp_disable(true);
true
}
Err(_) => false,
};
let vars: Vec<(&str, &str)> = vec![("THP_DISABLED", &disabled.to_string())];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we need the return value of the prctl::get_thp_disable
function and assign this return value to the variable thp_disable. That is, the x
in Ok(x)
is needed, besides, x
may be true or false. We should not just judge whether it is Ok()
or Err()
. This value means the state before setting the set_thp_disable, and will be used to set_thp_disable
before starting runc later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we need the return value of the
prctl::get_thp_disable
function and assign this return value to the variable thp_disable. That is, thex
inOk(x)
is needed, besides,x
may be true or false. We should not just judge whether it isOk()
orErr()
.
get_thp_disable
returns a Result<bool, i32>
so we could still return x
instead of converting to string here. using the bool
makes it clearer in the code than x.tostring
and string::new()
which is semantically unclear what string:new()
is in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are actually 3 states here, true
, false
and error
. When the get_thp_disable
function returns error, it means that the thp parameter cannot be obtained, if the set_thp_disable
is executed at this time, it will cause runc to be unable to recover the value of thp_disable, therefore, thp_disabled
parameter needs 3 states, true
, false
, error
, due to the variable life cycle, so string is used here to return.
crates/runc/src/lib.rs
Outdated
@@ -368,6 +368,22 @@ pub trait Spawner: Debug { | |||
impl Runc { | |||
async fn launch(&self, cmd: Command, combined_output: bool) -> Result<Response> { | |||
debug!("Execute command {:?}", cmd); | |||
#[cfg(target_os = "linux")] | |||
let mut cmd = cmd; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since runc
is Linux only, we can rewrite this to something like:
async fn launch(&self, mut cmd: Command, combined_output: bool) -> Result<Response> {
debug!("Execute command {:?}", cmd);
if let Ok(thp) = std::env::var("THP_DISABLED") {
if let Ok(thp_disabled) = thp.parse::<bool>() {
unsafe {
cmd.pre_exec(move || {
#[cfg(target_os = "linux")]
if let Err(e) = prctl::set_thp_disable(thp_disabled) {
log::debug!("set_thp_disable err: {}", e);
};
Ok(())
});
}
}
}
Thanks for your suggestions. |
3e833d4
to
e42fb05
Compare
let _ = prctl::set_thp_disable(true); | ||
x.to_string() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible that you get Ok(false)
and the set it to true and this will return false? The does weren't clear (https://docs.rs/prctl/latest/prctl/fn.get_thp_disable.html)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our goal is to set thp disabled = true
on the shim side and then restore thp disabled before starting runc.
So we only need to focus on the return value of the function get_thp_disabled
, which is Result<bool, i32>
.
The return value of the function set_thp_disabled
is Result<(), i32>
, we don't care if the setting is successful, because even if the setting failed, we should not exit the shim process, therefore, there is no need to pay attention to the set_thp_disabled
function's return value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return value of the function
set_thp_disabled
isResult<(), i32>
, we don't care if the setting is successful, because even if the setting failed, we should not exit the shim process, therefore, there is no need to pay attention to theset_thp_disabled
function's return value.
could you add a comment in the code that indicates this? I worry about someone doing maitaince long term and wondering why return value and failure case is ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your suggestion, done.
78f505e
to
50b1963
Compare
@zzyyzte thanks for your patience. I guess I might be missing something but I don't see how the current configuration configurable? it seems to always try to |
Yes, we need to explicitly set |
Oh, read @mxpv could you clarify, otherwise looks good. |
@zzzzzzzzzy9 could you pls rebase your PR to pick up latest CI changes? |
Done. @mxpv |
@mxpv May need to merge again? |
@zzzzzzzzzy9 would you please remove that merge commit by rebase? It's conflict right now. If you don't mind, I can help to handle this. |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #195 +/- ##
==========================================
- Coverage 37.98% 37.89% -0.10%
==========================================
Files 55 55
Lines 5060 5083 +23
==========================================
+ Hits 1922 1926 +4
- Misses 3138 3157 +19
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
If /sys/kernel/mm/transparent_hugepage/enabled=always, the shim process will use huge pages, which will consume a lot of memory. Just like this: ps -efo pid,rss,comm | grep shim PID RSS COMMAND 2614 7464 containerd-shim I don't think shim needs to use huge pages, and if we turn off the huge pages option, we can save a lot of memory resources. After we set THP_DISABLE=true: ps -efo pid,comm,rss PID COMMAND RSS 1629841 containerd-shim 5648 containerd | |--shim1 --start | |--shim2 (this shim will on host) | |--runc create (when containerd send create request by ttrpc) | |--runc init (this is the pid 1 in container) we should set thp_disabled=1 in shim1 --start, because if we set this in shim 2, the huge page has been setted while func main() running, we set thp_disabled cannot change the setted huge pages. So We need to set thp_disabled=1 in shim1 so that shim2 inherits the settings of the parent process shim1, and shim2 has closed the hugepage when it starts. For runc processes, we need to set thp_disabled='before' in shim2 after fork() and before execve(). So we use cmd.pre_exec to do this.
If /sys/kernel/mm/transparent_hugepage/enabled=always, the shim process will use huge pages, which will consume a lot of memory.
Just like this:
I don't think shim needs to use huge pages, and if we turn off the huge pages option, we can save a lot of memory resources.
After we set THP_DISABLE=true:
ps -efo pid,rss,comm PID RSS COMMAND 2470 5444 containerd-shim cat /proc/2470/smaps | grep -i hugepages AnonHugePages: 0 kB ...