Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exit worker threads early when there are no tasks after a while #377

Open
hai-rise opened this issue Oct 9, 2024 · 11 comments
Open

Exit worker threads early when there are no tasks after a while #377

hai-rise opened this issue Oct 9, 2024 · 11 comments
Assignees
Labels

Comments

@hai-rise
Copy link
Contributor

hai-rise commented Oct 9, 2024

No description provided.

@hai-rise
Copy link
Contributor Author

hai-rise commented Oct 9, 2024

@byhashdong Here it is, sir!

@byhashdong
Copy link
Contributor

byhashdong commented Oct 14, 2024

  • A thread pool (of parked threads) may work better. Early experiments with rayon were very poor, so we may need to roll a specialized implementation.
    • after trying rayon, get same result, the performance is worse than original thread pool implementaion
  • We should start post-processing as soon as the scheduler flags completion, instead of waiting for the scoped threads to join (very sad tail-end latency).
    • having a little test, found time gap between all threads completed and scoped completed normally less than 100us.
  • We should re-consider exiting idle threads early like having no new work after 100 iterations.
    • confused, while task.is_some() { as the code, thread should exit if there is no new work to do. @hai-rise

@hai-rise
Copy link
Contributor Author

after trying rayon, get same result, the performance is worse than original thread pool implementation

Hopefully one day we understand why!

having a little test, found time gap between all threads completed and scoped completed normally less than 100us.

Ah yes, the scope should end immediately when all threads join. The original idea is more about starting to post-process when Scheduler announces that execution has been done, so even before all threads have completed joining!

confused, while task.is_some() { as the code, thread should exit if there is no new work to do.

While looking for a Scheduler::next_task, we currently loop until execution is done or aborted:

pevm/src/scheduler.rs

Lines 118 to 119 in bed6f22

thread::yield_now();
continue;

We can add a counter here to return a None to join a thread early when it fails to find a job after X iterations.

@byhashdong
Copy link
Contributor

Do you mean that still keep thread::yield_now() but add a counter to make sure the thread not live too long without any task to execute?

@hai-rise
Copy link
Contributor Author

hai-rise commented Oct 16, 2024

@byhashdong Yes, something like that.

FYI, yield_now was better than spin_loop and much better than not doing either (aggressive loop that may cause congestion at the atomics within) on my x86, but it didn't yield much difference on ARM. Hopefully, we will know the optimal strategy for each input block, and how to play nice with the OS scheduler!

@byhashdong
Copy link
Contributor

byhashdong commented Oct 17, 2024

Adding some stats code and running cargo bench --bench mainnet to do benchmark test, report as below:

  • total 2153984 threads launched
  • distribution of cycles as below:
    sec cycles of yields
    < 50 616820
    50 ~ 100 99462
    100 ~ 200 217636
    200~ 500 435890
    > 500 593066

image

Test code is attached as below:

diff --git a/src/pevm.rs b/src/pevm.rs
index a42f46c..9557aee 100644
--- a/src/pevm.rs
+++ b/src/pevm.rs
@@ -172,7 +172,8 @@ impl Pevm {
         thread::scope(|scope| {
             for _ in 0..concurrency_level.into() {
                 scope.spawn(|| {
-                    let mut task = scheduler.next_task();
+                    let mut task_none_counter = 0;
+                    let mut task = scheduler.next_task(&mut task_none_counter);
                     while task.is_some() {
                         task = match task.unwrap() {
                             Task::Execution(tx_version) => {
@@ -194,9 +195,15 @@ impl Pevm {
                         }

                         if task.is_none() {
-                            task = scheduler.next_task();
+                            task = scheduler.next_task(&mut task_none_counter);
                         }
                     }
+
+                    println!(
+                        "Thread {:?} task_none_counter: {}",
+                        thread::current().id(),
+                        task_none_counter
+                    );
                 });
             }
         });
diff --git a/src/scheduler.rs b/src/scheduler.rs
index 6c0e764..e5b3136 100644
--- a/src/scheduler.rs
+++ b/src/scheduler.rs
@@ -105,7 +105,7 @@ impl Scheduler {
         None
     }

-    pub(crate) fn next_task(&self) -> Option<Task> {
+    pub(crate) fn next_task(&self, task_none_counter: &mut usize) -> Option<Task> {
         while !self.aborted.load(Ordering::Acquire) {
             let execution_idx = self.execution_idx.load(Ordering::Acquire);
             let validation_idx = self.validation_idx.load(Ordering::Acquire);
@@ -115,6 +115,8 @@ impl Scheduler {
                 {
                     break;
                 }
+
+                *task_none_counter += 1;
                 thread::yield_now();
                 continue;
             }
@@ -161,6 +163,7 @@ impl Scheduler {
                 return Some(Task::Execution(tx_version));
             }
         }
+
         None
     }

@byhashdong
Copy link
Contributor

Will do benchmark tests to determine the counter value!

@byhashdong
Copy link
Contributor

byhashdong commented Oct 18, 2024

Tried some runs of benchmark test for different counter value, It looks like 500 is the best one. @hai-rise

counter improved not changed regressed
20 82 9 79
100 34 108 28
200 12 146 12
300 6 152 12
400 15 150 5
500 80 74 16
600 20 132 18

🟢: improved
🟡: not changed
🔴: regressed

block 20 100 200 300 400 500 600
46147 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟢 s: 🟢 p: 🟢 s: 🔴 p: 🟡
116525 s: 🟢 p: 🟢 s: 🔴 p: 🔴 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
930196 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟢 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
1150000 s: 🔴 p: 🔴 s: 🔴 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🔴 p: 🟡
1796867 s: 🟢 p: 🟢 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🔴 p: 🔴 s: 🟡 p: 🟡
2179522 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟢 s: 🟢 p: 🟢 s: 🟡 p: 🟢
2462997 s: 🟢 p: 🟢 s: 🟡 p: 🟢 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🔴
2641321 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟢 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
2674998 s: 🔴 p: 🔴 s: 🔴 p: 🔴 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🔴 p: 🔴
2675000 s: 🔴 p: 🔴 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🔴 p: 🔴
2688148 s: 🔴 p: 🔴 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟡
3356896 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟡
4330482 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
4369999 s: 🔴 p: 🔴 s: 🟡 p: 🟢 s: 🟡 p: 🔴 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟢 p: 🟡 s: 🟡 p: 🔴
4370000 s: 🔴 p: 🔴 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟢 p: 🟢 s: 🟢 p: 🟢 s: 🟡 p: 🟢
4864590 s: 🔴 p: 🔴 s: 🟢 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟢 s: 🟢 p: 🟢 s: 🟡 p: 🟢
5283152 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟡
5526571 s: 🔴 p: 🔴 s: 🔴 p: 🟢 s: 🔴 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟢 s: 🟢 p: 🟢 s: 🟡 p: 🟢
5891667 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🔴 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟢 p: 🟡
6137495 s: 🟡 p: 🟢 s: 🟡 p: 🟢 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🔴 s: 🟡 p: 🟡
6196166 s: 🟢 p: 🟢 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟢 s: 🟡 p: 🔴
7279999 s: 🔴 p: 🔴 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🔴
7280000 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟢
8038679 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟢
8889776 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
9068998 s: 🟢 p: 🟢 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
9069000 s: 🟡 p: 🟡 s: 🟡 p: 🟢 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟢 s: 🟡 p: 🔴 s: 🟡 p: 🟡
10760440 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟡
11114732 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟡 s: 🔴 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
11743952 s: 🔴 p: 🔴 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟡
11814555 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
12047794 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
12159808 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟡
12243999 s: 🔴 p: 🔴 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟡
12244000 s: 🟢 p: 🟢 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟡
12300570 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
12459406 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
12520364 s: 🟢 p: 🟢 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🔴
12522062 s: 🟢 p: 🟢 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🔴
12964999 s: 🔴 p: 🔴 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟢 s: 🟢 p: 🟢 s: 🟡 p: 🟢
12965000 s: 🔴 p: 🔴 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🔴
13217637 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
13287210 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
14029313 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟡
14334629 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟡
14383540 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟡
14396881 s: 🟢 p: 🟢 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
14545870 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟢 s: 🟡 p: 🟡
15199017 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
15274915 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡
15537393 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
15537394 s: 🔴 p: 🔴 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🔴
15538827 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟢
15752489 s: 🔴 p: 🔴 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟢 s: 🟢 p: 🟢 s: 🟡 p: 🟢
16146267 s: 🟢 p: 🟢 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟢
16257471 s: 🟢 p: 🟢 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
17034869 s: 🔴 p: 🔴 s: 🟡 p: 🔴 s: 🟡 p: 🟢 s: 🟡 p: 🔴 s: 🟡 p: 🟢 s: 🟢 p: 🟢 s: 🟡 p: 🟢
17034870 s: 🔴 p: 🔴 s: 🟡 p: 🔴 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟢
17666333 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟡
18085863 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟡
18426253 s: 🟢 p: 🟢 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🔴
18988207 s: 🟢 p: 🟢 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟡
19426586 s: 🟢 p: 🟢 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟢 s: 🟡 p: 🟢
19426587 s: 🟢 p: 🔴 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
19444337 s: 🟢 p: 🟢 s: 🟢 p: 🔴 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟢 s: 🟡 p: 🟡
19469101 s: 🟢 p: 🟢 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟡
19498855 s: 🔴 p: 🔴 s: 🟢 p: 🟢 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟡
19505152 s: 🟢 p: 🟢 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟢 s: 🟡 p: 🟡
19606599 s: 🟢 p: 🟢 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟢
19638737 s: 🔴 p: 🔴 s: 🟡 p: 🔴 s: 🟡 p: 🔴 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟢
19716145 s: 🟡 p: 🔴 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟢
19737292 s: 🟢 p: 🟢 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🟡
19807137 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
19860366 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟡
19910734 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟡 s: 🔴 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟡
19917570 s: 🟢 p: 🟢 s: 🟡 p: 🟢 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟢 s: 🟡 p: 🟡
19923400 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
19929064 s: 🟢 p: 🟢 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
19932148 s: 🔴 p: 🔴 s: 🟡 p: 🔴 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟢
19932703 s: 🟢 p: 🟢 s: 🟡 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🔴 s: 🟡 p: 🔴
19932810 s: 🟢 p: 🟢 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟢 s: 🟡 p: 🟢
19933122 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡
19933597 s: 🔴 p: 🔴 s: 🟡 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🟢
19933612 s: 🔴 p: 🔴 s: 🟢 p: 🟢 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🟡 p: 🔴
19934116 s: 🔴 p: 🔴 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟡 p: 🟡 s: 🟢 p: 🟢 s: 🔴 p: 🟡

byhashdong added a commit to byhashdong/pevm that referenced this issue Oct 18, 2024
@hai-rise hai-rise changed the title Reuse a rayon thread pool in Pevm between runs Exit worker threads early when there are no tasks after a while Oct 18, 2024
@hai-rise
Copy link
Contributor Author

hai-rise commented Oct 19, 2024

@byhashdong Is s sequential? If so it should be unaffected by the counter, right?

I've also played with a few numbers and cannot find a sweet spot. There are always strong wins (no late dependencies so exiting workers early removes tail latency) and losses (many dependencies so when we exit threads early we lack workers when we reset the execution index way back early in the block).

I think we should bring more variables to the equation. For instance, to only count empty tasks and exit threads when num_validated + concurrency_level > block_size - min_validation_idx 🤔.

@byhashdong
Copy link
Contributor

Yes s means sequential. It definitely shouldn't be affected by the counter. I think the sequential volatility maybe introduced by my dev machine's work load!

So sad to hear no positives on your side. I think it maybe related to the counter in/out next_task.
On my side, the counter is put outside next_task, so it measures all yields in the thread, but if we put counter inside next_task it only count yield in this next_task. So I think it's main difference. Maybe you can have a try putting counter outside next_task on your side.

@hai-rise
Copy link
Contributor Author

@byhashdong For micro-benchmarks like this it's best to close most processes (especially large ones like browsers and code editors with rust-analyzer) to minimize noise. And good point on per-task counter vs. all-task counter. I'll try different formulas next week 🙏.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants