-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full Scale Performance: Sharrow On #12
Comments
First ran sharrow compile with the following settings:
Run completed in 76 minutes. Then ran in production mode
Run completed in 7.7 hours with a memory peak at about 163 GB in trip destination. Followed by multiprocessing
Run completed in 110 minutes (1.8 hours). |
Ran with 100% households and sharrow on, single process. Run completed in 1090.3 minutes (18.2 hours). This is much longer than the previous time posted above of 7.7 hours. Current run was performed using PR #867 commit c9d4205. Timing statements comparing the old run above to this current run show large differences mainly in the destination models: Will try again with the main branch of ActivitySim instead of PR 867 to see if that makes a difference. |
Ran using an older environment that uses the current version of ActivitySim (main@bd48d3db), but has sharrow v2.8.2 instead of the previous run's main@8d63a66 (> v2.9.1). Numba was also older using 0.56.4 compared to 0.59.1. The run results were pretty much exactly the same -- run time was 1080.3 minutes. One difference between these current set of runs and the 7.7 hour run above is the server. The 7.7 hour run was done on SANDAG's 1TB RAM, 40 Core machine. These were done on RSG's 500 GB RAM, 24 core machine. |
Sharrow, single process, MTC extended model ran in 10.7 hours on WSP's 512 GB RAM, AMD server. Using everything the latest as of June 26. Memory peak 145 GB in trip destination. ActivitySim: pr/867@c9d4205 |
Per the discussion at ActivitySim/sandag-abm3-example#6 (comment), ran many runs with different NUMBA multithreading (i.e. changing only All runs were performed on the same RSG machine with 24 threads. Some observations:
|
Running the same tests as above and on the same machine, but using multiprocessing instead of multi-threading: Comments:
|
This is the issue to report on memory usage and runtime performance when using sharrow...
The text was updated successfully, but these errors were encountered: