You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(Better to do this after #42 and #39 are merged, which are higher priority and a bit linked.)
A while ago I tried to launch the simulator at large scale on Google Cloud (GCP). I had to fix a few things in Simulator.run to make it work, in particular in the use of jax.lax.fori_loop and the neighbor rebuilding, then it was working well (I managed to run a simulation with 30K entities, which was cool).
Before we lost access to GCP, I saved the code I had there locally on my laptop, with the aim to commit it here. I have just looked at it, but I actually I can't see any relevant change in the diff (no major change with what is currently in the repo). So it is well possible that this is actually currently working.
@corentinlger Can you please try it on JZ? (see steps to reproduce below).
Steps to Reproduce
On JZ, make a script to launch a simulation with:
A large number of agents and objects (I succeeded with 30K entities on GCP, but you can try with less, e.g. 10K if there is not enough memory on the GPU you will get)
Use these parameters: box_size=1000., neighbor_radius=10., use_fori_loop=True, num_steps_lax=1000, freq=-1
Execute Simulator.run (no server) in non-threaded mode with a single timestep (so that it only runs within the fori_loop call, the number of timesteps being set in num_steps_lax). Use a large GPU (e.g. a V100 or A100) and make sure the simulation run in it (e.g. checking with jax.devices)
Please report here if there are issues. If it works, information about how many entities you manage to simulate, for how many timesteps with which GPU memory will be interesting to know.
Thanks!
The text was updated successfully, but these errors were encountered:
Description
(Better to do this after #42 and #39 are merged, which are higher priority and a bit linked.)
A while ago I tried to launch the simulator at large scale on Google Cloud (GCP). I had to fix a few things in
Simulator.run
to make it work, in particular in the use ofjax.lax.fori_loop
and the neighbor rebuilding, then it was working well (I managed to run a simulation with 30K entities, which was cool).Before we lost access to GCP, I saved the code I had there locally on my laptop, with the aim to commit it here. I have just looked at it, but I actually I can't see any relevant change in the diff (no major change with what is currently in the repo). So it is well possible that this is actually currently working.
@corentinlger Can you please try it on JZ? (see steps to reproduce below).
Steps to Reproduce
On JZ, make a script to launch a simulation with:
box_size=1000., neighbor_radius=10., use_fori_loop=True, num_steps_lax=1000, freq=-1
Execute
Simulator.run
(no server) in non-threaded mode with a single timestep (so that it only runs within thefori_loop
call, the number of timesteps being set innum_steps_lax
). Use a large GPU (e.g. a V100 or A100) and make sure the simulation run in it (e.g. checking withjax.devices
)Please report here if there are issues. If it works, information about how many entities you manage to simulate, for how many timesteps with which GPU memory will be interesting to know.
Thanks!
The text was updated successfully, but these errors were encountered: