Is it possible to run FlexFlow only like a simulator? #1172

OneMoreProblem · 2023-10-05T16:13:38Z

I found a machine_config_example can I use it for a simulator to imitate a real execution, and predict execution time?

For example, I have a machine without GPU. Is it possible to simulate the execution time on a machine with 16 GPUs, with different parallelism strategies? How can I access simulation data like a memory, and forward and backward execution?

The text was updated successfully, but these errors were encountered:

lockshaw · 2023-10-06T17:31:28Z

It is possible as long as you have access to a single GPU (we use the GPU to profile kernel execution times) . We're working on providing interface for users to provide an analytical cost model for GPU execution, but as of now this is not stable.

cnjsdfcy · 2023-10-13T02:15:32Z

hi @lockshaw , in which branch the cost model setting interface is working on ? thx~

Rashad-CSU · 2023-10-19T18:06:56Z

It is possible as long as you have access to a single GPU (we use the GPU to profile kernel execution times) . We're working on providing interface for users to provide an analytical cost model for GPU execution, but as of now this is not stable.

Hi,

Currently, I am using "flexflow_python "$FF_HOME"/examples/python/native/cifar10_cnn.py -ll:gpu 1 -ll:fsize 30000 -ll:zsize 3000" to run a model on one GPU. My machine has only one. What command should I use to simulate it for several (e.g. 16) GPUs, and create a tree structure?

Thanks,
Rashad

lockshaw · 2024-03-11T23:53:13Z

hi @lockshaw , in which branch the cost model setting interface is working on ? thx~

@cnjsdfcy #622

lockshaw · 2024-03-11T23:55:36Z

It is possible as long as you have access to a single GPU (we use the GPU to profile kernel execution times) . We're working on providing interface for users to provide an analytical cost model for GPU execution, but as of now this is not stable.

Hi,

Currently, I am using "flexflow_python "$FF_HOME"/examples/python/native/cifar10_cnn.py -ll:gpu 1 -ll:fsize 30000 -ll:zsize 3000" to run a model on one GPU. My machine has only one. What command should I use to simulate it for several (e.g. 16) GPUs, and create a tree structure?

Thanks, Rashad

AFAIK there isn't a convenient way to do this on #475, which I'm guessing you're using (unless @goliaro knows otherwise). We're working on making it easier to use the simulator as a separate component on #622, but it's definitely not ready for public usage yet. If you're interested in this I recommend following that branch as we're hoping to have it merged in the next couple months.

lockshaw self-assigned this Oct 6, 2023

lockshaw added the question Further information is requested label Oct 6, 2023

OneMoreProblem mentioned this issue Oct 19, 2023

Unable to run an example file (within examples/python/pytorch) - cifar10_cnn.py #1193

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to run FlexFlow only like a simulator? #1172

Is it possible to run FlexFlow only like a simulator? #1172

OneMoreProblem commented Oct 5, 2023 •

edited

Loading

lockshaw commented Oct 6, 2023

cnjsdfcy commented Oct 13, 2023

Rashad-CSU commented Oct 19, 2023

lockshaw commented Mar 11, 2024

lockshaw commented Mar 11, 2024

Is it possible to run FlexFlow only like a simulator? #1172

Is it possible to run FlexFlow only like a simulator? #1172

Comments

OneMoreProblem commented Oct 5, 2023 • edited Loading

lockshaw commented Oct 6, 2023

cnjsdfcy commented Oct 13, 2023

Rashad-CSU commented Oct 19, 2023

lockshaw commented Mar 11, 2024

lockshaw commented Mar 11, 2024

OneMoreProblem commented Oct 5, 2023 •

edited

Loading