Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to run FlexFlow only like a simulator? #1172

Open
OneMoreProblem opened this issue Oct 5, 2023 · 5 comments
Open

Is it possible to run FlexFlow only like a simulator? #1172

OneMoreProblem opened this issue Oct 5, 2023 · 5 comments
Assignees
Labels
question Further information is requested

Comments

@OneMoreProblem
Copy link

OneMoreProblem commented Oct 5, 2023

I found a machine_config_example can I use it for a simulator to imitate a real execution, and predict execution time?

For example, I have a machine without GPU. Is it possible to simulate the execution time on a machine with 16 GPUs, with different parallelism strategies? How can I access simulation data like a memory, and forward and backward execution?

@lockshaw lockshaw self-assigned this Oct 6, 2023
@lockshaw lockshaw added the question Further information is requested label Oct 6, 2023
@lockshaw
Copy link
Collaborator

lockshaw commented Oct 6, 2023

It is possible as long as you have access to a single GPU (we use the GPU to profile kernel execution times) . We're working on providing interface for users to provide an analytical cost model for GPU execution, but as of now this is not stable.

@cnjsdfcy
Copy link

hi @lockshaw , in which branch the cost model setting interface is working on ? thx~

@Rashad-CSU
Copy link

It is possible as long as you have access to a single GPU (we use the GPU to profile kernel execution times) . We're working on providing interface for users to provide an analytical cost model for GPU execution, but as of now this is not stable.

Hi,

Currently, I am using "flexflow_python "$FF_HOME"/examples/python/native/cifar10_cnn.py -ll:gpu 1 -ll:fsize 30000 -ll:zsize 3000" to run a model on one GPU. My machine has only one. What command should I use to simulate it for several (e.g. 16) GPUs, and create a tree structure?

Thanks,
Rashad

@lockshaw
Copy link
Collaborator

hi @lockshaw , in which branch the cost model setting interface is working on ? thx~

@cnjsdfcy #622

@lockshaw
Copy link
Collaborator

It is possible as long as you have access to a single GPU (we use the GPU to profile kernel execution times) . We're working on providing interface for users to provide an analytical cost model for GPU execution, but as of now this is not stable.

Hi,

Currently, I am using "flexflow_python "$FF_HOME"/examples/python/native/cifar10_cnn.py -ll:gpu 1 -ll:fsize 30000 -ll:zsize 3000" to run a model on one GPU. My machine has only one. What command should I use to simulate it for several (e.g. 16) GPUs, and create a tree structure?

Thanks, Rashad

AFAIK there isn't a convenient way to do this on #475, which I'm guessing you're using (unless @goliaro knows otherwise). We're working on making it easier to use the simulator as a separate component on #622, but it's definitely not ready for public usage yet. If you're interested in this I recommend following that branch as we're hoping to have it merged in the next couple months.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants