-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: color-coded graphs for performance visualization #20
Comments
PS probably it's obvious, by color-coding, I mean, e.g. Gradient from Blue to Red, where the darkest blue == max seconds of processing time, darkest red == least seconds, based on a simple min/max normalization from all computed graph elements, and a best-estimate allocation of the Profile runtimes per element, shown on your graph viz... |
This is a great suggestion, thanks so much! TorchLens already logs all that info, so it’s just a matter of allowing the visuals to reflect it. Currently the color scheme is one that I devised to try to make the salient aspects of the network pop out, with minimal tinkering from the user. But, it could be a helpful “power feature” to allow the user to override these defaults as needed. The tricky thing would be balancing simplicity and flexibility. Currently, my philosophy is to be conservative about hard-coding new use-cases, while giving users all the data and flexibility they need to do anything they want. What about something like the following: provide a set of optional visualization arguments for variables like the color, shape, and size of the box for each layer. For each such argument, allow the user to pass in a function for determining the value of that variable. For instance,
So, the user would pass in a function that takes as input the data structure both for the overall model, and for each individual layer, which would compute the visualization variables based on the metadata provided by TorchLens. I’ll have to think about this, but does this seem sufficient for your use case? |
That makes sense! I would instantly use that. When I had the vision of color-coding a graphical network representing Torch runtime / memory stats, etc., I also posted it to a related open source project called TorchExplorer. There I also published a visual demo based on a Turbo colormap, including Python code, which simulates a dummy neural network I put together for visual purposes. I'd love to port that same exact code to real graphical networks of very complicated neural networks I'm trying to profile now (LLMs, vision transformers, ...) Let me know if you push any code for custom color-coding based on e.g runtime, I'd love to follow-up and share some examples with fine-tuned colormaps that I think could look really awesome (like below) and be very useful. |
I'll put it on top of my list and keep you posted once it's implemented! Can't wait to see what you do with it. |
Sounds great! |
Okay, here's the interface after some initial tinkering... does this seem natural and intuitive to your eyes? Basically, you provide a dict specifying the visualization arguments you want to override (these correspond to the various graphviz functions), where each value is either a literal value (for things that don't depend on the particular data in the model or layer) or a function (for things that depend on metadata about the model or layer). If it looks good I'll polish and push it.
|
Here's another example where the nodes are colored based on their runtime, and sized based on their filesize, by passing in the following function:
|
These both look great! In terms of the API you propose, I think it's simple and practical enough, so long as the intent of being able to do custom color-coding (and size-coding -- awesome work) is clear. You might consider wrapping up these examples with your final API in a demo Jupyter notebook, with a I think The size-coding is super nice and innovative. Most intuitively, it seems like size can do a great job of showing off differences in memory requirements. I think that's what you've started with in terms of In general, I think a really good legend will be key to bringing together the colors and sizes into a useful data visualization exploration. That's because it should be possible to quickly reference the colors/size vs. a ground-truth legend. So it will be great to have a big legend that shows both a range of colors across full colormap (as in my example, above) as well as a similar legend custom made for showing the min/max sizes with numerical values alongside. This is practically already very useful, and I'm excited to pick it up and share some demos with some of the complex neural net modules I'm currently seeking to optimize performance of... |
PS I just saw that you're a postgrad researcher at Columbia. I studied ML there for comp sci grad school :) |
Thanks a ton for the feedback—yup, I’ll def add this to the tutorials once I’ve finalized things (there’s a Colab tutorial notebook linked on the main readme for torchlens). To be clear, I’m probably not going to hard-code these literal visualization options (e.g. this specific way of coloring and sizing the nodes), but rather just provide these override options so the user can do so themselves as they like, if only because I can’t anticipate all the use cases and graphic design is not my main specialty (the upside to this is that it should allow unlimited flexibility for visualizing things based on the model metadata). But, perhaps it would be useful to provide some examples along with accompanying code for people to riff off of. Regarding the legend, that might be tricky because torchlens runs on graphviz, and graphviz doesn’t have a built-in way of doing legends. The only way would be to perhaps make the legend with some other tool like matplotlib, and then paste the graph image and the legend image together. I’d have to think about whether this would make sense to add, since it might be a specialized use case and I want to avoid feature creep. But if there’s sufficient user demand I can brainstorm ways to streamline this. The tensor_fsize field is computed using the sys.getsizeof function from the Python standard library. Currently I don’t break it down by CPU/GPU but that would be easy to do. And yes I’m at Columbia, in the neuroscience department :) I’ll keep you posted once I’ve pushed the new code. |
Apologies for the slow rollout on this, but this functionality has now been added to the main branch in the latest release. |
Cool! |
I welcome any feedback you've got! Closing for now, cheers |
Gathering data from
https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html
...it would be fantastic if there was a library with a one-line API comparable to what Tensorboard previously offered with TensorFlow, for color-coded graph visualization of performance metrics per computational graph element -- namely, runtime, but also of interest, would be memory metrics... e.g. see https://branyang.gitbooks.io/tfdocs/content/get_started/graph_viz.html
The problem with Tensorboard PyTorch support is apparently it's a mess right now...
Please ping me if this is of interest to develop, I think it would greatly help ML developers to be able to both visualize graphs and visualize performance bottlenecks of the graphs...
The text was updated successfully, but these errors were encountered: