You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm just curious about the acceleration speed of the cuda implementation compared to pytorch.bmm operation, if the input to each MLPs are equal. The test() part code in multi_module.py cannot run successfully due to some flags, and I have no idea how to measure the speed of the cuda implementation against bmm. Could you please give me some guidance? Thanks!
The text was updated successfully, but these errors were encountered:
Hi, thanks for opening the source code!
I'm just curious about the acceleration speed of the cuda implementation compared to pytorch.bmm operation, if the input to each MLPs are equal. The test() part code in multi_module.py cannot run successfully due to some flags, and I have no idea how to measure the speed of the cuda implementation against bmm. Could you please give me some guidance? Thanks!
The text was updated successfully, but these errors were encountered: