-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Add sync to cudagraph module #1026
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Oct 4, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 73.7580μs | 26.4435μs | 37.8165 KOps/s | 40.9303 KOps/s | |
test_plain_set_stack_nested | 64.8110μs | 26.7570μs | 37.3733 KOps/s | 40.2900 KOps/s | |
test_plain_set_nested_inplace | 70.9330μs | 29.2945μs | 34.1361 KOps/s | 37.3383 KOps/s | |
test_plain_set_stack_nested_inplace | 68.2790μs | 28.7535μs | 34.7784 KOps/s | 37.3494 KOps/s | |
test_items | 23.0930μs | 4.1523μs | 240.8323 KOps/s | 231.5482 KOps/s | |
test_items_nested | 0.4683ms | 0.3881ms | 2.5770 KOps/s | 2.6132 KOps/s | |
test_items_nested_locked | 0.5548ms | 0.3891ms | 2.5701 KOps/s | 2.6116 KOps/s | |
test_items_nested_leaf | 0.1612ms | 80.7941μs | 12.3771 KOps/s | 12.2949 KOps/s | |
test_items_stack_nested | 0.7027ms | 0.3951ms | 2.5312 KOps/s | 2.5774 KOps/s | |
test_items_stack_nested_leaf | 0.1551ms | 84.2178μs | 11.8740 KOps/s | 12.0376 KOps/s | |
test_items_stack_nested_locked | 0.7329ms | 0.3937ms | 2.5399 KOps/s | 2.5864 KOps/s | |
test_keys | 30.3270μs | 3.5332μs | 283.0297 KOps/s | 284.7011 KOps/s | |
test_keys_nested | 0.2678ms | 0.1362ms | 7.3430 KOps/s | 7.5227 KOps/s | |
test_keys_nested_locked | 1.5895ms | 0.1416ms | 7.0631 KOps/s | 7.2001 KOps/s | |
test_keys_nested_leaf | 0.2002ms | 0.1193ms | 8.3823 KOps/s | 8.6125 KOps/s | |
test_keys_stack_nested | 0.2318ms | 0.1344ms | 7.4395 KOps/s | 7.5322 KOps/s | |
test_keys_stack_nested_leaf | 0.2098ms | 0.1179ms | 8.4848 KOps/s | 8.6193 KOps/s | |
test_keys_stack_nested_locked | 0.2492ms | 0.1398ms | 7.1547 KOps/s | 7.2435 KOps/s | |
test_values | 6.2398μs | 1.0726μs | 932.3001 KOps/s | 937.1496 KOps/s | |
test_values_nested | 0.1869ms | 96.9536μs | 10.3142 KOps/s | 10.3774 KOps/s | |
test_values_nested_locked | 0.1698ms | 97.4017μs | 10.2668 KOps/s | 10.4452 KOps/s | |
test_values_nested_leaf | 0.1618ms | 83.6408μs | 11.9559 KOps/s | 12.3478 KOps/s | |
test_values_stack_nested | 0.1725ms | 98.1995μs | 10.1833 KOps/s | 10.4195 KOps/s | |
test_values_stack_nested_leaf | 0.1436ms | 82.4443μs | 12.1294 KOps/s | 12.3271 KOps/s | |
test_values_stack_nested_locked | 0.1557ms | 97.0292μs | 10.3062 KOps/s | 10.5310 KOps/s | |
test_membership | 21.3000μs | 0.9287μs | 1.0768 MOps/s | 1.1263 MOps/s | |
test_membership_nested | 23.1740μs | 2.8657μs | 348.9513 KOps/s | 366.4396 KOps/s | |
test_membership_nested_leaf | 24.6670μs | 2.8847μs | 346.6537 KOps/s | 364.7127 KOps/s | |
test_membership_stacked_nested | 22.9630μs | 2.8755μs | 347.7653 KOps/s | 364.9114 KOps/s | |
test_membership_stacked_nested_leaf | 29.9260μs | 2.9144μs | 343.1201 KOps/s | 368.7271 KOps/s | |
test_membership_nested_last | 25.4070μs | 4.3319μs | 230.8475 KOps/s | 232.3815 KOps/s | |
test_membership_nested_leaf_last | 26.6190μs | 4.3652μs | 229.0871 KOps/s | 237.4847 KOps/s | |
test_membership_stacked_nested_last | 28.3930μs | 4.3585μs | 229.4387 KOps/s | 237.6236 KOps/s | |
test_membership_stacked_nested_leaf_last | 26.0390μs | 4.3346μs | 230.7034 KOps/s | 235.6261 KOps/s | |
test_nested_getleaf | 33.7930μs | 10.5925μs | 94.4068 KOps/s | 93.8171 KOps/s | |
test_nested_get | 55.2940μs | 9.9542μs | 100.4598 KOps/s | 98.3939 KOps/s | |
test_stacked_getleaf | 36.0180μs | 10.5987μs | 94.3512 KOps/s | 94.7916 KOps/s | |
test_stacked_get | 32.6510μs | 9.9791μs | 100.2095 KOps/s | 99.7176 KOps/s | |
test_nested_getitemleaf | 34.3140μs | 10.9187μs | 91.5858 KOps/s | 90.7088 KOps/s | |
test_nested_getitem | 54.3820μs | 10.1169μs | 98.8447 KOps/s | 97.2975 KOps/s | |
test_stacked_getitemleaf | 44.9150μs | 10.8791μs | 91.9194 KOps/s | 90.7819 KOps/s | |
test_stacked_getitem | 30.8270μs | 10.0875μs | 99.1325 KOps/s | 95.8756 KOps/s | |
test_lock_nested | 84.1360ms | 0.5978ms | 1.6729 KOps/s | 1.9214 KOps/s | |
test_lock_stack_nested | 0.6979ms | 0.4749ms | 2.1057 KOps/s | 2.0698 KOps/s | |
test_unlock_nested | 83.2556ms | 0.5150ms | 1.9419 KOps/s | 2.2917 KOps/s | |
test_unlock_stack_nested | 0.6588ms | 0.3913ms | 2.5554 KOps/s | 2.4934 KOps/s | |
test_flatten_speed | 0.1911ms | 0.1016ms | 9.8407 KOps/s | 9.9553 KOps/s | |
test_unflatten_speed | 1.0573ms | 0.5350ms | 1.8690 KOps/s | 1.9370 KOps/s | |
test_common_ops | 4.5509ms | 1.2034ms | 830.9866 Ops/s | 898.3887 Ops/s | |
test_creation | 19.5170μs | 2.1418μs | 466.9060 KOps/s | 473.7156 KOps/s | |
test_creation_empty | 47.5590μs | 20.1023μs | 49.7456 KOps/s | 56.1580 KOps/s | |
test_creation_nested_1 | 55.9050μs | 23.4589μs | 42.6277 KOps/s | 47.7704 KOps/s | |
test_creation_nested_2 | 61.5160μs | 27.6225μs | 36.2024 KOps/s | 39.6664 KOps/s | |
test_clone | 0.1201ms | 17.1299μs | 58.3774 KOps/s | 58.2338 KOps/s | |
test_getitem[int] | 1.1468ms | 16.9050μs | 59.1542 KOps/s | 57.6708 KOps/s | |
test_getitem[slice_int] | 0.1455ms | 31.9062μs | 31.3419 KOps/s | 32.2207 KOps/s | |
test_getitem[range] | 0.2710ms | 60.5421μs | 16.5174 KOps/s | 17.2812 KOps/s | |
test_getitem[tuple] | 0.1414ms | 25.8072μs | 38.7489 KOps/s | 38.8393 KOps/s | |
test_getitem[list] | 0.1880ms | 55.1552μs | 18.1307 KOps/s | 18.6575 KOps/s | |
test_setitem_dim[int] | 61.4160μs | 34.1248μs | 29.3042 KOps/s | 30.1376 KOps/s | |
test_setitem_dim[slice_int] | 0.1097ms | 62.7226μs | 15.9432 KOps/s | 16.2444 KOps/s | |
test_setitem_dim[range] | 0.1385ms | 87.2948μs | 11.4554 KOps/s | 11.9410 KOps/s | |
test_setitem_dim[tuple] | 80.2610μs | 51.3278μs | 19.4826 KOps/s | 19.8325 KOps/s | |
test_setitem | 88.3360μs | 32.0764μs | 31.1756 KOps/s | 32.7089 KOps/s | |
test_set | 83.0760μs | 30.9376μs | 32.3231 KOps/s | 34.3094 KOps/s | |
test_set_shared | 3.2701ms | 0.2189ms | 4.5688 KOps/s | 4.5577 KOps/s | |
test_update | 0.1481ms | 40.3122μs | 24.8064 KOps/s | 26.7878 KOps/s | |
test_update_nested | 0.1257ms | 50.9592μs | 19.6235 KOps/s | 20.8724 KOps/s | |
test_update__nested | 1.0217ms | 37.6594μs | 26.5538 KOps/s | 26.1721 KOps/s | |
test_set_nested | 0.1714ms | 34.9620μs | 28.6025 KOps/s | 31.5271 KOps/s | |
test_set_nested_new | 0.1367ms | 38.2407μs | 26.1502 KOps/s | 26.6935 KOps/s | |
test_select | 0.1340ms | 55.8132μs | 17.9169 KOps/s | 18.5139 KOps/s | |
test_select_nested | 0.1265ms | 60.9997μs | 16.3935 KOps/s | 16.8152 KOps/s | |
test_exclude_nested | 0.1776ms | 77.1810μs | 12.9566 KOps/s | 13.4215 KOps/s | |
test_empty[True] | 0.6525ms | 0.3526ms | 2.8363 KOps/s | 2.8365 KOps/s | |
test_empty[False] | 8.5637μs | 1.3037μs | 767.0563 KOps/s | 801.6773 KOps/s | |
test_unbind_speed | 0.6510ms | 0.3065ms | 3.2630 KOps/s | 3.1473 KOps/s | |
test_unbind_speed_stack0 | 0.5954ms | 0.3022ms | 3.3090 KOps/s | 3.2202 KOps/s | |
test_unbind_speed_stack1 | 91.6826ms | 0.8020ms | 1.2469 KOps/s | 1.2813 KOps/s | |
test_split | 3.2178ms | 2.0625ms | 484.8526 Ops/s | 449.4527 Ops/s | |
test_chunk | 89.9785ms | 2.2424ms | 445.9494 Ops/s | 451.2638 Ops/s | |
test_creation[device0] | 0.2292ms | 0.1166ms | 8.5729 KOps/s | 8.5724 KOps/s | |
test_creation_from_tensor | 5.7147ms | 0.1183ms | 8.4529 KOps/s | 8.4557 KOps/s | |
test_add_one[memmap_tensor0] | 0.2123ms | 7.4447μs | 134.3231 KOps/s | 134.9379 KOps/s | |
test_contiguous[memmap_tensor0] | 17.3220μs | 1.9316μs | 517.7183 KOps/s | 527.0806 KOps/s | |
test_stack[memmap_tensor0] | 55.4650μs | 5.7497μs | 173.9212 KOps/s | 176.8150 KOps/s | |
test_memmaptd_index | 1.0988ms | 0.4265ms | 2.3448 KOps/s | 2.4384 KOps/s | |
test_memmaptd_index_astensor | 1.2452ms | 0.5274ms | 1.8963 KOps/s | 1.9403 KOps/s | |
test_memmaptd_index_op | 1.7066ms | 1.0992ms | 909.7751 Ops/s | 954.8079 Ops/s | |
test_serialize_model | 0.2091s | 0.1295s | 7.7238 Ops/s | 8.5527 Ops/s | |
test_serialize_model_pickle | 0.4730s | 0.3917s | 2.5529 Ops/s | 2.5166 Ops/s | |
test_serialize_weights | 0.1242s | 0.1165s | 8.5841 Ops/s | 8.5972 Ops/s | |
test_serialize_weights_returnearly | 0.1689s | 0.1602s | 6.2429 Ops/s | 6.4266 Ops/s | |
test_serialize_weights_pickle | 0.5189s | 0.4225s | 2.3667 Ops/s | 2.4715 Ops/s | |
test_serialize_weights_filesystem | 0.2231s | 0.1524s | 6.5611 Ops/s | 7.1210 Ops/s | |
test_serialize_model_filesystem | 0.1568s | 0.1455s | 6.8721 Ops/s | 6.5340 Ops/s | |
test_reshape_pytree | 90.7600μs | 39.0870μs | 25.5840 KOps/s | 25.7282 KOps/s | |
test_reshape_td | 0.1174ms | 45.6259μs | 21.9174 KOps/s | 21.1107 KOps/s | |
test_view_pytree | 90.5200μs | 38.8738μs | 25.7243 KOps/s | 26.0067 KOps/s | |
test_view_td | 0.1285ms | 52.8314μs | 18.9281 KOps/s | 18.3564 KOps/s | |
test_unbind_pytree | 78.6080μs | 36.5084μs | 27.3909 KOps/s | 28.1187 KOps/s | |
test_unbind_td | 0.3059ms | 46.3309μs | 21.5839 KOps/s | 21.3131 KOps/s | |
test_split_pytree | 85.1500μs | 38.2810μs | 26.1226 KOps/s | 26.1316 KOps/s | |
test_split_td | 0.4566ms | 61.0102μs | 16.3907 KOps/s | 16.7720 KOps/s | |
test_add_pytree | 98.3750μs | 44.9851μs | 22.2296 KOps/s | 22.2010 KOps/s | |
test_add_td | 0.2719ms | 90.4186μs | 11.0597 KOps/s | 11.5658 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1074ms | 59.0086μs | 16.9467 KOps/s | 17.1838 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3576ms | 0.2013ms | 4.9687 KOps/s | 5.1075 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1280ms | 56.8126μs | 17.6017 KOps/s | 17.6212 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2885ms | 0.1403ms | 7.1253 KOps/s | 7.1658 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 81.5250μs | 23.2872μs | 42.9421 KOps/s | 43.3749 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1576ms | 73.7437μs | 13.5605 KOps/s | 13.4352 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1457ms | 75.8423μs | 13.1853 KOps/s | 13.3912 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1274ms | 68.5130μs | 14.5958 KOps/s | 14.6828 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3864ms | 0.1806ms | 5.5375 KOps/s | 5.5560 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4562ms | 0.2391ms | 4.1821 KOps/s | 4.2065 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1087ms | 49.0932μs | 20.3694 KOps/s | 20.6967 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1498ms | 76.9514μs | 12.9952 KOps/s | 12.7903 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2854ms | 0.1731ms | 5.7763 KOps/s | 5.7805 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4765ms | 0.2847ms | 3.5121 KOps/s | 3.5194 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4792ms | 0.2761ms | 3.6219 KOps/s | 3.6121 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3383ms | 0.1819ms | 5.4986 KOps/s | 5.5951 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1642ms | 73.8821μs | 13.5351 KOps/s | 13.6423 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1157ms | 50.3107μs | 19.8765 KOps/s | 20.8773 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4557ms | 0.2313ms | 4.3237 KOps/s | 4.3344 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2840ms | 0.1755ms | 5.6987 KOps/s | 5.8243 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2102ms | 0.1105ms | 9.0520 KOps/s | 8.9367 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1682ms | 78.2182μs | 12.7847 KOps/s | 12.7389 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1617ms | 79.1303μs | 12.6374 KOps/s | 13.0793 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1453ms | 68.9800μs | 14.4970 KOps/s | 14.6847 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2933ms | 0.1952ms | 5.1228 KOps/s | 5.1342 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.0140ms | 1.8097ms | 552.5834 Ops/s | 572.9522 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.4111ms | 0.1949ms | 5.1300 KOps/s | 5.1650 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.8455ms | 1.0976ms | 911.0601 Ops/s | 912.5589 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.4990ms | 0.4207ms | 2.3772 KOps/s | 2.4170 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.5957ms | 4.2028ms | 237.9386 Ops/s | 252.4610 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 96.1910μs | 35.6506μs | 28.0500 KOps/s | 29.2637 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 1.1608ms | 50.8379μs | 19.6704 KOps/s | 20.5411 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 96.7720μs | 30.5153μs | 32.7704 KOps/s | 32.4667 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 74.7410μs | 30.3705μs | 32.9267 KOps/s | 34.7078 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 95.3890μs | 30.4359μs | 32.8560 KOps/s | 33.0795 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 81.4130μs | 29.8484μs | 33.5026 KOps/s | 34.4463 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1594ms | 75.5160μs | 13.2422 KOps/s | 13.6777 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5549ms | 28.7525μs | 34.7796 KOps/s | 35.2401 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2016ms | 70.2774μs | 14.2293 KOps/s | 14.9017 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 82.1340μs | 24.0611μs | 41.5608 KOps/s | 42.8101 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1651ms | 69.1798μs | 14.4551 KOps/s | 14.8678 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 60.7640μs | 23.6238μs | 42.3301 KOps/s | 42.9500 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1531ms | 75.2762μs | 13.2844 KOps/s | 13.6248 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.1097ms | 28.3047μs | 35.3298 KOps/s | 36.2266 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1516ms | 68.7121μs | 14.5535 KOps/s | 14.8571 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 70.7230μs | 23.6120μs | 42.3514 KOps/s | 43.0664 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1511ms | 68.9939μs | 14.4940 KOps/s | 15.0056 KOps/s | |
test_compile_indexing[int-pytree-eager] | 86.2820μs | 23.9320μs | 41.7850 KOps/s | 42.0776 KOps/s | |
test_mod_add[eager] | 94.0060μs | 27.5001μs | 36.3635 KOps/s | 38.9514 KOps/s | |
test_mod_add[compile] | 87.1840μs | 39.5345μs | 25.2944 KOps/s | 26.6504 KOps/s | |
test_mod_add[compile-overhead] | 92.0530μs | 39.3325μs | 25.4243 KOps/s | 25.7760 KOps/s | |
test_mod_wrap[eager] | 0.4440ms | 0.2114ms | 4.7307 KOps/s | 4.9270 KOps/s | |
test_mod_wrap[compile] | 0.4385ms | 0.2370ms | 4.2196 KOps/s | 4.4040 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4142ms | 0.2330ms | 4.2922 KOps/s | 4.4145 KOps/s | |
test_mod_wrap_and_backward[eager] | 11.9082ms | 10.4843ms | 95.3804 Ops/s | 92.2352 Ops/s | |
test_mod_wrap_and_backward[compile] | 11.9915ms | 10.5152ms | 95.1002 Ops/s | 86.1214 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.2716ms | 10.5610ms | 94.6879 Ops/s | 87.7345 Ops/s | |
test_seq_add[eager] | 0.1809ms | 95.2630μs | 10.4972 KOps/s | 10.7663 KOps/s | |
test_seq_add[compile] | 0.1382ms | 65.1759μs | 15.3431 KOps/s | 15.4203 KOps/s | |
test_seq_add[compile-overhead] | 0.1304ms | 64.6524μs | 15.4673 KOps/s | 15.6688 KOps/s | |
test_seq_wrap[eager] | 0.6237ms | 0.3988ms | 2.5074 KOps/s | 2.6421 KOps/s | |
test_seq_wrap[compile] | 1.2107ms | 0.2717ms | 3.6805 KOps/s | 3.7369 KOps/s | |
test_seq_wrap[compile-overhead] | 1.2315ms | 0.2710ms | 3.6904 KOps/s | 3.7753 KOps/s | |
test_func_call_runtime[False-eager] | 0.9446ms | 0.5309ms | 1.8836 KOps/s | 1.9678 KOps/s | |
test_func_call_runtime[False-compile] | 1.0155ms | 0.5031ms | 1.9876 KOps/s | 2.0193 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.8735ms | 0.5043ms | 1.9830 KOps/s | 2.0066 KOps/s | |
test_func_call_runtime[True-eager] | 1.2074ms | 0.7620ms | 1.3124 KOps/s | 1.3584 KOps/s | |
test_func_call_runtime[True-compile] | 0.9142ms | 0.5172ms | 1.9336 KOps/s | 1.9845 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6455ms | 0.5165ms | 1.9360 KOps/s | 1.9711 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.0419ms | 0.5370ms | 1.8621 KOps/s | 1.9640 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6824ms | 0.5062ms | 1.9754 KOps/s | 2.0015 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.6391ms | 0.5064ms | 1.9748 KOps/s | 1.9969 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0749ms | 0.9103ms | 1.0986 KOps/s | 1.1304 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.1427ms | 0.7589ms | 1.3177 KOps/s | 1.3622 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.8618ms | 0.7535ms | 1.3271 KOps/s | 1.3589 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5718ms | 1.9410ms | 515.2000 Ops/s | 525.1234 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.6922ms | 1.9984ms | 500.4032 Ops/s | 509.5431 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.5975ms | 1.9978ms | 500.5463 Ops/s | 513.3802 Ops/s | |
test_distributed | 0.2711ms | 0.1245ms | 8.0346 KOps/s | 7.8031 KOps/s | |
test_tdmodule | 70.3330μs | 19.2502μs | 51.9476 KOps/s | 58.3347 KOps/s | |
test_tdmodule_dispatch | 66.3140μs | 38.4895μs | 25.9811 KOps/s | 29.1391 KOps/s | |
test_tdseq | 49.3630μs | 22.0910μs | 45.2674 KOps/s | 50.8114 KOps/s | |
test_tdseq_dispatch | 79.7200μs | 44.1328μs | 22.6589 KOps/s | 25.0570 KOps/s | |
test_instantiation_functorch | 1.7339ms | 1.5936ms | 627.5120 Ops/s | 641.6514 Ops/s | |
test_instantiation_td | 2.1891ms | 1.2216ms | 818.5940 Ops/s | 853.8919 Ops/s | |
test_exec_functorch | 0.2867ms | 0.1896ms | 5.2746 KOps/s | 5.3838 KOps/s | |
test_exec_functional_call | 0.4379ms | 0.1755ms | 5.6966 KOps/s | 5.8665 KOps/s | |
test_exec_td | 0.3660ms | 0.2001ms | 4.9968 KOps/s | 5.0085 KOps/s | |
test_exec_td_decorator | 0.8959ms | 0.2382ms | 4.1974 KOps/s | 4.3425 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.9286ms | 0.6968ms | 1.4351 KOps/s | 1.4463 KOps/s | |
test_vmap_mlp_speed[True-False] | 1.0887ms | 0.6979ms | 1.4328 KOps/s | 1.4778 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6168ms | 0.5412ms | 1.8479 KOps/s | 1.8832 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7616ms | 0.5421ms | 1.8448 KOps/s | 1.8745 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.3197ms | 0.6604ms | 1.5142 KOps/s | 1.5690 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0489ms | 0.6635ms | 1.5072 KOps/s | 1.5740 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9190ms | 0.5462ms | 1.8309 KOps/s | 1.8892 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8697ms | 0.5453ms | 1.8340 KOps/s | 1.8883 KOps/s | |
test_to_module_speed[True] | 2.0614ms | 1.4555ms | 687.0623 Ops/s | 718.1580 Ops/s | |
test_to_module_speed[False] | 2.2297ms | 1.4277ms | 700.4133 Ops/s | 731.4227 Ops/s | |
test_tc_init | 98.3650μs | 47.7083μs | 20.9607 KOps/s | 21.9232 KOps/s | |
test_tc_init_nested | 0.1700ms | 96.6748μs | 10.3440 KOps/s | 10.9682 KOps/s | |
test_tc_first_layer_tensor | 37.0600μs | 1.5824μs | 631.9673 KOps/s | 632.1583 KOps/s | |
test_tc_first_layer_nontensor | 24.4550μs | 4.9329μs | 202.7212 KOps/s | 210.0437 KOps/s | |
test_tc_second_layer_tensor | 22.1620μs | 2.9379μs | 340.3789 KOps/s | 347.3818 KOps/s | |
test_tc_second_layer_nontensor | 28.8440μs | 6.1208μs | 163.3770 KOps/s | 164.7053 KOps/s | |
test_unbind | 0.4667s | 13.1081ms | 76.2886 Ops/s | 75.7584 Ops/s | |
test_full_like | 8.7797ms | 7.2483ms | 137.9626 Ops/s | 144.9704 Ops/s | |
test_zeros_like | 3.1614ms | 2.7655ms | 361.5937 Ops/s | 346.6296 Ops/s | |
test_ones_like | 3.5754ms | 3.1884ms | 313.6374 Ops/s | 290.0997 Ops/s | |
test_clone | 5.6838ms | 5.0558ms | 197.7914 Ops/s | 206.7255 Ops/s | |
test_squeeze | 66.0740μs | 12.8179μs | 78.0158 KOps/s | 78.8672 KOps/s | |
test_unsqueeze | 0.1600ms | 95.0781μs | 10.5177 KOps/s | 10.8456 KOps/s | |
test_split | 0.5694ms | 0.1959ms | 5.1049 KOps/s | 5.1302 KOps/s | |
test_permute | 0.3564ms | 0.2208ms | 4.5300 KOps/s | 4.4661 KOps/s | |
test_stack | 27.3719ms | 24.6537ms | 40.5619 Ops/s | 41.3618 Ops/s | |
test_cat | 30.6883ms | 24.4095ms | 40.9677 Ops/s | 41.6513 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1489ms | 16.1417μs | 61.9515 KOps/s | 58.4373 KOps/s | |
test_plain_set_stack_nested | 35.6500μs | 16.2227μs | 61.6422 KOps/s | 57.7120 KOps/s | |
test_plain_set_nested_inplace | 41.5100μs | 17.3759μs | 57.5509 KOps/s | 54.3644 KOps/s | |
test_plain_set_stack_nested_inplace | 41.8800μs | 17.1283μs | 58.3830 KOps/s | 54.9699 KOps/s | |
test_items | 22.0100μs | 2.8591μs | 349.7640 KOps/s | 348.8000 KOps/s | |
test_items_nested | 0.3748ms | 0.3464ms | 2.8866 KOps/s | 2.8758 KOps/s | |
test_items_nested_locked | 0.4044ms | 0.3479ms | 2.8741 KOps/s | 2.9106 KOps/s | |
test_items_nested_leaf | 0.1107ms | 69.1224μs | 14.4671 KOps/s | 14.4991 KOps/s | |
test_items_stack_nested | 0.4012ms | 0.3474ms | 2.8786 KOps/s | 2.8888 KOps/s | |
test_items_stack_nested_leaf | 0.1071ms | 70.1134μs | 14.2626 KOps/s | 14.1808 KOps/s | |
test_items_stack_nested_locked | 0.3930ms | 0.3497ms | 2.8593 KOps/s | 2.8835 KOps/s | |
test_keys | 43.1700μs | 3.4201μs | 292.3907 KOps/s | 294.3286 KOps/s | |
test_keys_nested | 96.4410μs | 70.9389μs | 14.0966 KOps/s | 14.0830 KOps/s | |
test_keys_nested_locked | 2.5071ms | 76.1701μs | 13.1285 KOps/s | 13.0032 KOps/s | |
test_keys_nested_leaf | 92.8710μs | 61.1906μs | 16.3424 KOps/s | 16.3798 KOps/s | |
test_keys_stack_nested | 97.1720μs | 70.9975μs | 14.0850 KOps/s | 14.0844 KOps/s | |
test_keys_stack_nested_leaf | 93.8720μs | 62.3452μs | 16.0397 KOps/s | 15.7732 KOps/s | |
test_keys_stack_nested_locked | 0.1061ms | 76.2984μs | 13.1064 KOps/s | 12.9315 KOps/s | |
test_values | 6.2502μs | 0.8422μs | 1.1874 MOps/s | 1.1914 MOps/s | |
test_values_nested | 81.8410μs | 48.5034μs | 20.6171 KOps/s | 20.4728 KOps/s | |
test_values_nested_locked | 76.2710μs | 50.2826μs | 19.8876 KOps/s | 19.8357 KOps/s | |
test_values_nested_leaf | 69.5210μs | 42.6050μs | 23.4714 KOps/s | 23.3568 KOps/s | |
test_values_stack_nested | 77.8110μs | 50.4327μs | 19.8284 KOps/s | 19.6929 KOps/s | |
test_values_stack_nested_leaf | 77.7810μs | 43.3359μs | 23.0755 KOps/s | 22.6558 KOps/s | |
test_values_stack_nested_locked | 79.6810μs | 51.0940μs | 19.5718 KOps/s | 19.1012 KOps/s | |
test_membership | 1.8295μs | 0.4979μs | 2.0085 MOps/s | 1.9956 MOps/s | |
test_membership_nested | 13.0400μs | 1.8358μs | 544.7291 KOps/s | 543.1131 KOps/s | |
test_membership_nested_leaf | 11.2533μs | 1.8094μs | 552.6595 KOps/s | 554.2519 KOps/s | |
test_membership_stacked_nested | 23.8700μs | 1.8810μs | 531.6355 KOps/s | 537.0805 KOps/s | |
test_membership_stacked_nested_leaf | 22.7410μs | 1.9252μs | 519.4236 KOps/s | 535.7859 KOps/s | |
test_membership_nested_last | 32.1010μs | 2.9296μs | 341.3405 KOps/s | 345.5148 KOps/s | |
test_membership_nested_leaf_last | 32.4210μs | 2.9595μs | 337.8955 KOps/s | 347.3320 KOps/s | |
test_membership_stacked_nested_last | 37.9900μs | 5.6149μs | 178.0970 KOps/s | 288.9760 KOps/s | |
test_membership_stacked_nested_leaf_last | 33.9900μs | 5.5775μs | 179.2902 KOps/s | 286.8494 KOps/s | |
test_nested_getleaf | 32.5900μs | 6.0451μs | 165.4239 KOps/s | 165.6837 KOps/s | |
test_nested_get | 36.7210μs | 5.7204μs | 174.8142 KOps/s | 176.5633 KOps/s | |
test_stacked_getleaf | 37.3300μs | 5.9741μs | 167.3898 KOps/s | 165.9591 KOps/s | |
test_stacked_get | 31.8200μs | 5.6263μs | 177.7378 KOps/s | 177.8625 KOps/s | |
test_nested_getitemleaf | 29.4100μs | 6.1413μs | 162.8319 KOps/s | 162.6734 KOps/s | |
test_nested_getitem | 30.8210μs | 5.7666μs | 173.4115 KOps/s | 175.4934 KOps/s | |
test_stacked_getitemleaf | 36.0610μs | 6.0703μs | 164.7370 KOps/s | 164.3580 KOps/s | |
test_stacked_getitem | 39.4300μs | 5.6938μs | 175.6299 KOps/s | 176.2203 KOps/s | |
test_lock_nested | 4.8927ms | 0.4302ms | 2.3245 KOps/s | 2.3745 KOps/s | |
test_lock_stack_nested | 0.4180ms | 0.3802ms | 2.6301 KOps/s | 2.6640 KOps/s | |
test_unlock_nested | 0.7546ms | 0.3631ms | 2.7542 KOps/s | 2.8192 KOps/s | |
test_unlock_stack_nested | 0.3559ms | 0.3173ms | 3.1516 KOps/s | 3.1950 KOps/s | |
test_flatten_speed | 0.1166ms | 83.9826μs | 11.9072 KOps/s | 12.0913 KOps/s | |
test_unflatten_speed | 0.3617ms | 0.3245ms | 3.0814 KOps/s | 3.1131 KOps/s | |
test_common_ops | 1.6489ms | 1.3117ms | 762.3637 Ops/s | 778.9759 Ops/s | |
test_creation | 26.4000μs | 1.4647μs | 682.7285 KOps/s | 679.5280 KOps/s | |
test_creation_empty | 42.0700μs | 14.5315μs | 68.8160 KOps/s | 62.5276 KOps/s | |
test_creation_nested_1 | 65.0310μs | 15.9073μs | 62.8642 KOps/s | 55.5439 KOps/s | |
test_creation_nested_2 | 53.9010μs | 18.7307μs | 53.3883 KOps/s | 49.1935 KOps/s | |
test_clone | 71.2410μs | 28.0817μs | 35.6103 KOps/s | 35.1167 KOps/s | |
test_getitem[int] | 92.4959ms | 23.3966μs | 42.7413 KOps/s | 66.7016 KOps/s | |
test_getitem[slice_int] | 0.1183ms | 27.3719μs | 36.5339 KOps/s | 37.7743 KOps/s | |
test_getitem[range] | 0.2251ms | 0.1081ms | 9.2537 KOps/s | 9.3656 KOps/s | |
test_getitem[tuple] | 0.1158ms | 24.0477μs | 41.5840 KOps/s | 43.3538 KOps/s | |
test_getitem[list] | 0.1923ms | 98.2641μs | 10.1767 KOps/s | 10.4652 KOps/s | |
test_setitem_dim[int] | 93.1210μs | 44.3907μs | 22.5272 KOps/s | 22.9853 KOps/s | |
test_setitem_dim[slice_int] | 91.2310μs | 66.6790μs | 14.9972 KOps/s | 15.0369 KOps/s | |
test_setitem_dim[range] | 0.1610ms | 0.1257ms | 7.9556 KOps/s | 8.0026 KOps/s | |
test_setitem_dim[tuple] | 85.5510μs | 60.2998μs | 16.5838 KOps/s | 16.7890 KOps/s | |
test_setitem | 67.1910μs | 40.9077μs | 24.4453 KOps/s | 23.6405 KOps/s | |
test_set | 82.8210μs | 41.3750μs | 24.1692 KOps/s | 22.9953 KOps/s | |
test_set_shared | 0.3515ms | 52.8942μs | 18.9057 KOps/s | 17.7779 KOps/s | |
test_update | 82.6310μs | 49.4037μs | 20.2414 KOps/s | 19.5711 KOps/s | |
test_update_nested | 98.0710μs | 58.3691μs | 17.1323 KOps/s | 16.7788 KOps/s | |
test_update__nested | 96.0610μs | 60.0040μs | 16.6656 KOps/s | 16.3018 KOps/s | |
test_set_nested | 74.6210μs | 42.4665μs | 23.5480 KOps/s | 22.8792 KOps/s | |
test_set_nested_new | 96.2310μs | 46.2171μs | 21.6370 KOps/s | 21.0158 KOps/s | |
test_select | 99.6210μs | 59.5547μs | 16.7913 KOps/s | 16.4820 KOps/s | |
test_select_nested | 0.3466ms | 42.5560μs | 23.4985 KOps/s | 23.6869 KOps/s | |
test_exclude_nested | 0.1069ms | 57.6251μs | 17.3535 KOps/s | 17.5553 KOps/s | |
test_empty[True] | 0.2943ms | 0.2549ms | 3.9239 KOps/s | 3.9121 KOps/s | |
test_empty[False] | 3.2490μs | 0.7363μs | 1.3582 MOps/s | 1.3545 MOps/s | |
test_to | 52.0410μs | 25.9858μs | 38.4826 KOps/s | 37.9993 KOps/s | |
test_to_nonblocking | 57.8810μs | 24.6455μs | 40.5754 KOps/s | 40.3062 KOps/s | |
test_unbind_speed | 0.3140ms | 0.2768ms | 3.6130 KOps/s | 3.6484 KOps/s | |
test_unbind_speed_stack0 | 0.3079ms | 0.2738ms | 3.6527 KOps/s | 3.7524 KOps/s | |
test_unbind_speed_stack1 | 91.9810ms | 0.7024ms | 1.4237 KOps/s | 1.4561 KOps/s | |
test_split | 94.2540ms | 2.1321ms | 469.0250 Ops/s | 475.7444 Ops/s | |
test_chunk | 94.2802ms | 2.1281ms | 469.8962 Ops/s | 475.1637 Ops/s | |
test_creation[device0] | 0.3944ms | 0.1243ms | 8.0435 KOps/s | 8.0346 KOps/s | |
test_creation_from_tensor | 0.3510ms | 0.1296ms | 7.7151 KOps/s | 7.8308 KOps/s | |
test_add_one[memmap_tensor0] | 0.2914ms | 8.3668μs | 119.5200 KOps/s | 118.4392 KOps/s | |
test_contiguous[memmap_tensor0] | 32.7600μs | 2.1168μs | 472.4078 KOps/s | 477.6385 KOps/s | |
test_stack[memmap_tensor0] | 45.0010μs | 6.4359μs | 155.3793 KOps/s | 158.2423 KOps/s | |
test_memmaptd_index | 1.2591ms | 0.4199ms | 2.3814 KOps/s | 2.4214 KOps/s | |
test_memmaptd_index_astensor | 0.9348ms | 0.4886ms | 2.0468 KOps/s | 2.0621 KOps/s | |
test_memmaptd_index_op | 1.4167ms | 0.9911ms | 1.0090 KOps/s | 979.4465 Ops/s | |
test_serialize_model | 0.1314s | 0.1305s | 7.6626 Ops/s | 7.7008 Ops/s | |
test_serialize_model_pickle | 1.3725s | 1.2174s | 0.8215 Ops/s | 0.8237 Ops/s | |
test_serialize_weights | 0.1301s | 0.1295s | 7.7233 Ops/s | 7.6889 Ops/s | |
test_serialize_weights_returnearly | 0.2278s | 56.5210ms | 17.6925 Ops/s | 18.0318 Ops/s | |
test_serialize_weights_pickle | 1.4282s | 1.2258s | 0.8158 Ops/s | 0.8206 Ops/s | |
test_reshape_pytree | 67.8110μs | 34.1943μs | 29.2446 KOps/s | 29.9168 KOps/s | |
test_reshape_td | 81.8010μs | 40.7243μs | 24.5554 KOps/s | 25.2506 KOps/s | |
test_view_pytree | 69.2310μs | 34.7347μs | 28.7897 KOps/s | 29.9431 KOps/s | |
test_view_td | 79.0710μs | 47.4942μs | 21.0552 KOps/s | 22.4955 KOps/s | |
test_unbind_pytree | 66.0510μs | 33.2401μs | 30.0841 KOps/s | 30.9366 KOps/s | |
test_unbind_td | 0.5378ms | 44.3298μs | 22.5582 KOps/s | 24.9962 KOps/s | |
test_split_pytree | 0.5233ms | 45.5658μs | 21.9463 KOps/s | 22.7359 KOps/s | |
test_split_td | 0.1539ms | 55.7483μs | 17.9378 KOps/s | 16.3080 KOps/s | |
test_add_pytree | 92.5310μs | 56.1147μs | 17.8207 KOps/s | 18.6886 KOps/s | |
test_add_td | 0.1555ms | 94.4896μs | 10.5832 KOps/s | 10.9770 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2971ms | 0.1597ms | 6.2633 KOps/s | 6.2578 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2668ms | 0.1673ms | 5.9788 KOps/s | 6.1568 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 1.0143ms | 0.1415ms | 7.0654 KOps/s | 7.1066 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2702ms | 0.1765ms | 5.6659 KOps/s | 5.4382 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1242ms | 20.1079μs | 49.7318 KOps/s | 46.7270 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1405ms | 47.8500μs | 20.8986 KOps/s | 20.3777 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2816ms | 63.5416μs | 15.7377 KOps/s | 15.5139 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1338ms | 49.6463μs | 20.1425 KOps/s | 20.0131 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4202ms | 0.3120ms | 3.2049 KOps/s | 3.2212 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3318ms | 0.2369ms | 4.2206 KOps/s | 4.3532 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2307ms | 0.1246ms | 8.0234 KOps/s | 7.9244 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1637ms | 64.1707μs | 15.5834 KOps/s | 15.1299 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4228ms | 0.3128ms | 3.1974 KOps/s | 3.2165 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6867ms | 0.5936ms | 1.6846 KOps/s | 1.6294 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3840ms | 0.2825ms | 3.5392 KOps/s | 3.6012 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4280ms | 0.3128ms | 3.1968 KOps/s | 3.1821 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1749ms | 74.7114μs | 13.3848 KOps/s | 13.4886 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1829ms | 0.1297ms | 7.7111 KOps/s | 7.8658 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6185ms | 0.5172ms | 1.9336 KOps/s | 1.8771 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.4174ms | 0.3121ms | 3.2039 KOps/s | 3.2231 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1194ms | 17.6752μs | 56.5763 KOps/s | 51.5416 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1267ms | 39.2814μs | 25.4573 KOps/s | 23.7342 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1508ms | 70.0508μs | 14.2754 KOps/s | 14.3238 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1301ms | 52.6951μs | 18.9771 KOps/s | 18.9315 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.2898ms | 0.8000ms | 1.2500 KOps/s | 1.1406 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.1964ms | 3.0740ms | 325.3119 Ops/s | 329.2184 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.2823ms | 0.7998ms | 1.2502 KOps/s | 1.1566 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.3232ms | 3.1717ms | 315.2858 Ops/s | 318.5201 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1601ms | 0.1090ms | 9.1763 KOps/s | 9.5085 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.2338ms | 59.9290μs | 16.6864 KOps/s | 16.7876 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1842ms | 0.1004ms | 9.9649 KOps/s | 9.9820 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1412ms | 40.6674μs | 24.5897 KOps/s | 23.5637 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1848ms | 0.1012ms | 9.8825 KOps/s | 9.8365 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1380ms | 40.2142μs | 24.8669 KOps/s | 23.7332 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1898ms | 0.1332ms | 7.5080 KOps/s | 7.5102 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1541ms | 23.8995μs | 41.8418 KOps/s | 42.3625 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2384ms | 0.1271ms | 7.8681 KOps/s | 7.8521 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1037ms | 19.5834μs | 51.0636 KOps/s | 50.1381 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2113ms | 0.1281ms | 7.8072 KOps/s | 7.8005 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 70.2110μs | 19.5631μs | 51.1167 KOps/s | 51.3059 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2250ms | 0.1345ms | 7.4348 KOps/s | 7.4638 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5175ms | 23.7158μs | 42.1660 KOps/s | 41.6298 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2178ms | 0.1282ms | 7.8025 KOps/s | 7.8209 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1093ms | 23.9501μs | 41.7534 KOps/s | 50.5352 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2246ms | 0.1321ms | 7.5703 KOps/s | 7.8288 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.1526ms | 19.6071μs | 51.0020 KOps/s | 40.9872 KOps/s | |
test_mod_add[eager] | 0.1314ms | 30.9777μs | 32.2813 KOps/s | 31.1002 KOps/s | |
test_mod_add[compile] | 0.5138ms | 70.1574μs | 14.2537 KOps/s | 14.3123 KOps/s | |
test_mod_add[compile-overhead] | 0.2619ms | 0.1387ms | 7.2096 KOps/s | 6.8230 KOps/s | |
test_mod_wrap[eager] | 0.9883ms | 0.7984ms | 1.2525 KOps/s | 1.2489 KOps/s | |
test_mod_wrap[compile] | 2.1132ms | 0.8347ms | 1.1980 KOps/s | 1.2081 KOps/s | |
test_mod_wrap[compile-overhead] | 4.8848ms | 3.0787ms | 324.8119 Ops/s | 326.8383 Ops/s | |
test_mod_wrap_and_backward[eager] | 4.6125ms | 4.2018ms | 237.9909 Ops/s | 243.1239 Ops/s | |
test_mod_wrap_and_backward[compile] | 4.5741ms | 4.1308ms | 242.0830 Ops/s | 244.6037 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3789ms | 0.9696ms | 1.0314 KOps/s | 982.0624 Ops/s | |
test_seq_add[eager] | 0.1638ms | 95.4403μs | 10.4777 KOps/s | 10.2845 KOps/s | |
test_seq_add[compile] | 0.2344ms | 79.4675μs | 12.5838 KOps/s | 12.0156 KOps/s | |
test_seq_add[compile-overhead] | 0.4968ms | 0.1113ms | 8.9857 KOps/s | 8.8688 KOps/s | |
test_seq_wrap[eager] | 1.3082ms | 0.9332ms | 1.0716 KOps/s | 1.0838 KOps/s | |
test_seq_wrap[compile] | 1.2349ms | 0.8468ms | 1.1810 KOps/s | 1.1952 KOps/s | |
test_seq_wrap[compile-overhead] | 0.6084ms | 0.2158ms | 4.6341 KOps/s | 4.5977 KOps/s | |
test_func_call_runtime[False-eager] | 2.7724ms | 2.3820ms | 419.8135 Ops/s | 421.5495 Ops/s | |
test_func_call_runtime[False-compile] | 2.7811ms | 2.3999ms | 416.6877 Ops/s | 422.6779 Ops/s | |
test_func_call_runtime[False-compile-overhead] | 0.7556ms | 0.3523ms | 2.8386 KOps/s | 2.8527 KOps/s | |
test_func_call_runtime[True-eager] | 2.7214ms | 2.5486ms | 392.3697 Ops/s | 398.8938 Ops/s | |
test_func_call_runtime[True-compile] | 2.5880ms | 2.4373ms | 410.2950 Ops/s | 418.7598 Ops/s | |
test_func_call_runtime[True-compile-overhead] | 0.4239ms | 0.3724ms | 2.6856 KOps/s | 2.6657 KOps/s | |
test_func_call_cm_runtime[False-eager] | 2.7780ms | 2.3820ms | 419.8170 Ops/s | 422.6378 Ops/s | |
test_func_call_cm_runtime[False-compile] | 2.8126ms | 2.3948ms | 417.5677 Ops/s | 422.0705 Ops/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4084ms | 0.3549ms | 2.8176 KOps/s | 2.8437 KOps/s | |
test_func_call_cm_runtime[True-eager] | 2.8499ms | 2.6494ms | 377.4495 Ops/s | 382.7642 Ops/s | |
test_func_call_cm_runtime[True-compile] | 2.6599ms | 2.4766ms | 403.7804 Ops/s | 415.6312 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5101ms | 0.3998ms | 2.5012 KOps/s | 2.4989 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 4.2191ms | 3.7718ms | 265.1249 Ops/s | 266.8760 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.8436ms | 2.4492ms | 408.2998 Ops/s | 410.8199 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5303ms | 0.4006ms | 2.4961 KOps/s | 2.4762 KOps/s | |
test_distributed | 6.3285ms | 0.2156ms | 4.6385 KOps/s | 8.9053 KOps/s | |
test_tdmodule | 52.0110μs | 14.0984μs | 70.9302 KOps/s | 67.5723 KOps/s | |
test_tdmodule_dispatch | 54.6810μs | 27.5461μs | 36.3028 KOps/s | 34.1814 KOps/s | |
test_tdseq | 35.4800μs | 14.8496μs | 67.3417 KOps/s | 62.9167 KOps/s | |
test_tdseq_dispatch | 52.5510μs | 30.4040μs | 32.8904 KOps/s | 30.9515 KOps/s | |
test_instantiation_functorch | 1.9637ms | 1.8014ms | 555.1144 Ops/s | 555.3556 Ops/s | |
test_instantiation_td | 1.7056ms | 1.1557ms | 865.2400 Ops/s | 852.9471 Ops/s | |
test_exec_functorch | 1.1037ms | 0.9998ms | 1.0002 KOps/s | 1.0108 KOps/s | |
test_exec_functional_call | 1.1243ms | 1.0110ms | 989.1038 Ops/s | 1.0079 KOps/s | |
test_exec_td | 1.1845ms | 1.0397ms | 961.8200 Ops/s | 979.7895 Ops/s | |
test_exec_td_decorator | 1.5996ms | 1.0823ms | 923.9622 Ops/s | 953.7011 Ops/s | |
test_vmap_mlp_speed[True-True] | 2.0782ms | 1.2748ms | 784.4134 Ops/s | 798.1068 Ops/s | |
test_vmap_mlp_speed[True-False] | 1.3599ms | 1.2608ms | 793.1615 Ops/s | 796.1943 Ops/s | |
test_vmap_mlp_speed[False-True] | 1.2389ms | 1.1549ms | 865.9086 Ops/s | 872.0375 Ops/s | |
test_vmap_mlp_speed[False-False] | 1.2371ms | 1.1540ms | 866.5188 Ops/s | 868.9582 Ops/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.3729ms | 1.2566ms | 795.8073 Ops/s | 814.7839 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.7842ms | 1.2606ms | 793.3041 Ops/s | 814.5220 Ops/s | |
test_vmap_mlp_speed_decorator[False-True] | 1.5069ms | 1.1828ms | 845.4760 Ops/s | 872.6298 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.3433ms | 1.1813ms | 846.5034 Ops/s | 872.4035 Ops/s | |
test_vmap_transformer_speed[True-True] | 13.4719ms | 13.3353ms | 74.9887 Ops/s | 76.6320 Ops/s | |
test_vmap_transformer_speed[True-False] | 13.3946ms | 13.3071ms | 75.1478 Ops/s | 76.8351 Ops/s | |
test_vmap_transformer_speed[False-True] | 13.2649ms | 13.1291ms | 76.1669 Ops/s | 78.3988 Ops/s | |
test_vmap_transformer_speed[False-False] | 13.2927ms | 13.1431ms | 76.0855 Ops/s | 78.5312 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 34.4970ms | 34.3550ms | 29.1078 Ops/s | 29.9254 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 34.4677ms | 34.3167ms | 29.1403 Ops/s | 29.8389 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 34.3954ms | 34.1311ms | 29.2988 Ops/s | 30.1020 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 34.4002ms | 34.1845ms | 29.2530 Ops/s | 30.1944 Ops/s | |
test_to_module_speed[True] | 1.3399ms | 0.9720ms | 1.0288 KOps/s | 1.0140 KOps/s | |
test_to_module_speed[False] | 1.3750ms | 0.9536ms | 1.0487 KOps/s | 1.0491 KOps/s | |
test_tc_init | 0.1113ms | 31.5894μs | 31.6562 KOps/s | 29.9761 KOps/s | |
test_tc_init_nested | 0.1163ms | 65.4467μs | 15.2796 KOps/s | 14.3754 KOps/s | |
test_tc_first_layer_tensor | 10.8159μs | 0.6655μs | 1.5026 MOps/s | 1.4907 MOps/s | |
test_tc_first_layer_nontensor | 19.1610μs | 2.1957μs | 455.4258 KOps/s | 455.9964 KOps/s | |
test_tc_second_layer_tensor | 20.3903μs | 1.3545μs | 738.2763 KOps/s | 732.2792 KOps/s | |
test_tc_second_layer_nontensor | 78.6910μs | 2.8642μs | 349.1427 KOps/s | 344.6868 KOps/s | |
test_unbind | 0.1945s | 12.1680ms | 82.1829 Ops/s | 91.3035 Ops/s | |
test_full_like | 0.6565ms | 0.5740ms | 1.7423 KOps/s | 1.7497 KOps/s | |
test_zeros_like | 0.2687ms | 0.1979ms | 5.0523 KOps/s | 5.0533 KOps/s | |
test_ones_like | 0.2809ms | 0.1978ms | 5.0567 KOps/s | 5.0545 KOps/s | |
test_clone | 0.4559ms | 0.4137ms | 2.4173 KOps/s | 2.4158 KOps/s | |
test_squeeze | 39.7300μs | 9.4073μs | 106.3005 KOps/s | 108.3317 KOps/s | |
test_unsqueeze | 0.2787ms | 70.1260μs | 14.2600 KOps/s | 13.9495 KOps/s | |
test_split | 0.2548ms | 0.1498ms | 6.6769 KOps/s | 6.6782 KOps/s | |
test_permute | 0.2633ms | 0.1707ms | 5.8570 KOps/s | 5.8643 KOps/s | |
test_stack | 1.2584ms | 0.8481ms | 1.1790 KOps/s | 1.1633 KOps/s | |
test_cat | 1.2551ms | 1.2316ms | 811.9654 Ops/s | 812.0526 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.