-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] tensorclass nocast #1079
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Nov 7, 2024
ghstack-source-id: edaba79a8a3b42cb3dac19b9fc145c1ceca4c70f Pull Request resolved: #1079
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Nov 7, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 35.7370μs | 18.0181μs | 55.4998 KOps/s | 52.4439 KOps/s | |
test_plain_set_stack_nested | 44.1330μs | 18.1879μs | 54.9815 KOps/s | 53.1036 KOps/s | |
test_plain_set_nested_inplace | 77.3650μs | 19.3381μs | 51.7114 KOps/s | 47.7484 KOps/s | |
test_plain_set_stack_nested_inplace | 52.4980μs | 19.5185μs | 51.2335 KOps/s | 47.8676 KOps/s | |
test_items | 31.7500μs | 4.5180μs | 221.3357 KOps/s | 246.7659 KOps/s | |
test_items_nested | 0.4567ms | 0.3480ms | 2.8735 KOps/s | 2.9142 KOps/s | |
test_items_nested_locked | 0.6132ms | 0.3475ms | 2.8777 KOps/s | 2.9105 KOps/s | |
test_items_nested_leaf | 0.1496ms | 72.5862μs | 13.7767 KOps/s | 14.0442 KOps/s | |
test_items_stack_nested | 0.4428ms | 0.3520ms | 2.8406 KOps/s | 2.8728 KOps/s | |
test_items_stack_nested_leaf | 0.1756ms | 75.2691μs | 13.2857 KOps/s | 13.4260 KOps/s | |
test_items_stack_nested_locked | 0.4245ms | 0.3517ms | 2.8432 KOps/s | 2.8986 KOps/s | |
test_keys | 21.2900μs | 3.5411μs | 282.3969 KOps/s | 286.4285 KOps/s | |
test_keys_nested | 0.2270ms | 0.1355ms | 7.3806 KOps/s | 7.3986 KOps/s | |
test_keys_nested_locked | 2.2349ms | 0.1412ms | 7.0832 KOps/s | 7.0934 KOps/s | |
test_keys_nested_leaf | 0.2274ms | 0.1166ms | 8.5791 KOps/s | 8.6259 KOps/s | |
test_keys_stack_nested | 0.2201ms | 0.1393ms | 7.1763 KOps/s | 7.4157 KOps/s | |
test_keys_stack_nested_leaf | 0.2738ms | 0.1187ms | 8.4274 KOps/s | 8.7934 KOps/s | |
test_keys_stack_nested_locked | 0.2359ms | 0.1422ms | 7.0332 KOps/s | 7.2016 KOps/s | |
test_values | 6.3080μs | 1.0707μs | 933.9942 KOps/s | 937.3920 KOps/s | |
test_values_nested | 0.1102ms | 56.5088μs | 17.6964 KOps/s | 17.9312 KOps/s | |
test_values_nested_locked | 0.1029ms | 56.4108μs | 17.7271 KOps/s | 18.0758 KOps/s | |
test_values_nested_leaf | 0.1254ms | 61.0482μs | 16.3805 KOps/s | 15.5969 KOps/s | |
test_values_stack_nested | 0.1056ms | 56.6767μs | 17.6439 KOps/s | 17.6217 KOps/s | |
test_values_stack_nested_leaf | 0.1465ms | 61.8227μs | 16.1753 KOps/s | 16.7037 KOps/s | |
test_values_stack_nested_locked | 0.1087ms | 56.9754μs | 17.5514 KOps/s | 17.4581 KOps/s | |
test_membership | 3.9374μs | 0.7422μs | 1.3473 MOps/s | 1.0910 MOps/s | |
test_membership_nested | 28.8440μs | 2.7177μs | 367.9604 KOps/s | 355.9888 KOps/s | |
test_membership_nested_leaf | 31.8690μs | 2.7036μs | 369.8782 KOps/s | 352.2468 KOps/s | |
test_membership_stacked_nested | 24.2250μs | 2.6992μs | 370.4805 KOps/s | 355.8600 KOps/s | |
test_membership_stacked_nested_leaf | 23.5640μs | 2.7154μs | 368.2653 KOps/s | 357.8738 KOps/s | |
test_membership_nested_last | 32.3610μs | 4.0883μs | 244.5975 KOps/s | 237.3680 KOps/s | |
test_membership_nested_leaf_last | 20.8790μs | 4.0915μs | 244.4117 KOps/s | 239.7337 KOps/s | |
test_membership_stacked_nested_last | 46.3570μs | 4.0609μs | 246.2497 KOps/s | 74.6666 KOps/s | |
test_membership_stacked_nested_leaf_last | 27.8730μs | 4.0481μs | 247.0281 KOps/s | 75.0461 KOps/s | |
test_nested_getleaf | 33.0520μs | 10.7467μs | 93.0516 KOps/s | 92.8077 KOps/s | |
test_nested_get | 36.5790μs | 10.0614μs | 99.3894 KOps/s | 98.1616 KOps/s | |
test_stacked_getleaf | 31.8000μs | 10.8578μs | 92.0998 KOps/s | 94.1373 KOps/s | |
test_stacked_get | 32.3800μs | 10.1540μs | 98.4830 KOps/s | 99.6276 KOps/s | |
test_nested_getitemleaf | 38.3620μs | 10.8422μs | 92.2326 KOps/s | 88.5686 KOps/s | |
test_nested_getitem | 34.5450μs | 10.4400μs | 95.7853 KOps/s | 93.0703 KOps/s | |
test_stacked_getitemleaf | 37.5910μs | 10.9264μs | 91.5211 KOps/s | 90.0914 KOps/s | |
test_stacked_getitem | 29.2150μs | 9.9537μs | 100.4649 KOps/s | 94.9927 KOps/s | |
test_lock_nested | 4.9633ms | 0.4484ms | 2.2301 KOps/s | 2.2478 KOps/s | |
test_lock_stack_nested | 0.6533ms | 0.4128ms | 2.4223 KOps/s | 2.4883 KOps/s | |
test_unlock_nested | 0.6685ms | 0.3588ms | 2.7867 KOps/s | 2.7385 KOps/s | |
test_unlock_stack_nested | 0.5045ms | 0.3298ms | 3.0321 KOps/s | 3.0853 KOps/s | |
test_flatten_speed | 0.2007ms | 93.2183μs | 10.7275 KOps/s | 10.7470 KOps/s | |
test_unflatten_speed | 0.6010ms | 0.4680ms | 2.1368 KOps/s | 2.1287 KOps/s | |
test_common_ops | 2.5378ms | 0.7886ms | 1.2681 KOps/s | 1.2217 KOps/s | |
test_creation | 16.2010μs | 2.0777μs | 481.3021 KOps/s | 481.3593 KOps/s | |
test_creation_empty | 28.8140μs | 10.4359μs | 95.8230 KOps/s | 76.2920 KOps/s | |
test_creation_nested_1 | 53.0290μs | 13.4403μs | 74.4030 KOps/s | 61.2435 KOps/s | |
test_creation_nested_2 | 38.4620μs | 17.4525μs | 57.2984 KOps/s | 49.9053 KOps/s | |
test_clone | 51.0160μs | 13.3779μs | 74.7504 KOps/s | 76.5482 KOps/s | |
test_getitem[int] | 1.5646ms | 12.9895μs | 76.9853 KOps/s | 79.9085 KOps/s | |
test_getitem[slice_int] | 0.1511ms | 24.3357μs | 41.0919 KOps/s | 42.2951 KOps/s | |
test_getitem[range] | 0.1777ms | 47.7756μs | 20.9312 KOps/s | 20.0662 KOps/s | |
test_getitem[tuple] | 0.1361ms | 19.9206μs | 50.1993 KOps/s | 45.6384 KOps/s | |
test_getitem[list] | 0.1993ms | 43.4672μs | 23.0059 KOps/s | 22.3212 KOps/s | |
test_setitem_dim[int] | 56.5660μs | 26.3314μs | 37.9775 KOps/s | 40.9247 KOps/s | |
test_setitem_dim[slice_int] | 92.1430μs | 51.9253μs | 19.2584 KOps/s | 19.8486 KOps/s | |
test_setitem_dim[range] | 0.1191ms | 73.7480μs | 13.5597 KOps/s | 13.1639 KOps/s | |
test_setitem_dim[tuple] | 88.8370μs | 41.8637μs | 23.8870 KOps/s | 25.0037 KOps/s | |
test_setitem | 76.7140μs | 20.8926μs | 47.8639 KOps/s | 46.0013 KOps/s | |
test_set | 69.4010μs | 20.0432μs | 49.8922 KOps/s | 47.6191 KOps/s | |
test_set_shared | 3.4956ms | 0.1677ms | 5.9638 KOps/s | 5.8681 KOps/s | |
test_update | 0.2046ms | 22.6901μs | 44.0721 KOps/s | 39.9354 KOps/s | |
test_update_nested | 83.4770μs | 33.6634μs | 29.7058 KOps/s | 28.4225 KOps/s | |
test_update__nested | 1.0879ms | 33.6165μs | 29.7473 KOps/s | 31.1592 KOps/s | |
test_set_nested | 67.1260μs | 22.4872μs | 44.4698 KOps/s | 42.9145 KOps/s | |
test_set_nested_new | 68.8890μs | 27.6203μs | 36.2053 KOps/s | 36.0056 KOps/s | |
test_select | 0.1002ms | 43.9074μs | 22.7752 KOps/s | 22.9555 KOps/s | |
test_select_nested | 0.1370ms | 59.9419μs | 16.6828 KOps/s | 16.7120 KOps/s | |
test_exclude_nested | 0.3710ms | 75.1667μs | 13.3038 KOps/s | 13.4006 KOps/s | |
test_empty[True] | 0.6592ms | 0.3511ms | 2.8479 KOps/s | 2.8242 KOps/s | |
test_empty[False] | 7.0408μs | 1.2248μs | 816.4775 KOps/s | 817.3841 KOps/s | |
test_unbind_speed | 0.4559ms | 0.2628ms | 3.8058 KOps/s | 3.7455 KOps/s | |
test_unbind_speed_stack0 | 0.5421ms | 0.2597ms | 3.8508 KOps/s | 3.9276 KOps/s | |
test_unbind_speed_stack1 | 0.1033s | 0.7727ms | 1.2942 KOps/s | 1.4623 KOps/s | |
test_split | 2.4262ms | 1.5543ms | 643.3880 Ops/s | 585.2477 Ops/s | |
test_chunk | 0.1029s | 1.8628ms | 536.8221 Ops/s | 576.8334 Ops/s | |
test_consolidate_njt[False-None] | 10.5504ms | 8.3304ms | 120.0429 Ops/s | 119.8952 Ops/s | |
test_creation[device0] | 3.3934ms | 91.8440μs | 10.8880 KOps/s | 10.8246 KOps/s | |
test_creation_from_tensor | 0.2199ms | 92.1050μs | 10.8572 KOps/s | 10.5566 KOps/s | |
test_add_one[memmap_tensor0] | 0.1278ms | 4.8846μs | 204.7261 KOps/s | 196.6347 KOps/s | |
test_contiguous[memmap_tensor0] | 11.2810μs | 0.5308μs | 1.8838 MOps/s | 1.9696 MOps/s | |
test_stack[memmap_tensor0] | 39.9950μs | 3.6128μs | 276.7932 KOps/s | 286.3311 KOps/s | |
test_memmaptd_index | 1.0595ms | 0.2446ms | 4.0884 KOps/s | 4.2273 KOps/s | |
test_memmaptd_index_astensor | 0.5818ms | 0.3215ms | 3.1103 KOps/s | 3.1892 KOps/s | |
test_memmaptd_index_op | 1.3277ms | 0.5959ms | 1.6780 KOps/s | 1.6001 KOps/s | |
test_serialize_model | 0.1233s | 0.1142s | 8.7575 Ops/s | 7.7501 Ops/s | |
test_serialize_model_pickle | 0.4459s | 0.3909s | 2.5582 Ops/s | 2.5628 Ops/s | |
test_serialize_weights | 0.1184s | 0.1127s | 8.8735 Ops/s | 8.7927 Ops/s | |
test_serialize_weights_returnearly | 0.3712s | 0.1838s | 5.4421 Ops/s | 6.3134 Ops/s | |
test_serialize_weights_pickle | 0.5584s | 0.4259s | 2.3477 Ops/s | 2.4156 Ops/s | |
test_serialize_weights_filesystem | 0.1514s | 0.1417s | 7.0547 Ops/s | 7.0884 Ops/s | |
test_serialize_model_filesystem | 0.2475s | 0.1577s | 6.3425 Ops/s | 6.5121 Ops/s | |
test_reshape_pytree | 57.0270μs | 27.7441μs | 36.0438 KOps/s | 37.2882 KOps/s | |
test_reshape_td | 66.2240μs | 33.6085μs | 29.7543 KOps/s | 29.6472 KOps/s | |
test_view_pytree | 69.2100μs | 27.9172μs | 35.8202 KOps/s | 36.7701 KOps/s | |
test_view_td | 79.4390μs | 38.7276μs | 25.8214 KOps/s | 25.8395 KOps/s | |
test_unbind_pytree | 64.9220μs | 30.9231μs | 32.3382 KOps/s | 32.8046 KOps/s | |
test_unbind_td | 0.3695ms | 38.8446μs | 25.7436 KOps/s | 25.2122 KOps/s | |
test_split_pytree | 68.9390μs | 30.4431μs | 32.8481 KOps/s | 33.2388 KOps/s | |
test_split_td | 0.2072ms | 44.1554μs | 22.6473 KOps/s | 22.3341 KOps/s | |
test_add_pytree | 83.9980μs | 36.7528μs | 27.2088 KOps/s | 27.7711 KOps/s | |
test_add_td | 0.1105ms | 57.5790μs | 17.3675 KOps/s | 16.4651 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1167ms | 62.6911μs | 15.9512 KOps/s | 15.7283 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3589ms | 0.1618ms | 6.1798 KOps/s | 6.1963 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1217ms | 45.5715μs | 21.9435 KOps/s | 21.1877 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2368ms | 0.1208ms | 8.2799 KOps/s | 8.4510 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 75.1110μs | 26.4905μs | 37.7493 KOps/s | 38.2204 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1231ms | 53.5416μs | 18.6771 KOps/s | 18.4240 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1539ms | 80.7230μs | 12.3880 KOps/s | 12.3112 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1398ms | 70.0590μs | 14.2737 KOps/s | 14.5081 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1683ms | 0.1043ms | 9.5873 KOps/s | 9.3456 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3914ms | 0.1980ms | 5.0515 KOps/s | 5.0330 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1005ms | 45.1260μs | 22.1602 KOps/s | 21.6256 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4575ms | 63.1559μs | 15.8338 KOps/s | 16.0914 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2370ms | 0.1058ms | 9.4535 KOps/s | 9.6551 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4524ms | 0.2041ms | 4.8988 KOps/s | 4.9792 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4236ms | 0.2108ms | 4.7433 KOps/s | 4.7039 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2052ms | 0.1072ms | 9.3244 KOps/s | 9.4668 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1256ms | 57.4110μs | 17.4183 KOps/s | 18.0221 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 94.6480μs | 45.5939μs | 21.9328 KOps/s | 21.5944 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6358ms | 0.1598ms | 6.2592 KOps/s | 6.2932 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1973ms | 0.1029ms | 9.7166 KOps/s | 9.4741 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 59.4310μs | 21.6171μs | 46.2597 KOps/s | 48.0648 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1080ms | 58.6696μs | 17.0446 KOps/s | 16.8261 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1581ms | 83.2630μs | 12.0101 KOps/s | 12.0159 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1388ms | 71.0704μs | 14.0706 KOps/s | 14.0953 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3291ms | 0.2080ms | 4.8069 KOps/s | 4.7692 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.1753ms | 1.3477ms | 741.9890 Ops/s | 746.0653 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.4413ms | 0.2025ms | 4.9382 KOps/s | 4.8180 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.0158ms | 0.7850ms | 1.2738 KOps/s | 1.2798 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.7601ms | 0.4571ms | 2.1879 KOps/s | 2.2218 KOps/s | |
test_compile_assign_and_add_stack[eager] | 2.8468ms | 2.6271ms | 380.6510 Ops/s | 350.1844 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 88.8870μs | 36.0294μs | 27.7551 KOps/s | 27.1504 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5489ms | 31.2384μs | 32.0119 KOps/s | 29.0147 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 75.5620μs | 29.5701μs | 33.8179 KOps/s | 32.9327 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 91.7120μs | 23.3451μs | 42.8356 KOps/s | 42.1537 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 2.2067ms | 30.4486μs | 32.8422 KOps/s | 31.7888 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 77.7860μs | 23.2813μs | 42.9529 KOps/s | 42.1495 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.4310ms | 55.1900μs | 18.1192 KOps/s | 19.4207 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5029ms | 18.8396μs | 53.0796 KOps/s | 49.8230 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 91.1320μs | 44.6250μs | 22.4090 KOps/s | 22.5956 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 49.1530μs | 18.9955μs | 52.6439 KOps/s | 52.7944 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1005ms | 45.5552μs | 21.9514 KOps/s | 21.7221 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 70.8830μs | 19.0254μs | 52.5612 KOps/s | 52.5559 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1208ms | 52.7771μs | 18.9476 KOps/s | 18.8859 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.8739ms | 18.8672μs | 53.0021 KOps/s | 50.5193 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.4028ms | 46.0212μs | 21.7291 KOps/s | 21.8814 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 48.4300μs | 19.0080μs | 52.6094 KOps/s | 53.1703 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1201ms | 45.3711μs | 22.0405 KOps/s | 21.2772 KOps/s | |
test_compile_indexing[int-pytree-eager] | 60.1830μs | 19.0096μs | 52.6049 KOps/s | 52.4292 KOps/s | |
test_mod_add[eager] | 76.0930μs | 26.7289μs | 37.4127 KOps/s | 36.0345 KOps/s | |
test_mod_add[compile] | 99.0050μs | 45.6401μs | 21.9106 KOps/s | 21.2927 KOps/s | |
test_mod_add[compile-overhead] | 93.1640μs | 46.0403μs | 21.7201 KOps/s | 21.2051 KOps/s | |
test_mod_wrap[eager] | 0.3709ms | 0.2140ms | 4.6732 KOps/s | 4.5774 KOps/s | |
test_mod_wrap[compile] | 1.6736ms | 0.2013ms | 4.9676 KOps/s | 4.8223 KOps/s | |
test_mod_wrap[compile-overhead] | 1.3655ms | 0.1992ms | 5.0197 KOps/s | 4.8011 KOps/s | |
test_mod_wrap_and_backward[eager] | 13.0992ms | 11.3379ms | 88.1999 Ops/s | 86.1489 Ops/s | |
test_mod_wrap_and_backward[compile] | 14.5619ms | 12.3259ms | 81.1303 Ops/s | 82.3767 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 17.5416ms | 12.0633ms | 82.8957 Ops/s | 79.2946 Ops/s | |
test_seq_add[eager] | 0.1656ms | 92.9535μs | 10.7581 KOps/s | 10.4442 KOps/s | |
test_seq_add[compile] | 0.1330ms | 61.3547μs | 16.2987 KOps/s | 16.1934 KOps/s | |
test_seq_add[compile-overhead] | 0.1109ms | 58.6007μs | 17.0646 KOps/s | 16.4461 KOps/s | |
test_seq_wrap[eager] | 0.6324ms | 0.3897ms | 2.5663 KOps/s | 2.4438 KOps/s | |
test_seq_wrap[compile] | 0.3130ms | 0.2234ms | 4.4772 KOps/s | 4.3869 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4161ms | 0.2233ms | 4.4789 KOps/s | 4.3871 KOps/s | |
test_func_call_runtime[False-eager] | 0.9098ms | 0.5458ms | 1.8322 KOps/s | 1.8399 KOps/s | |
test_func_call_runtime[False-compile] | 0.8227ms | 0.4225ms | 2.3668 KOps/s | 2.3196 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.9050ms | 0.4243ms | 2.3567 KOps/s | 2.3294 KOps/s | |
test_func_call_runtime[True-eager] | 1.4058ms | 0.7626ms | 1.3113 KOps/s | 1.3280 KOps/s | |
test_func_call_runtime[True-compile] | 0.7734ms | 0.4627ms | 2.1612 KOps/s | 2.1125 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6923ms | 0.4608ms | 2.1702 KOps/s | 2.1107 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8343ms | 0.5409ms | 1.8486 KOps/s | 1.8493 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8923ms | 0.4171ms | 2.3974 KOps/s | 2.3182 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.7913ms | 0.4188ms | 2.3876 KOps/s | 2.3157 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0546ms | 0.8886ms | 1.1254 KOps/s | 1.1327 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.4139ms | 0.5047ms | 1.9813 KOps/s | 1.9983 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.9966ms | 0.4853ms | 2.0608 KOps/s | 2.0107 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5805ms | 1.8579ms | 538.2392 Ops/s | 521.9543 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8836ms | 0.5148ms | 1.9427 KOps/s | 1.7542 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.8680ms | 0.5175ms | 1.9323 KOps/s | 1.8975 KOps/s | |
test_distributed | 0.3428ms | 0.1257ms | 7.9544 KOps/s | 7.6433 KOps/s | |
test_tdmodule | 80.9320μs | 18.7006μs | 53.4741 KOps/s | 48.9255 KOps/s | |
test_tdmodule_dispatch | 66.3150μs | 37.0627μs | 26.9813 KOps/s | 24.8053 KOps/s | |
test_tdseq | 45.2450μs | 21.7565μs | 45.9634 KOps/s | 42.8222 KOps/s | |
test_tdseq_dispatch | 70.1220μs | 42.2478μs | 23.6699 KOps/s | 21.7059 KOps/s | |
test_instantiation_functorch | 2.4428ms | 1.5468ms | 646.5121 Ops/s | 638.1164 Ops/s | |
test_exec_functorch | 0.6924ms | 0.1935ms | 5.1686 KOps/s | 5.5144 KOps/s | |
test_exec_functional_call | 0.4521ms | 0.1788ms | 5.5938 KOps/s | 5.7740 KOps/s | |
test_exec_td_decorator | 0.5067ms | 0.2355ms | 4.2456 KOps/s | 4.3491 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8643ms | 0.6235ms | 1.6037 KOps/s | 1.5516 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9012ms | 0.6246ms | 1.6009 KOps/s | 1.5290 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7005ms | 0.5113ms | 1.9558 KOps/s | 1.8840 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8778ms | 0.5129ms | 1.9498 KOps/s | 1.9004 KOps/s | |
test_to_module_speed[True] | 1.7055ms | 1.2822ms | 779.9359 Ops/s | 767.8285 Ops/s | |
test_to_module_speed[False] | 1.7990ms | 1.2503ms | 799.7960 Ops/s | 788.2808 Ops/s | |
test_tc_init | 85.8920μs | 46.1494μs | 21.6688 KOps/s | 21.3074 KOps/s | |
test_tc_init_nested | 0.1596ms | 92.3282μs | 10.8309 KOps/s | 10.5579 KOps/s | |
test_tc_first_layer_tensor | 22.9030μs | 1.5136μs | 660.6686 KOps/s | 648.0851 KOps/s | |
test_tc_first_layer_nontensor | 21.8210μs | 4.6741μs | 213.9443 KOps/s | 213.9693 KOps/s | |
test_tc_second_layer_tensor | 21.2790μs | 2.8073μs | 356.2084 KOps/s | 356.1363 KOps/s | |
test_tc_second_layer_nontensor | 27.7120μs | 5.9634μs | 167.6908 KOps/s | 166.0582 KOps/s | |
test_unbind | 0.2188s | 12.3631ms | 80.8860 Ops/s | 82.4936 Ops/s | |
test_full_like | 7.6290ms | 6.9082ms | 144.7547 Ops/s | 142.6469 Ops/s | |
test_zeros_like | 3.1740ms | 2.7324ms | 365.9819 Ops/s | 360.7920 Ops/s | |
test_ones_like | 3.5748ms | 3.1192ms | 320.5949 Ops/s | 323.2390 Ops/s | |
test_clone | 5.6219ms | 5.0288ms | 198.8557 Ops/s | 201.1185 Ops/s | |
test_squeeze | 58.9110μs | 12.0821μs | 82.7670 KOps/s | 81.4488 KOps/s | |
test_unsqueeze | 0.2882ms | 89.0995μs | 11.2234 KOps/s | 11.1497 KOps/s | |
test_split | 0.3777ms | 0.1887ms | 5.3004 KOps/s | 5.2787 KOps/s | |
test_permute | 0.2892ms | 0.2118ms | 4.7215 KOps/s | 4.5527 KOps/s | |
test_stack | 27.7589ms | 24.8751ms | 40.2008 Ops/s | 40.1956 Ops/s | |
test_cat | 26.8506ms | 24.6642ms | 40.5446 Ops/s | 40.6990 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 35.0310μs | 10.1768μs | 98.2627 KOps/s | 82.2395 KOps/s | |
test_plain_set_stack_nested | 96.4710μs | 10.2586μs | 97.4796 KOps/s | 82.4650 KOps/s | |
test_plain_set_nested_inplace | 39.4000μs | 11.0025μs | 90.8886 KOps/s | 76.9744 KOps/s | |
test_plain_set_stack_nested_inplace | 38.0210μs | 11.0029μs | 90.8855 KOps/s | 76.2296 KOps/s | |
test_items | 39.9510μs | 2.9485μs | 339.1551 KOps/s | 340.9414 KOps/s | |
test_items_nested | 0.3940ms | 0.3255ms | 3.0721 KOps/s | 3.0274 KOps/s | |
test_items_nested_locked | 0.3822ms | 0.3285ms | 3.0444 KOps/s | 3.0071 KOps/s | |
test_items_nested_leaf | 89.3720μs | 59.0285μs | 16.9410 KOps/s | 16.8641 KOps/s | |
test_items_stack_nested | 0.3722ms | 0.3277ms | 3.0517 KOps/s | 3.0569 KOps/s | |
test_items_stack_nested_leaf | 87.5920μs | 60.1244μs | 16.6322 KOps/s | 16.9709 KOps/s | |
test_items_stack_nested_locked | 0.3858ms | 0.3304ms | 3.0271 KOps/s | 3.0534 KOps/s | |
test_keys | 26.2910μs | 3.5471μs | 281.9175 KOps/s | 287.9638 KOps/s | |
test_keys_nested | 96.8120μs | 69.5183μs | 14.3847 KOps/s | 14.2009 KOps/s | |
test_keys_nested_locked | 0.7125ms | 75.1058μs | 13.3146 KOps/s | 13.1526 KOps/s | |
test_keys_nested_leaf | 94.1420μs | 60.9425μs | 16.4089 KOps/s | 16.2243 KOps/s | |
test_keys_stack_nested | 0.1083ms | 69.7572μs | 14.3354 KOps/s | 14.1347 KOps/s | |
test_keys_stack_nested_leaf | 0.1444ms | 60.7577μs | 16.4588 KOps/s | 16.2880 KOps/s | |
test_keys_stack_nested_locked | 0.1290ms | 75.4536μs | 13.2532 KOps/s | 13.1615 KOps/s | |
test_values | 5.9168μs | 0.8585μs | 1.1648 MOps/s | 1.1518 MOps/s | |
test_values_nested | 59.5110μs | 31.7554μs | 31.4907 KOps/s | 31.6818 KOps/s | |
test_values_nested_locked | 61.7210μs | 33.0517μs | 30.2556 KOps/s | 30.1854 KOps/s | |
test_values_nested_leaf | 61.8210μs | 33.9953μs | 29.4158 KOps/s | 29.6605 KOps/s | |
test_values_stack_nested | 63.5420μs | 31.9471μs | 31.3017 KOps/s | 31.4970 KOps/s | |
test_values_stack_nested_leaf | 73.5820μs | 34.2408μs | 29.2050 KOps/s | 29.5871 KOps/s | |
test_values_stack_nested_locked | 58.4110μs | 33.4687μs | 29.8787 KOps/s | 30.0363 KOps/s | |
test_membership | 2.3406μs | 0.5254μs | 1.9035 MOps/s | 1.8941 MOps/s | |
test_membership_nested | 41.0310μs | 2.0192μs | 495.2451 KOps/s | 506.5130 KOps/s | |
test_membership_nested_leaf | 22.8705μs | 2.0006μs | 499.8462 KOps/s | 501.6784 KOps/s | |
test_membership_stacked_nested | 28.7210μs | 2.0591μs | 485.6496 KOps/s | 488.3188 KOps/s | |
test_membership_stacked_nested_leaf | 37.8210μs | 2.0712μs | 482.8124 KOps/s | 489.3576 KOps/s | |
test_membership_nested_last | 35.9010μs | 2.9153μs | 343.0234 KOps/s | 344.3796 KOps/s | |
test_membership_nested_leaf_last | 24.8110μs | 2.9394μs | 340.2035 KOps/s | 347.0322 KOps/s | |
test_membership_stacked_nested_last | 31.0300μs | 4.6451μs | 215.2824 KOps/s | 343.2127 KOps/s | |
test_membership_stacked_nested_leaf_last | 28.0210μs | 4.6600μs | 214.5925 KOps/s | 347.6389 KOps/s | |
test_nested_getleaf | 38.1910μs | 6.1011μs | 163.9049 KOps/s | 162.7249 KOps/s | |
test_nested_get | 44.7610μs | 5.7404μs | 174.2033 KOps/s | 171.7809 KOps/s | |
test_stacked_getleaf | 28.6710μs | 6.0731μs | 164.6608 KOps/s | 162.9629 KOps/s | |
test_stacked_get | 37.2910μs | 5.7659μs | 173.4324 KOps/s | 172.1317 KOps/s | |
test_nested_getitemleaf | 26.7810μs | 6.1545μs | 162.4840 KOps/s | 162.2781 KOps/s | |
test_nested_getitem | 32.9210μs | 5.8659μs | 170.4760 KOps/s | 171.0022 KOps/s | |
test_stacked_getitemleaf | 49.3410μs | 6.1309μs | 163.1069 KOps/s | 161.5343 KOps/s | |
test_stacked_getitem | 37.4510μs | 5.8210μs | 171.7929 KOps/s | 169.9485 KOps/s | |
test_lock_nested | 0.7771ms | 0.3776ms | 2.6486 KOps/s | 2.5840 KOps/s | |
test_lock_stack_nested | 0.4044ms | 0.3411ms | 2.9319 KOps/s | 2.8555 KOps/s | |
test_unlock_nested | 0.6257ms | 0.3170ms | 3.1550 KOps/s | 3.1356 KOps/s | |
test_unlock_stack_nested | 0.3177ms | 0.2795ms | 3.5780 KOps/s | 3.4697 KOps/s | |
test_flatten_speed | 0.1170ms | 73.2629μs | 13.6495 KOps/s | 13.6405 KOps/s | |
test_unflatten_speed | 0.3556ms | 0.2974ms | 3.3621 KOps/s | 3.3733 KOps/s | |
test_common_ops | 91.5859ms | 0.6449ms | 1.5506 KOps/s | 1.5598 KOps/s | |
test_creation | 0.1710ms | 1.4766μs | 677.2189 KOps/s | 667.4650 KOps/s | |
test_creation_empty | 29.2800μs | 6.3671μs | 157.0585 KOps/s | 98.4644 KOps/s | |
test_creation_nested_1 | 45.3510μs | 7.9159μs | 126.3281 KOps/s | 84.3819 KOps/s | |
test_creation_nested_2 | 36.5210μs | 10.3950μs | 96.2001 KOps/s | 69.5087 KOps/s | |
test_clone | 76.9610μs | 10.7934μs | 92.6489 KOps/s | 94.4487 KOps/s | |
test_getitem[int] | 1.4839ms | 11.1269μs | 89.8726 KOps/s | 90.7909 KOps/s | |
test_getitem[slice_int] | 0.1111ms | 21.9621μs | 45.5330 KOps/s | 44.5709 KOps/s | |
test_getitem[range] | 0.1330ms | 38.3623μs | 26.0672 KOps/s | 25.8478 KOps/s | |
test_getitem[tuple] | 0.1066ms | 18.4794μs | 54.1144 KOps/s | 53.2023 KOps/s | |
test_getitem[list] | 0.2385ms | 33.8670μs | 29.5272 KOps/s | 28.4877 KOps/s | |
test_setitem_dim[int] | 45.0210μs | 19.5314μs | 51.1996 KOps/s | 49.9684 KOps/s | |
test_setitem_dim[slice_int] | 56.4610μs | 39.1556μs | 25.5391 KOps/s | 24.7288 KOps/s | |
test_setitem_dim[range] | 80.3710μs | 53.7859μs | 18.5922 KOps/s | 18.4789 KOps/s | |
test_setitem_dim[tuple] | 56.0310μs | 32.4976μs | 30.7715 KOps/s | 29.4109 KOps/s | |
test_setitem | 77.0310μs | 14.5284μs | 68.8306 KOps/s | 59.9399 KOps/s | |
test_set | 75.9810μs | 14.0851μs | 70.9969 KOps/s | 62.1010 KOps/s | |
test_set_shared | 1.4695ms | 0.1499ms | 6.6693 KOps/s | 6.6685 KOps/s | |
test_update | 0.4118ms | 15.9398μs | 62.7361 KOps/s | 49.8767 KOps/s | |
test_update_nested | 77.2610μs | 21.0244μs | 47.5637 KOps/s | 39.3033 KOps/s | |
test_update__nested | 1.1231ms | 25.5784μs | 39.0955 KOps/s | 39.8933 KOps/s | |
test_set_nested | 74.9420μs | 15.3244μs | 65.2556 KOps/s | 56.4335 KOps/s | |
test_set_nested_new | 77.3010μs | 17.5185μs | 57.0825 KOps/s | 50.5599 KOps/s | |
test_select | 89.6420μs | 29.3698μs | 34.0486 KOps/s | 31.4129 KOps/s | |
test_select_nested | 82.4420μs | 43.2206μs | 23.1371 KOps/s | 23.1033 KOps/s | |
test_exclude_nested | 0.3945ms | 60.5685μs | 16.5102 KOps/s | 16.3368 KOps/s | |
test_empty[True] | 0.3092ms | 0.2586ms | 3.8676 KOps/s | 3.8139 KOps/s | |
test_empty[False] | 5.2931μs | 0.7779μs | 1.2854 MOps/s | 1.2801 MOps/s | |
test_to | 85.8410μs | 54.7260μs | 18.2728 KOps/s | 17.6390 KOps/s | |
test_to_nonblocking | 0.1009ms | 47.8275μs | 20.9085 KOps/s | 20.9324 KOps/s | |
test_unbind_speed | 1.7604ms | 0.2409ms | 4.1503 KOps/s | 4.0957 KOps/s | |
test_unbind_speed_stack0 | 0.2957ms | 0.2359ms | 4.2397 KOps/s | 4.1663 KOps/s | |
test_unbind_speed_stack1 | 91.3214ms | 0.6528ms | 1.5319 KOps/s | 1.6392 KOps/s | |
test_split | 92.9293ms | 1.7033ms | 587.0883 Ops/s | 589.8618 Ops/s | |
test_chunk | 94.0945ms | 1.6954ms | 589.8378 Ops/s | 586.0991 Ops/s | |
test_consolidate[False-None] | 95.8091ms | 2.9185ms | 342.6473 Ops/s | 345.0007 Ops/s | |
test_consolidate[default-None] | 1.8840ms | 1.7165ms | 582.5863 Ops/s | 601.7089 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8960ms | 1.7502ms | 571.3515 Ops/s | 584.7536 Ops/s | |
test_consolidate_njt[False-None] | 7.2592ms | 6.8223ms | 146.5791 Ops/s | 148.3299 Ops/s | |
test_to[False-False-None] | 1.8122ms | 1.7259ms | 579.3928 Ops/s | 573.1748 Ops/s | |
test_to[True-False-None] | 1.4986ms | 1.3874ms | 720.7515 Ops/s | 735.1736 Ops/s | |
test_to[within-False-None] | 4.4479ms | 4.2211ms | 236.9027 Ops/s | 241.4695 Ops/s | |
test_to[True-default-None] | 5.4815ms | 5.3087ms | 188.3684 Ops/s | 185.4414 Ops/s | |
test_to_njt[False-False-None] | 7.1902ms | 7.0945ms | 140.9540 Ops/s | 137.2819 Ops/s | |
test_to_njt[True-False-None] | 5.8558ms | 5.7297ms | 174.5285 Ops/s | 166.6167 Ops/s | |
test_to_njt[within-False-None] | 12.7574ms | 12.5823ms | 79.4766 Ops/s | 55.3953 Ops/s | |
test_creation[device0] | 0.5490ms | 80.9387μs | 12.3550 KOps/s | 11.9120 KOps/s | |
test_creation_from_tensor | 0.6139ms | 84.5892μs | 11.8218 KOps/s | 11.3050 KOps/s | |
test_add_one[memmap_tensor0] | 0.3788ms | 7.0275μs | 142.2990 KOps/s | 147.3025 KOps/s | |
test_contiguous[memmap_tensor0] | 1.9650μs | 0.4327μs | 2.3108 MOps/s | 2.2607 MOps/s | |
test_stack[memmap_tensor0] | 41.5700μs | 4.7362μs | 211.1419 KOps/s | 215.9946 KOps/s | |
test_memmaptd_index | 1.9945ms | 0.2605ms | 3.8392 KOps/s | 3.8493 KOps/s | |
test_memmaptd_index_astensor | 0.5760ms | 0.3212ms | 3.1132 KOps/s | 3.1546 KOps/s | |
test_memmaptd_index_op | 0.9951ms | 0.5846ms | 1.7106 KOps/s | 1.5513 KOps/s | |
test_serialize_model | 0.1322s | 0.1310s | 7.6365 Ops/s | 7.6184 Ops/s | |
test_serialize_model_pickle | 1.3642s | 1.2192s | 0.8202 Ops/s | 0.8435 Ops/s | |
test_serialize_weights | 0.1305s | 0.1299s | 7.6992 Ops/s | 7.6227 Ops/s | |
test_serialize_weights_returnearly | 0.3808s | 56.5784ms | 17.6746 Ops/s | 23.4963 Ops/s | |
test_serialize_weights_pickle | 1.3484s | 1.2115s | 0.8254 Ops/s | 0.8223 Ops/s | |
test_reshape_pytree | 55.3810μs | 23.8292μs | 41.9654 KOps/s | 43.7552 KOps/s | |
test_reshape_td | 58.5310μs | 27.6495μs | 36.1670 KOps/s | 36.5508 KOps/s | |
test_view_pytree | 64.6410μs | 23.7894μs | 42.0355 KOps/s | 43.6796 KOps/s | |
test_view_td | 58.3510μs | 30.6823μs | 32.5921 KOps/s | 33.5884 KOps/s | |
test_unbind_pytree | 61.0220μs | 29.3364μs | 34.0874 KOps/s | 34.9760 KOps/s | |
test_unbind_td | 0.6365ms | 36.5960μs | 27.3254 KOps/s | 26.6769 KOps/s | |
test_split_pytree | 69.7220μs | 31.9858μs | 31.2639 KOps/s | 31.8591 KOps/s | |
test_split_td | 0.7549ms | 41.3214μs | 24.2005 KOps/s | 23.4899 KOps/s | |
test_add_pytree | 62.3110μs | 35.1413μs | 28.4566 KOps/s | 28.5887 KOps/s | |
test_add_td | 80.2220μs | 45.6729μs | 21.8948 KOps/s | 19.4589 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1708ms | 0.1238ms | 8.0756 KOps/s | 8.0449 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2263ms | 0.1280ms | 7.8134 KOps/s | 7.5251 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1432ms | 0.1001ms | 9.9910 KOps/s | 9.8305 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.5173ms | 0.1570ms | 6.3679 KOps/s | 6.4103 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 58.6910μs | 22.7975μs | 43.8644 KOps/s | 42.6051 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 67.4320μs | 27.2609μs | 36.6826 KOps/s | 35.7335 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2249ms | 65.2031μs | 15.3367 KOps/s | 15.2446 KOps/s | |
test_compile_copy_nested[pytree-eager] | 98.1420μs | 50.0983μs | 19.9607 KOps/s | 19.8803 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1972ms | 0.1445ms | 6.9207 KOps/s | 6.9496 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2920ms | 0.2118ms | 4.7209 KOps/s | 4.7715 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1628ms | 0.1003ms | 9.9749 KOps/s | 10.1325 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1107ms | 53.6690μs | 18.6327 KOps/s | 18.8449 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1952ms | 0.1512ms | 6.6149 KOps/s | 6.9073 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5981ms | 0.5141ms | 1.9450 KOps/s | 1.9699 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3511ms | 0.2488ms | 4.0193 KOps/s | 3.9901 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2219ms | 0.1473ms | 6.7893 KOps/s | 6.6924 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1462ms | 62.9927μs | 15.8748 KOps/s | 15.5044 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1636ms | 0.1044ms | 9.5789 KOps/s | 9.7961 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5073ms | 0.4181ms | 2.3919 KOps/s | 2.3978 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1850ms | 0.1451ms | 6.8895 KOps/s | 7.0175 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 66.0110μs | 19.5632μs | 51.1163 KOps/s | 42.5040 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 59.8810μs | 27.4902μs | 36.3766 KOps/s | 36.6348 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1699ms | 70.1934μs | 14.2463 KOps/s | 14.2942 KOps/s | |
test_compile_copy_flat[pytree-eager] | 83.2720μs | 52.2484μs | 19.1393 KOps/s | 19.2605 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6704ms | 0.4557ms | 2.1943 KOps/s | 2.1833 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.7729ms | 2.6753ms | 373.7913 Ops/s | 372.3847 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.6442ms | 0.4420ms | 2.2624 KOps/s | 2.2310 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.8987ms | 2.7779ms | 359.9804 Ops/s | 358.5840 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1985ms | 0.1222ms | 8.1821 KOps/s | 8.5858 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5592ms | 84.7839μs | 11.7947 KOps/s | 12.1795 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1708ms | 0.1143ms | 8.7467 KOps/s | 9.1977 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1540ms | 73.4673μs | 13.6115 KOps/s | 13.4155 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1771ms | 0.1093ms | 9.1481 KOps/s | 8.6147 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1412ms | 72.6122μs | 13.7718 KOps/s | 13.5268 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1567ms | 0.1038ms | 9.6307 KOps/s | 9.6758 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1444ms | 18.3245μs | 54.5719 KOps/s | 53.2916 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1500ms | 0.1003ms | 9.9657 KOps/s | 10.0149 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 56.8410μs | 17.7205μs | 56.4318 KOps/s | 57.9026 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2226ms | 0.1050ms | 9.5213 KOps/s | 9.9814 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 46.9710μs | 17.6228μs | 56.7446 KOps/s | 57.4485 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1657ms | 0.1085ms | 9.2194 KOps/s | 9.6135 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5431ms | 18.2247μs | 54.8707 KOps/s | 54.2229 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1580ms | 0.1052ms | 9.5037 KOps/s | 10.0208 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 47.4410μs | 17.8516μs | 56.0175 KOps/s | 58.1033 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2066ms | 0.1008ms | 9.9187 KOps/s | 10.0110 KOps/s | |
test_compile_indexing[int-pytree-eager] | 50.8310μs | 17.7363μs | 56.3816 KOps/s | 58.3072 KOps/s | |
test_mod_add[eager] | 71.5310μs | 30.5685μs | 32.7134 KOps/s | 29.0886 KOps/s | |
test_mod_add[compile] | 0.1403ms | 82.1176μs | 12.1777 KOps/s | 12.5724 KOps/s | |
test_mod_add[compile-overhead] | 0.3129ms | 0.1655ms | 6.0429 KOps/s | 5.8337 KOps/s | |
test_mod_wrap[eager] | 0.3579ms | 0.2493ms | 4.0110 KOps/s | 3.8367 KOps/s | |
test_mod_wrap[compile] | 1.6258ms | 0.3040ms | 3.2893 KOps/s | 3.4525 KOps/s | |
test_mod_wrap[compile-overhead] | 7.8212ms | 4.0297ms | 248.1548 Ops/s | 244.9692 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.7882ms | 1.4625ms | 683.7544 Ops/s | 673.8177 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5790ms | 1.3883ms | 720.3134 Ops/s | 718.4868 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.5047ms | 1.0378ms | 963.5707 Ops/s | 964.2210 Ops/s | |
test_seq_add[eager] | 0.2622ms | 98.5454μs | 10.1476 KOps/s | 9.8234 KOps/s | |
test_seq_add[compile] | 0.1669ms | 88.6702μs | 11.2777 KOps/s | 11.1171 KOps/s | |
test_seq_add[compile-overhead] | 0.1995ms | 0.1291ms | 7.7445 KOps/s | 7.6782 KOps/s | |
test_seq_wrap[eager] | 0.4799ms | 0.3823ms | 2.6160 KOps/s | 2.3620 KOps/s | |
test_seq_wrap[compile] | 0.3641ms | 0.3055ms | 3.2734 KOps/s | 3.2493 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2794ms | 0.2225ms | 4.4947 KOps/s | 4.4709 KOps/s | |
test_func_call_runtime[False-eager] | 0.8943ms | 0.7747ms | 1.2909 KOps/s | 1.2780 KOps/s | |
test_func_call_runtime[False-compile] | 1.0675ms | 0.7644ms | 1.3082 KOps/s | 1.3091 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4494ms | 0.3683ms | 2.7152 KOps/s | 2.7596 KOps/s | |
test_func_call_runtime[True-eager] | 1.0802ms | 0.9650ms | 1.0363 KOps/s | 1.0571 KOps/s | |
test_func_call_runtime[True-compile] | 0.8497ms | 0.7851ms | 1.2738 KOps/s | 1.2792 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4591ms | 0.3811ms | 2.6240 KOps/s | 2.6188 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9216ms | 0.7950ms | 1.2578 KOps/s | 1.2869 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9232ms | 0.7655ms | 1.3064 KOps/s | 1.3086 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4249ms | 0.3640ms | 2.7471 KOps/s | 2.7432 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1345ms | 1.0305ms | 970.3570 Ops/s | 955.8643 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.1008ms | 0.8287ms | 1.2068 KOps/s | 1.2375 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5113ms | 0.4091ms | 2.4447 KOps/s | 2.4428 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5245ms | 2.0736ms | 482.2443 Ops/s | 472.9845 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8923ms | 0.8221ms | 1.2164 KOps/s | 1.1985 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4818ms | 0.4122ms | 2.4260 KOps/s | 2.4188 KOps/s | |
test_distributed | 1.8840ms | 0.1726ms | 5.7932 KOps/s | 8.5428 KOps/s | |
test_tdmodule | 19.8410μs | 12.6664μs | 78.9490 KOps/s | 60.0057 KOps/s | |
test_tdmodule_dispatch | 0.5671ms | 25.4580μs | 39.2804 KOps/s | 33.1475 KOps/s | |
test_tdseq | 34.4000μs | 14.4061μs | 69.4151 KOps/s | 59.8132 KOps/s | |
test_tdseq_dispatch | 47.3900μs | 28.0356μs | 35.6689 KOps/s | 29.7759 KOps/s | |
test_instantiation_functorch | 1.7156ms | 1.5878ms | 629.7985 Ops/s | 622.2150 Ops/s | |
test_exec_functorch | 0.1992ms | 0.1531ms | 6.5322 KOps/s | 6.5260 KOps/s | |
test_exec_functional_call | 0.2884ms | 0.1465ms | 6.8282 KOps/s | 6.7970 KOps/s | |
test_exec_td_decorator | 0.3703ms | 0.1912ms | 5.2297 KOps/s | 5.1960 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8194ms | 0.6710ms | 1.4902 KOps/s | 1.4712 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8247ms | 0.6694ms | 1.4939 KOps/s | 1.4698 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.6946ms | 0.5939ms | 1.6839 KOps/s | 1.6813 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7053ms | 0.5897ms | 1.6958 KOps/s | 1.6833 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.0912ms | 19.4778ms | 51.3404 Ops/s | 51.4546 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.0861ms | 19.3992ms | 51.5486 Ops/s | 51.3207 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.8472ms | 19.2748ms | 51.8812 Ops/s | 51.9377 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.3483ms | 19.2320ms | 51.9968 Ops/s | 51.8194 Ops/s | |
test_to_module_speed[True] | 1.0529ms | 0.9492ms | 1.0535 KOps/s | 1.0479 KOps/s | |
test_to_module_speed[False] | 1.3328ms | 0.9349ms | 1.0697 KOps/s | 1.0705 KOps/s | |
test_tc_init | 66.6910μs | 33.1237μs | 30.1899 KOps/s | 25.5730 KOps/s | |
test_tc_init_nested | 0.1558ms | 68.0515μs | 14.6948 KOps/s | 12.6578 KOps/s | |
test_tc_first_layer_tensor | 6.9416μs | 0.6988μs | 1.4311 MOps/s | 1.3771 MOps/s | |
test_tc_first_layer_nontensor | 26.5810μs | 2.3378μs | 427.7589 KOps/s | 431.1265 KOps/s | |
test_tc_second_layer_tensor | 16.7002μs | 1.4291μs | 699.7492 KOps/s | 691.9296 KOps/s | |
test_tc_second_layer_nontensor | 27.4110μs | 3.0725μs | 325.4707 KOps/s | 325.8801 KOps/s | |
test_unbind | 0.2372s | 10.1452ms | 98.5684 Ops/s | 149.2625 Ops/s | |
test_full_like | 9.6740ms | 9.0997ms | 109.8940 Ops/s | 105.6444 Ops/s | |
test_zeros_like | 9.2343ms | 7.2806ms | 137.3519 Ops/s | 114.3920 Ops/s | |
test_ones_like | 5.2493ms | 4.3184ms | 231.5694 Ops/s | 233.6542 Ops/s | |
test_clone | 6.6839ms | 6.3527ms | 157.4129 Ops/s | 158.2744 Ops/s | |
test_squeeze | 58.1110μs | 9.4484μs | 105.8382 KOps/s | 106.1178 KOps/s | |
test_unsqueeze | 0.1591ms | 74.5539μs | 13.4131 KOps/s | 13.6494 KOps/s | |
test_split | 0.3982ms | 0.1700ms | 5.8810 KOps/s | 5.8153 KOps/s | |
test_permute | 0.2534ms | 0.1920ms | 5.2092 KOps/s | 5.2861 KOps/s | |
test_stack | 50.9252ms | 50.5664ms | 19.7760 Ops/s | 19.7380 Ops/s | |
test_cat | 50.5639ms | 49.9669ms | 20.0133 Ops/s | 19.9372 Ops/s |
This was referenced Nov 7, 2024
vmoens
added a commit
that referenced
this pull request
Nov 7, 2024
ghstack-source-id: edaba79a8a3b42cb3dac19b9fc145c1ceca4c70f Pull Request resolved: #1079
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):