-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Faster to #1073
Merged
Merged
[Performance] Faster to #1073
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Nov 5, 2024
vmoens
added a commit
that referenced
this pull request
Nov 5, 2024
ghstack-source-id: 3ff1db59f081b75f24c34c5239f88b5c5de8dbe4 Pull Request resolved: #1073
vmoens
added a commit
that referenced
this pull request
Nov 5, 2024
ghstack-source-id: 63222e1497d3be45d831003c387926fbfaade67d Pull Request resolved: #1073
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 42.2990μs | 17.8200μs | 56.1167 KOps/s | 56.1628 KOps/s | |
test_plain_set_stack_nested | 47.2490μs | 17.7486μs | 56.3423 KOps/s | 54.9797 KOps/s | |
test_plain_set_nested_inplace | 44.3930μs | 19.4382μs | 51.4450 KOps/s | 49.8002 KOps/s | |
test_plain_set_stack_nested_inplace | 43.5620μs | 19.3934μs | 51.5639 KOps/s | 48.6930 KOps/s | |
test_items | 21.5410μs | 4.1548μs | 240.6866 KOps/s | 239.4114 KOps/s | |
test_items_nested | 0.6309ms | 0.3434ms | 2.9122 KOps/s | 2.9192 KOps/s | |
test_items_nested_locked | 0.4906ms | 0.3432ms | 2.9137 KOps/s | 2.9411 KOps/s | |
test_items_nested_leaf | 0.1324ms | 71.5837μs | 13.9697 KOps/s | 14.0722 KOps/s | |
test_items_stack_nested | 0.6586ms | 0.3457ms | 2.8924 KOps/s | 2.8903 KOps/s | |
test_items_stack_nested_leaf | 0.1340ms | 70.9951μs | 14.0855 KOps/s | 13.6424 KOps/s | |
test_items_stack_nested_locked | 0.7924ms | 0.3536ms | 2.8281 KOps/s | 2.9091 KOps/s | |
test_keys | 44.2130μs | 3.5082μs | 285.0457 KOps/s | 284.7310 KOps/s | |
test_keys_nested | 0.2257ms | 0.1387ms | 7.2081 KOps/s | 7.2309 KOps/s | |
test_keys_nested_locked | 0.9177ms | 0.1419ms | 7.0474 KOps/s | 6.9348 KOps/s | |
test_keys_nested_leaf | 0.2047ms | 0.1193ms | 8.3791 KOps/s | 8.5264 KOps/s | |
test_keys_stack_nested | 0.2469ms | 0.1391ms | 7.1899 KOps/s | 7.3426 KOps/s | |
test_keys_stack_nested_leaf | 0.1776ms | 0.1194ms | 8.3727 KOps/s | 8.5260 KOps/s | |
test_keys_stack_nested_locked | 0.2668ms | 0.1432ms | 6.9852 KOps/s | 7.0605 KOps/s | |
test_values | 7.0212μs | 1.0595μs | 943.8252 KOps/s | 966.4813 KOps/s | |
test_values_nested | 0.1101ms | 55.3433μs | 18.0690 KOps/s | 18.1599 KOps/s | |
test_values_nested_locked | 0.1094ms | 56.2113μs | 17.7900 KOps/s | 18.1499 KOps/s | |
test_values_nested_leaf | 0.1119ms | 60.4521μs | 16.5420 KOps/s | 16.0083 KOps/s | |
test_values_stack_nested | 0.1044ms | 55.3497μs | 18.0669 KOps/s | 17.8017 KOps/s | |
test_values_stack_nested_leaf | 0.1169ms | 60.9613μs | 16.4038 KOps/s | 16.4589 KOps/s | |
test_values_stack_nested_locked | 90.4590μs | 55.5665μs | 17.9965 KOps/s | 17.8069 KOps/s | |
test_membership | 23.7940μs | 0.9062μs | 1.1036 MOps/s | 1.1441 MOps/s | |
test_membership_nested | 26.0890μs | 2.7585μs | 362.5187 KOps/s | 359.7531 KOps/s | |
test_membership_nested_leaf | 25.9780μs | 2.7639μs | 361.8013 KOps/s | 362.4091 KOps/s | |
test_membership_stacked_nested | 17.9840μs | 2.7069μs | 369.4254 KOps/s | 363.5144 KOps/s | |
test_membership_stacked_nested_leaf | 22.3220μs | 2.7457μs | 364.2034 KOps/s | 358.5334 KOps/s | |
test_membership_nested_last | 31.6090μs | 4.1563μs | 240.6006 KOps/s | 239.5194 KOps/s | |
test_membership_nested_leaf_last | 29.5950μs | 4.1653μs | 240.0816 KOps/s | 237.2419 KOps/s | |
test_membership_stacked_nested_last | 29.5750μs | 4.1565μs | 240.5845 KOps/s | 207.1252 KOps/s | |
test_membership_stacked_nested_leaf_last | 38.6730μs | 4.1579μs | 240.5046 KOps/s | 204.6098 KOps/s | |
test_nested_getleaf | 38.4630μs | 10.8353μs | 92.2909 KOps/s | 94.2970 KOps/s | |
test_nested_get | 41.3380μs | 10.1073μs | 98.9384 KOps/s | 98.2556 KOps/s | |
test_stacked_getleaf | 40.2250μs | 10.5488μs | 94.7976 KOps/s | 90.6993 KOps/s | |
test_stacked_get | 35.6270μs | 10.1976μs | 98.0624 KOps/s | 94.9593 KOps/s | |
test_nested_getitemleaf | 36.4080μs | 11.0517μs | 90.4836 KOps/s | 88.4393 KOps/s | |
test_nested_getitem | 35.2160μs | 10.3491μs | 96.6264 KOps/s | 95.7829 KOps/s | |
test_stacked_getitemleaf | 35.1960μs | 11.0328μs | 90.6386 KOps/s | 88.6585 KOps/s | |
test_stacked_getitem | 32.3710μs | 10.2527μs | 97.5355 KOps/s | 96.6931 KOps/s | |
test_lock_nested | 0.8824ms | 0.4344ms | 2.3018 KOps/s | 2.2647 KOps/s | |
test_lock_stack_nested | 0.6297ms | 0.4085ms | 2.4479 KOps/s | 2.4165 KOps/s | |
test_unlock_nested | 0.7405ms | 0.3503ms | 2.8550 KOps/s | 2.7929 KOps/s | |
test_unlock_stack_nested | 0.5797ms | 0.3278ms | 3.0506 KOps/s | 3.0446 KOps/s | |
test_flatten_speed | 0.1709ms | 92.4950μs | 10.8114 KOps/s | 10.7628 KOps/s | |
test_unflatten_speed | 0.6408ms | 0.4780ms | 2.0920 KOps/s | 2.0711 KOps/s | |
test_common_ops | 3.9015ms | 0.7705ms | 1.2979 KOps/s | 1.3131 KOps/s | |
test_creation | 71.2230μs | 2.1863μs | 457.3941 KOps/s | 473.2352 KOps/s | |
test_creation_empty | 32.2800μs | 10.5734μs | 94.5766 KOps/s | 89.4496 KOps/s | |
test_creation_nested_1 | 67.2190μs | 13.2542μs | 75.4477 KOps/s | 73.1560 KOps/s | |
test_creation_nested_2 | 48.4610μs | 17.8656μs | 55.9734 KOps/s | 56.1110 KOps/s | |
test_clone | 0.1328ms | 13.4580μs | 74.3053 KOps/s | 75.2244 KOps/s | |
test_getitem[int] | 1.1134ms | 12.7875μs | 78.2016 KOps/s | 80.0654 KOps/s | |
test_getitem[slice_int] | 0.1420ms | 24.7269μs | 40.4418 KOps/s | 42.0545 KOps/s | |
test_getitem[range] | 0.3793ms | 49.2948μs | 20.2861 KOps/s | 20.7958 KOps/s | |
test_getitem[tuple] | 0.1408ms | 19.8701μs | 50.3269 KOps/s | 51.2595 KOps/s | |
test_getitem[list] | 0.3375ms | 45.2292μs | 22.1096 KOps/s | 23.2676 KOps/s | |
test_setitem_dim[int] | 45.6660μs | 24.8899μs | 40.1769 KOps/s | 38.6312 KOps/s | |
test_setitem_dim[slice_int] | 90.7500μs | 50.1290μs | 19.9485 KOps/s | 19.5636 KOps/s | |
test_setitem_dim[range] | 0.1366ms | 73.8851μs | 13.5345 KOps/s | 13.7992 KOps/s | |
test_setitem_dim[tuple] | 68.5990μs | 39.4453μs | 25.3516 KOps/s | 24.2017 KOps/s | |
test_setitem | 0.1850ms | 20.2834μs | 49.3015 KOps/s | 49.3389 KOps/s | |
test_set | 0.1148ms | 19.6692μs | 50.8409 KOps/s | 50.4272 KOps/s | |
test_set_shared | 1.1244ms | 0.1681ms | 5.9491 KOps/s | 5.9586 KOps/s | |
test_update | 0.1437ms | 22.1836μs | 45.0783 KOps/s | 44.0594 KOps/s | |
test_update_nested | 0.1157ms | 31.7233μs | 31.5225 KOps/s | 30.2316 KOps/s | |
test_update__nested | 0.5760ms | 32.6600μs | 30.6185 KOps/s | 29.8200 KOps/s | |
test_set_nested | 0.1507ms | 21.5254μs | 46.4567 KOps/s | 45.5103 KOps/s | |
test_set_nested_new | 0.1556ms | 26.0534μs | 38.3828 KOps/s | 37.8636 KOps/s | |
test_select | 0.1335ms | 41.8542μs | 23.8925 KOps/s | 23.8011 KOps/s | |
test_select_nested | 0.1205ms | 60.0558μs | 16.6512 KOps/s | 16.9956 KOps/s | |
test_exclude_nested | 0.1479ms | 75.3266μs | 13.2755 KOps/s | 13.2264 KOps/s | |
test_empty[True] | 0.8190ms | 0.3525ms | 2.8371 KOps/s | 2.8344 KOps/s | |
test_empty[False] | 6.1065μs | 1.2100μs | 826.4435 KOps/s | 824.5927 KOps/s | |
test_unbind_speed | 0.4623ms | 0.2582ms | 3.8726 KOps/s | 3.8686 KOps/s | |
test_unbind_speed_stack0 | 0.4364ms | 0.2567ms | 3.8963 KOps/s | 3.8522 KOps/s | |
test_unbind_speed_stack1 | 97.9686ms | 0.7531ms | 1.3278 KOps/s | 1.4432 KOps/s | |
test_split | 0.1018s | 1.7804ms | 561.6864 Ops/s | 578.0106 Ops/s | |
test_chunk | 97.2289ms | 1.7852ms | 560.1719 Ops/s | 578.1846 Ops/s | |
test_consolidate_njt[False-None] | 8.3939ms | 8.0624ms | 124.0319 Ops/s | 122.8318 Ops/s | |
test_creation[device0] | 3.3135ms | 93.2586μs | 10.7229 KOps/s | 10.9147 KOps/s | |
test_creation_from_tensor | 0.2296ms | 93.9300μs | 10.6462 KOps/s | 10.4856 KOps/s | |
test_add_one[memmap_tensor0] | 0.2155ms | 4.9572μs | 201.7251 KOps/s | 202.0930 KOps/s | |
test_contiguous[memmap_tensor0] | 25.9480μs | 0.5035μs | 1.9862 MOps/s | 1.9585 MOps/s | |
test_stack[memmap_tensor0] | 46.8780μs | 3.4381μs | 290.8581 KOps/s | 310.5030 KOps/s | |
test_memmaptd_index | 0.9030ms | 0.2331ms | 4.2899 KOps/s | 4.3050 KOps/s | |
test_memmaptd_index_astensor | 0.6107ms | 0.3116ms | 3.2092 KOps/s | 3.2393 KOps/s | |
test_memmaptd_index_op | 1.0679ms | 0.5825ms | 1.7168 KOps/s | 1.7235 KOps/s | |
test_serialize_model | 0.1310s | 0.1149s | 8.7065 Ops/s | 7.6711 Ops/s | |
test_serialize_model_pickle | 0.4513s | 0.3954s | 2.5293 Ops/s | 2.5263 Ops/s | |
test_serialize_weights | 0.2162s | 0.1303s | 7.6747 Ops/s | 8.7639 Ops/s | |
test_serialize_weights_returnearly | 0.1706s | 0.1560s | 6.4095 Ops/s | 6.3970 Ops/s | |
test_serialize_weights_pickle | 1.0832s | 0.7089s | 1.4106 Ops/s | 2.4931 Ops/s | |
test_serialize_weights_filesystem | 0.1439s | 0.1385s | 7.2184 Ops/s | 6.4490 Ops/s | |
test_serialize_model_filesystem | 0.2406s | 0.1519s | 6.5847 Ops/s | 6.6488 Ops/s | |
test_reshape_pytree | 57.2980μs | 27.3174μs | 36.6067 KOps/s | 36.8492 KOps/s | |
test_reshape_td | 67.5070μs | 32.7840μs | 30.5027 KOps/s | 31.0518 KOps/s | |
test_view_pytree | 62.3160μs | 26.9517μs | 37.1035 KOps/s | 36.5710 KOps/s | |
test_view_td | 98.9360μs | 38.4685μs | 25.9953 KOps/s | 26.5757 KOps/s | |
test_unbind_pytree | 67.2860μs | 29.8744μs | 33.4735 KOps/s | 33.4130 KOps/s | |
test_unbind_td | 0.3172ms | 38.5889μs | 25.9142 KOps/s | 26.1629 KOps/s | |
test_split_pytree | 0.1034ms | 29.7083μs | 33.6606 KOps/s | 33.4438 KOps/s | |
test_split_td | 0.5381ms | 44.9261μs | 22.2588 KOps/s | 22.3729 KOps/s | |
test_add_pytree | 84.8080μs | 36.8572μs | 27.1317 KOps/s | 27.2292 KOps/s | |
test_add_td | 0.1553ms | 57.4017μs | 17.4211 KOps/s | 17.9941 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1281ms | 63.4179μs | 15.7684 KOps/s | 15.9544 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4826ms | 0.1599ms | 6.2544 KOps/s | 6.2981 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1331ms | 47.2039μs | 21.1847 KOps/s | 21.5256 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2285ms | 0.1200ms | 8.3356 KOps/s | 8.5073 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 82.4640μs | 26.3692μs | 37.9231 KOps/s | 38.5604 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1145ms | 54.8099μs | 18.2449 KOps/s | 18.7881 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1881ms | 78.6969μs | 12.7070 KOps/s | 12.7077 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1426ms | 68.8531μs | 14.5237 KOps/s | 14.8297 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2228ms | 0.1061ms | 9.4212 KOps/s | 9.3660 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3382ms | 0.1983ms | 5.0432 KOps/s | 5.0629 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1459ms | 45.3763μs | 22.0380 KOps/s | 21.7243 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4868ms | 61.5346μs | 16.2510 KOps/s | 16.0087 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1987ms | 0.1033ms | 9.6803 KOps/s | 9.7822 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4288ms | 0.2031ms | 4.9226 KOps/s | 4.9738 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3860ms | 0.2095ms | 4.7737 KOps/s | 4.7788 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1879ms | 0.1055ms | 9.4760 KOps/s | 9.4921 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2056ms | 54.9300μs | 18.2050 KOps/s | 18.1000 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.4249ms | 47.8958μs | 20.8787 KOps/s | 20.7904 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.2338ms | 0.1600ms | 6.2502 KOps/s | 6.3361 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.5083ms | 0.1065ms | 9.3916 KOps/s | 9.6113 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 67.8270μs | 22.5115μs | 44.4217 KOps/s | 46.9900 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1985ms | 62.6575μs | 15.9598 KOps/s | 16.7792 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1676ms | 81.1881μs | 12.3171 KOps/s | 12.2728 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1302ms | 69.3311μs | 14.4235 KOps/s | 14.6891 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3042ms | 0.2114ms | 4.7308 KOps/s | 4.8012 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.4144ms | 1.2763ms | 783.5191 Ops/s | 770.4657 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3062ms | 0.2059ms | 4.8578 KOps/s | 4.8872 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3437ms | 0.7864ms | 1.2717 KOps/s | 1.2816 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.7030ms | 0.4612ms | 2.1681 KOps/s | 2.1617 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.1288ms | 2.6000ms | 384.6108 Ops/s | 382.4691 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1108ms | 37.4728μs | 26.6861 KOps/s | 26.9911 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.4875ms | 33.7251μs | 29.6515 KOps/s | 29.9533 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 86.2420μs | 29.8705μs | 33.4779 KOps/s | 33.9261 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 72.3850μs | 23.5050μs | 42.5441 KOps/s | 42.9644 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 93.1040μs | 30.4497μs | 32.8411 KOps/s | 33.3173 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 68.2070μs | 23.5981μs | 42.3764 KOps/s | 43.1632 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1151ms | 52.1376μs | 19.1800 KOps/s | 19.0217 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.6453ms | 20.5774μs | 48.5970 KOps/s | 48.4249 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1105ms | 44.2860μs | 22.5805 KOps/s | 22.7310 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 78.0160μs | 19.2471μs | 51.9560 KOps/s | 51.5905 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 93.7160μs | 45.2594μs | 22.0949 KOps/s | 22.2192 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 75.1810μs | 19.0724μs | 52.4319 KOps/s | 51.4405 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1105ms | 52.7182μs | 18.9688 KOps/s | 18.8541 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.8565ms | 19.9989μs | 50.0027 KOps/s | 50.8581 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 95.6090μs | 45.0318μs | 22.2065 KOps/s | 22.3106 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 70.9430μs | 19.0639μs | 52.4552 KOps/s | 52.9583 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1010ms | 45.1441μs | 22.1513 KOps/s | 22.3579 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.4547ms | 19.2136μs | 52.0464 KOps/s | 52.5060 KOps/s | |
test_mod_add[eager] | 63.4380μs | 25.7472μs | 38.8392 KOps/s | 38.3506 KOps/s | |
test_mod_add[compile] | 85.5300μs | 45.2368μs | 22.1059 KOps/s | 22.2236 KOps/s | |
test_mod_add[compile-overhead] | 96.5300μs | 44.9266μs | 22.2585 KOps/s | 22.3061 KOps/s | |
test_mod_wrap[eager] | 0.4591ms | 0.2134ms | 4.6850 KOps/s | 4.6366 KOps/s | |
test_mod_wrap[compile] | 1.5411ms | 0.2107ms | 4.7459 KOps/s | 4.8365 KOps/s | |
test_mod_wrap[compile-overhead] | 1.5693ms | 0.2081ms | 4.8051 KOps/s | 4.8470 KOps/s | |
test_mod_wrap_and_backward[eager] | 14.4694ms | 12.6446ms | 79.0853 Ops/s | 83.3196 Ops/s | |
test_mod_wrap_and_backward[compile] | 16.3591ms | 12.7340ms | 78.5299 Ops/s | 79.0981 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 17.0178ms | 12.7734ms | 78.2876 Ops/s | 83.1785 Ops/s | |
test_seq_add[eager] | 0.2169ms | 88.1651μs | 11.3424 KOps/s | 10.7916 KOps/s | |
test_seq_add[compile] | 0.1401ms | 60.8539μs | 16.4328 KOps/s | 16.7219 KOps/s | |
test_seq_add[compile-overhead] | 0.1181ms | 59.1354μs | 16.9103 KOps/s | 16.7706 KOps/s | |
test_seq_wrap[eager] | 0.5478ms | 0.3848ms | 2.5990 KOps/s | 2.5581 KOps/s | |
test_seq_wrap[compile] | 0.4271ms | 0.2301ms | 4.3464 KOps/s | 4.3768 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4197ms | 0.2273ms | 4.3997 KOps/s | 4.3984 KOps/s | |
test_func_call_runtime[False-eager] | 0.7829ms | 0.5603ms | 1.7847 KOps/s | 1.8154 KOps/s | |
test_func_call_runtime[False-compile] | 1.0595ms | 0.4294ms | 2.3288 KOps/s | 2.3443 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.9045ms | 0.4343ms | 2.3028 KOps/s | 2.3460 KOps/s | |
test_func_call_runtime[True-eager] | 1.5158ms | 0.7707ms | 1.2976 KOps/s | 1.2981 KOps/s | |
test_func_call_runtime[True-compile] | 0.8386ms | 0.4719ms | 2.1192 KOps/s | 2.1604 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.9125ms | 0.4763ms | 2.0994 KOps/s | 2.1472 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.0529ms | 0.5587ms | 1.7899 KOps/s | 1.8141 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8634ms | 0.4281ms | 2.3361 KOps/s | 2.3579 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.9062ms | 0.4307ms | 2.3216 KOps/s | 2.3632 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.2821ms | 0.9067ms | 1.1029 KOps/s | 1.1017 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.6268ms | 0.4961ms | 2.0156 KOps/s | 2.0566 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.6746ms | 0.4957ms | 2.0173 KOps/s | 2.0438 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6459ms | 1.8928ms | 528.3296 Ops/s | 529.2865 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.7094ms | 0.5215ms | 1.9177 KOps/s | 1.9053 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.7367ms | 0.5284ms | 1.8924 KOps/s | 1.9165 KOps/s | |
test_distributed | 0.2873ms | 0.1285ms | 7.7814 KOps/s | 7.7080 KOps/s | |
test_tdmodule | 45.2340μs | 18.5054μs | 54.0383 KOps/s | 51.6002 KOps/s | |
test_tdmodule_dispatch | 71.8440μs | 37.1797μs | 26.8964 KOps/s | 27.1718 KOps/s | |
test_tdseq | 36.7480μs | 20.7307μs | 48.2376 KOps/s | 44.5433 KOps/s | |
test_tdseq_dispatch | 68.3280μs | 41.1711μs | 24.2889 KOps/s | 23.2586 KOps/s | |
test_instantiation_functorch | 2.1661ms | 1.5504ms | 645.0150 Ops/s | 660.3664 Ops/s | |
test_exec_functorch | 0.2860ms | 0.1816ms | 5.5066 KOps/s | 5.6163 KOps/s | |
test_exec_functional_call | 0.4429ms | 0.1724ms | 5.8020 KOps/s | 5.6822 KOps/s | |
test_exec_td_decorator | 0.5232ms | 0.2263ms | 4.4198 KOps/s | 4.3143 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8752ms | 0.6355ms | 1.5735 KOps/s | 1.5804 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8777ms | 0.6347ms | 1.5756 KOps/s | 1.5365 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9235ms | 0.5225ms | 1.9140 KOps/s | 1.9125 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8155ms | 0.5227ms | 1.9130 KOps/s | 1.9091 KOps/s | |
test_to_module_speed[True] | 1.3765ms | 1.2878ms | 776.5270 Ops/s | 766.7625 Ops/s | |
test_to_module_speed[False] | 1.7717ms | 1.2637ms | 791.3442 Ops/s | 777.5382 Ops/s | |
test_tc_init | 86.8430μs | 42.9718μs | 23.2711 KOps/s | 22.7022 KOps/s | |
test_tc_init_nested | 0.1554ms | 87.8757μs | 11.3797 KOps/s | 11.2877 KOps/s | |
test_tc_first_layer_tensor | 22.8130μs | 1.5419μs | 648.5586 KOps/s | 659.2786 KOps/s | |
test_tc_first_layer_nontensor | 43.5610μs | 4.8714μs | 205.2785 KOps/s | 212.3511 KOps/s | |
test_tc_second_layer_tensor | 44.4030μs | 2.7882μs | 358.6495 KOps/s | 357.6847 KOps/s | |
test_tc_second_layer_nontensor | 43.3210μs | 6.1785μs | 161.8514 KOps/s | 167.6861 KOps/s | |
test_unbind | 0.2127s | 12.2958ms | 81.3285 Ops/s | 78.9360 Ops/s | |
test_full_like | 17.7212ms | 11.6405ms | 85.9068 Ops/s | 85.3003 Ops/s | |
test_zeros_like | 9.8019ms | 7.1102ms | 140.6421 Ops/s | 133.7339 Ops/s | |
test_ones_like | 13.0645ms | 7.8133ms | 127.9874 Ops/s | 131.5689 Ops/s | |
test_clone | 14.0065ms | 9.4608ms | 105.6988 Ops/s | 109.0471 Ops/s | |
test_squeeze | 60.0530μs | 12.0184μs | 83.2056 KOps/s | 83.5152 KOps/s | |
test_unsqueeze | 0.1653ms | 87.3775μs | 11.4446 KOps/s | 11.2195 KOps/s | |
test_split | 0.5998ms | 0.1929ms | 5.1842 KOps/s | 5.3098 KOps/s | |
test_permute | 0.4274ms | 0.2194ms | 4.5578 KOps/s | 4.5810 KOps/s | |
test_stack | 28.3028ms | 23.9160ms | 41.8131 Ops/s | 42.1023 Ops/s | |
test_cat | 27.3812ms | 23.5713ms | 42.4246 Ops/s | 41.8822 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 26.8710μs | 10.5217μs | 95.0421 KOps/s | 93.8905 KOps/s | |
test_plain_set_stack_nested | 28.3500μs | 10.6472μs | 93.9213 KOps/s | 93.0242 KOps/s | |
test_plain_set_nested_inplace | 0.1002ms | 11.4753μs | 87.1437 KOps/s | 85.4243 KOps/s | |
test_plain_set_stack_nested_inplace | 57.2410μs | 11.4953μs | 86.9922 KOps/s | 86.4631 KOps/s | |
test_items | 27.6110μs | 2.8884μs | 346.2156 KOps/s | 344.1281 KOps/s | |
test_items_nested | 0.4581ms | 0.3219ms | 3.1061 KOps/s | 3.1187 KOps/s | |
test_items_nested_locked | 0.4806ms | 0.3237ms | 3.0894 KOps/s | 3.1160 KOps/s | |
test_items_nested_leaf | 0.1027ms | 57.8740μs | 17.2789 KOps/s | 17.2580 KOps/s | |
test_items_stack_nested | 0.4341ms | 0.3217ms | 3.1087 KOps/s | 3.0393 KOps/s | |
test_items_stack_nested_leaf | 92.6010μs | 57.6787μs | 17.3374 KOps/s | 17.1618 KOps/s | |
test_items_stack_nested_locked | 0.4253ms | 0.3249ms | 3.0781 KOps/s | 3.0941 KOps/s | |
test_keys | 25.7900μs | 3.4725μs | 287.9769 KOps/s | 290.3897 KOps/s | |
test_keys_nested | 0.1315ms | 70.4455μs | 14.1954 KOps/s | 14.1550 KOps/s | |
test_keys_nested_locked | 0.8795ms | 75.8876μs | 13.1774 KOps/s | 13.0238 KOps/s | |
test_keys_nested_leaf | 0.1107ms | 62.0557μs | 16.1146 KOps/s | 16.0852 KOps/s | |
test_keys_stack_nested | 0.1199ms | 70.2233μs | 14.2403 KOps/s | 13.9370 KOps/s | |
test_keys_stack_nested_leaf | 91.0820μs | 61.5284μs | 16.2527 KOps/s | 16.0384 KOps/s | |
test_keys_stack_nested_locked | 0.1227ms | 75.7085μs | 13.2086 KOps/s | 13.0761 KOps/s | |
test_values | 7.2500μs | 0.9565μs | 1.0455 MOps/s | 1.1818 MOps/s | |
test_values_nested | 70.5210μs | 31.3366μs | 31.9116 KOps/s | 32.0888 KOps/s | |
test_values_nested_locked | 0.3976ms | 32.6232μs | 30.6530 KOps/s | 30.3984 KOps/s | |
test_values_nested_leaf | 0.4183ms | 33.6350μs | 29.7309 KOps/s | 29.6178 KOps/s | |
test_values_stack_nested | 65.5810μs | 31.4300μs | 31.8168 KOps/s | 31.8268 KOps/s | |
test_values_stack_nested_leaf | 94.2510μs | 33.6985μs | 29.6749 KOps/s | 29.4258 KOps/s | |
test_values_stack_nested_locked | 0.4143ms | 32.8537μs | 30.4380 KOps/s | 30.3574 KOps/s | |
test_membership | 18.9108μs | 0.5204μs | 1.9214 MOps/s | 1.9742 MOps/s | |
test_membership_nested | 0.1938ms | 1.8762μs | 533.0005 KOps/s | 527.7844 KOps/s | |
test_membership_nested_leaf | 15.3655μs | 1.9008μs | 526.0910 KOps/s | 534.9057 KOps/s | |
test_membership_stacked_nested | 24.7810μs | 1.9421μs | 514.9099 KOps/s | 503.5655 KOps/s | |
test_membership_stacked_nested_leaf | 17.0200μs | 1.9254μs | 519.3706 KOps/s | 504.5934 KOps/s | |
test_membership_nested_last | 0.3864ms | 2.8390μs | 352.2412 KOps/s | 355.2958 KOps/s | |
test_membership_nested_leaf_last | 0.3814ms | 2.8198μs | 354.6384 KOps/s | 352.7081 KOps/s | |
test_membership_stacked_nested_last | 30.4410μs | 2.8300μs | 353.3626 KOps/s | 359.1944 KOps/s | |
test_membership_stacked_nested_leaf_last | 43.3700μs | 2.8224μs | 354.3113 KOps/s | 356.5444 KOps/s | |
test_nested_getleaf | 0.3893ms | 6.0453μs | 165.4183 KOps/s | 166.3351 KOps/s | |
test_nested_get | 0.3912ms | 5.7040μs | 175.3140 KOps/s | 175.0504 KOps/s | |
test_stacked_getleaf | 40.4410μs | 6.0079μs | 166.4474 KOps/s | 166.7393 KOps/s | |
test_stacked_get | 0.3876ms | 5.7083μs | 175.1833 KOps/s | 175.0716 KOps/s | |
test_nested_getitemleaf | 33.9000μs | 6.0865μs | 164.2979 KOps/s | 163.6521 KOps/s | |
test_nested_getitem | 0.3946ms | 5.7711μs | 173.2763 KOps/s | 173.5886 KOps/s | |
test_stacked_getitemleaf | 29.9900μs | 6.1148μs | 163.5375 KOps/s | 164.0938 KOps/s | |
test_stacked_getitem | 0.3859ms | 5.7919μs | 172.6563 KOps/s | 173.1619 KOps/s | |
test_lock_nested | 0.8724ms | 0.3648ms | 2.7411 KOps/s | 2.6882 KOps/s | |
test_lock_stack_nested | 0.3683ms | 0.3388ms | 2.9513 KOps/s | 2.9217 KOps/s | |
test_unlock_nested | 0.6587ms | 0.3099ms | 3.2266 KOps/s | 3.2328 KOps/s | |
test_unlock_stack_nested | 0.6511ms | 0.2790ms | 3.5847 KOps/s | 3.5553 KOps/s | |
test_flatten_speed | 0.4677ms | 72.2459μs | 13.8416 KOps/s | 13.9166 KOps/s | |
test_unflatten_speed | 0.6550ms | 0.2878ms | 3.4751 KOps/s | 3.4141 KOps/s | |
test_common_ops | 1.5473ms | 0.5932ms | 1.6858 KOps/s | 1.6777 KOps/s | |
test_creation | 0.1810ms | 1.4796μs | 675.8383 KOps/s | 664.5598 KOps/s | |
test_creation_empty | 28.4300μs | 7.2499μs | 137.9333 KOps/s | 133.9479 KOps/s | |
test_creation_nested_1 | 0.3906ms | 8.8471μs | 113.0312 KOps/s | 109.8027 KOps/s | |
test_creation_nested_2 | 34.2300μs | 11.3687μs | 87.9608 KOps/s | 86.0556 KOps/s | |
test_clone | 37.8510μs | 11.6057μs | 86.1646 KOps/s | 90.9333 KOps/s | |
test_getitem[int] | 1.2811ms | 11.0245μs | 90.7072 KOps/s | 89.7130 KOps/s | |
test_getitem[slice_int] | 0.4056ms | 21.5146μs | 46.4801 KOps/s | 46.2337 KOps/s | |
test_getitem[range] | 0.1374ms | 39.8716μs | 25.0805 KOps/s | 24.9862 KOps/s | |
test_getitem[tuple] | 0.1046ms | 18.8382μs | 53.0836 KOps/s | 52.8734 KOps/s | |
test_getitem[list] | 0.4238ms | 34.8759μs | 28.6731 KOps/s | 28.8505 KOps/s | |
test_setitem_dim[int] | 39.9700μs | 20.0659μs | 49.8358 KOps/s | 50.1193 KOps/s | |
test_setitem_dim[slice_int] | 62.9210μs | 38.7021μs | 25.8384 KOps/s | 25.6963 KOps/s | |
test_setitem_dim[range] | 77.5210μs | 55.5497μs | 18.0019 KOps/s | 18.1281 KOps/s | |
test_setitem_dim[tuple] | 55.2910μs | 33.0631μs | 30.2452 KOps/s | 29.7989 KOps/s | |
test_setitem | 0.3991ms | 15.9791μs | 62.5819 KOps/s | 64.6692 KOps/s | |
test_set | 0.1207ms | 15.0372μs | 66.5016 KOps/s | 66.8706 KOps/s | |
test_set_shared | 1.8141ms | 0.1486ms | 6.7296 KOps/s | 6.7358 KOps/s | |
test_update | 1.2865ms | 17.5105μs | 57.1085 KOps/s | 57.0048 KOps/s | |
test_update_nested | 0.4036ms | 22.2373μs | 44.9694 KOps/s | 44.0000 KOps/s | |
test_update__nested | 0.1280ms | 25.1719μs | 39.7269 KOps/s | 40.6120 KOps/s | |
test_set_nested | 0.3999ms | 16.4447μs | 60.8100 KOps/s | 57.7487 KOps/s | |
test_set_nested_new | 0.1165ms | 18.6128μs | 53.7264 KOps/s | 49.7702 KOps/s | |
test_select | 0.4127ms | 30.5320μs | 32.7525 KOps/s | 32.0495 KOps/s | |
test_select_nested | 76.9410μs | 41.8571μs | 23.8908 KOps/s | 24.1279 KOps/s | |
test_exclude_nested | 0.4404ms | 59.4539μs | 16.8197 KOps/s | 16.9945 KOps/s | |
test_empty[True] | 0.6283ms | 0.2549ms | 3.9231 KOps/s | 3.9314 KOps/s | |
test_empty[False] | 38.4136μs | 0.7416μs | 1.3484 MOps/s | 1.3570 MOps/s | |
test_to | 85.4320μs | 55.9723μs | 17.8660 KOps/s | 16.8566 KOps/s | |
test_to_nonblocking | 0.4392ms | 48.3636μs | 20.6767 KOps/s | 18.9736 KOps/s | |
test_unbind_speed | 0.2739ms | 0.2352ms | 4.2512 KOps/s | 4.1930 KOps/s | |
test_unbind_speed_stack0 | 0.6236ms | 0.2353ms | 4.2498 KOps/s | 4.2043 KOps/s | |
test_unbind_speed_stack1 | 92.3932ms | 0.6543ms | 1.5283 KOps/s | 1.5028 KOps/s | |
test_split | 92.6383ms | 1.6098ms | 621.1975 Ops/s | 575.0972 Ops/s | |
test_chunk | 94.9654ms | 1.7403ms | 574.6094 Ops/s | 685.3991 Ops/s | |
test_consolidate[False-None] | 3.2561ms | 2.6545ms | 376.7146 Ops/s | 347.7775 Ops/s | |
test_consolidate[default-None] | 1.8207ms | 1.7001ms | 588.1983 Ops/s | 598.8832 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8401ms | 1.6934ms | 590.5216 Ops/s | 584.9597 Ops/s | |
test_consolidate_njt[False-None] | 6.9644ms | 6.6839ms | 149.6134 Ops/s | 148.5429 Ops/s | |
test_to[False-False-None] | 1.8091ms | 1.7631ms | 567.1864 Ops/s | 482.8376 Ops/s | |
test_to[True-False-None] | 1.6258ms | 1.3767ms | 726.3558 Ops/s | 715.5130 Ops/s | |
test_to[within-False-None] | 4.1863ms | 4.0673ms | 245.8608 Ops/s | 241.2657 Ops/s | |
test_to[True-default-None] | 5.6343ms | 5.2034ms | 192.1813 Ops/s | 187.6993 Ops/s | |
test_to_njt[False-False-None] | 7.2010ms | 7.0748ms | 141.3475 Ops/s | 127.2648 Ops/s | |
test_to_njt[True-False-None] | 5.8147ms | 5.6423ms | 177.2341 Ops/s | 166.0823 Ops/s | |
test_to_njt[within-False-None] | 12.5532ms | 12.3906ms | 80.7061 Ops/s | 75.9551 Ops/s | |
test_creation[device0] | 0.3812ms | 79.8014μs | 12.5311 KOps/s | 11.6946 KOps/s | |
test_creation_from_tensor | 0.6103ms | 83.2098μs | 12.0178 KOps/s | 11.1977 KOps/s | |
test_add_one[memmap_tensor0] | 0.4208ms | 7.2930μs | 137.1171 KOps/s | 137.9714 KOps/s | |
test_contiguous[memmap_tensor0] | 1.7900μs | 0.4346μs | 2.3008 MOps/s | 2.3042 MOps/s | |
test_stack[memmap_tensor0] | 37.7810μs | 4.8843μs | 204.7377 KOps/s | 201.2213 KOps/s | |
test_memmaptd_index | 1.9694ms | 0.2585ms | 3.8679 KOps/s | 3.8556 KOps/s | |
test_memmaptd_index_astensor | 0.5986ms | 0.3158ms | 3.1662 KOps/s | 3.1356 KOps/s | |
test_memmaptd_index_op | 1.0078ms | 0.5940ms | 1.6836 KOps/s | 1.6822 KOps/s | |
test_serialize_model | 0.1305s | 0.1295s | 7.7245 Ops/s | 7.6867 Ops/s | |
test_serialize_model_pickle | 1.3490s | 1.2167s | 0.8219 Ops/s | 0.8419 Ops/s | |
test_serialize_weights | 0.1304s | 0.1291s | 7.7455 Ops/s | 7.7193 Ops/s | |
test_serialize_weights_returnearly | 0.3508s | 65.7085ms | 15.2187 Ops/s | 23.3183 Ops/s | |
test_serialize_weights_pickle | 1.3742s | 1.2175s | 0.8214 Ops/s | 0.8399 Ops/s | |
test_reshape_pytree | 53.3210μs | 22.4613μs | 44.5210 KOps/s | 43.6607 KOps/s | |
test_reshape_td | 52.9800μs | 27.3626μs | 36.5463 KOps/s | 36.9970 KOps/s | |
test_view_pytree | 55.4710μs | 22.1071μs | 45.2343 KOps/s | 43.9263 KOps/s | |
test_view_td | 58.6210μs | 30.3173μs | 32.9845 KOps/s | 31.7099 KOps/s | |
test_unbind_pytree | 56.9210μs | 28.1314μs | 35.5475 KOps/s | 35.2596 KOps/s | |
test_unbind_td | 0.8952ms | 36.9144μs | 27.0897 KOps/s | 26.8170 KOps/s | |
test_split_pytree | 66.6210μs | 30.3083μs | 32.9942 KOps/s | 32.9196 KOps/s | |
test_split_td | 0.1766ms | 39.8841μs | 25.0727 KOps/s | 25.2286 KOps/s | |
test_add_pytree | 76.8310μs | 37.1182μs | 26.9409 KOps/s | 27.7704 KOps/s | |
test_add_td | 86.9010μs | 48.0388μs | 20.8165 KOps/s | 19.9195 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1711ms | 0.1244ms | 8.0415 KOps/s | 8.0422 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2701ms | 0.1266ms | 7.9006 KOps/s | 7.7761 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1418ms | 0.1037ms | 9.6449 KOps/s | 9.8664 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.5836ms | 0.1611ms | 6.2063 KOps/s | 6.5024 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 68.3710μs | 22.9738μs | 43.5278 KOps/s | 41.7793 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1081ms | 26.8698μs | 37.2165 KOps/s | 36.4486 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1421ms | 63.3368μs | 15.7886 KOps/s | 15.3201 KOps/s | |
test_compile_copy_nested[pytree-eager] | 78.5610μs | 49.1318μs | 20.3534 KOps/s | 20.0700 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1819ms | 0.1402ms | 7.1343 KOps/s | 6.8676 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3590ms | 0.2069ms | 4.8329 KOps/s | 4.7091 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1340ms | 96.3000μs | 10.3842 KOps/s | 10.1660 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1353ms | 53.0337μs | 18.8559 KOps/s | 18.9519 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1791ms | 0.1376ms | 7.2662 KOps/s | 6.9226 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5898ms | 0.5219ms | 1.9161 KOps/s | 2.0243 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3544ms | 0.2483ms | 4.0280 KOps/s | 4.0193 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1822ms | 0.1415ms | 7.0651 KOps/s | 6.9913 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1404ms | 62.4288μs | 16.0182 KOps/s | 15.7654 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2317ms | 97.2035μs | 10.2877 KOps/s | 9.8277 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4817ms | 0.4388ms | 2.2789 KOps/s | 2.4126 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1827ms | 0.1381ms | 7.2404 KOps/s | 7.1308 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2022ms | 27.6392μs | 36.1805 KOps/s | 52.3759 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 64.4710μs | 27.0393μs | 36.9832 KOps/s | 36.6417 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1283ms | 69.3859μs | 14.4121 KOps/s | 14.3415 KOps/s | |
test_compile_copy_flat[pytree-eager] | 82.5610μs | 51.6750μs | 19.3517 KOps/s | 19.4240 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6596ms | 0.3963ms | 2.5236 KOps/s | 2.1927 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.2028ms | 2.7944ms | 357.8566 Ops/s | 360.4011 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.6352ms | 0.4407ms | 2.2691 KOps/s | 2.2206 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.0330ms | 2.8566ms | 350.0643 Ops/s | 354.7956 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.4100ms | 0.1149ms | 8.7055 KOps/s | 8.1365 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5805ms | 83.1589μs | 12.0252 KOps/s | 11.4177 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.3953ms | 0.1119ms | 8.9362 KOps/s | 9.1326 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2099ms | 70.0180μs | 14.2820 KOps/s | 14.1111 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2066ms | 0.1111ms | 9.0022 KOps/s | 9.2159 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1192ms | 70.0232μs | 14.2810 KOps/s | 14.1374 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2548ms | 0.1075ms | 9.2988 KOps/s | 9.8400 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1538ms | 18.0104μs | 55.5235 KOps/s | 54.5182 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2728ms | 98.8945μs | 10.1118 KOps/s | 10.2490 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1219ms | 16.1665μs | 61.8564 KOps/s | 62.0558 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2918ms | 0.1023ms | 9.7780 KOps/s | 10.2091 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 48.6610μs | 16.1136μs | 62.0595 KOps/s | 62.5605 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2421ms | 0.1063ms | 9.4080 KOps/s | 9.7531 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5720ms | 17.9233μs | 55.7933 KOps/s | 55.5097 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2615ms | 0.1032ms | 9.6944 KOps/s | 10.1572 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 47.2610μs | 16.0919μs | 62.1432 KOps/s | 62.0259 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2018ms | 99.2004μs | 10.0806 KOps/s | 10.1627 KOps/s | |
test_compile_indexing[int-pytree-eager] | 46.1310μs | 16.1806μs | 61.8026 KOps/s | 62.4001 KOps/s | |
test_mod_add[eager] | 75.9210μs | 31.9680μs | 31.2813 KOps/s | 30.7202 KOps/s | |
test_mod_add[compile] | 0.2473ms | 77.7375μs | 12.8638 KOps/s | 12.7043 KOps/s | |
test_mod_add[compile-overhead] | 0.3106ms | 0.1611ms | 6.2068 KOps/s | 5.8737 KOps/s | |
test_mod_wrap[eager] | 0.3262ms | 0.2552ms | 3.9188 KOps/s | 3.8908 KOps/s | |
test_mod_wrap[compile] | 1.5851ms | 0.2858ms | 3.4990 KOps/s | 3.4370 KOps/s | |
test_mod_wrap[compile-overhead] | 7.8839ms | 4.0651ms | 245.9975 Ops/s | 255.6534 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.6328ms | 1.5078ms | 663.2340 Ops/s | 682.0321 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5596ms | 1.4029ms | 712.7888 Ops/s | 719.0793 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.4938ms | 1.0241ms | 976.5118 Ops/s | 968.8107 Ops/s | |
test_seq_add[eager] | 0.2399ms | 96.2835μs | 10.3860 KOps/s | 9.8328 KOps/s | |
test_seq_add[compile] | 0.2189ms | 92.1589μs | 10.8508 KOps/s | 10.7371 KOps/s | |
test_seq_add[compile-overhead] | 0.1698ms | 0.1276ms | 7.8385 KOps/s | 7.7521 KOps/s | |
test_seq_wrap[eager] | 0.5133ms | 0.3976ms | 2.5149 KOps/s | 2.4448 KOps/s | |
test_seq_wrap[compile] | 0.4351ms | 0.3053ms | 3.2758 KOps/s | 3.2632 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2950ms | 0.2248ms | 4.4483 KOps/s | 4.4305 KOps/s | |
test_func_call_runtime[False-eager] | 0.8946ms | 0.7535ms | 1.3271 KOps/s | 1.3058 KOps/s | |
test_func_call_runtime[False-compile] | 0.9822ms | 0.7701ms | 1.2986 KOps/s | 1.2936 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5436ms | 0.3639ms | 2.7483 KOps/s | 2.7270 KOps/s | |
test_func_call_runtime[True-eager] | 1.1954ms | 0.9186ms | 1.0886 KOps/s | 1.0803 KOps/s | |
test_func_call_runtime[True-compile] | 0.8531ms | 0.7839ms | 1.2757 KOps/s | 1.2612 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4628ms | 0.3840ms | 2.6042 KOps/s | 2.5780 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8326ms | 0.7533ms | 1.3276 KOps/s | 1.3114 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8386ms | 0.7591ms | 1.3174 KOps/s | 1.3061 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4370ms | 0.3650ms | 2.7397 KOps/s | 2.7204 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1903ms | 1.0204ms | 979.9999 Ops/s | 970.9429 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.8930ms | 0.8113ms | 1.2326 KOps/s | 1.2234 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5348ms | 0.4097ms | 2.4407 KOps/s | 2.4149 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5776ms | 2.1108ms | 473.7562 Ops/s | 472.4186 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9731ms | 0.8220ms | 1.2165 KOps/s | 1.1942 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4852ms | 0.4132ms | 2.4203 KOps/s | 2.4113 KOps/s | |
test_distributed | 3.7597ms | 0.1341ms | 7.4567 KOps/s | 8.7914 KOps/s | |
test_tdmodule | 0.3124ms | 14.5291μs | 68.8276 KOps/s | 72.7571 KOps/s | |
test_tdmodule_dispatch | 46.7700μs | 27.3677μs | 36.5395 KOps/s | 37.1916 KOps/s | |
test_tdseq | 38.6410μs | 15.6519μs | 63.8899 KOps/s | 65.5575 KOps/s | |
test_tdseq_dispatch | 53.4700μs | 30.4383μs | 32.8534 KOps/s | 33.0635 KOps/s | |
test_instantiation_functorch | 1.7186ms | 1.5747ms | 635.0537 Ops/s | 630.3275 Ops/s | |
test_exec_functorch | 0.2545ms | 0.1505ms | 6.6456 KOps/s | 6.7186 KOps/s | |
test_exec_functional_call | 0.1922ms | 0.1430ms | 6.9953 KOps/s | 7.0488 KOps/s | |
test_exec_td_decorator | 0.3758ms | 0.1925ms | 5.1944 KOps/s | 5.3703 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8242ms | 0.7014ms | 1.4257 KOps/s | 1.4088 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8253ms | 0.6930ms | 1.4431 KOps/s | 1.4069 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7571ms | 0.6091ms | 1.6418 KOps/s | 1.5981 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7117ms | 0.5951ms | 1.6804 KOps/s | 1.6323 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.0284ms | 19.7176ms | 50.7161 Ops/s | 50.9890 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.4206ms | 19.7829ms | 50.5487 Ops/s | 51.0459 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.7722ms | 19.6211ms | 50.9655 Ops/s | 51.4541 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.6675ms | 19.5746ms | 51.0866 Ops/s | 51.3989 Ops/s | |
test_to_module_speed[True] | 2.1616ms | 0.9405ms | 1.0632 KOps/s | 1.0560 KOps/s | |
test_to_module_speed[False] | 1.0108ms | 0.9166ms | 1.0909 KOps/s | 1.0937 KOps/s | |
test_tc_init | 77.0210μs | 37.0087μs | 27.0207 KOps/s | 29.4483 KOps/s | |
test_tc_init_nested | 0.1150ms | 68.6509μs | 14.5665 KOps/s | 13.9524 KOps/s | |
test_tc_first_layer_tensor | 4.7229μs | 0.6898μs | 1.4498 MOps/s | 1.4425 MOps/s | |
test_tc_first_layer_nontensor | 25.8100μs | 2.3107μs | 432.7720 KOps/s | 429.0044 KOps/s | |
test_tc_second_layer_tensor | 8.6277μs | 1.4112μs | 708.5949 KOps/s | 704.2830 KOps/s | |
test_tc_second_layer_nontensor | 25.2510μs | 3.0671μs | 326.0420 KOps/s | 325.8264 KOps/s | |
test_unbind | 7.0541ms | 6.7325ms | 148.5324 Ops/s | 148.5034 Ops/s | |
test_full_like | 10.8952ms | 9.3820ms | 106.5872 Ops/s | 103.9159 Ops/s | |
test_zeros_like | 9.1767ms | 7.1421ms | 140.0139 Ops/s | 113.9894 Ops/s | |
test_ones_like | 4.9481ms | 4.2736ms | 233.9934 Ops/s | 231.5253 Ops/s | |
test_clone | 6.8602ms | 6.4225ms | 155.7035 Ops/s | 155.5442 Ops/s | |
test_squeeze | 78.7710μs | 10.0152μs | 99.8487 KOps/s | 91.4597 KOps/s | |
test_unsqueeze | 0.1282ms | 77.1532μs | 12.9612 KOps/s | 13.5811 KOps/s | |
test_split | 0.3077ms | 0.1731ms | 5.7766 KOps/s | 5.9984 KOps/s | |
test_permute | 0.2368ms | 0.1894ms | 5.2809 KOps/s | 5.3346 KOps/s | |
test_stack | 50.8373ms | 50.5695ms | 19.7748 Ops/s | 19.7770 Ops/s | |
test_cat | 50.8380ms | 50.4965ms | 19.8033 Ops/s | 19.8564 Ops/s |
vmoens
added a commit
that referenced
this pull request
Nov 5, 2024
ghstack-source-id: 3dfb0b66fae82dc8cf5ef2a14eccb1bec5237ebb Pull Request resolved: #1073
vmoens
added a commit
that referenced
this pull request
Nov 5, 2024
ghstack-source-id: 3dfb0b66fae82dc8cf5ef2a14eccb1bec5237ebb Pull Request resolved: #1073
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Performance
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):