-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] TensorDict.softmax #1163
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Jan 7, 2025
ghstack-source-id: a88bebc23e6aaa02ec297db72dbda68ec9628ce7 Pull Request resolved: #1163
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 7, 2025
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 41.5370μs | 20.6825μs | 48.3500 KOps/s | 50.2283 KOps/s | |
test_plain_set_stack_nested | 46.6270μs | 21.0241μs | 47.5645 KOps/s | 50.3000 KOps/s | |
test_plain_set_nested_inplace | 75.5000μs | 22.8587μs | 43.7470 KOps/s | 45.6296 KOps/s | |
test_plain_set_stack_nested_inplace | 70.0400μs | 22.7993μs | 43.8609 KOps/s | 45.4315 KOps/s | |
test_items | 34.6240μs | 4.2400μs | 235.8478 KOps/s | 234.7238 KOps/s | |
test_items_nested | 0.6321ms | 0.4123ms | 2.4253 KOps/s | 2.4725 KOps/s | |
test_items_nested_locked | 0.7342ms | 0.4094ms | 2.4426 KOps/s | 2.4712 KOps/s | |
test_items_nested_leaf | 0.1469ms | 77.0068μs | 12.9859 KOps/s | 13.0968 KOps/s | |
test_items_stack_nested | 0.6686ms | 0.4118ms | 2.4286 KOps/s | 2.4431 KOps/s | |
test_items_stack_nested_leaf | 0.1358ms | 80.6572μs | 12.3982 KOps/s | 12.7020 KOps/s | |
test_items_stack_nested_locked | 0.8507ms | 0.4113ms | 2.4311 KOps/s | 2.4437 KOps/s | |
test_keys | 31.1280μs | 3.5769μs | 279.5703 KOps/s | 278.5926 KOps/s | |
test_keys_nested | 0.3112ms | 0.1683ms | 5.9419 KOps/s | 5.9631 KOps/s | |
test_keys_nested_locked | 0.8168ms | 0.1740ms | 5.7463 KOps/s | 5.7824 KOps/s | |
test_keys_nested_leaf | 1.8908ms | 0.1465ms | 6.8239 KOps/s | 6.8921 KOps/s | |
test_keys_stack_nested | 0.2594ms | 0.1654ms | 6.0472 KOps/s | 6.0089 KOps/s | |
test_keys_stack_nested_leaf | 0.2337ms | 0.1437ms | 6.9607 KOps/s | 6.9153 KOps/s | |
test_keys_stack_nested_locked | 0.2863ms | 0.1719ms | 5.8182 KOps/s | 5.8255 KOps/s | |
test_values | 10.2336μs | 1.0755μs | 929.7863 KOps/s | 965.1051 KOps/s | |
test_values_nested | 99.6960μs | 63.3521μs | 15.7848 KOps/s | 16.0818 KOps/s | |
test_values_nested_locked | 0.1190ms | 63.1521μs | 15.8348 KOps/s | 16.0108 KOps/s | |
test_values_nested_leaf | 0.1429ms | 72.0608μs | 13.8772 KOps/s | 13.8986 KOps/s | |
test_values_stack_nested | 0.1576ms | 63.1629μs | 15.8321 KOps/s | 14.8917 KOps/s | |
test_values_stack_nested_leaf | 0.1381ms | 72.0790μs | 13.8737 KOps/s | 13.6705 KOps/s | |
test_values_stack_nested_locked | 0.1099ms | 63.5926μs | 15.7251 KOps/s | 15.6537 KOps/s | |
test_membership | 13.9060μs | 0.8950μs | 1.1173 MOps/s | 1.1129 MOps/s | |
test_membership_nested | 55.7250μs | 2.9220μs | 342.2286 KOps/s | 344.5993 KOps/s | |
test_membership_nested_leaf | 44.5530μs | 2.9653μs | 337.2365 KOps/s | 347.9678 KOps/s | |
test_membership_stacked_nested | 45.7550μs | 2.9083μs | 343.8493 KOps/s | 345.5924 KOps/s | |
test_membership_stacked_nested_leaf | 23.7140μs | 2.9143μs | 343.1352 KOps/s | 345.7949 KOps/s | |
test_membership_nested_last | 28.7130μs | 4.3573μs | 229.5011 KOps/s | 232.4218 KOps/s | |
test_membership_nested_leaf_last | 35.3560μs | 4.3806μs | 228.2789 KOps/s | 227.4231 KOps/s | |
test_membership_stacked_nested_last | 33.9930μs | 5.1236μs | 195.1742 KOps/s | 231.0654 KOps/s | |
test_membership_stacked_nested_leaf_last | 26.6300μs | 5.1558μs | 193.9568 KOps/s | 229.8296 KOps/s | |
test_nested_getleaf | 39.9040μs | 10.6623μs | 93.7883 KOps/s | 92.9834 KOps/s | |
test_nested_get | 31.3180μs | 10.1690μs | 98.3380 KOps/s | 97.1560 KOps/s | |
test_stacked_getleaf | 45.8150μs | 10.5876μs | 94.4501 KOps/s | 93.2720 KOps/s | |
test_stacked_get | 40.9260μs | 10.2551μs | 97.5128 KOps/s | 99.2313 KOps/s | |
test_nested_getitemleaf | 35.9670μs | 11.5475μs | 86.5990 KOps/s | 89.4442 KOps/s | |
test_nested_getitem | 66.6640μs | 10.0908μs | 99.0997 KOps/s | 95.4990 KOps/s | |
test_stacked_getitemleaf | 35.8670μs | 11.0900μs | 90.1716 KOps/s | 90.4617 KOps/s | |
test_stacked_getitem | 0.1015ms | 10.3202μs | 96.8973 KOps/s | 96.2920 KOps/s | |
test_lock_nested | 1.7318ms | 0.4606ms | 2.1709 KOps/s | 2.2143 KOps/s | |
test_lock_stack_nested | 0.6989ms | 0.4320ms | 2.3146 KOps/s | 2.3316 KOps/s | |
test_unlock_nested | 0.7824ms | 0.3763ms | 2.6577 KOps/s | 2.6725 KOps/s | |
test_unlock_stack_nested | 0.5296ms | 0.3447ms | 2.9013 KOps/s | 2.9212 KOps/s | |
test_flatten_speed | 0.2016ms | 0.1025ms | 9.7582 KOps/s | 10.2545 KOps/s | |
test_unflatten_speed | 0.9683ms | 0.5414ms | 1.8472 KOps/s | 1.8784 KOps/s | |
test_common_ops | 4.3316ms | 0.8173ms | 1.2235 KOps/s | 1.3503 KOps/s | |
test_creation | 59.3110μs | 2.5145μs | 397.6939 KOps/s | 398.8308 KOps/s | |
test_creation_empty | 34.6850μs | 12.2506μs | 81.6286 KOps/s | 100.8409 KOps/s | |
test_creation_nested_1 | 60.8940μs | 15.0769μs | 66.3268 KOps/s | 78.6192 KOps/s | |
test_creation_nested_2 | 50.2240μs | 19.7710μs | 50.5790 KOps/s | 57.7699 KOps/s | |
test_clone | 0.2061ms | 13.7638μs | 72.6543 KOps/s | 74.6216 KOps/s | |
test_getitem[int] | 0.7716ms | 13.1608μs | 75.9832 KOps/s | 80.0537 KOps/s | |
test_getitem[slice_int] | 0.1398ms | 25.2190μs | 39.6526 KOps/s | 42.0850 KOps/s | |
test_getitem[range] | 0.1743ms | 50.9847μs | 19.6137 KOps/s | 21.0974 KOps/s | |
test_getitem[tuple] | 0.1398ms | 20.8360μs | 47.9938 KOps/s | 50.6781 KOps/s | |
test_getitem[list] | 0.1757ms | 46.4644μs | 21.5219 KOps/s | 23.3545 KOps/s | |
test_setitem_dim[int] | 56.0840μs | 27.2023μs | 36.7616 KOps/s | 41.3567 KOps/s | |
test_setitem_dim[slice_int] | 96.2100μs | 55.4690μs | 18.0281 KOps/s | 19.3574 KOps/s | |
test_setitem_dim[range] | 0.1144ms | 75.5158μs | 13.2423 KOps/s | 13.7425 KOps/s | |
test_setitem_dim[tuple] | 76.5330μs | 43.6422μs | 22.9136 KOps/s | 24.5134 KOps/s | |
test_setitem | 82.5630μs | 21.5303μs | 46.4462 KOps/s | 52.0931 KOps/s | |
test_set | 0.1518ms | 21.2768μs | 46.9995 KOps/s | 53.5514 KOps/s | |
test_set_shared | 1.1785ms | 0.1707ms | 5.8587 KOps/s | 6.0046 KOps/s | |
test_update | 0.3041ms | 24.1677μs | 41.3776 KOps/s | 48.1336 KOps/s | |
test_update_nested | 0.3735ms | 34.5055μs | 28.9809 KOps/s | 32.2272 KOps/s | |
test_update__nested | 0.5907ms | 34.6472μs | 28.8623 KOps/s | 30.0819 KOps/s | |
test_set_nested | 0.2893ms | 23.4473μs | 42.6488 KOps/s | 48.7508 KOps/s | |
test_set_nested_new | 97.2110μs | 28.5926μs | 34.9741 KOps/s | 39.2617 KOps/s | |
test_select | 0.1053ms | 45.1337μs | 22.1564 KOps/s | 24.3946 KOps/s | |
test_select_nested | 0.1229ms | 62.9693μs | 15.8808 KOps/s | 16.1964 KOps/s | |
test_exclude_nested | 0.1558ms | 83.3851μs | 11.9925 KOps/s | 12.4878 KOps/s | |
test_empty[True] | 0.7173ms | 0.4244ms | 2.3563 KOps/s | 2.4443 KOps/s | |
test_empty[False] | 6.7400μs | 1.3977μs | 715.4814 KOps/s | 724.0916 KOps/s | |
test_unbind_speed | 0.5209ms | 0.2712ms | 3.6875 KOps/s | 3.7251 KOps/s | |
test_unbind_speed_stack0 | 0.4450ms | 0.2643ms | 3.7834 KOps/s | 3.7704 KOps/s | |
test_unbind_speed_stack1 | 99.9301ms | 0.7836ms | 1.2762 KOps/s | 1.3639 KOps/s | |
test_split | 1.7887ms | 1.6159ms | 618.8355 Ops/s | 572.1426 Ops/s | |
test_chunk | 0.1065s | 1.9526ms | 512.1475 Ops/s | 572.1762 Ops/s | |
test_consolidate_njt[False-None] | 8.6133ms | 8.2197ms | 121.6591 Ops/s | 124.1283 Ops/s | |
test_creation[device0] | 4.5817ms | 94.0892μs | 10.6282 KOps/s | 10.8214 KOps/s | |
test_creation_from_tensor | 0.2392ms | 94.6658μs | 10.5635 KOps/s | 10.4551 KOps/s | |
test_add_one[memmap_tensor0] | 0.2157ms | 4.9457μs | 202.1948 KOps/s | 215.6424 KOps/s | |
test_contiguous[memmap_tensor0] | 16.2710μs | 0.5116μs | 1.9548 MOps/s | 1.9567 MOps/s | |
test_stack[memmap_tensor0] | 51.4860μs | 3.4080μs | 293.4293 KOps/s | 303.3288 KOps/s | |
test_memmaptd_index | 0.9423ms | 0.2414ms | 4.1428 KOps/s | 4.1801 KOps/s | |
test_memmaptd_index_astensor | 0.7457ms | 0.3327ms | 3.0058 KOps/s | 3.0785 KOps/s | |
test_memmaptd_index_op | 0.9747ms | 0.6105ms | 1.6379 KOps/s | 1.8106 KOps/s | |
test_serialize_model | 0.1396s | 0.1167s | 8.5699 Ops/s | 8.6516 Ops/s | |
test_serialize_model_pickle | 0.4443s | 0.3881s | 2.5767 Ops/s | 2.5225 Ops/s | |
test_serialize_weights | 0.1191s | 0.1118s | 8.9415 Ops/s | 8.9459 Ops/s | |
test_serialize_weights_returnearly | 0.2569s | 0.1713s | 5.8385 Ops/s | 6.3687 Ops/s | |
test_serialize_weights_pickle | 0.5637s | 0.4202s | 2.3798 Ops/s | 2.5910 Ops/s | |
test_serialize_weights_filesystem | 0.1497s | 0.1459s | 6.8559 Ops/s | 7.0825 Ops/s | |
test_serialize_model_filesystem | 0.1579s | 0.1470s | 6.8014 Ops/s | 6.0522 Ops/s | |
test_reshape_pytree | 61.2740μs | 26.1822μs | 38.1939 KOps/s | 36.1869 KOps/s | |
test_reshape_td | 71.1330μs | 32.4472μs | 30.8193 KOps/s | 29.8905 KOps/s | |
test_view_pytree | 80.2300μs | 26.3375μs | 37.9686 KOps/s | 36.8598 KOps/s | |
test_view_td | 89.6470μs | 39.3617μs | 25.4054 KOps/s | 26.3126 KOps/s | |
test_unbind_pytree | 90.5980μs | 29.9187μs | 33.4239 KOps/s | 33.7910 KOps/s | |
test_unbind_td | 0.3213ms | 40.2202μs | 24.8631 KOps/s | 25.4624 KOps/s | |
test_split_pytree | 72.7350μs | 29.8642μs | 33.4849 KOps/s | 33.5438 KOps/s | |
test_split_td | 0.5292ms | 46.0176μs | 21.7308 KOps/s | 22.5075 KOps/s | |
test_add_pytree | 98.0220μs | 36.5232μs | 27.3799 KOps/s | 28.2580 KOps/s | |
test_add_td | 0.1535ms | 60.6723μs | 16.4820 KOps/s | 19.4438 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1326ms | 63.2784μs | 15.8032 KOps/s | 15.7512 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4057ms | 0.1713ms | 5.8363 KOps/s | 5.8779 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1220ms | 46.8053μs | 21.3651 KOps/s | 21.9480 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2588ms | 0.1234ms | 8.1044 KOps/s | 8.3408 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 60.5520μs | 26.6251μs | 37.5585 KOps/s | 39.9828 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1404ms | 59.6573μs | 16.7624 KOps/s | 16.8436 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1841ms | 79.2504μs | 12.6182 KOps/s | 12.4856 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1217ms | 66.8844μs | 14.9512 KOps/s | 14.4961 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2004ms | 0.1057ms | 9.4627 KOps/s | 9.3111 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3856ms | 0.2165ms | 4.6196 KOps/s | 4.6415 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1191ms | 46.3217μs | 21.5882 KOps/s | 21.7262 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4782ms | 66.2118μs | 15.1031 KOps/s | 15.6771 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1883ms | 0.1049ms | 9.5303 KOps/s | 9.8052 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3137ms | 0.2028ms | 4.9315 KOps/s | 4.9823 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4047ms | 0.2381ms | 4.1997 KOps/s | 4.2734 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2192ms | 0.1102ms | 9.0738 KOps/s | 9.3835 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2081ms | 61.6018μs | 16.2333 KOps/s | 16.9484 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.6302ms | 46.3413μs | 21.5790 KOps/s | 21.9848 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5670ms | 0.1582ms | 6.3192 KOps/s | 6.3745 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1819ms | 0.1039ms | 9.6263 KOps/s | 9.7487 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 66.2230μs | 20.9059μs | 47.8333 KOps/s | 46.8017 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1869ms | 69.0696μs | 14.4782 KOps/s | 15.0720 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1526ms | 81.4036μs | 12.2845 KOps/s | 12.1129 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1275ms | 69.4040μs | 14.4084 KOps/s | 14.2296 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2973ms | 0.2077ms | 4.8144 KOps/s | 4.9115 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.7441ms | 1.3551ms | 737.9644 Ops/s | 767.7678 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3445ms | 0.2058ms | 4.8582 KOps/s | 5.0184 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.0298ms | 0.7781ms | 1.2851 KOps/s | 1.3159 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.5505ms | 0.4539ms | 2.2032 KOps/s | 2.2073 KOps/s | |
test_compile_assign_and_add_stack[eager] | 3.4719ms | 2.8530ms | 350.5109 Ops/s | 397.3091 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 89.0550μs | 35.7888μs | 27.9417 KOps/s | 27.6866 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5046ms | 34.8341μs | 28.7075 KOps/s | 30.7524 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 85.3390μs | 28.8157μs | 34.7034 KOps/s | 33.6261 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 63.1980μs | 23.7396μs | 42.1238 KOps/s | 43.1039 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 86.5020μs | 29.3176μs | 34.1092 KOps/s | 32.6862 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 62.9070μs | 23.6513μs | 42.2810 KOps/s | 43.0006 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1144ms | 51.6897μs | 19.3462 KOps/s | 19.2085 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.6323ms | 21.0795μs | 47.4395 KOps/s | 49.7339 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 90.3680μs | 44.7894μs | 22.3267 KOps/s | 22.5137 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 55.8440μs | 19.1388μs | 52.2500 KOps/s | 52.3170 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1075ms | 45.5828μs | 21.9381 KOps/s | 22.3765 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 53.6400μs | 19.0920μs | 52.3780 KOps/s | 53.3801 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1181ms | 53.2007μs | 18.7967 KOps/s | 19.1123 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.0372ms | 20.7369μs | 48.2232 KOps/s | 49.8400 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 97.2210μs | 45.3591μs | 22.0463 KOps/s | 22.1314 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 50.2140μs | 19.0929μs | 52.3754 KOps/s | 52.7099 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1135ms | 45.2173μs | 22.1154 KOps/s | 22.4261 KOps/s | |
test_compile_indexing[int-pytree-eager] | 70.3510μs | 18.8099μs | 53.1634 KOps/s | 52.4611 KOps/s | |
test_mod_add[eager] | 89.8070μs | 35.6456μs | 28.0540 KOps/s | 31.1388 KOps/s | |
test_mod_add[compile] | 0.1077ms | 47.4500μs | 21.0748 KOps/s | 20.6978 KOps/s | |
test_mod_add[compile-overhead] | 0.1341ms | 47.6796μs | 20.9733 KOps/s | 20.7871 KOps/s | |
test_mod_wrap[eager] | 0.4558ms | 0.2261ms | 4.4222 KOps/s | 4.5681 KOps/s | |
test_mod_wrap[compile] | 0.4436ms | 0.2189ms | 4.5680 KOps/s | 4.8991 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4186ms | 0.2074ms | 4.8207 KOps/s | 4.8474 KOps/s | |
test_mod_wrap_and_backward[eager] | 18.5038ms | 11.8347ms | 84.4971 Ops/s | 88.1358 Ops/s | |
test_mod_wrap_and_backward[compile] | 17.7854ms | 12.6738ms | 78.9031 Ops/s | 79.9234 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 18.1711ms | 12.5378ms | 79.7590 Ops/s | 79.5108 Ops/s | |
test_seq_add[eager] | 0.2514ms | 0.1187ms | 8.4215 KOps/s | 8.9715 KOps/s | |
test_seq_add[compile] | 0.1289ms | 63.0348μs | 15.8642 KOps/s | 15.6342 KOps/s | |
test_seq_add[compile-overhead] | 0.1305ms | 60.3032μs | 16.5829 KOps/s | 16.2821 KOps/s | |
test_seq_wrap[eager] | 0.6984ms | 0.4460ms | 2.2422 KOps/s | 2.3753 KOps/s | |
test_seq_wrap[compile] | 0.3455ms | 0.2292ms | 4.3632 KOps/s | 4.3847 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3589ms | 0.2279ms | 4.3872 KOps/s | 4.4201 KOps/s | |
test_func_call_runtime[False-eager] | 0.9787ms | 0.5603ms | 1.7849 KOps/s | 1.8429 KOps/s | |
test_func_call_runtime[False-compile] | 0.8218ms | 0.4336ms | 2.3062 KOps/s | 2.4051 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.8054ms | 0.4323ms | 2.3132 KOps/s | 2.3777 KOps/s | |
test_func_call_runtime[True-eager] | 1.2615ms | 0.7692ms | 1.3000 KOps/s | 1.3289 KOps/s | |
test_func_call_runtime[True-compile] | 0.8367ms | 0.4702ms | 2.1266 KOps/s | 2.1606 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5929ms | 0.4700ms | 2.1276 KOps/s | 2.1689 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9298ms | 0.5527ms | 1.8092 KOps/s | 1.8556 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5425ms | 0.4285ms | 2.3336 KOps/s | 2.3830 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5835ms | 0.4284ms | 2.3342 KOps/s | 2.3900 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4780ms | 0.9118ms | 1.0967 KOps/s | 1.1167 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8727ms | 0.4945ms | 2.0222 KOps/s | 2.0512 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.8554ms | 0.4972ms | 2.0111 KOps/s | 2.0674 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.8388ms | 1.9272ms | 518.8976 Ops/s | 525.5861 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9042ms | 0.5249ms | 1.9050 KOps/s | 1.9304 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.6972ms | 0.5236ms | 1.9099 KOps/s | 1.9402 KOps/s | |
test_distributed | 0.2557ms | 0.1264ms | 7.9085 KOps/s | 7.8330 KOps/s | |
test_tdmodule | 0.1221ms | 26.7583μs | 37.3715 KOps/s | 39.4510 KOps/s | |
test_tdmodule_dispatch | 82.7050μs | 48.7516μs | 20.5121 KOps/s | 21.7825 KOps/s | |
test_tdseq | 60.4130μs | 29.2122μs | 34.2322 KOps/s | 35.4830 KOps/s | |
test_tdseq_dispatch | 88.5650μs | 54.9177μs | 18.2091 KOps/s | 19.1863 KOps/s | |
test_instantiation_functorch | 2.0519ms | 1.5848ms | 630.9972 Ops/s | 650.7353 Ops/s | |
test_exec_functorch | 0.2797ms | 0.1822ms | 5.4872 KOps/s | 5.7314 KOps/s | |
test_exec_functional_call | 0.3120ms | 0.1765ms | 5.6661 KOps/s | 5.9956 KOps/s | |
test_exec_td_decorator | 0.4668ms | 0.2370ms | 4.2200 KOps/s | 4.3871 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.2409ms | 0.6694ms | 1.4938 KOps/s | 1.5313 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8693ms | 0.6631ms | 1.5080 KOps/s | 1.5541 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7308ms | 0.5354ms | 1.8678 KOps/s | 1.9094 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8585ms | 0.5379ms | 1.8592 KOps/s | 1.9004 KOps/s | |
test_to_module_speed[True] | 1.5394ms | 1.3433ms | 744.4354 Ops/s | 742.7790 Ops/s | |
test_to_module_speed[False] | 1.5678ms | 1.3196ms | 757.8049 Ops/s | 761.5808 Ops/s | |
test_tc_init | 0.1266ms | 49.9162μs | 20.0336 KOps/s | 22.4835 KOps/s | |
test_tc_init_nested | 0.1535ms | 95.7153μs | 10.4477 KOps/s | 11.6603 KOps/s | |
test_tc_first_layer_tensor | 16.1000μs | 1.5962μs | 626.4991 KOps/s | 651.0546 KOps/s | |
test_tc_first_layer_nontensor | 36.9280μs | 4.7918μs | 208.6885 KOps/s | 214.1258 KOps/s | |
test_tc_second_layer_tensor | 43.7710μs | 2.9261μs | 341.7542 KOps/s | 349.9554 KOps/s | |
test_tc_second_layer_nontensor | 27.2510μs | 6.1724μs | 162.0106 KOps/s | 167.0392 KOps/s | |
test_unbind | 0.2205s | 15.0916ms | 66.2622 Ops/s | 78.2721 Ops/s | |
test_full_like | 9.9184ms | 7.5308ms | 132.7888 Ops/s | 71.3174 Ops/s | |
test_zeros_like | 3.8349ms | 2.9953ms | 333.8515 Ops/s | 134.6580 Ops/s | |
test_ones_like | 5.0114ms | 3.5879ms | 278.7138 Ops/s | 122.6627 Ops/s | |
test_clone | 7.5466ms | 5.5375ms | 180.5854 Ops/s | 104.5356 Ops/s | |
test_squeeze | 65.6720μs | 12.2545μs | 81.6029 KOps/s | 84.2360 KOps/s | |
test_unsqueeze | 0.1679ms | 94.1699μs | 10.6191 KOps/s | 10.8648 KOps/s | |
test_split | 0.5857ms | 0.2013ms | 4.9666 KOps/s | 5.1904 KOps/s | |
test_permute | 0.4045ms | 0.2141ms | 4.6711 KOps/s | 4.8359 KOps/s | |
test_stack | 31.5510ms | 25.2450ms | 39.6118 Ops/s | 38.3302 Ops/s | |
test_cat | 31.5497ms | 24.9342ms | 40.1055 Ops/s | 39.7962 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 31.9200μs | 11.5810μs | 86.3487 KOps/s | 78.0322 KOps/s | |
test_plain_set_stack_nested | 32.1800μs | 11.8552μs | 84.3509 KOps/s | 77.1647 KOps/s | |
test_plain_set_nested_inplace | 34.3010μs | 12.7964μs | 78.1470 KOps/s | 71.9067 KOps/s | |
test_plain_set_stack_nested_inplace | 33.8910μs | 12.6555μs | 79.0172 KOps/s | 71.6669 KOps/s | |
test_items | 20.9600μs | 2.8997μs | 344.8632 KOps/s | 344.2328 KOps/s | |
test_items_nested | 0.4840ms | 0.3628ms | 2.7565 KOps/s | 2.8037 KOps/s | |
test_items_nested_locked | 0.4016ms | 0.3632ms | 2.7532 KOps/s | 2.7807 KOps/s | |
test_items_nested_leaf | 83.6310μs | 59.1970μs | 16.8927 KOps/s | 16.9269 KOps/s | |
test_items_stack_nested | 0.3954ms | 0.3674ms | 2.7215 KOps/s | 2.7477 KOps/s | |
test_items_stack_nested_leaf | 88.9210μs | 61.5065μs | 16.2584 KOps/s | 16.6128 KOps/s | |
test_items_stack_nested_locked | 0.3970ms | 0.3668ms | 2.7265 KOps/s | 2.7536 KOps/s | |
test_keys | 22.7700μs | 3.4995μs | 285.7533 KOps/s | 289.4395 KOps/s | |
test_keys_nested | 0.1109ms | 81.5377μs | 12.2643 KOps/s | 12.3048 KOps/s | |
test_keys_nested_locked | 0.8283ms | 85.7745μs | 11.6585 KOps/s | 11.4837 KOps/s | |
test_keys_nested_leaf | 2.4591ms | 72.3896μs | 13.8141 KOps/s | 13.9386 KOps/s | |
test_keys_stack_nested | 0.1070ms | 81.9407μs | 12.2039 KOps/s | 11.8830 KOps/s | |
test_keys_stack_nested_leaf | 97.0620μs | 73.3438μs | 13.6344 KOps/s | 13.3885 KOps/s | |
test_keys_stack_nested_locked | 0.1337ms | 87.8783μs | 11.3794 KOps/s | 11.1518 KOps/s | |
test_values | 5.3985μs | 0.8549μs | 1.1697 MOps/s | 1.1654 MOps/s | |
test_values_nested | 54.2300μs | 34.8351μs | 28.7067 KOps/s | 28.9358 KOps/s | |
test_values_nested_locked | 61.8410μs | 36.0988μs | 27.7017 KOps/s | 27.5624 KOps/s | |
test_values_nested_leaf | 67.0910μs | 39.2313μs | 25.4898 KOps/s | 25.4928 KOps/s | |
test_values_stack_nested | 75.6910μs | 34.9375μs | 28.6225 KOps/s | 28.3641 KOps/s | |
test_values_stack_nested_leaf | 76.9410μs | 39.5764μs | 25.2676 KOps/s | 25.1888 KOps/s | |
test_values_stack_nested_locked | 67.8510μs | 36.6601μs | 27.2776 KOps/s | 27.0618 KOps/s | |
test_membership | 1.8215μs | 0.5202μs | 1.9223 MOps/s | 1.9486 MOps/s | |
test_membership_nested | 29.8910μs | 2.0965μs | 476.9938 KOps/s | 471.7162 KOps/s | |
test_membership_nested_leaf | 15.9350μs | 2.0096μs | 497.6038 KOps/s | 491.1910 KOps/s | |
test_membership_stacked_nested | 47.4900μs | 2.1027μs | 475.5694 KOps/s | 463.6338 KOps/s | |
test_membership_stacked_nested_leaf | 31.3410μs | 2.0822μs | 480.2601 KOps/s | 476.6002 KOps/s | |
test_membership_nested_last | 32.4210μs | 3.1054μs | 322.0221 KOps/s | 324.3384 KOps/s | |
test_membership_nested_leaf_last | 31.1900μs | 3.0975μs | 322.8447 KOps/s | 315.9784 KOps/s | |
test_membership_stacked_nested_last | 46.7910μs | 3.6182μs | 276.3821 KOps/s | 320.4616 KOps/s | |
test_membership_stacked_nested_leaf_last | 26.8410μs | 3.5629μs | 280.6670 KOps/s | 321.6560 KOps/s | |
test_nested_getleaf | 39.3500μs | 6.0683μs | 164.7898 KOps/s | 163.3576 KOps/s | |
test_nested_get | 39.1810μs | 5.8815μs | 170.0254 KOps/s | 172.1438 KOps/s | |
test_stacked_getleaf | 33.3500μs | 6.1348μs | 163.0058 KOps/s | 161.7016 KOps/s | |
test_stacked_get | 35.4200μs | 5.8548μs | 170.7995 KOps/s | 171.9199 KOps/s | |
test_nested_getitemleaf | 86.1210μs | 6.3183μs | 158.2700 KOps/s | 159.9705 KOps/s | |
test_nested_getitem | 38.3800μs | 5.8832μs | 169.9748 KOps/s | 168.5819 KOps/s | |
test_stacked_getitemleaf | 27.5500μs | 6.2313μs | 160.4807 KOps/s | 160.3193 KOps/s | |
test_stacked_getitem | 33.0310μs | 5.8979μs | 169.5518 KOps/s | 168.8157 KOps/s | |
test_lock_nested | 2.3644ms | 0.3788ms | 2.6396 KOps/s | 2.6742 KOps/s | |
test_lock_stack_nested | 0.3903ms | 0.3476ms | 2.8771 KOps/s | 2.8604 KOps/s | |
test_unlock_nested | 0.6712ms | 0.3161ms | 3.1638 KOps/s | 3.1403 KOps/s | |
test_unlock_stack_nested | 0.3427ms | 0.2865ms | 3.4902 KOps/s | 3.4771 KOps/s | |
test_flatten_speed | 0.1248ms | 75.5681μs | 13.2331 KOps/s | 13.3965 KOps/s | |
test_unflatten_speed | 0.3631ms | 0.3191ms | 3.1339 KOps/s | 3.0652 KOps/s | |
test_common_ops | 1.5321ms | 0.5902ms | 1.6943 KOps/s | 1.5735 KOps/s | |
test_creation | 0.1272ms | 1.7430μs | 573.7153 KOps/s | 572.0595 KOps/s | |
test_creation_empty | 39.8800μs | 7.0503μs | 141.8386 KOps/s | 106.8111 KOps/s | |
test_creation_nested_1 | 33.1300μs | 8.7508μs | 114.2751 KOps/s | 90.1545 KOps/s | |
test_creation_nested_2 | 48.9010μs | 11.4340μs | 87.4588 KOps/s | 71.6002 KOps/s | |
test_clone | 37.7500μs | 10.9849μs | 91.0337 KOps/s | 93.7749 KOps/s | |
test_getitem[int] | 1.2967ms | 10.9822μs | 91.0567 KOps/s | 92.8675 KOps/s | |
test_getitem[slice_int] | 0.1043ms | 21.5910μs | 46.3155 KOps/s | 47.5745 KOps/s | |
test_getitem[range] | 0.1258ms | 37.3665μs | 26.7619 KOps/s | 26.9623 KOps/s | |
test_getitem[tuple] | 0.1162ms | 18.3482μs | 54.5014 KOps/s | 55.1638 KOps/s | |
test_getitem[list] | 0.1310ms | 33.8179μs | 29.5702 KOps/s | 31.0787 KOps/s | |
test_setitem_dim[int] | 52.2810μs | 19.0133μs | 52.5948 KOps/s | 53.2200 KOps/s | |
test_setitem_dim[slice_int] | 65.2610μs | 39.0678μs | 25.5965 KOps/s | 25.7278 KOps/s | |
test_setitem_dim[range] | 78.0710μs | 53.1712μs | 18.8072 KOps/s | 18.8772 KOps/s | |
test_setitem_dim[tuple] | 52.6110μs | 32.7586μs | 30.5264 KOps/s | 30.5054 KOps/s | |
test_setitem | 39.8100μs | 14.9014μs | 67.1076 KOps/s | 62.4946 KOps/s | |
test_set | 0.1102ms | 14.2981μs | 69.9396 KOps/s | 64.8961 KOps/s | |
test_set_shared | 1.4656ms | 0.1544ms | 6.4768 KOps/s | 6.5125 KOps/s | |
test_update | 0.4809ms | 16.4460μs | 60.8052 KOps/s | 52.5554 KOps/s | |
test_update_nested | 0.1116ms | 21.7484μs | 45.9805 KOps/s | 39.6300 KOps/s | |
test_update__nested | 0.5117ms | 26.3716μs | 37.9195 KOps/s | 38.5446 KOps/s | |
test_set_nested | 0.1053ms | 15.6984μs | 63.7008 KOps/s | 59.6106 KOps/s | |
test_set_nested_new | 0.1134ms | 19.4429μs | 51.4327 KOps/s | 52.4635 KOps/s | |
test_select | 0.2117ms | 31.6616μs | 31.5840 KOps/s | 31.1518 KOps/s | |
test_select_nested | 76.9510μs | 43.8946μs | 22.7819 KOps/s | 22.1929 KOps/s | |
test_exclude_nested | 95.7210μs | 64.4857μs | 15.5073 KOps/s | 15.2444 KOps/s | |
test_empty[True] | 0.3634ms | 0.2886ms | 3.4656 KOps/s | 3.4423 KOps/s | |
test_empty[False] | 3.6050μs | 0.8794μs | 1.1371 MOps/s | 1.1268 MOps/s | |
test_to | 85.9710μs | 55.5945μs | 17.9874 KOps/s | 17.8097 KOps/s | |
test_to_nonblocking | 0.1001ms | 47.8488μs | 20.8992 KOps/s | 20.5748 KOps/s | |
test_unbind_speed | 1.3167ms | 0.2413ms | 4.1447 KOps/s | 4.0966 KOps/s | |
test_unbind_speed_stack0 | 0.2985ms | 0.2388ms | 4.1873 KOps/s | 4.1201 KOps/s | |
test_unbind_speed_stack1 | 92.7464ms | 0.6705ms | 1.4915 KOps/s | 1.4725 KOps/s | |
test_split | 0.1023s | 1.6410ms | 609.3768 Ops/s | 580.4213 Ops/s | |
test_chunk | 94.5760ms | 1.6440ms | 608.2829 Ops/s | 689.6312 Ops/s | |
test_consolidate[False-None] | 97.0657ms | 2.9420ms | 339.9036 Ops/s | 338.7293 Ops/s | |
test_consolidate[default-None] | 1.8557ms | 1.6606ms | 602.1899 Ops/s | 587.1979 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8157ms | 1.6654ms | 600.4476 Ops/s | 582.3687 Ops/s | |
test_consolidate_njt[False-None] | 7.0018ms | 6.4374ms | 155.3426 Ops/s | 152.2004 Ops/s | |
test_to[False-False-None] | 1.7558ms | 1.6903ms | 591.6038 Ops/s | 575.6510 Ops/s | |
test_to[True-False-None] | 1.5757ms | 1.2922ms | 773.8556 Ops/s | 769.0263 Ops/s | |
test_to[within-False-None] | 4.1775ms | 4.0587ms | 246.3849 Ops/s | 241.0604 Ops/s | |
test_to[True-default-None] | 5.4672ms | 5.2765ms | 189.5183 Ops/s | 187.6885 Ops/s | |
test_to_njt[False-False-None] | 7.3878ms | 7.1939ms | 139.0063 Ops/s | 142.5187 Ops/s | |
test_to_njt[True-False-None] | 5.7487ms | 5.4789ms | 182.5194 Ops/s | 178.3925 Ops/s | |
test_to_njt[within-False-None] | 12.2645ms | 12.1485ms | 82.3147 Ops/s | 72.2224 Ops/s | |
test_creation[device0] | 0.5495ms | 81.1264μs | 12.3265 KOps/s | 11.2823 KOps/s | |
test_creation_from_tensor | 0.5017ms | 86.4161μs | 11.5719 KOps/s | 11.0408 KOps/s | |
test_add_one[memmap_tensor0] | 0.3691ms | 6.8015μs | 147.0270 KOps/s | 142.9026 KOps/s | |
test_contiguous[memmap_tensor0] | 2.1380μs | 0.4167μs | 2.4000 MOps/s | 2.4434 MOps/s | |
test_stack[memmap_tensor0] | 38.5510μs | 4.3727μs | 228.6923 KOps/s | 231.5721 KOps/s | |
test_memmaptd_index | 1.7255ms | 0.2484ms | 4.0251 KOps/s | 3.9610 KOps/s | |
test_memmaptd_index_astensor | 0.5828ms | 0.3094ms | 3.2325 KOps/s | 3.1156 KOps/s | |
test_memmaptd_index_op | 1.0012ms | 0.5616ms | 1.7805 KOps/s | 1.6743 KOps/s | |
test_serialize_model | 0.1321s | 0.1314s | 7.6118 Ops/s | 7.5890 Ops/s | |
test_serialize_model_pickle | 1.3460s | 1.1876s | 0.8420 Ops/s | 0.8209 Ops/s | |
test_serialize_weights | 0.1325s | 0.1310s | 7.6312 Ops/s | 7.6661 Ops/s | |
test_serialize_weights_returnearly | 0.3336s | 63.4441ms | 15.7619 Ops/s | 14.4756 Ops/s | |
test_serialize_weights_pickle | 1.3739s | 1.2163s | 0.8221 Ops/s | 0.8224 Ops/s | |
test_reshape_pytree | 51.9500μs | 22.3578μs | 44.7271 KOps/s | 44.5452 KOps/s | |
test_reshape_td | 59.6710μs | 26.9967μs | 37.0415 KOps/s | 35.7033 KOps/s | |
test_view_pytree | 50.1110μs | 21.6009μs | 46.2944 KOps/s | 44.9342 KOps/s | |
test_view_td | 59.8610μs | 30.3092μs | 32.9933 KOps/s | 29.2340 KOps/s | |
test_unbind_pytree | 50.2710μs | 27.8523μs | 35.9036 KOps/s | 35.2426 KOps/s | |
test_unbind_td | 0.6800ms | 37.0917μs | 26.9602 KOps/s | 26.6820 KOps/s | |
test_split_pytree | 84.6310μs | 29.8697μs | 33.4788 KOps/s | 32.6548 KOps/s | |
test_split_td | 0.8649ms | 37.5477μs | 26.6328 KOps/s | 25.4148 KOps/s | |
test_add_pytree | 67.4510μs | 35.3247μs | 28.3088 KOps/s | 29.1597 KOps/s | |
test_add_td | 80.0600μs | 49.3906μs | 20.2468 KOps/s | 18.9396 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1742ms | 0.1199ms | 8.3376 KOps/s | 8.1252 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2254ms | 0.1309ms | 7.6415 KOps/s | 7.5766 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1315ms | 96.1276μs | 10.4028 KOps/s | 10.1436 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3379ms | 0.1480ms | 6.7550 KOps/s | 6.4166 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 52.4910μs | 22.5080μs | 44.4286 KOps/s | 43.7646 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 69.2300μs | 29.6788μs | 33.6940 KOps/s | 33.2408 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4040ms | 65.1314μs | 15.3536 KOps/s | 15.0758 KOps/s | |
test_compile_copy_nested[pytree-eager] | 81.4810μs | 49.9144μs | 20.0343 KOps/s | 19.9567 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1813ms | 0.1393ms | 7.1808 KOps/s | 6.9355 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3023ms | 0.2128ms | 4.6990 KOps/s | 4.6472 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1363ms | 97.3162μs | 10.2758 KOps/s | 9.6295 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1087ms | 53.4577μs | 18.7064 KOps/s | 18.2958 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1706ms | 0.1338ms | 7.4749 KOps/s | 7.3360 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5219ms | 0.4769ms | 2.0967 KOps/s | 2.0564 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3650ms | 0.2544ms | 3.9309 KOps/s | 3.8534 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1849ms | 0.1407ms | 7.1077 KOps/s | 7.0782 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1432ms | 64.3637μs | 15.5367 KOps/s | 15.4052 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1378ms | 97.3720μs | 10.2699 KOps/s | 10.0357 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4483ms | 0.3965ms | 2.5222 KOps/s | 2.2499 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1897ms | 0.1332ms | 7.5076 KOps/s | 7.3430 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 65.2400μs | 18.6865μs | 53.5147 KOps/s | 53.8026 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 62.4600μs | 31.2137μs | 32.0372 KOps/s | 31.3977 KOps/s | |
test_compile_copy_flat[pytree-compile] | 98.6310μs | 71.5256μs | 13.9810 KOps/s | 14.0690 KOps/s | |
test_compile_copy_flat[pytree-eager] | 83.8610μs | 52.1710μs | 19.1677 KOps/s | 19.0547 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6039ms | 0.3871ms | 2.5834 KOps/s | 2.2439 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.7966ms | 2.6388ms | 378.9656 Ops/s | 375.7283 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5617ms | 0.3783ms | 2.6435 KOps/s | 2.2959 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.0672ms | 2.6315ms | 380.0088 Ops/s | 380.8688 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.5408ms | 0.1135ms | 8.8119 KOps/s | 8.4167 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5571ms | 79.3188μs | 12.6074 KOps/s | 12.1770 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1570ms | 0.1091ms | 9.1626 KOps/s | 9.1922 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.4904ms | 70.9064μs | 14.1031 KOps/s | 13.5226 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.5252ms | 0.1121ms | 8.9203 KOps/s | 8.5657 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.4921ms | 71.1533μs | 14.0542 KOps/s | 13.2496 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1419ms | 0.1023ms | 9.7749 KOps/s | 9.0016 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.4433ms | 17.4918μs | 57.1695 KOps/s | 42.6618 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.5276ms | 97.2176μs | 10.2862 KOps/s | 9.2987 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.4219ms | 16.2723μs | 61.4540 KOps/s | 56.3722 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.5147ms | 98.0239μs | 10.2016 KOps/s | 9.2245 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 50.8900μs | 16.0014μs | 62.4943 KOps/s | 56.3723 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.5183ms | 0.1028ms | 9.7264 KOps/s | 8.8503 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5705ms | 16.9816μs | 58.8872 KOps/s | 50.9253 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.5053ms | 97.9179μs | 10.2126 KOps/s | 9.2642 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 50.8610μs | 15.9733μs | 62.6044 KOps/s | 55.4241 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.5065ms | 97.7447μs | 10.2307 KOps/s | 9.2579 KOps/s | |
test_compile_indexing[int-pytree-eager] | 49.3310μs | 15.8674μs | 63.0221 KOps/s | 55.8574 KOps/s | |
test_mod_add[eager] | 83.9610μs | 37.1592μs | 26.9112 KOps/s | 22.6949 KOps/s | |
test_mod_add[compile] | 0.1314ms | 79.1119μs | 12.6403 KOps/s | 11.1728 KOps/s | |
test_mod_add[compile-overhead] | 0.3185ms | 0.1680ms | 5.9537 KOps/s | 5.7415 KOps/s | |
test_mod_wrap[eager] | 0.3485ms | 0.2541ms | 3.9362 KOps/s | 3.5982 KOps/s | |
test_mod_wrap[compile] | 0.3583ms | 0.2825ms | 3.5403 KOps/s | 3.3914 KOps/s | |
test_mod_wrap[compile-overhead] | 7.1035ms | 3.7376ms | 267.5484 Ops/s | 284.0270 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4628ms | 1.3637ms | 733.3218 Ops/s | 685.6287 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.4035ms | 1.2697ms | 787.5975 Ops/s | 731.0145 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3929ms | 0.9451ms | 1.0580 KOps/s | 917.4492 Ops/s | |
test_seq_add[eager] | 0.1737ms | 0.1144ms | 8.7427 KOps/s | 7.9733 KOps/s | |
test_seq_add[compile] | 0.2239ms | 88.2074μs | 11.3369 KOps/s | 10.6016 KOps/s | |
test_seq_add[compile-overhead] | 0.2220ms | 0.1308ms | 7.6478 KOps/s | 7.4455 KOps/s | |
test_seq_wrap[eager] | 0.4757ms | 0.4141ms | 2.4150 KOps/s | 2.2568 KOps/s | |
test_seq_wrap[compile] | 0.3512ms | 0.2982ms | 3.3532 KOps/s | 3.1383 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3031ms | 0.2252ms | 4.4396 KOps/s | 4.4258 KOps/s | |
test_func_call_runtime[False-eager] | 0.8095ms | 0.7347ms | 1.3612 KOps/s | 1.2808 KOps/s | |
test_func_call_runtime[False-compile] | 0.9993ms | 0.7399ms | 1.3515 KOps/s | 1.2427 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4137ms | 0.3639ms | 2.7481 KOps/s | 2.6134 KOps/s | |
test_func_call_runtime[True-eager] | 0.9577ms | 0.8982ms | 1.1134 KOps/s | 1.0099 KOps/s | |
test_func_call_runtime[True-compile] | 0.8223ms | 0.7602ms | 1.3154 KOps/s | 1.3178 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4333ms | 0.3861ms | 2.5903 KOps/s | 2.5933 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7937ms | 0.7291ms | 1.3715 KOps/s | 1.3189 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.7917ms | 0.7450ms | 1.3422 KOps/s | 1.3419 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4233ms | 0.3664ms | 2.7293 KOps/s | 2.7303 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1019ms | 1.0022ms | 997.8537 Ops/s | 981.4942 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.8486ms | 0.7868ms | 1.2710 KOps/s | 1.2595 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4643ms | 0.4096ms | 2.4417 KOps/s | 2.4079 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5653ms | 2.0881ms | 478.9064 Ops/s | 471.5418 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9146ms | 0.8147ms | 1.2274 KOps/s | 1.2337 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4895ms | 0.4137ms | 2.4171 KOps/s | 2.4154 KOps/s | |
test_distributed | 3.2015ms | 0.1859ms | 5.3778 KOps/s | 8.3218 KOps/s | |
test_tdmodule | 43.9300μs | 19.2564μs | 51.9307 KOps/s | 47.7854 KOps/s | |
test_tdmodule_dispatch | 58.5310μs | 34.3470μs | 29.1147 KOps/s | 25.6213 KOps/s | |
test_tdseq | 38.9710μs | 20.2790μs | 49.3121 KOps/s | 45.5595 KOps/s | |
test_tdseq_dispatch | 62.0210μs | 37.4576μs | 26.6969 KOps/s | 24.6543 KOps/s | |
test_instantiation_functorch | 1.6488ms | 1.5751ms | 634.8923 Ops/s | 636.5040 Ops/s | |
test_exec_functorch | 0.2058ms | 0.1481ms | 6.7522 KOps/s | 6.8860 KOps/s | |
test_exec_functional_call | 0.1855ms | 0.1394ms | 7.1745 KOps/s | 7.1819 KOps/s | |
test_exec_td_decorator | 0.3850ms | 0.1862ms | 5.3712 KOps/s | 5.3137 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7547ms | 0.6865ms | 1.4567 KOps/s | 1.3735 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8248ms | 0.6827ms | 1.4647 KOps/s | 1.3662 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7093ms | 0.5978ms | 1.6729 KOps/s | 1.5735 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7109ms | 0.5958ms | 1.6785 KOps/s | 1.5676 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.3633ms | 19.1812ms | 52.1343 Ops/s | 51.4574 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.9269ms | 19.2612ms | 51.9180 Ops/s | 51.2986 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.2654ms | 19.1301ms | 52.2736 Ops/s | 51.4826 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.3817ms | 19.0823ms | 52.4045 Ops/s | 51.7839 Ops/s | |
test_to_module_speed[True] | 1.0657ms | 0.9748ms | 1.0258 KOps/s | 1.0112 KOps/s | |
test_to_module_speed[False] | 1.3397ms | 0.9604ms | 1.0412 KOps/s | 1.0360 KOps/s | |
test_tc_init | 61.0910μs | 35.2047μs | 28.4053 KOps/s | 26.6634 KOps/s | |
test_tc_init_nested | 0.1128ms | 69.5846μs | 14.3710 KOps/s | 13.4726 KOps/s | |
test_tc_first_layer_tensor | 20.7500μs | 0.8281μs | 1.2076 MOps/s | 1.2239 MOps/s | |
test_tc_first_layer_nontensor | 27.8310μs | 2.3162μs | 431.7339 KOps/s | 432.8706 KOps/s | |
test_tc_second_layer_tensor | 9.1653μs | 1.4424μs | 693.2900 KOps/s | 682.9727 KOps/s | |
test_tc_second_layer_nontensor | 25.9700μs | 3.0829μs | 324.3676 KOps/s | 323.9778 KOps/s | |
test_unbind | 0.2430s | 10.3176ms | 96.9216 Ops/s | 141.7124 Ops/s | |
test_full_like | 12.1733ms | 9.2502ms | 108.1055 Ops/s | 106.6248 Ops/s | |
test_zeros_like | 6.0543ms | 4.3367ms | 230.5900 Ops/s | 230.3083 Ops/s | |
test_ones_like | 5.0743ms | 4.4411ms | 225.1701 Ops/s | 225.5189 Ops/s | |
test_clone | 11.4594ms | 9.2436ms | 108.1829 Ops/s | 153.6019 Ops/s | |
test_squeeze | 62.8310μs | 9.8033μs | 102.0069 KOps/s | 103.6975 KOps/s | |
test_unsqueeze | 0.1233ms | 74.1847μs | 13.4799 KOps/s | 13.4128 KOps/s | |
test_split | 0.3816ms | 0.1629ms | 6.1382 KOps/s | 5.9653 KOps/s | |
test_permute | 0.2197ms | 0.1759ms | 5.6855 KOps/s | 5.3312 KOps/s | |
test_stack | 51.3354ms | 51.0684ms | 19.5816 Ops/s | 19.5028 Ops/s | |
test_cat | 52.8748ms | 51.7185ms | 19.3354 Ops/s | 19.2463 Ops/s |
This was referenced Jan 7, 2025
vmoens
added a commit
that referenced
this pull request
Jan 7, 2025
ghstack-source-id: a88bebc23e6aaa02ec297db72dbda68ec9628ce7 Pull Request resolved: #1163
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):