-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] inplace to method #1066
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Oct 31, 2024
ghstack-source-id: 32b7865fa9378527b6091d25d88f6d1ee385a4ac Pull Request resolved: #1066
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Oct 31, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 46.1670μs | 20.9429μs | 47.7490 KOps/s | 45.4332 KOps/s | |
test_plain_set_stack_nested | 58.2890μs | 21.1164μs | 47.3566 KOps/s | 45.1703 KOps/s | |
test_plain_set_nested_inplace | 86.2920μs | 22.8909μs | 43.6855 KOps/s | 41.3898 KOps/s | |
test_plain_set_stack_nested_inplace | 61.9370μs | 22.6680μs | 44.1151 KOps/s | 41.6934 KOps/s | |
test_items | 48.4510μs | 4.1533μs | 240.7745 KOps/s | 240.0567 KOps/s | |
test_items_nested | 0.5780ms | 0.3379ms | 2.9596 KOps/s | 2.8725 KOps/s | |
test_items_nested_locked | 0.4899ms | 0.3383ms | 2.9561 KOps/s | 2.8479 KOps/s | |
test_items_nested_leaf | 0.1320ms | 71.7154μs | 13.9440 KOps/s | 14.0440 KOps/s | |
test_items_stack_nested | 0.5759ms | 0.3400ms | 2.9415 KOps/s | 2.8189 KOps/s | |
test_items_stack_nested_leaf | 0.1580ms | 77.7271μs | 12.8655 KOps/s | 13.5260 KOps/s | |
test_items_stack_nested_locked | 0.5941ms | 0.3377ms | 2.9614 KOps/s | 2.8253 KOps/s | |
test_keys | 53.4310μs | 3.5802μs | 279.3136 KOps/s | 283.3679 KOps/s | |
test_keys_nested | 0.2336ms | 0.1379ms | 7.2521 KOps/s | 7.2878 KOps/s | |
test_keys_nested_locked | 0.7439ms | 0.1426ms | 7.0115 KOps/s | 6.9019 KOps/s | |
test_keys_nested_leaf | 0.1853ms | 0.1181ms | 8.4703 KOps/s | 8.4459 KOps/s | |
test_keys_stack_nested | 0.2327ms | 0.1369ms | 7.3046 KOps/s | 7.1018 KOps/s | |
test_keys_stack_nested_leaf | 0.1876ms | 0.1157ms | 8.6423 KOps/s | 8.4014 KOps/s | |
test_keys_stack_nested_locked | 0.2256ms | 0.1415ms | 7.0664 KOps/s | 6.9569 KOps/s | |
test_values | 9.6982μs | 1.0418μs | 959.9060 KOps/s | 884.5101 KOps/s | |
test_values_nested | 0.1239ms | 54.0378μs | 18.5056 KOps/s | 17.8612 KOps/s | |
test_values_nested_locked | 0.1387ms | 54.4313μs | 18.3718 KOps/s | 17.2713 KOps/s | |
test_values_nested_leaf | 0.1134ms | 58.9705μs | 16.9576 KOps/s | 16.4735 KOps/s | |
test_values_stack_nested | 0.1187ms | 55.1573μs | 18.1300 KOps/s | 17.5607 KOps/s | |
test_values_stack_nested_leaf | 0.1220ms | 59.1109μs | 16.9173 KOps/s | 16.4018 KOps/s | |
test_values_stack_nested_locked | 0.1325ms | 56.0455μs | 17.8426 KOps/s | 17.5876 KOps/s | |
test_membership | 17.0320μs | 0.8596μs | 1.1633 MOps/s | 1.1148 MOps/s | |
test_membership_nested | 51.4670μs | 2.6961μs | 370.9118 KOps/s | 358.0303 KOps/s | |
test_membership_nested_leaf | 54.8130μs | 2.7543μs | 363.0740 KOps/s | 353.3954 KOps/s | |
test_membership_stacked_nested | 29.4860μs | 2.6996μs | 370.4297 KOps/s | 366.1551 KOps/s | |
test_membership_stacked_nested_leaf | 22.6020μs | 2.7095μs | 369.0683 KOps/s | 362.0266 KOps/s | |
test_membership_nested_last | 63.6900μs | 3.9746μs | 251.5981 KOps/s | 238.6268 KOps/s | |
test_membership_nested_leaf_last | 22.9930μs | 3.9993μs | 250.0463 KOps/s | 243.1771 KOps/s | |
test_membership_stacked_nested_last | 62.6080μs | 3.9843μs | 250.9844 KOps/s | 195.0924 KOps/s | |
test_membership_stacked_nested_leaf_last | 22.8330μs | 3.9775μs | 251.4166 KOps/s | 194.0147 KOps/s | |
test_nested_getleaf | 61.7930μs | 10.5568μs | 94.7254 KOps/s | 94.8211 KOps/s | |
test_nested_get | 44.4540μs | 10.0406μs | 99.5955 KOps/s | 99.3527 KOps/s | |
test_stacked_getleaf | 65.2430μs | 10.5101μs | 95.1467 KOps/s | 94.4373 KOps/s | |
test_stacked_get | 55.6750μs | 10.0981μs | 99.0282 KOps/s | 98.1721 KOps/s | |
test_nested_getitemleaf | 31.6890μs | 10.9413μs | 91.3968 KOps/s | 91.2016 KOps/s | |
test_nested_getitem | 68.1640μs | 10.1128μs | 98.8847 KOps/s | 97.7223 KOps/s | |
test_stacked_getitemleaf | 47.2790μs | 10.9241μs | 91.5411 KOps/s | 91.2837 KOps/s | |
test_stacked_getitem | 58.0600μs | 10.1716μs | 98.3133 KOps/s | 97.7831 KOps/s | |
test_lock_nested | 1.1156ms | 0.4894ms | 2.0432 KOps/s | 2.0063 KOps/s | |
test_lock_stack_nested | 0.5829ms | 0.4579ms | 2.1837 KOps/s | 2.1757 KOps/s | |
test_unlock_nested | 0.9193ms | 0.4141ms | 2.4146 KOps/s | 2.4018 KOps/s | |
test_unlock_stack_nested | 0.4551ms | 0.3797ms | 2.6336 KOps/s | 2.6653 KOps/s | |
test_flatten_speed | 0.1555ms | 91.4205μs | 10.9385 KOps/s | 10.9349 KOps/s | |
test_unflatten_speed | 0.9400ms | 0.4756ms | 2.1026 KOps/s | 2.1239 KOps/s | |
test_common_ops | 2.1151ms | 1.1041ms | 905.6775 Ops/s | 857.6283 Ops/s | |
test_creation | 19.7970μs | 2.0819μs | 480.3386 KOps/s | 484.9301 KOps/s | |
test_creation_empty | 76.0430μs | 16.9018μs | 59.1655 KOps/s | 48.7025 KOps/s | |
test_creation_nested_1 | 50.5850μs | 19.9746μs | 50.0637 KOps/s | 42.0166 KOps/s | |
test_creation_nested_2 | 60.3240μs | 24.1817μs | 41.3536 KOps/s | 35.7312 KOps/s | |
test_clone | 0.1477ms | 16.9144μs | 59.1212 KOps/s | 58.3865 KOps/s | |
test_getitem[int] | 1.0153ms | 16.1043μs | 62.0951 KOps/s | 58.0459 KOps/s | |
test_getitem[slice_int] | 0.1422ms | 29.3899μs | 34.0253 KOps/s | 32.2973 KOps/s | |
test_getitem[range] | 0.1907ms | 57.6053μs | 17.3595 KOps/s | 16.7319 KOps/s | |
test_getitem[tuple] | 0.1422ms | 24.6151μs | 40.6255 KOps/s | 39.4732 KOps/s | |
test_getitem[list] | 0.2022ms | 53.0821μs | 18.8387 KOps/s | 18.3361 KOps/s | |
test_setitem_dim[int] | 79.9300μs | 32.6662μs | 30.6127 KOps/s | 30.1650 KOps/s | |
test_setitem_dim[slice_int] | 0.1040ms | 62.3028μs | 16.0506 KOps/s | 15.7892 KOps/s | |
test_setitem_dim[range] | 0.1337ms | 83.7317μs | 11.9429 KOps/s | 11.5702 KOps/s | |
test_setitem_dim[tuple] | 0.1373ms | 50.2577μs | 19.8975 KOps/s | 19.4766 KOps/s | |
test_setitem | 0.1110ms | 29.7028μs | 33.6669 KOps/s | 32.1597 KOps/s | |
test_set | 64.9120μs | 28.4617μs | 35.1349 KOps/s | 33.0120 KOps/s | |
test_set_shared | 3.2396ms | 0.2201ms | 4.5440 KOps/s | 4.4272 KOps/s | |
test_update | 0.4204ms | 35.5545μs | 28.1258 KOps/s | 26.0391 KOps/s | |
test_update_nested | 1.1170ms | 45.7225μs | 21.8711 KOps/s | 20.3383 KOps/s | |
test_update__nested | 95.6090μs | 40.0546μs | 24.9659 KOps/s | 23.7815 KOps/s | |
test_set_nested | 0.3002ms | 31.2762μs | 31.9732 KOps/s | 29.2876 KOps/s | |
test_set_nested_new | 0.3510ms | 35.5206μs | 28.1527 KOps/s | 25.8865 KOps/s | |
test_select | 0.3342ms | 53.6824μs | 18.6281 KOps/s | 17.3873 KOps/s | |
test_select_nested | 0.3838ms | 60.9803μs | 16.3988 KOps/s | 16.8286 KOps/s | |
test_exclude_nested | 0.1628ms | 75.2621μs | 13.2869 KOps/s | 13.3748 KOps/s | |
test_empty[True] | 0.5573ms | 0.3512ms | 2.8470 KOps/s | 2.7878 KOps/s | |
test_empty[False] | 6.8652μs | 1.2542μs | 797.3460 KOps/s | 805.4425 KOps/s | |
test_unbind_speed | 0.4572ms | 0.3050ms | 3.2783 KOps/s | 3.3336 KOps/s | |
test_unbind_speed_stack0 | 0.9315ms | 0.3041ms | 3.2883 KOps/s | 3.3966 KOps/s | |
test_unbind_speed_stack1 | 0.1173s | 0.8522ms | 1.1734 KOps/s | 1.3367 KOps/s | |
test_split | 0.1144s | 2.1895ms | 456.7323 Ops/s | 453.2778 Ops/s | |
test_chunk | 2.2049ms | 1.9688ms | 507.9122 Ops/s | 504.5663 Ops/s | |
test_creation[device0] | 0.2558ms | 0.1156ms | 8.6524 KOps/s | 8.4350 KOps/s | |
test_creation_from_tensor | 4.3544ms | 0.1185ms | 8.4386 KOps/s | 8.3198 KOps/s | |
test_add_one[memmap_tensor0] | 91.5720μs | 7.3269μs | 136.4834 KOps/s | 141.5601 KOps/s | |
test_contiguous[memmap_tensor0] | 27.2910μs | 1.8672μs | 535.5663 KOps/s | 533.8664 KOps/s | |
test_stack[memmap_tensor0] | 31.8100μs | 5.4783μs | 182.5374 KOps/s | 181.7802 KOps/s | |
test_memmaptd_index | 1.0072ms | 0.3981ms | 2.5120 KOps/s | 2.5366 KOps/s | |
test_memmaptd_index_astensor | 0.9036ms | 0.4715ms | 2.1207 KOps/s | 2.0799 KOps/s | |
test_memmaptd_index_op | 0.1177s | 1.1101ms | 900.8298 Ops/s | 956.0935 Ops/s | |
test_serialize_model | 0.1275s | 0.1195s | 8.3706 Ops/s | 8.1488 Ops/s | |
test_serialize_model_pickle | 0.4511s | 0.3999s | 2.5008 Ops/s | 2.5489 Ops/s | |
test_serialize_weights | 0.1270s | 0.1192s | 8.3858 Ops/s | 7.2882 Ops/s | |
test_serialize_weights_returnearly | 0.1617s | 0.1572s | 6.3605 Ops/s | 6.0065 Ops/s | |
test_serialize_weights_pickle | 0.5331s | 0.4259s | 2.3479 Ops/s | 1.0899 Ops/s | |
test_serialize_weights_filesystem | 0.1560s | 0.1456s | 6.8682 Ops/s | 6.8425 Ops/s | |
test_serialize_model_filesystem | 0.1669s | 0.1537s | 6.5065 Ops/s | 6.8344 Ops/s | |
test_reshape_pytree | 84.3490μs | 38.5780μs | 25.9215 KOps/s | 25.3850 KOps/s | |
test_reshape_td | 0.1185ms | 45.9454μs | 21.7650 KOps/s | 21.0986 KOps/s | |
test_view_pytree | 0.1026ms | 39.0295μs | 25.6217 KOps/s | 25.3634 KOps/s | |
test_view_td | 97.1130μs | 50.8303μs | 19.6733 KOps/s | 19.0634 KOps/s | |
test_unbind_pytree | 83.1360μs | 35.6014μs | 28.0888 KOps/s | 27.9091 KOps/s | |
test_unbind_td | 0.3200ms | 47.1326μs | 21.2168 KOps/s | 22.8455 KOps/s | |
test_split_pytree | 96.9020μs | 37.5107μs | 26.6591 KOps/s | 26.4192 KOps/s | |
test_split_td | 0.5027ms | 56.9549μs | 17.5578 KOps/s | 17.1655 KOps/s | |
test_add_pytree | 96.5910μs | 43.5265μs | 22.9745 KOps/s | 22.1305 KOps/s | |
test_add_td | 0.1539ms | 77.4867μs | 12.9054 KOps/s | 11.6506 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1647ms | 70.2337μs | 14.2382 KOps/s | 13.7545 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4184ms | 0.1885ms | 5.3040 KOps/s | 5.3154 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1301ms | 53.5861μs | 18.6616 KOps/s | 18.0335 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2794ms | 0.1447ms | 6.9118 KOps/s | 6.9236 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1121ms | 25.4259μs | 39.3300 KOps/s | 38.9705 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1530ms | 71.0660μs | 14.0714 KOps/s | 14.0544 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1487ms | 78.7490μs | 12.6986 KOps/s | 12.7404 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1524ms | 67.1805μs | 14.8853 KOps/s | 14.8879 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2453ms | 0.1137ms | 8.7954 KOps/s | 8.6484 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3595ms | 0.2031ms | 4.9244 KOps/s | 4.7985 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1688ms | 52.6788μs | 18.9830 KOps/s | 18.2186 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.5234ms | 69.9675μs | 14.2924 KOps/s | 13.8949 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2002ms | 0.1116ms | 8.9600 KOps/s | 8.4204 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5893ms | 0.2992ms | 3.3422 KOps/s | 3.2860 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4420ms | 0.2158ms | 4.6349 KOps/s | 4.5004 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2411ms | 0.1142ms | 8.7557 KOps/s | 8.6858 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1735ms | 62.6444μs | 15.9631 KOps/s | 15.8214 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1482ms | 53.4595μs | 18.7057 KOps/s | 18.3044 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 1.5280ms | 0.2446ms | 4.0883 KOps/s | 4.0092 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2493ms | 0.1116ms | 8.9616 KOps/s | 8.8829 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 88.0450μs | 20.6605μs | 48.4015 KOps/s | 48.4686 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1367ms | 59.9024μs | 16.6938 KOps/s | 16.7074 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1519ms | 79.8175μs | 12.5286 KOps/s | 12.2772 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1834ms | 68.6795μs | 14.5604 KOps/s | 14.4755 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.4594ms | 0.2188ms | 4.5711 KOps/s | 4.6421 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.9152ms | 1.7100ms | 584.7816 Ops/s | 570.5405 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3204ms | 0.2139ms | 4.6748 KOps/s | 4.6431 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.9339ms | 1.1559ms | 865.0946 Ops/s | 843.5046 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.8084ms | 0.4662ms | 2.1450 KOps/s | 2.1129 KOps/s | |
test_compile_assign_and_add_stack[eager] | 5.0754ms | 3.8452ms | 260.0624 Ops/s | 247.8840 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1203ms | 43.7280μs | 22.8687 KOps/s | 21.8769 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.6090ms | 49.8725μs | 20.0511 KOps/s | 19.5400 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 96.2910μs | 36.6975μs | 27.2498 KOps/s | 26.0237 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 89.4680μs | 29.1207μs | 34.3398 KOps/s | 33.6012 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 90.9410μs | 37.3903μs | 26.7449 KOps/s | 25.5748 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 97.3630μs | 28.7530μs | 34.7790 KOps/s | 33.9732 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1577ms | 76.5999μs | 13.0548 KOps/s | 12.8328 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5728ms | 28.6722μs | 34.8770 KOps/s | 34.1295 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1357ms | 69.7568μs | 14.3355 KOps/s | 14.3199 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1007ms | 23.7377μs | 42.1272 KOps/s | 41.2170 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1352ms | 70.1601μs | 14.2531 KOps/s | 14.1636 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 73.8790μs | 23.7016μs | 42.1912 KOps/s | 41.5830 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1864ms | 78.1573μs | 12.7947 KOps/s | 12.7884 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.8219ms | 28.3456μs | 35.2788 KOps/s | 35.1627 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1402ms | 69.8445μs | 14.3175 KOps/s | 13.6660 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 73.3080μs | 23.6491μs | 42.2849 KOps/s | 41.9597 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1509ms | 69.8194μs | 14.3227 KOps/s | 13.6324 KOps/s | |
test_compile_indexing[int-pytree-eager] | 63.8300μs | 23.5241μs | 42.5096 KOps/s | 42.6020 KOps/s | |
test_mod_add[eager] | 72.4560μs | 24.3053μs | 41.1432 KOps/s | 36.8900 KOps/s | |
test_mod_add[compile] | 0.1076ms | 43.1117μs | 23.1955 KOps/s | 22.5564 KOps/s | |
test_mod_add[compile-overhead] | 0.1008ms | 42.6709μs | 23.4352 KOps/s | 22.1718 KOps/s | |
test_mod_wrap[eager] | 0.3390ms | 0.2077ms | 4.8135 KOps/s | 4.6721 KOps/s | |
test_mod_wrap[compile] | 1.8092ms | 0.1997ms | 5.0080 KOps/s | 4.8227 KOps/s | |
test_mod_wrap[compile-overhead] | 1.7529ms | 0.2016ms | 4.9606 KOps/s | 4.8613 KOps/s | |
test_mod_wrap_and_backward[eager] | 13.5207ms | 11.3407ms | 88.1778 Ops/s | 90.2553 Ops/s | |
test_mod_wrap_and_backward[compile] | 19.0059ms | 11.8815ms | 84.1642 Ops/s | 89.7684 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 17.9569ms | 12.6119ms | 79.2902 Ops/s | 89.8863 Ops/s | |
test_seq_add[eager] | 0.1764ms | 87.1066μs | 11.4802 KOps/s | 10.3719 KOps/s | |
test_seq_add[compile] | 0.1219ms | 58.1629μs | 17.1931 KOps/s | 16.2814 KOps/s | |
test_seq_add[compile-overhead] | 0.1534ms | 57.5352μs | 17.3807 KOps/s | 16.7732 KOps/s | |
test_seq_wrap[eager] | 0.5616ms | 0.3743ms | 2.6716 KOps/s | 2.4670 KOps/s | |
test_seq_wrap[compile] | 0.3987ms | 0.2250ms | 4.4442 KOps/s | 4.3399 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3200ms | 0.2234ms | 4.4770 KOps/s | 4.3969 KOps/s | |
test_func_call_runtime[False-eager] | 0.9572ms | 0.5519ms | 1.8118 KOps/s | 1.8228 KOps/s | |
test_func_call_runtime[False-compile] | 0.5274ms | 0.4304ms | 2.3234 KOps/s | 2.2780 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5307ms | 0.4255ms | 2.3503 KOps/s | 2.2853 KOps/s | |
test_func_call_runtime[True-eager] | 0.8892ms | 0.7514ms | 1.3309 KOps/s | 1.3004 KOps/s | |
test_func_call_runtime[True-compile] | 0.6404ms | 0.4662ms | 2.1448 KOps/s | 2.1202 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.8944ms | 0.4730ms | 2.1143 KOps/s | 2.1161 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7940ms | 0.5483ms | 1.8238 KOps/s | 1.8342 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6555ms | 0.4275ms | 2.3391 KOps/s | 2.3015 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5479ms | 0.4242ms | 2.3572 KOps/s | 2.3016 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0332ms | 0.8980ms | 1.1136 KOps/s | 1.1211 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.6536ms | 0.4922ms | 2.0316 KOps/s | 2.0167 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5863ms | 0.4922ms | 2.0316 KOps/s | 2.0024 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5050ms | 1.9084ms | 524.0072 Ops/s | 528.7362 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8460ms | 0.5199ms | 1.9234 KOps/s | 1.9135 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.6439ms | 0.5227ms | 1.9131 KOps/s | 1.9267 KOps/s | |
test_distributed | 0.3678ms | 0.1270ms | 7.8722 KOps/s | 7.7991 KOps/s | |
test_tdmodule | 55.3440μs | 18.1125μs | 55.2106 KOps/s | 51.8127 KOps/s | |
test_tdmodule_dispatch | 80.7820μs | 35.6345μs | 28.0627 KOps/s | 25.8066 KOps/s | |
test_tdseq | 42.0390μs | 20.9394μs | 47.7568 KOps/s | 44.4626 KOps/s | |
test_tdseq_dispatch | 82.3540μs | 41.7539μs | 23.9499 KOps/s | 22.5720 KOps/s | |
test_instantiation_functorch | 1.8436ms | 1.5452ms | 647.1466 Ops/s | 651.8166 Ops/s | |
test_exec_functorch | 0.3615ms | 0.1816ms | 5.5056 KOps/s | 5.4853 KOps/s | |
test_exec_functional_call | 0.3265ms | 0.1731ms | 5.7768 KOps/s | 5.7039 KOps/s | |
test_exec_td_decorator | 0.5766ms | 0.2283ms | 4.3802 KOps/s | 4.3810 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8702ms | 0.6419ms | 1.5579 KOps/s | 1.5522 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1209ms | 0.6485ms | 1.5420 KOps/s | 1.5557 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7337ms | 0.5235ms | 1.9103 KOps/s | 1.9168 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8346ms | 0.5262ms | 1.9003 KOps/s | 1.9165 KOps/s | |
test_to_module_speed[True] | 1.5546ms | 1.2929ms | 773.4827 Ops/s | 783.7500 Ops/s | |
test_to_module_speed[False] | 1.5330ms | 1.2657ms | 790.0819 Ops/s | 805.9318 Ops/s | |
test_tc_init | 81.2520μs | 41.7080μs | 23.9762 KOps/s | 21.4743 KOps/s | |
test_tc_init_nested | 0.1515ms | 84.1725μs | 11.8804 KOps/s | 10.4737 KOps/s | |
test_tc_first_layer_tensor | 27.8020μs | 1.5003μs | 666.5140 KOps/s | 655.2985 KOps/s | |
test_tc_first_layer_nontensor | 24.8270μs | 4.5434μs | 220.1012 KOps/s | 214.4797 KOps/s | |
test_tc_second_layer_tensor | 46.2070μs | 2.7897μs | 358.4557 KOps/s | 354.0361 KOps/s | |
test_tc_second_layer_nontensor | 30.3870μs | 5.8564μs | 170.7548 KOps/s | 167.0371 KOps/s | |
test_unbind | 0.2419s | 12.6617ms | 78.9781 Ops/s | 73.6440 Ops/s | |
test_full_like | 9.5052ms | 8.6345ms | 115.8150 Ops/s | 122.6403 Ops/s | |
test_zeros_like | 3.7492ms | 3.3267ms | 300.5989 Ops/s | 323.0903 Ops/s | |
test_ones_like | 4.6552ms | 4.0022ms | 249.8627 Ops/s | 269.7988 Ops/s | |
test_clone | 6.6665ms | 5.7364ms | 174.3263 Ops/s | 180.3168 Ops/s | |
test_squeeze | 64.5010μs | 11.9021μs | 84.0187 KOps/s | 87.0921 KOps/s | |
test_unsqueeze | 0.1537ms | 86.5096μs | 11.5594 KOps/s | 11.1162 KOps/s | |
test_split | 0.5450ms | 0.1897ms | 5.2709 KOps/s | 5.2453 KOps/s | |
test_permute | 0.4429ms | 0.2177ms | 4.5935 KOps/s | 4.6096 KOps/s | |
test_stack | 32.8868ms | 26.4424ms | 37.8180 Ops/s | 38.5957 Ops/s | |
test_cat | 29.0336ms | 25.9325ms | 38.5617 Ops/s | 38.7408 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 38.9210μs | 14.5762μs | 68.6049 KOps/s | 64.0178 KOps/s | |
test_plain_set_stack_nested | 38.6410μs | 14.6762μs | 68.1375 KOps/s | 63.3446 KOps/s | |
test_plain_set_nested_inplace | 39.2800μs | 15.6404μs | 63.9369 KOps/s | 59.7760 KOps/s | |
test_plain_set_stack_nested_inplace | 42.5810μs | 15.6738μs | 63.8008 KOps/s | 59.8821 KOps/s | |
test_items | 39.8700μs | 2.9055μs | 344.1699 KOps/s | 342.0513 KOps/s | |
test_items_nested | 0.3637ms | 0.3217ms | 3.1082 KOps/s | 3.0973 KOps/s | |
test_items_nested_locked | 0.3517ms | 0.3211ms | 3.1145 KOps/s | 3.0863 KOps/s | |
test_items_nested_leaf | 0.6102ms | 58.3294μs | 17.1440 KOps/s | 17.2420 KOps/s | |
test_items_stack_nested | 0.7068ms | 0.3227ms | 3.0991 KOps/s | 3.0965 KOps/s | |
test_items_stack_nested_leaf | 88.4210μs | 58.4659μs | 17.1040 KOps/s | 16.6262 KOps/s | |
test_items_stack_nested_locked | 0.7032ms | 0.3238ms | 3.0883 KOps/s | 3.0624 KOps/s | |
test_keys | 0.3790ms | 3.4616μs | 288.8877 KOps/s | 289.4447 KOps/s | |
test_keys_nested | 0.4505ms | 70.3871μs | 14.2072 KOps/s | 14.2678 KOps/s | |
test_keys_nested_locked | 2.4567ms | 76.2497μs | 13.1148 KOps/s | 13.1993 KOps/s | |
test_keys_nested_leaf | 95.0510μs | 61.9535μs | 16.1411 KOps/s | 16.3160 KOps/s | |
test_keys_stack_nested | 0.4506ms | 71.3005μs | 14.0252 KOps/s | 14.1768 KOps/s | |
test_keys_stack_nested_leaf | 0.4422ms | 62.6134μs | 15.9710 KOps/s | 15.9038 KOps/s | |
test_keys_stack_nested_locked | 0.1245ms | 76.0139μs | 13.1555 KOps/s | 13.1211 KOps/s | |
test_values | 4.8333μs | 0.8542μs | 1.1706 MOps/s | 1.1775 MOps/s | |
test_values_nested | 57.8310μs | 31.2295μs | 32.0210 KOps/s | 32.1747 KOps/s | |
test_values_nested_locked | 62.7010μs | 32.6542μs | 30.6240 KOps/s | 30.4721 KOps/s | |
test_values_nested_leaf | 70.2310μs | 33.2457μs | 30.0791 KOps/s | 29.7902 KOps/s | |
test_values_stack_nested | 67.6410μs | 31.3935μs | 31.8537 KOps/s | 31.2933 KOps/s | |
test_values_stack_nested_leaf | 67.9320μs | 33.8009μs | 29.5850 KOps/s | 29.1902 KOps/s | |
test_values_stack_nested_locked | 61.7710μs | 32.8491μs | 30.4423 KOps/s | 30.0234 KOps/s | |
test_membership | 2.3231μs | 0.5059μs | 1.9768 MOps/s | 1.9706 MOps/s | |
test_membership_nested | 20.7805μs | 1.8738μs | 533.6757 KOps/s | 524.7453 KOps/s | |
test_membership_nested_leaf | 18.0805μs | 1.8793μs | 532.1194 KOps/s | 524.9581 KOps/s | |
test_membership_stacked_nested | 22.3910μs | 1.9290μs | 518.4028 KOps/s | 510.5460 KOps/s | |
test_membership_stacked_nested_leaf | 36.6910μs | 1.9450μs | 514.1324 KOps/s | 513.2274 KOps/s | |
test_membership_nested_last | 29.9110μs | 2.8213μs | 354.4424 KOps/s | 351.8159 KOps/s | |
test_membership_nested_leaf_last | 26.9410μs | 2.8484μs | 351.0776 KOps/s | 350.5424 KOps/s | |
test_membership_stacked_nested_last | 19.3710μs | 2.7976μs | 357.4501 KOps/s | 265.8115 KOps/s | |
test_membership_stacked_nested_leaf_last | 26.8910μs | 2.8175μs | 354.9242 KOps/s | 264.4126 KOps/s | |
test_nested_getleaf | 29.2000μs | 6.0158μs | 166.2277 KOps/s | 164.4757 KOps/s | |
test_nested_get | 31.3010μs | 5.7096μs | 175.1436 KOps/s | 174.6097 KOps/s | |
test_stacked_getleaf | 31.1800μs | 6.0260μs | 165.9472 KOps/s | 164.0427 KOps/s | |
test_stacked_get | 25.4000μs | 5.6956μs | 175.5754 KOps/s | 173.5459 KOps/s | |
test_nested_getitemleaf | 39.5210μs | 6.1103μs | 163.6577 KOps/s | 163.6639 KOps/s | |
test_nested_getitem | 40.5010μs | 5.7961μs | 172.5294 KOps/s | 173.1937 KOps/s | |
test_stacked_getitemleaf | 58.7410μs | 6.0716μs | 164.7022 KOps/s | 163.5617 KOps/s | |
test_stacked_getitem | 38.4510μs | 5.7670μs | 173.4012 KOps/s | 172.9840 KOps/s | |
test_lock_nested | 4.3128ms | 0.4295ms | 2.3281 KOps/s | 2.3793 KOps/s | |
test_lock_stack_nested | 0.4512ms | 0.3888ms | 2.5718 KOps/s | 2.6162 KOps/s | |
test_unlock_nested | 0.7413ms | 0.3656ms | 2.7349 KOps/s | 2.7649 KOps/s | |
test_unlock_stack_nested | 0.3729ms | 0.3284ms | 3.0448 KOps/s | 3.1088 KOps/s | |
test_flatten_speed | 99.1420μs | 72.7470μs | 13.7463 KOps/s | 13.7620 KOps/s | |
test_unflatten_speed | 0.6743ms | 0.2928ms | 3.4158 KOps/s | 3.4014 KOps/s | |
test_common_ops | 1.6578ms | 1.2647ms | 790.6987 Ops/s | 775.2771 Ops/s | |
test_creation | 18.7110μs | 1.4784μs | 676.4156 KOps/s | 660.4119 KOps/s | |
test_creation_empty | 0.4003ms | 15.4165μs | 64.8655 KOps/s | 56.3439 KOps/s | |
test_creation_nested_1 | 0.9524ms | 17.2214μs | 58.0674 KOps/s | 51.1674 KOps/s | |
test_creation_nested_2 | 0.4106ms | 20.1534μs | 49.6193 KOps/s | 44.6737 KOps/s | |
test_clone | 61.4310μs | 30.4601μs | 32.8298 KOps/s | 33.3617 KOps/s | |
test_getitem[int] | 1.0550ms | 17.0979μs | 58.4866 KOps/s | 61.0305 KOps/s | |
test_getitem[slice_int] | 0.4231ms | 30.2267μs | 33.0834 KOps/s | 34.3000 KOps/s | |
test_getitem[range] | 0.2588ms | 0.1189ms | 8.4132 KOps/s | 8.6358 KOps/s | |
test_getitem[tuple] | 0.1364ms | 26.2093μs | 38.1544 KOps/s | 39.2995 KOps/s | |
test_getitem[list] | 91.6393ms | 0.1194ms | 8.3764 KOps/s | 9.6231 KOps/s | |
test_setitem_dim[int] | 70.8110μs | 47.4560μs | 21.0722 KOps/s | 21.7845 KOps/s | |
test_setitem_dim[slice_int] | 0.1012ms | 70.4191μs | 14.2007 KOps/s | 14.5792 KOps/s | |
test_setitem_dim[range] | 0.1585ms | 0.1313ms | 7.6141 KOps/s | 7.6500 KOps/s | |
test_setitem_dim[tuple] | 0.4623ms | 64.5967μs | 15.4807 KOps/s | 15.9508 KOps/s | |
test_setitem | 83.3010μs | 42.8247μs | 23.3510 KOps/s | 23.0154 KOps/s | |
test_set | 0.4367ms | 42.4817μs | 23.5396 KOps/s | 23.3629 KOps/s | |
test_set_shared | 0.3550ms | 51.8719μs | 19.2783 KOps/s | 19.3823 KOps/s | |
test_update | 0.4438ms | 50.5028μs | 19.8009 KOps/s | 18.6748 KOps/s | |
test_update_nested | 88.3920μs | 57.5523μs | 17.3755 KOps/s | 16.7754 KOps/s | |
test_update__nested | 0.1930ms | 61.5783μs | 16.2395 KOps/s | 15.2379 KOps/s | |
test_set_nested | 0.4427ms | 44.8055μs | 22.3187 KOps/s | 22.1174 KOps/s | |
test_set_nested_new | 88.2720μs | 48.3842μs | 20.6679 KOps/s | 20.6629 KOps/s | |
test_select | 0.4533ms | 61.8733μs | 16.1621 KOps/s | 16.1151 KOps/s | |
test_select_nested | 72.3820μs | 42.2160μs | 23.6877 KOps/s | 23.6696 KOps/s | |
test_exclude_nested | 0.4391ms | 59.8375μs | 16.7119 KOps/s | 16.8766 KOps/s | |
test_empty[True] | 0.6399ms | 0.2585ms | 3.8680 KOps/s | 3.8822 KOps/s | |
test_empty[False] | 3.7490μs | 0.7347μs | 1.3610 MOps/s | 1.3285 MOps/s | |
test_to | 0.4421ms | 51.9114μs | 19.2636 KOps/s | 38.2793 KOps/s | |
test_to_nonblocking | 93.8920μs | 51.5598μs | 19.3949 KOps/s | 39.7727 KOps/s | |
test_unbind_speed | 0.6693ms | 0.2827ms | 3.5372 KOps/s | 3.5620 KOps/s | |
test_unbind_speed_stack0 | 0.3283ms | 0.2851ms | 3.5071 KOps/s | 3.6151 KOps/s | |
test_unbind_speed_stack1 | 90.7231ms | 0.7165ms | 1.3957 KOps/s | 1.4334 KOps/s | |
test_split | 92.4373ms | 2.2239ms | 449.6539 Ops/s | 454.5879 Ops/s | |
test_chunk | 94.4260ms | 2.2303ms | 448.3664 Ops/s | 454.8678 Ops/s | |
test_to[False] | 6.8019ms | 6.2585ms | 159.7818 Ops/s | 284.6573 Ops/s | |
test_to[True] | 4.8540ms | 4.4440ms | 225.0211 Ops/s | 230.6385 Ops/s | |
test_to_njt[False] | 0.3492s | 0.2731s | 3.6610 Ops/s | 3.9724 Ops/s | |
test_to_njt[True] | 0.2631s | 0.2620s | 3.8163 Ops/s | 3.8603 Ops/s | |
test_creation[device0] | 0.3013ms | 0.1306ms | 7.6540 KOps/s | 7.6366 KOps/s | |
test_creation_from_tensor | 0.3791ms | 0.1381ms | 7.2407 KOps/s | 7.3909 KOps/s | |
test_add_one[memmap_tensor0] | 0.1589ms | 9.3862μs | 106.5393 KOps/s | 106.6438 KOps/s | |
test_contiguous[memmap_tensor0] | 30.4110μs | 2.2371μs | 447.0158 KOps/s | 448.3855 KOps/s | |
test_stack[memmap_tensor0] | 36.2500μs | 7.1697μs | 139.4758 KOps/s | 144.1815 KOps/s | |
test_memmaptd_index | 0.9921ms | 0.4387ms | 2.2793 KOps/s | 2.3957 KOps/s | |
test_memmaptd_index_astensor | 1.0761ms | 0.5014ms | 1.9946 KOps/s | 2.0774 KOps/s | |
test_memmaptd_index_op | 1.4253ms | 1.0559ms | 947.0431 Ops/s | 945.3433 Ops/s | |
test_serialize_model | 0.1314s | 0.1302s | 7.6797 Ops/s | 7.5902 Ops/s | |
test_serialize_model_pickle | 1.3486s | 1.1888s | 0.8412 Ops/s | 0.8386 Ops/s | |
test_serialize_weights | 0.1315s | 0.1298s | 7.7045 Ops/s | 7.6603 Ops/s | |
test_serialize_weights_returnearly | 0.1940s | 55.1817ms | 18.1219 Ops/s | 17.7589 Ops/s | |
test_serialize_weights_pickle | 1.3478s | 1.2193s | 0.8202 Ops/s | 0.8384 Ops/s | |
test_reshape_pytree | 82.8410μs | 35.3908μs | 28.2560 KOps/s | 28.2813 KOps/s | |
test_reshape_td | 69.2710μs | 41.8384μs | 23.9015 KOps/s | 24.2577 KOps/s | |
test_view_pytree | 81.5210μs | 36.2567μs | 27.5811 KOps/s | 28.6009 KOps/s | |
test_view_td | 88.5620μs | 46.8492μs | 21.3451 KOps/s | 20.6715 KOps/s | |
test_unbind_pytree | 76.2210μs | 35.1734μs | 28.4306 KOps/s | 28.5145 KOps/s | |
test_unbind_td | 0.4890ms | 43.0364μs | 23.2361 KOps/s | 23.0495 KOps/s | |
test_split_pytree | 0.1434ms | 47.6406μs | 20.9905 KOps/s | 17.9256 KOps/s | |
test_split_td | 0.7219ms | 58.5329μs | 17.0844 KOps/s | 17.4198 KOps/s | |
test_add_pytree | 0.1064ms | 59.5753μs | 16.7855 KOps/s | 16.7662 KOps/s | |
test_add_td | 0.1547ms | 0.1003ms | 9.9663 KOps/s | 9.9304 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2281ms | 0.1672ms | 5.9801 KOps/s | 5.9416 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2738ms | 0.1531ms | 6.5323 KOps/s | 6.5228 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2131ms | 0.1601ms | 6.2443 KOps/s | 6.3119 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2404ms | 0.1891ms | 5.2895 KOps/s | 5.2633 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 80.6420μs | 22.3677μs | 44.7073 KOps/s | 46.0843 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1037ms | 45.5840μs | 21.9375 KOps/s | 21.9365 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2271ms | 65.9091μs | 15.1724 KOps/s | 15.2722 KOps/s | |
test_compile_copy_nested[pytree-eager] | 82.8520μs | 50.5058μs | 19.7997 KOps/s | 20.2735 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3923ms | 0.3224ms | 3.1015 KOps/s | 3.1152 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3527ms | 0.2175ms | 4.5973 KOps/s | 4.6334 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1766ms | 0.1338ms | 7.4722 KOps/s | 7.4626 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1497ms | 60.2269μs | 16.6039 KOps/s | 16.2520 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4055ms | 0.3317ms | 3.0143 KOps/s | 3.0122 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.8575ms | 0.6434ms | 1.5542 KOps/s | 1.5639 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3847ms | 0.2579ms | 3.8771 KOps/s | 3.8565 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4376ms | 0.3235ms | 3.0912 KOps/s | 3.0909 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1530ms | 70.5107μs | 14.1822 KOps/s | 13.8737 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1854ms | 0.1348ms | 7.4173 KOps/s | 7.5095 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6230ms | 0.5377ms | 1.8596 KOps/s | 1.8783 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.4201ms | 0.3310ms | 3.0214 KOps/s | 3.0124 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 58.5310μs | 18.9593μs | 52.7445 KOps/s | 52.4418 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 61.4610μs | 28.5309μs | 35.0497 KOps/s | 34.7272 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1042ms | 68.8854μs | 14.5169 KOps/s | 14.3918 KOps/s | |
test_compile_copy_flat[pytree-eager] | 87.8720μs | 51.6480μs | 19.3618 KOps/s | 19.5006 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3586ms | 0.8240ms | 1.2135 KOps/s | 1.1129 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.5859ms | 3.3594ms | 297.6763 Ops/s | 298.9936 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.3520ms | 0.8260ms | 1.2107 KOps/s | 1.0994 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.4855ms | 3.4111ms | 293.1596 Ops/s | 290.6509 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1881ms | 0.1253ms | 7.9829 KOps/s | 8.0833 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1829ms | 62.9408μs | 15.8879 KOps/s | 15.9774 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1754ms | 0.1194ms | 8.3721 KOps/s | 8.4761 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 90.7710μs | 43.6093μs | 22.9309 KOps/s | 22.2171 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1778ms | 0.1203ms | 8.3130 KOps/s | 8.4161 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 88.2020μs | 43.1700μs | 23.1642 KOps/s | 21.9169 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2276ms | 0.1523ms | 6.5665 KOps/s | 6.6019 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1536ms | 26.9553μs | 37.0985 KOps/s | 37.6464 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2140ms | 0.1463ms | 6.8331 KOps/s | 6.8985 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 80.7820μs | 20.9793μs | 47.6660 KOps/s | 48.1755 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1946ms | 0.1468ms | 6.8102 KOps/s | 6.8710 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 84.7420μs | 20.5641μs | 48.6284 KOps/s | 48.3358 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2260ms | 0.1528ms | 6.5439 KOps/s | 6.5554 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4499ms | 27.1810μs | 36.7904 KOps/s | 37.6622 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1994ms | 0.1472ms | 6.7927 KOps/s | 6.8266 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 54.6110μs | 20.8005μs | 48.0757 KOps/s | 47.8494 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1903ms | 0.1468ms | 6.8126 KOps/s | 6.8394 KOps/s | |
test_compile_indexing[int-pytree-eager] | 81.8220μs | 20.9714μs | 47.6840 KOps/s | 48.0167 KOps/s | |
test_mod_add[eager] | 92.4920μs | 32.7965μs | 30.4910 KOps/s | 28.0567 KOps/s | |
test_mod_add[compile] | 0.1914ms | 78.4166μs | 12.7524 KOps/s | 13.0245 KOps/s | |
test_mod_add[compile-overhead] | 0.3097ms | 0.1582ms | 6.3207 KOps/s | 5.8568 KOps/s | |
test_mod_wrap[eager] | 0.3277ms | 0.2518ms | 3.9714 KOps/s | 3.9294 KOps/s | |
test_mod_wrap[compile] | 0.3925ms | 0.2937ms | 3.4043 KOps/s | 3.4496 KOps/s | |
test_mod_wrap[compile-overhead] | 7.7459ms | 4.0861ms | 244.7345 Ops/s | 244.0916 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5402ms | 1.3821ms | 723.5542 Ops/s | 680.0592 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5336ms | 1.2881ms | 776.3554 Ops/s | 717.1450 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3570ms | 0.9217ms | 1.0850 KOps/s | 973.4171 Ops/s | |
test_seq_add[eager] | 0.1558ms | 99.1935μs | 10.0813 KOps/s | 9.4922 KOps/s | |
test_seq_add[compile] | 0.1664ms | 93.9275μs | 10.6465 KOps/s | 11.5457 KOps/s | |
test_seq_add[compile-overhead] | 0.2020ms | 0.1305ms | 7.6636 KOps/s | 7.4060 KOps/s | |
test_seq_wrap[eager] | 0.4610ms | 0.3930ms | 2.5443 KOps/s | 2.4165 KOps/s | |
test_seq_wrap[compile] | 0.4440ms | 0.3040ms | 3.2895 KOps/s | 3.1008 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2815ms | 0.2336ms | 4.2800 KOps/s | 4.2830 KOps/s | |
test_func_call_runtime[False-eager] | 0.8288ms | 0.7647ms | 1.3077 KOps/s | 1.2581 KOps/s | |
test_func_call_runtime[False-compile] | 0.8461ms | 0.7646ms | 1.3080 KOps/s | 1.3014 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4121ms | 0.3685ms | 2.7138 KOps/s | 2.7285 KOps/s | |
test_func_call_runtime[True-eager] | 1.1603ms | 0.9267ms | 1.0791 KOps/s | 1.0848 KOps/s | |
test_func_call_runtime[True-compile] | 0.9025ms | 0.7925ms | 1.2618 KOps/s | 1.2517 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4895ms | 0.3885ms | 2.5741 KOps/s | 2.5936 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8114ms | 0.7556ms | 1.3234 KOps/s | 1.3259 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.0402ms | 0.7663ms | 1.3049 KOps/s | 1.3072 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5096ms | 0.3723ms | 2.6862 KOps/s | 2.7251 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1041ms | 1.0222ms | 978.2481 Ops/s | 981.6874 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.8706ms | 0.8186ms | 1.2216 KOps/s | 1.2190 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5044ms | 0.4224ms | 2.3675 KOps/s | 2.4135 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6114ms | 2.1093ms | 474.0819 Ops/s | 471.7941 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8858ms | 0.8296ms | 1.2054 KOps/s | 1.2041 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4829ms | 0.4182ms | 2.3913 KOps/s | 2.4084 KOps/s | |
test_distributed | 4.8203ms | 0.1989ms | 5.0271 KOps/s | 8.7187 KOps/s | |
test_tdmodule | 35.2110μs | 14.5144μs | 68.8971 KOps/s | 59.9470 KOps/s | |
test_tdmodule_dispatch | 64.0210μs | 28.8494μs | 34.6627 KOps/s | 30.8147 KOps/s | |
test_tdseq | 22.4600μs | 15.7599μs | 63.4520 KOps/s | 56.6837 KOps/s | |
test_tdseq_dispatch | 52.2510μs | 31.8053μs | 31.4413 KOps/s | 27.7387 KOps/s | |
test_instantiation_functorch | 2.1036ms | 1.8979ms | 526.9049 Ops/s | 521.9126 Ops/s | |
test_exec_functorch | 0.2588ms | 0.2137ms | 4.6787 KOps/s | 4.4608 KOps/s | |
test_exec_functional_call | 0.2956ms | 0.2152ms | 4.6469 KOps/s | 4.3682 KOps/s | |
test_exec_td_decorator | 0.4318ms | 0.2633ms | 3.7978 KOps/s | 3.6742 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7948ms | 0.6770ms | 1.4770 KOps/s | 1.4313 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8209ms | 0.6792ms | 1.4723 KOps/s | 1.4353 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7485ms | 0.6072ms | 1.6470 KOps/s | 1.6507 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9454ms | 0.6277ms | 1.5932 KOps/s | 1.6081 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.9896ms | 19.7891ms | 50.5329 Ops/s | 50.8032 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.6757ms | 19.6354ms | 50.9284 Ops/s | 50.5734 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.5849ms | 19.4855ms | 51.3203 Ops/s | 51.0405 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.7442ms | 19.5589ms | 51.1276 Ops/s | 51.1945 Ops/s | |
test_to_module_speed[True] | 1.3518ms | 0.9390ms | 1.0650 KOps/s | 1.0735 KOps/s | |
test_to_module_speed[False] | 1.3097ms | 0.9109ms | 1.0978 KOps/s | 1.0888 KOps/s | |
test_tc_init | 72.0620μs | 36.0971μs | 27.7030 KOps/s | 27.3361 KOps/s | |
test_tc_init_nested | 0.1064ms | 73.1997μs | 13.6613 KOps/s | 13.8468 KOps/s | |
test_tc_first_layer_tensor | 6.3487μs | 0.6944μs | 1.4400 MOps/s | 1.4244 MOps/s | |
test_tc_first_layer_nontensor | 30.9400μs | 2.3250μs | 430.1156 KOps/s | 429.7754 KOps/s | |
test_tc_second_layer_tensor | 8.9450μs | 1.3981μs | 715.2579 KOps/s | 707.9661 KOps/s | |
test_tc_second_layer_nontensor | 32.5400μs | 3.0404μs | 328.9019 KOps/s | 326.3531 KOps/s | |
test_unbind | 0.1909s | 11.7910ms | 84.8104 Ops/s | 94.5314 Ops/s | |
test_full_like | 0.6585ms | 0.5733ms | 1.7444 KOps/s | 1.7446 KOps/s | |
test_zeros_like | 0.2785ms | 0.1979ms | 5.0529 KOps/s | 5.0527 KOps/s | |
test_ones_like | 0.2832ms | 0.1978ms | 5.0568 KOps/s | 5.0578 KOps/s | |
test_clone | 0.4530ms | 0.4147ms | 2.4113 KOps/s | 2.4116 KOps/s | |
test_squeeze | 44.5210μs | 10.8086μs | 92.5191 KOps/s | 106.1070 KOps/s | |
test_unsqueeze | 0.2191ms | 75.5202μs | 13.2415 KOps/s | 13.8833 KOps/s | |
test_split | 0.5040ms | 0.1742ms | 5.7391 KOps/s | 6.0187 KOps/s | |
test_permute | 0.2884ms | 0.1914ms | 5.2250 KOps/s | 5.4988 KOps/s | |
test_stack | 1.2493ms | 0.8588ms | 1.1644 KOps/s | 1.2848 KOps/s | |
test_cat | 1.2683ms | 1.2313ms | 812.1332 Ops/s | 811.9938 Ops/s |
Merged
vmoens
added a commit
that referenced
this pull request
Nov 1, 2024
ghstack-source-id: eb25717dd0c9d4581b0ba19aff241e968f8face0 Pull Request resolved: #1066
vmoens
added a commit
that referenced
this pull request
Nov 1, 2024
ghstack-source-id: 21cbc9f21287d3041c95d31fe2e5259b4ed36a42 Pull Request resolved: #1066
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):