Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] NonTensorData(*sequence_of_any) #1160

Merged
merged 2 commits into from
Jan 7, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 7, 2025

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 7, 2025
vmoens added a commit that referenced this pull request Jan 7, 2025
ghstack-source-id: 97575533232f31a72147f9403cbf9dccfb814514
Pull Request resolved: #1160
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 7, 2025
ghstack-source-id: 537f3d87b0677a1ae4992ca581a585420a10a284
Pull Request resolved: #1160
Copy link

github-actions bot commented Jan 7, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}26$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 68.6280μs 21.1340μs 47.3171 KOps/s 49.0188 KOps/s $\color{#d91a1a}-3.47\%$
test_plain_set_stack_nested 79.0970μs 21.4530μs 46.6136 KOps/s 48.3604 KOps/s $\color{#d91a1a}-3.61\%$
test_plain_set_nested_inplace 56.5460μs 23.3862μs 42.7602 KOps/s 45.1671 KOps/s $\textbf{\color{#d91a1a}-5.33\%}$
test_plain_set_stack_nested_inplace 56.0340μs 23.3733μs 42.7839 KOps/s 45.0935 KOps/s $\textbf{\color{#d91a1a}-5.12\%}$
test_items 31.2890μs 4.1850μs 238.9476 KOps/s 229.8417 KOps/s $\color{#35bf28}+3.96\%$
test_items_nested 0.5324ms 0.4077ms 2.4526 KOps/s 2.4648 KOps/s $\color{#d91a1a}-0.49\%$
test_items_nested_locked 0.4650ms 0.4065ms 2.4598 KOps/s 2.4589 KOps/s $\color{#35bf28}+0.03\%$
test_items_nested_leaf 0.1652ms 77.1105μs 12.9684 KOps/s 12.8772 KOps/s $\color{#35bf28}+0.71\%$
test_items_stack_nested 0.4890ms 0.4081ms 2.4505 KOps/s 2.4330 KOps/s $\color{#35bf28}+0.72\%$
test_items_stack_nested_leaf 0.1205ms 79.0836μs 12.6448 KOps/s 12.5265 KOps/s $\color{#35bf28}+0.94\%$
test_items_stack_nested_locked 0.8533ms 0.4111ms 2.4324 KOps/s 2.4237 KOps/s $\color{#35bf28}+0.36\%$
test_keys 31.9990μs 3.4759μs 287.6959 KOps/s 280.8588 KOps/s $\color{#35bf28}+2.43\%$
test_keys_nested 0.2940ms 0.1672ms 5.9812 KOps/s 5.9464 KOps/s $\color{#35bf28}+0.59\%$
test_keys_nested_locked 0.7986ms 0.1732ms 5.7747 KOps/s 5.6704 KOps/s $\color{#35bf28}+1.84\%$
test_keys_nested_leaf 0.2969ms 0.1462ms 6.8399 KOps/s 6.7757 KOps/s $\color{#35bf28}+0.95\%$
test_keys_stack_nested 0.2362ms 0.1642ms 6.0915 KOps/s 5.9336 KOps/s $\color{#35bf28}+2.66\%$
test_keys_stack_nested_leaf 0.2068ms 0.1424ms 7.0235 KOps/s 6.8031 KOps/s $\color{#35bf28}+3.24\%$
test_keys_stack_nested_locked 0.2189ms 0.1707ms 5.8588 KOps/s 5.7399 KOps/s $\color{#35bf28}+2.07\%$
test_values 5.9792μs 1.0544μs 948.4265 KOps/s 953.7975 KOps/s $\color{#d91a1a}-0.56\%$
test_values_nested 0.1012ms 62.3722μs 16.0328 KOps/s 15.8551 KOps/s $\color{#35bf28}+1.12\%$
test_values_nested_locked 0.1625ms 64.6945μs 15.4573 KOps/s 16.0161 KOps/s $\color{#d91a1a}-3.49\%$
test_values_nested_leaf 0.1242ms 71.9122μs 13.9058 KOps/s 12.8833 KOps/s $\textbf{\color{#35bf28}+7.94\%}$
test_values_stack_nested 0.1381ms 64.4783μs 15.5091 KOps/s 15.6818 KOps/s $\color{#d91a1a}-1.10\%$
test_values_stack_nested_leaf 0.1332ms 72.5449μs 13.7846 KOps/s 13.7018 KOps/s $\color{#35bf28}+0.60\%$
test_values_stack_nested_locked 0.1261ms 63.3251μs 15.7915 KOps/s 15.4618 KOps/s $\color{#35bf28}+2.13\%$
test_membership 19.6470μs 0.8627μs 1.1592 MOps/s 1.0964 MOps/s $\textbf{\color{#35bf28}+5.72\%}$
test_membership_nested 23.6440μs 2.9785μs 335.7424 KOps/s 336.8660 KOps/s $\color{#d91a1a}-0.33\%$
test_membership_nested_leaf 20.6990μs 3.0054μs 332.7360 KOps/s 334.9208 KOps/s $\color{#d91a1a}-0.65\%$
test_membership_stacked_nested 23.9350μs 2.9744μs 336.2002 KOps/s 331.1960 KOps/s $\color{#35bf28}+1.51\%$
test_membership_stacked_nested_leaf 28.6530μs 2.9655μs 337.2147 KOps/s 318.9688 KOps/s $\textbf{\color{#35bf28}+5.72\%}$
test_membership_nested_last 38.8320μs 4.4399μs 225.2297 KOps/s 220.2946 KOps/s $\color{#35bf28}+2.24\%$
test_membership_nested_leaf_last 30.9270μs 4.5110μs 221.6811 KOps/s 223.3789 KOps/s $\color{#d91a1a}-0.76\%$
test_membership_stacked_nested_last 21.9010μs 5.8012μs 172.3779 KOps/s 224.7782 KOps/s $\textbf{\color{#d91a1a}-23.31\%}$
test_membership_stacked_nested_leaf_last 34.9850μs 5.8003μs 172.4040 KOps/s 224.1399 KOps/s $\textbf{\color{#d91a1a}-23.08\%}$
test_nested_getleaf 50.6970μs 10.8871μs 91.8521 KOps/s 93.7093 KOps/s $\color{#d91a1a}-1.98\%$
test_nested_get 34.9750μs 10.2360μs 97.6941 KOps/s 99.1216 KOps/s $\color{#d91a1a}-1.44\%$
test_stacked_getleaf 42.3690μs 10.8384μs 92.2646 KOps/s 94.0887 KOps/s $\color{#d91a1a}-1.94\%$
test_stacked_get 35.7670μs 10.3823μs 96.3181 KOps/s 98.3205 KOps/s $\color{#d91a1a}-2.04\%$
test_nested_getitemleaf 37.1090μs 11.3873μs 87.8169 KOps/s 89.3123 KOps/s $\color{#d91a1a}-1.67\%$
test_nested_getitem 58.8270μs 10.3693μs 96.4385 KOps/s 94.4946 KOps/s $\color{#35bf28}+2.06\%$
test_stacked_getitemleaf 81.5120μs 10.9147μs 91.6196 KOps/s 86.6340 KOps/s $\textbf{\color{#35bf28}+5.75\%}$
test_stacked_getitem 28.0420μs 10.4335μs 95.8456 KOps/s 96.0666 KOps/s $\color{#d91a1a}-0.23\%$
test_lock_nested 6.9320ms 0.4689ms 2.1327 KOps/s 2.1630 KOps/s $\color{#d91a1a}-1.40\%$
test_lock_stack_nested 0.4889ms 0.4307ms 2.3219 KOps/s 2.3023 KOps/s $\color{#35bf28}+0.85\%$
test_unlock_nested 0.7308ms 0.3795ms 2.6348 KOps/s 2.6075 KOps/s $\color{#35bf28}+1.05\%$
test_unlock_stack_nested 0.6524ms 0.3505ms 2.8528 KOps/s 2.8390 KOps/s $\color{#35bf28}+0.49\%$
test_flatten_speed 0.1675ms 0.1013ms 9.8737 KOps/s 9.8483 KOps/s $\color{#35bf28}+0.26\%$
test_unflatten_speed 0.6994ms 0.5335ms 1.8745 KOps/s 1.8852 KOps/s $\color{#d91a1a}-0.57\%$
test_common_ops 4.8774ms 0.8229ms 1.2152 KOps/s 1.2980 KOps/s $\textbf{\color{#d91a1a}-6.39\%}$
test_creation 20.8090μs 2.5434μs 393.1787 KOps/s 394.1561 KOps/s $\color{#d91a1a}-0.25\%$
test_creation_empty 64.6030μs 12.5705μs 79.5513 KOps/s 93.8527 KOps/s $\textbf{\color{#d91a1a}-15.24\%}$
test_creation_nested_1 40.5760μs 16.1636μs 61.8673 KOps/s 72.0278 KOps/s $\textbf{\color{#d91a1a}-14.11\%}$
test_creation_nested_2 42.6600μs 20.4137μs 48.9866 KOps/s 54.7966 KOps/s $\textbf{\color{#d91a1a}-10.60\%}$
test_clone 57.4470μs 13.5526μs 73.7865 KOps/s 72.7400 KOps/s $\color{#35bf28}+1.44\%$
test_getitem[int] 1.3334ms 12.9852μs 77.0106 KOps/s 77.0767 KOps/s $\color{#d91a1a}-0.09\%$
test_getitem[slice_int] 0.1450ms 25.3428μs 39.4589 KOps/s 40.2339 KOps/s $\color{#d91a1a}-1.93\%$
test_getitem[range] 0.1829ms 46.5887μs 21.4644 KOps/s 20.8038 KOps/s $\color{#35bf28}+3.18\%$
test_getitem[tuple] 0.1373ms 20.7847μs 48.1123 KOps/s 49.3096 KOps/s $\color{#d91a1a}-2.43\%$
test_getitem[list] 0.1864ms 42.4598μs 23.5517 KOps/s 22.8656 KOps/s $\color{#35bf28}+3.00\%$
test_setitem_dim[int] 54.0810μs 25.3404μs 39.4627 KOps/s 39.9398 KOps/s $\color{#d91a1a}-1.19\%$
test_setitem_dim[slice_int] 91.7610μs 50.4189μs 19.8338 KOps/s 19.8371 KOps/s $\color{#d91a1a}-0.02\%$
test_setitem_dim[range] 0.1569ms 73.1572μs 13.6692 KOps/s 13.8999 KOps/s $\color{#d91a1a}-1.66\%$
test_setitem_dim[tuple] 0.1043ms 40.9195μs 24.4382 KOps/s 24.8584 KOps/s $\color{#d91a1a}-1.69\%$
test_setitem 0.1322ms 21.8516μs 45.7632 KOps/s 50.5423 KOps/s $\textbf{\color{#d91a1a}-9.46\%}$
test_set 0.1354ms 20.9753μs 47.6752 KOps/s 51.5369 KOps/s $\textbf{\color{#d91a1a}-7.49\%}$
test_set_shared 3.3823ms 0.1746ms 5.7258 KOps/s 5.9207 KOps/s $\color{#d91a1a}-3.29\%$
test_update 0.1835ms 25.2250μs 39.6433 KOps/s 45.4085 KOps/s $\textbf{\color{#d91a1a}-12.70\%}$
test_update_nested 0.1490ms 34.9834μs 28.5850 KOps/s 30.2902 KOps/s $\textbf{\color{#d91a1a}-5.63\%}$
test_update__nested 0.1566ms 34.1063μs 29.3201 KOps/s 28.8148 KOps/s $\color{#35bf28}+1.75\%$
test_set_nested 83.1950μs 23.4938μs 42.5643 KOps/s 45.8934 KOps/s $\textbf{\color{#d91a1a}-7.25\%}$
test_set_nested_new 0.1178ms 27.7943μs 35.9786 KOps/s 37.1681 KOps/s $\color{#d91a1a}-3.20\%$
test_select 0.2144ms 46.1740μs 21.6572 KOps/s 23.0813 KOps/s $\textbf{\color{#d91a1a}-6.17\%}$
test_select_nested 0.1440ms 63.4092μs 15.7706 KOps/s 15.4995 KOps/s $\color{#35bf28}+1.75\%$
test_exclude_nested 0.1702ms 82.6551μs 12.0985 KOps/s 11.9113 KOps/s $\color{#35bf28}+1.57\%$
test_empty[True] 0.8837ms 0.4137ms 2.4174 KOps/s 2.3833 KOps/s $\color{#35bf28}+1.43\%$
test_empty[False] 9.5527μs 1.4122μs 708.1309 KOps/s 708.5565 KOps/s $\color{#d91a1a}-0.06\%$
test_unbind_speed 0.4154ms 0.2729ms 3.6642 KOps/s 3.5924 KOps/s $\color{#35bf28}+2.00\%$
test_unbind_speed_stack0 0.3839ms 0.2716ms 3.6814 KOps/s 3.7106 KOps/s $\color{#d91a1a}-0.79\%$
test_unbind_speed_stack1 0.1005s 0.7998ms 1.2503 KOps/s 1.3586 KOps/s $\textbf{\color{#d91a1a}-7.97\%}$
test_split 98.2846ms 1.7530ms 570.4447 Ops/s 551.9107 Ops/s $\color{#35bf28}+3.36\%$
test_chunk 0.1110s 1.7860ms 559.9031 Ops/s 556.7153 Ops/s $\color{#35bf28}+0.57\%$
test_consolidate_njt[False-None] 10.5405ms 8.3637ms 119.5649 Ops/s 123.5189 Ops/s $\color{#d91a1a}-3.20\%$
test_creation[device0] 0.2425ms 91.7683μs 10.8970 KOps/s 10.8233 KOps/s $\color{#35bf28}+0.68\%$
test_creation_from_tensor 3.3504ms 95.5036μs 10.4708 KOps/s 10.3382 KOps/s $\color{#35bf28}+1.28\%$
test_add_one[memmap_tensor0] 0.1563ms 5.0474μs 198.1207 KOps/s 216.4794 KOps/s $\textbf{\color{#d91a1a}-8.48\%}$
test_contiguous[memmap_tensor0] 12.4830μs 0.4993μs 2.0029 MOps/s 1.9159 MOps/s $\color{#35bf28}+4.54\%$
test_stack[memmap_tensor0] 29.4660μs 3.4089μs 293.3472 KOps/s 288.9249 KOps/s $\color{#35bf28}+1.53\%$
test_memmaptd_index 1.0087ms 0.2435ms 4.1072 KOps/s 4.2112 KOps/s $\color{#d91a1a}-2.47\%$
test_memmaptd_index_astensor 0.7140ms 0.3328ms 3.0045 KOps/s 3.0781 KOps/s $\color{#d91a1a}-2.39\%$
test_memmaptd_index_op 1.0143ms 0.6236ms 1.6036 KOps/s 1.7424 KOps/s $\textbf{\color{#d91a1a}-7.96\%}$
test_serialize_model 0.1244s 0.1157s 8.6410 Ops/s 8.4617 Ops/s $\color{#35bf28}+2.12\%$
test_serialize_model_pickle 0.5111s 0.3956s 2.5276 Ops/s 2.4897 Ops/s $\color{#35bf28}+1.52\%$
test_serialize_weights 0.1240s 0.1171s 8.5384 Ops/s 8.8516 Ops/s $\color{#d91a1a}-3.54\%$
test_serialize_weights_returnearly 0.1883s 0.1627s 6.1466 Ops/s 6.2988 Ops/s $\color{#d91a1a}-2.42\%$
test_serialize_weights_pickle 0.5437s 0.4130s 2.4213 Ops/s 2.4505 Ops/s $\color{#d91a1a}-1.19\%$
test_serialize_weights_filesystem 0.1560s 0.1440s 6.9441 Ops/s 6.4323 Ops/s $\textbf{\color{#35bf28}+7.96\%}$
test_serialize_model_filesystem 0.1535s 0.1485s 6.7342 Ops/s 6.6944 Ops/s $\color{#35bf28}+0.60\%$
test_reshape_pytree 77.0630μs 26.1110μs 38.2980 KOps/s 36.7753 KOps/s $\color{#35bf28}+4.14\%$
test_reshape_td 80.7800μs 33.8865μs 29.5103 KOps/s 29.2985 KOps/s $\color{#35bf28}+0.72\%$
test_view_pytree 70.6510μs 26.3056μs 38.0148 KOps/s 36.9631 KOps/s $\color{#35bf28}+2.85\%$
test_view_td 95.0170μs 39.7095μs 25.1829 KOps/s 25.3174 KOps/s $\color{#d91a1a}-0.53\%$
test_unbind_pytree 65.2020μs 29.9056μs 33.4385 KOps/s 33.5401 KOps/s $\color{#d91a1a}-0.30\%$
test_unbind_td 0.3377ms 40.6444μs 24.6036 KOps/s 24.7819 KOps/s $\color{#d91a1a}-0.72\%$
test_split_pytree 72.1640μs 29.2516μs 34.1861 KOps/s 33.1924 KOps/s $\color{#35bf28}+2.99\%$
test_split_td 0.5091ms 45.4837μs 21.9859 KOps/s 21.5784 KOps/s $\color{#35bf28}+1.89\%$
test_add_pytree 0.1047ms 35.6226μs 28.0721 KOps/s 27.8776 KOps/s $\color{#35bf28}+0.70\%$
test_add_td 0.1251ms 63.5144μs 15.7445 KOps/s 16.5901 KOps/s $\textbf{\color{#d91a1a}-5.10\%}$
test_compile_add_one_nested[tensordict-compile] 0.1499ms 62.7755μs 15.9298 KOps/s 16.4208 KOps/s $\color{#d91a1a}-2.99\%$
test_compile_add_one_nested[tensordict-eager] 1.5111ms 0.1736ms 5.7591 KOps/s 5.8103 KOps/s $\color{#d91a1a}-0.88\%$
test_compile_add_one_nested[pytree-compile] 0.1013ms 45.7069μs 21.8785 KOps/s 22.3717 KOps/s $\color{#d91a1a}-2.20\%$
test_compile_add_one_nested[pytree-eager] 0.2262ms 0.1186ms 8.4310 KOps/s 8.3987 KOps/s $\color{#35bf28}+0.38\%$
test_compile_copy_nested[tensordict-compile] 76.7030μs 26.5956μs 37.6002 KOps/s 38.3492 KOps/s $\color{#d91a1a}-1.95\%$
test_compile_copy_nested[tensordict-eager] 0.1168ms 59.2376μs 16.8812 KOps/s 16.8519 KOps/s $\color{#35bf28}+0.17\%$
test_compile_copy_nested[pytree-compile] 0.1503ms 77.8510μs 12.8451 KOps/s 12.4902 KOps/s $\color{#35bf28}+2.84\%$
test_compile_copy_nested[pytree-eager] 0.1251ms 67.2063μs 14.8796 KOps/s 14.6584 KOps/s $\color{#35bf28}+1.51\%$
test_compile_add_one_flat[tensordict-compile] 0.1892ms 0.1039ms 9.6282 KOps/s 9.8051 KOps/s $\color{#d91a1a}-1.80\%$
test_compile_add_one_flat[tensordict-eager] 0.3876ms 0.2155ms 4.6395 KOps/s 4.6481 KOps/s $\color{#d91a1a}-0.19\%$
test_compile_add_one_flat[tensorclass-compile] 0.1188ms 44.7989μs 22.3220 KOps/s 22.9017 KOps/s $\color{#d91a1a}-2.53\%$
test_compile_add_one_flat[tensorclass-eager] 0.4690ms 63.6520μs 15.7104 KOps/s 15.7129 KOps/s $\color{#d91a1a}-0.02\%$
test_compile_add_one_flat[pytree-compile] 0.1958ms 0.1016ms 9.8415 KOps/s 9.8517 KOps/s $\color{#d91a1a}-0.10\%$
test_compile_add_one_flat[pytree-eager] 0.4872ms 0.2017ms 4.9585 KOps/s 4.9993 KOps/s $\color{#d91a1a}-0.82\%$
test_compile_add_self_flat[tensordict-eager] 0.4289ms 0.2304ms 4.3398 KOps/s 4.2686 KOps/s $\color{#35bf28}+1.67\%$
test_compile_add_self_flat[tensordict-compile] 0.2038ms 0.1045ms 9.5695 KOps/s 9.7337 KOps/s $\color{#d91a1a}-1.69\%$
test_compile_add_self_flat[tensorclass-eager] 0.1544ms 59.8145μs 16.7184 KOps/s 17.0326 KOps/s $\color{#d91a1a}-1.84\%$
test_compile_add_self_flat[tensorclass-compile] 0.4861ms 44.9431μs 22.2503 KOps/s 22.7171 KOps/s $\color{#d91a1a}-2.05\%$
test_compile_add_self_flat[pytree-eager] 0.5763ms 0.1577ms 6.3398 KOps/s 6.1418 KOps/s $\color{#35bf28}+3.22\%$
test_compile_add_self_flat[pytree-compile] 0.2141ms 0.1021ms 9.7982 KOps/s 9.6539 KOps/s $\color{#35bf28}+1.49\%$
test_compile_copy_flat[tensordict-compile] 74.8200μs 22.2168μs 45.0111 KOps/s 48.3748 KOps/s $\textbf{\color{#d91a1a}-6.95\%}$
test_compile_copy_flat[tensordict-eager] 0.1369ms 67.4535μs 14.8250 KOps/s 14.9214 KOps/s $\color{#d91a1a}-0.65\%$
test_compile_copy_flat[pytree-compile] 0.1531ms 80.4221μs 12.4344 KOps/s 12.0891 KOps/s $\color{#35bf28}+2.86\%$
test_compile_copy_flat[pytree-eager] 0.1286ms 68.1836μs 14.6663 KOps/s 14.2757 KOps/s $\color{#35bf28}+2.74\%$
test_compile_assign_and_add[tensordict-compile] 0.3647ms 0.2074ms 4.8211 KOps/s 4.9045 KOps/s $\color{#d91a1a}-1.70\%$
test_compile_assign_and_add[tensordict-eager] 1.5543ms 1.3296ms 752.0839 Ops/s 741.7592 Ops/s $\color{#35bf28}+1.39\%$
test_compile_assign_and_add[pytree-compile] 0.3638ms 0.2027ms 4.9339 KOps/s 5.0757 KOps/s $\color{#d91a1a}-2.79\%$
test_compile_assign_and_add[pytree-eager] 1.0031ms 0.7780ms 1.2853 KOps/s 1.3012 KOps/s $\color{#d91a1a}-1.22\%$
test_compile_assign_and_add_stack[compile] 0.5470ms 0.4546ms 2.1996 KOps/s 2.2426 KOps/s $\color{#d91a1a}-1.91\%$
test_compile_assign_and_add_stack[eager] 3.7284ms 2.7967ms 357.5676 Ops/s 370.9456 Ops/s $\color{#d91a1a}-3.61\%$
test_compile_indexing[tensor-tensordict-compile] 0.1101ms 35.5634μs 28.1188 KOps/s 28.8526 KOps/s $\color{#d91a1a}-2.54\%$
test_compile_indexing[tensor-tensordict-eager] 0.5126ms 32.9530μs 30.3462 KOps/s 30.4283 KOps/s $\color{#d91a1a}-0.27\%$
test_compile_indexing[tensor-tensorclass-compile] 99.6660μs 29.1886μs 34.2599 KOps/s 35.0980 KOps/s $\color{#d91a1a}-2.39\%$
test_compile_indexing[tensor-tensorclass-eager] 88.9160μs 22.7394μs 43.9765 KOps/s 43.3668 KOps/s $\color{#35bf28}+1.41\%$
test_compile_indexing[tensor-pytree-compile] 73.1970μs 29.3679μs 34.0507 KOps/s 34.2770 KOps/s $\color{#d91a1a}-0.66\%$
test_compile_indexing[tensor-pytree-eager] 74.2490μs 22.6757μs 44.1002 KOps/s 43.3492 KOps/s $\color{#35bf28}+1.73\%$
test_compile_indexing[slice-tensordict-compile] 0.1554ms 50.6712μs 19.7351 KOps/s 19.8064 KOps/s $\color{#d91a1a}-0.36\%$
test_compile_indexing[slice-tensordict-eager] 0.7338ms 20.4737μs 48.8432 KOps/s 48.2546 KOps/s $\color{#35bf28}+1.22\%$
test_compile_indexing[slice-tensorclass-compile] 98.1230μs 43.3024μs 23.0934 KOps/s 23.1618 KOps/s $\color{#d91a1a}-0.30\%$
test_compile_indexing[slice-tensorclass-eager] 79.4680μs 18.9345μs 52.8136 KOps/s 51.6601 KOps/s $\color{#35bf28}+2.23\%$
test_compile_indexing[slice-pytree-compile] 0.1079ms 43.9935μs 22.7306 KOps/s 22.7700 KOps/s $\color{#d91a1a}-0.17\%$
test_compile_indexing[slice-pytree-eager] 0.1002ms 18.6117μs 53.7296 KOps/s 51.5806 KOps/s $\color{#35bf28}+4.17\%$
test_compile_indexing[int-tensordict-compile] 0.1305ms 51.5744μs 19.3895 KOps/s 19.2467 KOps/s $\color{#35bf28}+0.74\%$
test_compile_indexing[int-tensordict-eager] 1.0134ms 20.0009μs 49.9977 KOps/s 48.4635 KOps/s $\color{#35bf28}+3.17\%$
test_compile_indexing[int-tensorclass-compile] 0.1436ms 44.0142μs 22.7200 KOps/s 22.9940 KOps/s $\color{#d91a1a}-1.19\%$
test_compile_indexing[int-tensorclass-eager] 65.7430μs 18.6850μs 53.5188 KOps/s 51.9562 KOps/s $\color{#35bf28}+3.01\%$
test_compile_indexing[int-pytree-compile] 0.1115ms 44.3257μs 22.5603 KOps/s 22.5220 KOps/s $\color{#35bf28}+0.17\%$
test_compile_indexing[int-pytree-eager] 66.2540μs 18.6918μs 53.4993 KOps/s 52.0195 KOps/s $\color{#35bf28}+2.84\%$
test_mod_add[eager] 93.9250μs 36.4117μs 27.4637 KOps/s 29.1261 KOps/s $\textbf{\color{#d91a1a}-5.71\%}$
test_mod_add[compile] 0.1207ms 46.4070μs 21.5485 KOps/s 20.7893 KOps/s $\color{#35bf28}+3.65\%$
test_mod_add[compile-overhead] 0.2527ms 48.0259μs 20.8221 KOps/s 20.5109 KOps/s $\color{#35bf28}+1.52\%$
test_mod_wrap[eager] 0.3440ms 0.2218ms 4.5094 KOps/s 4.4056 KOps/s $\color{#35bf28}+2.35\%$
test_mod_wrap[compile] 0.3601ms 0.2035ms 4.9138 KOps/s 4.8672 KOps/s $\color{#35bf28}+0.96\%$
test_mod_wrap[compile-overhead] 0.3686ms 0.2019ms 4.9533 KOps/s 4.8053 KOps/s $\color{#35bf28}+3.08\%$
test_mod_wrap_and_backward[eager] 12.7225ms 11.1293ms 89.8525 Ops/s 85.7906 Ops/s $\color{#35bf28}+4.73\%$
test_mod_wrap_and_backward[compile] 12.6280ms 11.0244ms 90.7080 Ops/s 87.3060 Ops/s $\color{#35bf28}+3.90\%$
test_mod_wrap_and_backward[compile-overhead] 12.3143ms 11.1315ms 89.8349 Ops/s 77.6533 Ops/s $\textbf{\color{#35bf28}+15.69\%}$
test_seq_add[eager] 0.2800ms 0.1226ms 8.1547 KOps/s 8.6144 KOps/s $\textbf{\color{#d91a1a}-5.34\%}$
test_seq_add[compile] 0.1286ms 62.3551μs 16.0372 KOps/s 16.1851 KOps/s $\color{#d91a1a}-0.91\%$
test_seq_add[compile-overhead] 0.1409ms 60.0272μs 16.6591 KOps/s 16.6438 KOps/s $\color{#35bf28}+0.09\%$
test_seq_wrap[eager] 0.6941ms 0.4452ms 2.2462 KOps/s 2.2168 KOps/s $\color{#35bf28}+1.33\%$
test_seq_wrap[compile] 0.3247ms 0.2217ms 4.5101 KOps/s 4.4029 KOps/s $\color{#35bf28}+2.44\%$
test_seq_wrap[compile-overhead] 0.3910ms 0.2238ms 4.4681 KOps/s 4.3314 KOps/s $\color{#35bf28}+3.16\%$
test_func_call_runtime[False-eager] 0.7176ms 0.5416ms 1.8464 KOps/s 1.8280 KOps/s $\color{#35bf28}+1.01\%$
test_func_call_runtime[False-compile] 0.5103ms 0.4166ms 2.4006 KOps/s 2.3601 KOps/s $\color{#35bf28}+1.72\%$
test_func_call_runtime[False-compile-overhead] 0.9885ms 0.4154ms 2.4074 KOps/s 2.3728 KOps/s $\color{#35bf28}+1.46\%$
test_func_call_runtime[True-eager] 1.2952ms 0.7535ms 1.3272 KOps/s 1.2994 KOps/s $\color{#35bf28}+2.14\%$
test_func_call_runtime[True-compile] 0.5757ms 0.4522ms 2.2114 KOps/s 2.1535 KOps/s $\color{#35bf28}+2.69\%$
test_func_call_runtime[True-compile-overhead] 0.8783ms 0.4551ms 2.1974 KOps/s 2.1578 KOps/s $\color{#35bf28}+1.83\%$
test_func_call_cm_runtime[False-eager] 0.9184ms 0.5439ms 1.8387 KOps/s 1.8282 KOps/s $\color{#35bf28}+0.58\%$
test_func_call_cm_runtime[False-compile] 0.6516ms 0.4141ms 2.4148 KOps/s 2.3456 KOps/s $\color{#35bf28}+2.95\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5172ms 0.4129ms 2.4221 KOps/s 2.3612 KOps/s $\color{#35bf28}+2.58\%$
test_func_call_cm_runtime[True-eager] 1.4993ms 0.8999ms 1.1113 KOps/s 1.0807 KOps/s $\color{#35bf28}+2.83\%$
test_func_call_cm_runtime[True-compile] 0.7176ms 0.4805ms 2.0813 KOps/s 2.0537 KOps/s $\color{#35bf28}+1.34\%$
test_func_call_cm_runtime[True-compile-overhead] 0.8638ms 0.4804ms 2.0815 KOps/s 2.0401 KOps/s $\color{#35bf28}+2.03\%$
test_vmap_func_call_cm_runtime[eager] 3.0887ms 1.9147ms 522.2677 Ops/s 521.4903 Ops/s $\color{#35bf28}+0.15\%$
test_vmap_func_call_cm_runtime[compile] 0.6777ms 0.5135ms 1.9472 KOps/s 1.9115 KOps/s $\color{#35bf28}+1.87\%$
test_vmap_func_call_cm_runtime[compile-overhead] 1.1487ms 0.5144ms 1.9439 KOps/s 1.9322 KOps/s $\color{#35bf28}+0.61\%$
test_distributed 0.2734ms 0.1251ms 7.9946 KOps/s 7.8134 KOps/s $\color{#35bf28}+2.32\%$
test_tdmodule 0.1695ms 27.2426μs 36.7072 KOps/s 38.8975 KOps/s $\textbf{\color{#d91a1a}-5.63\%}$
test_tdmodule_dispatch 80.1390μs 49.1110μs 20.3620 KOps/s 21.3200 KOps/s $\color{#d91a1a}-4.49\%$
test_tdseq 47.7990μs 29.3763μs 34.0410 KOps/s 36.2610 KOps/s $\textbf{\color{#d91a1a}-6.12\%}$
test_tdseq_dispatch 91.2610μs 55.4449μs 18.0359 KOps/s 19.3757 KOps/s $\textbf{\color{#d91a1a}-6.91\%}$
test_instantiation_functorch 1.7771ms 1.5230ms 656.5894 Ops/s 653.2495 Ops/s $\color{#35bf28}+0.51\%$
test_exec_functorch 0.3419ms 0.1815ms 5.5105 KOps/s 5.5346 KOps/s $\color{#d91a1a}-0.44\%$
test_exec_functional_call 0.4365ms 0.1703ms 5.8716 KOps/s 5.7854 KOps/s $\color{#35bf28}+1.49\%$
test_exec_td_decorator 0.4600ms 0.2316ms 4.3187 KOps/s 4.2365 KOps/s $\color{#35bf28}+1.94\%$
test_vmap_mlp_speed_decorator[True-True] 0.9305ms 0.6600ms 1.5151 KOps/s 1.5182 KOps/s $\color{#d91a1a}-0.21\%$
test_vmap_mlp_speed_decorator[True-False] 0.9845ms 0.6614ms 1.5120 KOps/s 1.5185 KOps/s $\color{#d91a1a}-0.43\%$
test_vmap_mlp_speed_decorator[False-True] 0.7210ms 0.5271ms 1.8971 KOps/s 1.8742 KOps/s $\color{#35bf28}+1.22\%$
test_vmap_mlp_speed_decorator[False-False] 0.8563ms 0.5316ms 1.8812 KOps/s 1.8825 KOps/s $\color{#d91a1a}-0.07\%$
test_to_module_speed[True] 2.5100ms 1.3671ms 731.4660 Ops/s 714.9622 Ops/s $\color{#35bf28}+2.31\%$
test_to_module_speed[False] 2.0083ms 1.3286ms 752.6757 Ops/s 744.9655 Ops/s $\color{#35bf28}+1.03\%$
test_tc_init 88.6830μs 46.6785μs 21.4232 KOps/s 20.7869 KOps/s $\color{#35bf28}+3.06\%$
test_tc_init_nested 0.1934ms 94.7268μs 10.5567 KOps/s 10.2888 KOps/s $\color{#35bf28}+2.60\%$
test_tc_first_layer_tensor 46.9960μs 1.5365μs 650.8370 KOps/s 649.1385 KOps/s $\color{#35bf28}+0.26\%$
test_tc_first_layer_nontensor 44.4630μs 4.7172μs 211.9907 KOps/s 207.7409 KOps/s $\color{#35bf28}+2.05\%$
test_tc_second_layer_tensor 50.4330μs 2.8487μs 351.0336 KOps/s 343.1875 KOps/s $\color{#35bf28}+2.29\%$
test_tc_second_layer_nontensor 42.1980μs 6.0743μs 164.6280 KOps/s 164.9944 KOps/s $\color{#d91a1a}-0.22\%$
test_unbind 0.2285s 14.3304ms 69.7819 Ops/s 77.6004 Ops/s $\textbf{\color{#d91a1a}-10.08\%}$
test_full_like 8.8891ms 7.2024ms 138.8432 Ops/s 140.4832 Ops/s $\color{#d91a1a}-1.17\%$
test_zeros_like 3.8835ms 2.7633ms 361.8809 Ops/s 365.5348 Ops/s $\color{#d91a1a}-1.00\%$
test_ones_like 4.3613ms 3.3481ms 298.6777 Ops/s 315.1473 Ops/s $\textbf{\color{#d91a1a}-5.23\%}$
test_clone 5.7142ms 5.0569ms 197.7478 Ops/s 201.3130 Ops/s $\color{#d91a1a}-1.77\%$
test_squeeze 62.8760μs 12.7837μs 78.2249 KOps/s 79.1612 KOps/s $\color{#d91a1a}-1.18\%$
test_unsqueeze 0.1577ms 91.4005μs 10.9409 KOps/s 10.6624 KOps/s $\color{#35bf28}+2.61\%$
test_split 0.5029ms 0.1951ms 5.1255 KOps/s 5.1340 KOps/s $\color{#d91a1a}-0.17\%$
test_permute 0.3476ms 0.2096ms 4.7702 KOps/s 4.9308 KOps/s $\color{#d91a1a}-3.26\%$
test_stack 29.5472ms 25.1984ms 39.6851 Ops/s 40.0574 Ops/s $\color{#d91a1a}-0.93\%$
test_cat 29.4842ms 25.1652ms 39.7374 Ops/s 40.5052 Ops/s $\color{#d91a1a}-1.90\%$

Copy link

github-actions bot commented Jan 7, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}30$. Worsened: $\large\color{#d91a1a}16$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 31.7710μs 11.7886μs 84.8279 KOps/s 77.4740 KOps/s $\textbf{\color{#35bf28}+9.49\%}$
test_plain_set_stack_nested 35.2510μs 12.0581μs 82.9316 KOps/s 75.8615 KOps/s $\textbf{\color{#35bf28}+9.32\%}$
test_plain_set_nested_inplace 37.5410μs 12.8692μs 77.7052 KOps/s 70.7535 KOps/s $\textbf{\color{#35bf28}+9.83\%}$
test_plain_set_stack_nested_inplace 48.3620μs 12.8672μs 77.7172 KOps/s 70.9076 KOps/s $\textbf{\color{#35bf28}+9.60\%}$
test_items 27.0210μs 2.8747μs 347.8678 KOps/s 341.1938 KOps/s $\color{#35bf28}+1.96\%$
test_items_nested 0.5021ms 0.3664ms 2.7296 KOps/s 2.7449 KOps/s $\color{#d91a1a}-0.56\%$
test_items_nested_locked 0.4570ms 0.3638ms 2.7484 KOps/s 2.7462 KOps/s $\color{#35bf28}+0.08\%$
test_items_nested_leaf 82.9440μs 58.6789μs 17.0419 KOps/s 17.1106 KOps/s $\color{#d91a1a}-0.40\%$
test_items_stack_nested 0.3898ms 0.3661ms 2.7318 KOps/s 2.7412 KOps/s $\color{#d91a1a}-0.34\%$
test_items_stack_nested_leaf 90.8340μs 59.5320μs 16.7977 KOps/s 17.2017 KOps/s $\color{#d91a1a}-2.35\%$
test_items_stack_nested_locked 0.3896ms 0.3634ms 2.7516 KOps/s 2.7531 KOps/s $\color{#d91a1a}-0.05\%$
test_keys 30.4920μs 3.4832μs 287.0884 KOps/s 288.7427 KOps/s $\color{#d91a1a}-0.57\%$
test_keys_nested 0.1178ms 81.7679μs 12.2297 KOps/s 12.3582 KOps/s $\color{#d91a1a}-1.04\%$
test_keys_nested_locked 0.7557ms 87.8903μs 11.3778 KOps/s 11.4711 KOps/s $\color{#d91a1a}-0.81\%$
test_keys_nested_leaf 0.1085ms 72.0446μs 13.8803 KOps/s 13.9420 KOps/s $\color{#d91a1a}-0.44\%$
test_keys_stack_nested 0.1259ms 82.0995μs 12.1803 KOps/s 12.3433 KOps/s $\color{#d91a1a}-1.32\%$
test_keys_stack_nested_leaf 0.1061ms 73.2788μs 13.6465 KOps/s 13.8940 KOps/s $\color{#d91a1a}-1.78\%$
test_keys_stack_nested_locked 0.1344ms 88.5728μs 11.2902 KOps/s 11.4845 KOps/s $\color{#d91a1a}-1.69\%$
test_values 8.1538μs 0.8544μs 1.1704 MOps/s 1.1739 MOps/s $\color{#d91a1a}-0.30\%$
test_values_nested 64.0530μs 34.4823μs 29.0004 KOps/s 29.0007 KOps/s $-0.00\%$
test_values_nested_locked 73.4530μs 36.0686μs 27.7249 KOps/s 27.6226 KOps/s $\color{#35bf28}+0.37\%$
test_values_nested_leaf 73.6130μs 38.8819μs 25.7189 KOps/s 25.4558 KOps/s $\color{#35bf28}+1.03\%$
test_values_stack_nested 68.7330μs 35.0180μs 28.5568 KOps/s 29.0285 KOps/s $\color{#d91a1a}-1.63\%$
test_values_stack_nested_leaf 64.3830μs 39.1591μs 25.5368 KOps/s 25.3263 KOps/s $\color{#35bf28}+0.83\%$
test_values_stack_nested_locked 66.5830μs 36.3167μs 27.5355 KOps/s 27.5235 KOps/s $\color{#35bf28}+0.04\%$
test_membership 1.7636μs 0.5113μs 1.9558 MOps/s 1.9746 MOps/s $\color{#d91a1a}-0.95\%$
test_membership_nested 17.1860μs 2.0133μs 496.6892 KOps/s 498.5110 KOps/s $\color{#d91a1a}-0.37\%$
test_membership_nested_leaf 19.3410μs 2.0268μs 493.3976 KOps/s 490.8595 KOps/s $\color{#35bf28}+0.52\%$
test_membership_stacked_nested 27.7420μs 2.0712μs 482.8169 KOps/s 478.9149 KOps/s $\color{#35bf28}+0.81\%$
test_membership_stacked_nested_leaf 39.2520μs 2.0788μs 481.0533 KOps/s 461.7857 KOps/s $\color{#35bf28}+4.17\%$
test_membership_nested_last 33.1020μs 3.1119μs 321.3482 KOps/s 325.2775 KOps/s $\color{#d91a1a}-1.21\%$
test_membership_nested_leaf_last 39.6220μs 3.1213μs 320.3801 KOps/s 315.6347 KOps/s $\color{#35bf28}+1.50\%$
test_membership_stacked_nested_last 25.9420μs 3.8734μs 258.1711 KOps/s 314.7711 KOps/s $\textbf{\color{#d91a1a}-17.98\%}$
test_membership_stacked_nested_leaf_last 38.8920μs 3.8396μs 260.4420 KOps/s 316.6546 KOps/s $\textbf{\color{#d91a1a}-17.75\%}$
test_nested_getleaf 56.2820μs 6.2263μs 160.6079 KOps/s 161.4637 KOps/s $\color{#d91a1a}-0.53\%$
test_nested_get 36.1120μs 5.9003μs 169.4843 KOps/s 169.8194 KOps/s $\color{#d91a1a}-0.20\%$
test_stacked_getleaf 30.5720μs 6.1939μs 161.4498 KOps/s 163.0279 KOps/s $\color{#d91a1a}-0.97\%$
test_stacked_get 41.3420μs 5.8548μs 170.8005 KOps/s 171.6912 KOps/s $\color{#d91a1a}-0.52\%$
test_nested_getitemleaf 28.8220μs 6.2893μs 159.0012 KOps/s 160.6485 KOps/s $\color{#d91a1a}-1.03\%$
test_nested_getitem 38.8320μs 6.0168μs 166.2023 KOps/s 165.9326 KOps/s $\color{#35bf28}+0.16\%$
test_stacked_getitemleaf 31.1110μs 6.2956μs 158.8411 KOps/s 159.8270 KOps/s $\color{#d91a1a}-0.62\%$
test_stacked_getitem 41.0120μs 5.9796μs 167.2355 KOps/s 167.8746 KOps/s $\color{#d91a1a}-0.38\%$
test_lock_nested 0.7500ms 0.3819ms 2.6188 KOps/s 2.6493 KOps/s $\color{#d91a1a}-1.15\%$
test_lock_stack_nested 0.4469ms 0.3521ms 2.8397 KOps/s 2.8719 KOps/s $\color{#d91a1a}-1.12\%$
test_unlock_nested 0.6277ms 0.3216ms 3.1091 KOps/s 3.1246 KOps/s $\color{#d91a1a}-0.50\%$
test_unlock_stack_nested 0.3519ms 0.2895ms 3.4538 KOps/s 3.4850 KOps/s $\color{#d91a1a}-0.90\%$
test_flatten_speed 0.1591ms 74.3220μs 13.4550 KOps/s 13.1786 KOps/s $\color{#35bf28}+2.10\%$
test_unflatten_speed 0.4006ms 0.3228ms 3.0976 KOps/s 3.1471 KOps/s $\color{#d91a1a}-1.57\%$
test_common_ops 1.4993ms 0.6110ms 1.6367 KOps/s 1.5580 KOps/s $\textbf{\color{#35bf28}+5.05\%}$
test_creation 0.1004ms 1.7828μs 560.9136 KOps/s 568.8705 KOps/s $\color{#d91a1a}-1.40\%$
test_creation_empty 38.1320μs 7.4516μs 134.1998 KOps/s 102.9610 KOps/s $\textbf{\color{#35bf28}+30.34\%}$
test_creation_nested_1 47.6630μs 9.2018μs 108.6746 KOps/s 88.1544 KOps/s $\textbf{\color{#35bf28}+23.28\%}$
test_creation_nested_2 44.6020μs 11.9680μs 83.5561 KOps/s 70.9100 KOps/s $\textbf{\color{#35bf28}+17.83\%}$
test_clone 77.7940μs 11.4975μs 86.9752 KOps/s 94.9310 KOps/s $\textbf{\color{#d91a1a}-8.38\%}$
test_getitem[int] 1.4997ms 11.2394μs 88.9729 KOps/s 92.3789 KOps/s $\color{#d91a1a}-3.69\%$
test_getitem[slice_int] 0.1131ms 21.8580μs 45.7498 KOps/s 48.2051 KOps/s $\textbf{\color{#d91a1a}-5.09\%}$
test_getitem[range] 0.1302ms 39.0622μs 25.6002 KOps/s 26.2315 KOps/s $\color{#d91a1a}-2.41\%$
test_getitem[tuple] 0.1096ms 18.9917μs 52.6545 KOps/s 53.8047 KOps/s $\color{#d91a1a}-2.14\%$
test_getitem[list] 0.2073ms 34.7575μs 28.7707 KOps/s 29.3210 KOps/s $\color{#d91a1a}-1.88\%$
test_setitem_dim[int] 39.6920μs 19.9442μs 50.1399 KOps/s 51.0703 KOps/s $\color{#d91a1a}-1.82\%$
test_setitem_dim[slice_int] 62.3730μs 39.5586μs 25.2790 KOps/s 25.6311 KOps/s $\color{#d91a1a}-1.37\%$
test_setitem_dim[range] 83.9840μs 54.2221μs 18.4426 KOps/s 18.6988 KOps/s $\color{#d91a1a}-1.37\%$
test_setitem_dim[tuple] 74.3740μs 33.9679μs 29.4395 KOps/s 31.1166 KOps/s $\textbf{\color{#d91a1a}-5.39\%}$
test_setitem 91.3150μs 15.4529μs 64.7127 KOps/s 63.0062 KOps/s $\color{#35bf28}+2.71\%$
test_set 85.9540μs 14.7203μs 67.9336 KOps/s 64.6081 KOps/s $\textbf{\color{#35bf28}+5.15\%}$
test_set_shared 1.5386ms 0.1537ms 6.5048 KOps/s 6.5236 KOps/s $\color{#d91a1a}-0.29\%$
test_update 0.3258ms 17.2613μs 57.9331 KOps/s 52.0636 KOps/s $\textbf{\color{#35bf28}+11.27\%}$
test_update_nested 99.8350μs 22.5582μs 44.3298 KOps/s 39.7149 KOps/s $\textbf{\color{#35bf28}+11.62\%}$
test_update__nested 1.0933ms 27.1096μs 36.8873 KOps/s 38.5410 KOps/s $\color{#d91a1a}-4.29\%$
test_set_nested 90.8740μs 16.5590μs 60.3900 KOps/s 59.2550 KOps/s $\color{#35bf28}+1.92\%$
test_set_nested_new 0.1031ms 18.4019μs 54.3421 KOps/s 51.2897 KOps/s $\textbf{\color{#35bf28}+5.95\%}$
test_select 0.1177ms 30.7590μs 32.5108 KOps/s 31.8098 KOps/s $\color{#35bf28}+2.20\%$
test_select_nested 75.3240μs 43.5157μs 22.9802 KOps/s 22.9282 KOps/s $\color{#35bf28}+0.23\%$
test_exclude_nested 98.2850μs 62.7353μs 15.9400 KOps/s 15.6571 KOps/s $\color{#35bf28}+1.81\%$
test_empty[True] 0.3243ms 0.2895ms 3.4545 KOps/s 3.4674 KOps/s $\color{#d91a1a}-0.37\%$
test_empty[False] 3.7702μs 0.8207μs 1.2184 MOps/s 1.2151 MOps/s $\color{#35bf28}+0.27\%$
test_to 86.7450μs 57.1041μs 17.5119 KOps/s 17.6838 KOps/s $\color{#d91a1a}-0.97\%$
test_to_nonblocking 0.1975ms 49.0998μs 20.3667 KOps/s 20.3721 KOps/s $\color{#d91a1a}-0.03\%$
test_unbind_speed 1.5944ms 0.2420ms 4.1314 KOps/s 4.2686 KOps/s $\color{#d91a1a}-3.21\%$
test_unbind_speed_stack0 0.2893ms 0.2453ms 4.0767 KOps/s 4.2121 KOps/s $\color{#d91a1a}-3.21\%$
test_unbind_speed_stack1 92.8715ms 0.6791ms 1.4726 KOps/s 1.4657 KOps/s $\color{#35bf28}+0.47\%$
test_split 93.9551ms 1.7599ms 568.2249 Ops/s 631.7508 Ops/s $\textbf{\color{#d91a1a}-10.06\%}$
test_chunk 1.5958ms 1.4861ms 672.9023 Ops/s 579.3242 Ops/s $\textbf{\color{#35bf28}+16.15\%}$
test_consolidate[False-None] 97.7936ms 3.0082ms 332.4280 Ops/s 373.2395 Ops/s $\textbf{\color{#d91a1a}-10.93\%}$
test_consolidate[default-None] 1.8904ms 1.7194ms 581.6011 Ops/s 583.8379 Ops/s $\color{#d91a1a}-0.38\%$
test_consolidate[reduce-overhead-None] 1.8913ms 1.7551ms 569.7643 Ops/s 565.6528 Ops/s $\color{#35bf28}+0.73\%$
test_consolidate_njt[False-None] 6.9938ms 6.6924ms 149.4233 Ops/s 111.8498 Ops/s $\textbf{\color{#35bf28}+33.59\%}$
test_to[False-False-None] 1.8447ms 1.7683ms 565.5155 Ops/s 563.9923 Ops/s $\color{#35bf28}+0.27\%$
test_to[True-False-None] 1.6387ms 1.4075ms 710.4881 Ops/s 753.8072 Ops/s $\textbf{\color{#d91a1a}-5.75\%}$
test_to[within-False-None] 4.3345ms 4.2356ms 236.0928 Ops/s 239.5364 Ops/s $\color{#d91a1a}-1.44\%$
test_to[True-default-None] 6.1017ms 5.6414ms 177.2609 Ops/s 177.9046 Ops/s $\color{#d91a1a}-0.36\%$
test_to_njt[False-False-None] 7.5545ms 7.3812ms 135.4799 Ops/s 135.1494 Ops/s $\color{#35bf28}+0.24\%$
test_to_njt[True-False-None] 6.0272ms 5.7950ms 172.5632 Ops/s 172.8597 Ops/s $\color{#d91a1a}-0.17\%$
test_to_njt[within-False-None] 12.9977ms 12.4825ms 80.1120 Ops/s 78.9143 Ops/s $\color{#35bf28}+1.52\%$
test_creation[device0] 0.4562ms 85.1956μs 11.7377 KOps/s 12.2803 KOps/s $\color{#d91a1a}-4.42\%$
test_creation_from_tensor 0.4470ms 88.4552μs 11.3052 KOps/s 11.5249 KOps/s $\color{#d91a1a}-1.91\%$
test_add_one[memmap_tensor0] 0.4007ms 7.3275μs 136.4730 KOps/s 145.5067 KOps/s $\textbf{\color{#d91a1a}-6.21\%}$
test_contiguous[memmap_tensor0] 2.0356μs 0.4163μs 2.4019 MOps/s 2.3580 MOps/s $\color{#35bf28}+1.86\%$
test_stack[memmap_tensor0] 29.7020μs 4.4969μs 222.3771 KOps/s 230.7603 KOps/s $\color{#d91a1a}-3.63\%$
test_memmaptd_index 1.6745ms 0.2638ms 3.7913 KOps/s 3.9550 KOps/s $\color{#d91a1a}-4.14\%$
test_memmaptd_index_astensor 0.5919ms 0.3231ms 3.0955 KOps/s 3.1778 KOps/s $\color{#d91a1a}-2.59\%$
test_memmaptd_index_op 1.0845ms 0.6033ms 1.6575 KOps/s 1.6252 KOps/s $\color{#35bf28}+1.99\%$
test_serialize_model 0.1320s 0.1311s 7.6258 Ops/s 7.6286 Ops/s $\color{#d91a1a}-0.04\%$
test_serialize_model_pickle 1.3447s 1.1910s 0.8396 Ops/s 0.8220 Ops/s $\color{#35bf28}+2.14\%$
test_serialize_weights 0.1313s 0.1302s 7.6831 Ops/s 7.6856 Ops/s $\color{#d91a1a}-0.03\%$
test_serialize_weights_returnearly 0.3463s 55.8492ms 17.9054 Ops/s 14.6026 Ops/s $\textbf{\color{#35bf28}+22.62\%}$
test_serialize_weights_pickle 1.3558s 1.2158s 0.8225 Ops/s 0.8228 Ops/s $\color{#d91a1a}-0.04\%$
test_reshape_pytree 0.1576ms 22.4945μs 44.4553 KOps/s 43.5934 KOps/s $\color{#35bf28}+1.98\%$
test_reshape_td 58.3630μs 27.3697μs 36.5367 KOps/s 32.7952 KOps/s $\textbf{\color{#35bf28}+11.41\%}$
test_view_pytree 78.9140μs 22.1609μs 45.1245 KOps/s 43.3792 KOps/s $\color{#35bf28}+4.02\%$
test_view_td 0.1161ms 31.0849μs 32.1700 KOps/s 29.6924 KOps/s $\textbf{\color{#35bf28}+8.34\%}$
test_unbind_pytree 60.5930μs 28.4944μs 35.0947 KOps/s 33.7676 KOps/s $\color{#35bf28}+3.93\%$
test_unbind_td 0.7628ms 37.5982μs 26.5970 KOps/s 25.4106 KOps/s $\color{#35bf28}+4.67\%$
test_split_pytree 64.8130μs 30.9489μs 32.3114 KOps/s 30.8577 KOps/s $\color{#35bf28}+4.71\%$
test_split_td 0.9319ms 38.9773μs 25.6559 KOps/s 26.4752 KOps/s $\color{#d91a1a}-3.09\%$
test_add_pytree 0.1722ms 36.0090μs 27.7708 KOps/s 27.6758 KOps/s $\color{#35bf28}+0.34\%$
test_add_td 90.1550μs 47.3224μs 21.1316 KOps/s 18.1062 KOps/s $\textbf{\color{#35bf28}+16.71\%}$
test_compile_add_one_nested[tensordict-compile] 0.1721ms 0.1230ms 8.1324 KOps/s 7.9331 KOps/s $\color{#35bf28}+2.51\%$
test_compile_add_one_nested[tensordict-eager] 0.2801ms 0.1303ms 7.6717 KOps/s 7.6824 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_add_one_nested[pytree-compile] 0.2111ms 97.5316μs 10.2531 KOps/s 9.9801 KOps/s $\color{#35bf28}+2.74\%$
test_compile_add_one_nested[pytree-eager] 1.4590ms 0.1536ms 6.5092 KOps/s 6.6491 KOps/s $\color{#d91a1a}-2.10\%$
test_compile_copy_nested[tensordict-compile] 0.1685ms 23.4560μs 42.6330 KOps/s 44.4450 KOps/s $\color{#d91a1a}-4.08\%$
test_compile_copy_nested[tensordict-eager] 0.1193ms 29.8053μs 33.5511 KOps/s 33.6150 KOps/s $\color{#d91a1a}-0.19\%$
test_compile_copy_nested[pytree-compile] 0.4764ms 65.0116μs 15.3819 KOps/s 15.1187 KOps/s $\color{#35bf28}+1.74\%$
test_compile_copy_nested[pytree-eager] 81.9540μs 49.3945μs 20.2452 KOps/s 20.1842 KOps/s $\color{#35bf28}+0.30\%$
test_compile_add_one_flat[tensordict-compile] 0.2032ms 0.1460ms 6.8494 KOps/s 7.0118 KOps/s $\color{#d91a1a}-2.32\%$
test_compile_add_one_flat[tensordict-eager] 0.3546ms 0.2159ms 4.6319 KOps/s 4.6876 KOps/s $\color{#d91a1a}-1.19\%$
test_compile_add_one_flat[tensorclass-compile] 0.1443ms 0.1003ms 9.9743 KOps/s 10.0513 KOps/s $\color{#d91a1a}-0.77\%$
test_compile_add_one_flat[tensorclass-eager] 0.1508ms 55.2916μs 18.0859 KOps/s 17.5116 KOps/s $\color{#35bf28}+3.28\%$
test_compile_add_one_flat[pytree-compile] 0.1901ms 0.1432ms 6.9827 KOps/s 7.2921 KOps/s $\color{#d91a1a}-4.24\%$
test_compile_add_one_flat[pytree-eager] 0.6533ms 0.5058ms 1.9770 KOps/s 2.0778 KOps/s $\color{#d91a1a}-4.85\%$
test_compile_add_self_flat[tensordict-eager] 0.3883ms 0.2628ms 3.8054 KOps/s 3.8787 KOps/s $\color{#d91a1a}-1.89\%$
test_compile_add_self_flat[tensordict-compile] 0.3244ms 0.1528ms 6.5434 KOps/s 6.9868 KOps/s $\textbf{\color{#d91a1a}-6.35\%}$
test_compile_add_self_flat[tensorclass-eager] 0.2736ms 67.5349μs 14.8072 KOps/s 15.3292 KOps/s $\color{#d91a1a}-3.41\%$
test_compile_add_self_flat[tensorclass-compile] 0.1441ms 0.1018ms 9.8269 KOps/s 10.0238 KOps/s $\color{#d91a1a}-1.96\%$
test_compile_add_self_flat[pytree-eager] 0.4652ms 0.4241ms 2.3579 KOps/s 2.4489 KOps/s $\color{#d91a1a}-3.72\%$
test_compile_add_self_flat[pytree-compile] 0.2329ms 0.1428ms 7.0024 KOps/s 7.3561 KOps/s $\color{#d91a1a}-4.81\%$
test_compile_copy_flat[tensordict-compile] 0.1311ms 24.8624μs 40.2214 KOps/s 52.6049 KOps/s $\textbf{\color{#d91a1a}-23.54\%}$
test_compile_copy_flat[tensordict-eager] 68.3440μs 31.1428μs 32.1102 KOps/s 31.7637 KOps/s $\color{#35bf28}+1.09\%$
test_compile_copy_flat[pytree-compile] 0.1108ms 70.9672μs 14.0910 KOps/s 14.0261 KOps/s $\color{#35bf28}+0.46\%$
test_compile_copy_flat[pytree-eager] 81.4040μs 51.5893μs 19.3839 KOps/s 19.4041 KOps/s $\color{#d91a1a}-0.10\%$
test_compile_assign_and_add[tensordict-compile] 1.6628ms 0.4080ms 2.4512 KOps/s 2.2211 KOps/s $\textbf{\color{#35bf28}+10.36\%}$
test_compile_assign_and_add[tensordict-eager] 3.0084ms 2.7690ms 361.1468 Ops/s 375.8325 Ops/s $\color{#d91a1a}-3.91\%$
test_compile_assign_and_add[pytree-compile] 1.5951ms 0.4347ms 2.3003 KOps/s 2.2569 KOps/s $\color{#35bf28}+1.92\%$
test_compile_assign_and_add[pytree-eager] 2.9376ms 2.7308ms 366.1895 Ops/s 371.4134 Ops/s $\color{#d91a1a}-1.41\%$
test_compile_indexing[tensor-tensordict-compile] 0.7804ms 0.1154ms 8.6651 KOps/s 8.6048 KOps/s $\color{#35bf28}+0.70\%$
test_compile_indexing[tensor-tensordict-eager] 0.5868ms 83.8885μs 11.9206 KOps/s 11.5897 KOps/s $\color{#35bf28}+2.85\%$
test_compile_indexing[tensor-tensorclass-compile] 0.5378ms 0.1139ms 8.7792 KOps/s 9.2219 KOps/s $\color{#d91a1a}-4.80\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2197ms 70.4852μs 14.1874 KOps/s 13.5388 KOps/s $\color{#35bf28}+4.79\%$
test_compile_indexing[tensor-pytree-compile] 0.3066ms 0.1110ms 9.0080 KOps/s 9.1634 KOps/s $\color{#d91a1a}-1.70\%$
test_compile_indexing[tensor-pytree-eager] 0.2102ms 69.9705μs 14.2917 KOps/s 14.2479 KOps/s $\color{#35bf28}+0.31\%$
test_compile_indexing[slice-tensordict-compile] 0.2360ms 0.1036ms 9.6531 KOps/s 9.7726 KOps/s $\color{#d91a1a}-1.22\%$
test_compile_indexing[slice-tensordict-eager] 0.1679ms 17.6460μs 56.6700 KOps/s 56.2552 KOps/s $\color{#35bf28}+0.74\%$
test_compile_indexing[slice-tensorclass-compile] 0.1440ms 0.1019ms 9.8150 KOps/s 10.1128 KOps/s $\color{#d91a1a}-2.95\%$
test_compile_indexing[slice-tensorclass-eager] 56.2820μs 16.1243μs 62.0181 KOps/s 60.8756 KOps/s $\color{#35bf28}+1.88\%$
test_compile_indexing[slice-pytree-compile] 0.2884ms 0.1022ms 9.7843 KOps/s 10.0248 KOps/s $\color{#d91a1a}-2.40\%$
test_compile_indexing[slice-pytree-eager] 0.1436ms 16.1676μs 61.8522 KOps/s 62.8394 KOps/s $\color{#d91a1a}-1.57\%$
test_compile_indexing[int-tensordict-compile] 0.2582ms 0.1039ms 9.6249 KOps/s 9.4870 KOps/s $\color{#35bf28}+1.45\%$
test_compile_indexing[int-tensordict-eager] 0.5496ms 17.4839μs 57.1955 KOps/s 53.0392 KOps/s $\textbf{\color{#35bf28}+7.84\%}$
test_compile_indexing[int-tensorclass-compile] 0.2642ms 0.1038ms 9.6302 KOps/s 10.0209 KOps/s $\color{#d91a1a}-3.90\%$
test_compile_indexing[int-tensorclass-eager] 78.9240μs 18.1503μs 55.0954 KOps/s 63.4309 KOps/s $\textbf{\color{#d91a1a}-13.14\%}$
test_compile_indexing[int-pytree-compile] 0.2456ms 99.2221μs 10.0784 KOps/s 9.9649 KOps/s $\color{#35bf28}+1.14\%$
test_compile_indexing[int-pytree-eager] 42.9820μs 16.0917μs 62.1438 KOps/s 64.1328 KOps/s $\color{#d91a1a}-3.10\%$
test_mod_add[eager] 0.1909ms 38.7928μs 25.7780 KOps/s 25.5903 KOps/s $\color{#35bf28}+0.73\%$
test_mod_add[compile] 0.2705ms 81.0498μs 12.3381 KOps/s 12.2104 KOps/s $\color{#35bf28}+1.05\%$
test_mod_add[compile-overhead] 0.3598ms 0.1743ms 5.7366 KOps/s 5.6120 KOps/s $\color{#35bf28}+2.22\%$
test_mod_wrap[eager] 0.3356ms 0.2567ms 3.8963 KOps/s 3.8709 KOps/s $\color{#35bf28}+0.66\%$
test_mod_wrap[compile] 0.7153ms 0.2901ms 3.4470 KOps/s 3.4266 KOps/s $\color{#35bf28}+0.60\%$
test_mod_wrap[compile-overhead] 7.2735ms 3.6823ms 271.5670 Ops/s 269.6477 Ops/s $\color{#35bf28}+0.71\%$
test_mod_wrap_and_backward[eager] 1.5853ms 1.3879ms 720.4955 Ops/s 680.1990 Ops/s $\textbf{\color{#35bf28}+5.92\%}$
test_mod_wrap_and_backward[compile] 1.4133ms 1.2924ms 773.7690 Ops/s 716.0286 Ops/s $\textbf{\color{#35bf28}+8.06\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3983ms 0.9379ms 1.0663 KOps/s 939.4165 Ops/s $\textbf{\color{#35bf28}+13.50\%}$
test_seq_add[eager] 0.2655ms 0.1185ms 8.4388 KOps/s 8.2959 KOps/s $\color{#35bf28}+1.72\%$
test_seq_add[compile] 0.2040ms 89.6851μs 11.1501 KOps/s 11.1152 KOps/s $\color{#35bf28}+0.31\%$
test_seq_add[compile-overhead] 0.1867ms 0.1320ms 7.5744 KOps/s 7.5474 KOps/s $\color{#35bf28}+0.36\%$
test_seq_wrap[eager] 0.5676ms 0.4261ms 2.3469 KOps/s 2.2948 KOps/s $\color{#35bf28}+2.27\%$
test_seq_wrap[compile] 0.5248ms 0.3136ms 3.1889 KOps/s 3.2660 KOps/s $\color{#d91a1a}-2.36\%$
test_seq_wrap[compile-overhead] 0.2759ms 0.2340ms 4.2734 KOps/s 4.3431 KOps/s $\color{#d91a1a}-1.60\%$
test_func_call_runtime[False-eager] 1.0230ms 0.8264ms 1.2101 KOps/s 1.3195 KOps/s $\textbf{\color{#d91a1a}-8.29\%}$
test_func_call_runtime[False-compile] 0.8607ms 0.8054ms 1.2416 KOps/s 1.3191 KOps/s $\textbf{\color{#d91a1a}-5.87\%}$
test_func_call_runtime[False-compile-overhead] 0.5019ms 0.3698ms 2.7043 KOps/s 2.6863 KOps/s $\color{#35bf28}+0.67\%$
test_func_call_runtime[True-eager] 1.0685ms 0.9239ms 1.0823 KOps/s 1.0805 KOps/s $\color{#35bf28}+0.16\%$
test_func_call_runtime[True-compile] 0.9603ms 0.7822ms 1.2784 KOps/s 1.2822 KOps/s $\color{#d91a1a}-0.30\%$
test_func_call_runtime[True-compile-overhead] 0.4583ms 0.3887ms 2.5724 KOps/s 2.5550 KOps/s $\color{#35bf28}+0.68\%$
test_func_call_cm_runtime[False-eager] 0.8351ms 0.7623ms 1.3118 KOps/s 1.3241 KOps/s $\color{#d91a1a}-0.93\%$
test_func_call_cm_runtime[False-compile] 0.8922ms 0.7585ms 1.3184 KOps/s 1.3066 KOps/s $\color{#35bf28}+0.90\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4254ms 0.3717ms 2.6904 KOps/s 2.6793 KOps/s $\color{#35bf28}+0.41\%$
test_func_call_cm_runtime[True-eager] 1.1929ms 1.0245ms 976.1081 Ops/s 966.5917 Ops/s $\color{#35bf28}+0.98\%$
test_func_call_cm_runtime[True-compile] 0.9510ms 0.8077ms 1.2380 KOps/s 1.2404 KOps/s $\color{#d91a1a}-0.19\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5606ms 0.4173ms 2.3964 KOps/s 2.3863 KOps/s $\color{#35bf28}+0.42\%$
test_vmap_func_call_cm_runtime[eager] 2.5936ms 2.1238ms 470.8482 Ops/s 464.9323 Ops/s $\color{#35bf28}+1.27\%$
test_vmap_func_call_cm_runtime[compile] 0.9879ms 0.8247ms 1.2126 KOps/s 1.2037 KOps/s $\color{#35bf28}+0.75\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4664ms 0.4200ms 2.3808 KOps/s 2.3632 KOps/s $\color{#35bf28}+0.74\%$
test_distributed 2.4645ms 0.2379ms 4.2030 KOps/s 8.3390 KOps/s $\textbf{\color{#d91a1a}-49.60\%}$
test_tdmodule 59.2430μs 19.4323μs 51.4607 KOps/s 48.7222 KOps/s $\textbf{\color{#35bf28}+5.62\%}$
test_tdmodule_dispatch 71.0430μs 34.2953μs 29.1585 KOps/s 26.8377 KOps/s $\textbf{\color{#35bf28}+8.65\%}$
test_tdseq 59.0730μs 20.4471μs 48.9066 KOps/s 46.1611 KOps/s $\textbf{\color{#35bf28}+5.95\%}$
test_tdseq_dispatch 57.4730μs 37.7919μs 26.4607 KOps/s 24.6151 KOps/s $\textbf{\color{#35bf28}+7.50\%}$
test_instantiation_functorch 1.6583ms 1.5770ms 634.1164 Ops/s 629.2838 Ops/s $\color{#35bf28}+0.77\%$
test_exec_functorch 0.2797ms 0.1501ms 6.6633 KOps/s 6.7818 KOps/s $\color{#d91a1a}-1.75\%$
test_exec_functional_call 0.2247ms 0.1438ms 6.9559 KOps/s 7.0323 KOps/s $\color{#d91a1a}-1.09\%$
test_exec_td_decorator 0.3770ms 0.1928ms 5.1855 KOps/s 5.2133 KOps/s $\color{#d91a1a}-0.53\%$
test_vmap_mlp_speed_decorator[True-True] 0.8185ms 0.6957ms 1.4375 KOps/s 1.4108 KOps/s $\color{#35bf28}+1.89\%$
test_vmap_mlp_speed_decorator[True-False] 0.8491ms 0.6950ms 1.4388 KOps/s 1.4346 KOps/s $\color{#35bf28}+0.30\%$
test_vmap_mlp_speed_decorator[False-True] 0.7493ms 0.6092ms 1.6415 KOps/s 1.6424 KOps/s $\color{#d91a1a}-0.05\%$
test_vmap_mlp_speed_decorator[False-False] 0.7257ms 0.6098ms 1.6400 KOps/s 1.6082 KOps/s $\color{#35bf28}+1.98\%$
test_vmap_transformer_speed_decorator[True-True] 20.4219ms 19.6264ms 50.9517 Ops/s 51.0086 Ops/s $\color{#d91a1a}-0.11\%$
test_vmap_transformer_speed_decorator[True-False] 19.7641ms 19.6047ms 51.0083 Ops/s 50.6519 Ops/s $\color{#35bf28}+0.70\%$
test_vmap_transformer_speed_decorator[False-True] 19.7663ms 19.5324ms 51.1970 Ops/s 51.6000 Ops/s $\color{#d91a1a}-0.78\%$
test_vmap_transformer_speed_decorator[False-False] 19.5800ms 19.4628ms 51.3802 Ops/s 51.4877 Ops/s $\color{#d91a1a}-0.21\%$
test_to_module_speed[True] 2.2443ms 0.9793ms 1.0212 KOps/s 1.0164 KOps/s $\color{#35bf28}+0.47\%$
test_to_module_speed[False] 1.1178ms 0.9679ms 1.0332 KOps/s 1.0396 KOps/s $\color{#d91a1a}-0.62\%$
test_tc_init 63.5430μs 36.2732μs 27.5685 KOps/s 25.5773 KOps/s $\textbf{\color{#35bf28}+7.78\%}$
test_tc_init_nested 0.1069ms 72.1049μs 13.8687 KOps/s 12.9197 KOps/s $\textbf{\color{#35bf28}+7.35\%}$
test_tc_first_layer_tensor 5.0003μs 0.7155μs 1.3977 MOps/s 1.4282 MOps/s $\color{#d91a1a}-2.13\%$
test_tc_first_layer_nontensor 31.6020μs 2.2811μs 438.3923 KOps/s 449.3955 KOps/s $\color{#d91a1a}-2.45\%$
test_tc_second_layer_tensor 8.0355μs 1.4406μs 694.1323 KOps/s 696.0600 KOps/s $\color{#d91a1a}-0.28\%$
test_tc_second_layer_nontensor 46.4720μs 3.0239μs 330.6971 KOps/s 334.2481 KOps/s $\color{#d91a1a}-1.06\%$
test_unbind 0.2171s 12.1382ms 82.3842 Ops/s 141.0333 Ops/s $\textbf{\color{#d91a1a}-41.59\%}$
test_full_like 9.8978ms 9.4271ms 106.0767 Ops/s 107.1698 Ops/s $\color{#d91a1a}-1.02\%$
test_zeros_like 4.8474ms 4.2227ms 236.8155 Ops/s 232.9392 Ops/s $\color{#35bf28}+1.66\%$
test_ones_like 5.0777ms 4.3558ms 229.5785 Ops/s 235.8710 Ops/s $\color{#d91a1a}-2.67\%$
test_clone 7.8881ms 6.6058ms 151.3823 Ops/s 153.0043 Ops/s $\color{#d91a1a}-1.06\%$
test_squeeze 58.7130μs 9.6897μs 103.2019 KOps/s 100.9199 KOps/s $\color{#35bf28}+2.26\%$
test_unsqueeze 0.1267ms 72.3036μs 13.8306 KOps/s 13.0079 KOps/s $\textbf{\color{#35bf28}+6.32\%}$
test_split 0.4181ms 0.1621ms 6.1692 KOps/s 5.9763 KOps/s $\color{#35bf28}+3.23\%$
test_permute 0.2314ms 0.1807ms 5.5326 KOps/s 5.5319 KOps/s $\color{#35bf28}+0.01\%$
test_stack 52.0863ms 51.3397ms 19.4781 Ops/s 19.5855 Ops/s $\color{#d91a1a}-0.55\%$
test_cat 51.9216ms 51.3332ms 19.4806 Ops/s 19.5738 Ops/s $\color{#d91a1a}-0.48\%$

@vmoens vmoens merged commit fef37d6 into gh/vmoens/43/base Jan 7, 2025
50 of 55 checks passed
vmoens added a commit that referenced this pull request Jan 7, 2025
ghstack-source-id: 537f3d87b0677a1ae4992ca581a585420a10a284
Pull Request resolved: #1160
@vmoens vmoens deleted the gh/vmoens/43/head branch January 7, 2025 11:22
@vmoens vmoens added the enhancement New feature or request label Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants