Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Add sync to cudagraph module #1026

Merged
merged 1 commit into from
Oct 4, 2024
Merged

[BugFix] Add sync to cudagraph module #1026

merged 1 commit into from
Oct 4, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Oct 4, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 4, 2024
@vmoens vmoens added the bug Something isn't working label Oct 4, 2024
@vmoens vmoens merged commit 362c072 into main Oct 4, 2024
34 of 39 checks passed
@vmoens vmoens deleted the add-sync-cudagraphs branch October 4, 2024 08:38
Copy link

github-actions bot commented Oct 4, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}5$. Worsened: $\large\color{#d91a1a}28$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 73.7580μs 26.4435μs 37.8165 KOps/s 40.9303 KOps/s $\textbf{\color{#d91a1a}-7.61\%}$
test_plain_set_stack_nested 64.8110μs 26.7570μs 37.3733 KOps/s 40.2900 KOps/s $\textbf{\color{#d91a1a}-7.24\%}$
test_plain_set_nested_inplace 70.9330μs 29.2945μs 34.1361 KOps/s 37.3383 KOps/s $\textbf{\color{#d91a1a}-8.58\%}$
test_plain_set_stack_nested_inplace 68.2790μs 28.7535μs 34.7784 KOps/s 37.3494 KOps/s $\textbf{\color{#d91a1a}-6.88\%}$
test_items 23.0930μs 4.1523μs 240.8323 KOps/s 231.5482 KOps/s $\color{#35bf28}+4.01\%$
test_items_nested 0.4683ms 0.3881ms 2.5770 KOps/s 2.6132 KOps/s $\color{#d91a1a}-1.39\%$
test_items_nested_locked 0.5548ms 0.3891ms 2.5701 KOps/s 2.6116 KOps/s $\color{#d91a1a}-1.59\%$
test_items_nested_leaf 0.1612ms 80.7941μs 12.3771 KOps/s 12.2949 KOps/s $\color{#35bf28}+0.67\%$
test_items_stack_nested 0.7027ms 0.3951ms 2.5312 KOps/s 2.5774 KOps/s $\color{#d91a1a}-1.79\%$
test_items_stack_nested_leaf 0.1551ms 84.2178μs 11.8740 KOps/s 12.0376 KOps/s $\color{#d91a1a}-1.36\%$
test_items_stack_nested_locked 0.7329ms 0.3937ms 2.5399 KOps/s 2.5864 KOps/s $\color{#d91a1a}-1.80\%$
test_keys 30.3270μs 3.5332μs 283.0297 KOps/s 284.7011 KOps/s $\color{#d91a1a}-0.59\%$
test_keys_nested 0.2678ms 0.1362ms 7.3430 KOps/s 7.5227 KOps/s $\color{#d91a1a}-2.39\%$
test_keys_nested_locked 1.5895ms 0.1416ms 7.0631 KOps/s 7.2001 KOps/s $\color{#d91a1a}-1.90\%$
test_keys_nested_leaf 0.2002ms 0.1193ms 8.3823 KOps/s 8.6125 KOps/s $\color{#d91a1a}-2.67\%$
test_keys_stack_nested 0.2318ms 0.1344ms 7.4395 KOps/s 7.5322 KOps/s $\color{#d91a1a}-1.23\%$
test_keys_stack_nested_leaf 0.2098ms 0.1179ms 8.4848 KOps/s 8.6193 KOps/s $\color{#d91a1a}-1.56\%$
test_keys_stack_nested_locked 0.2492ms 0.1398ms 7.1547 KOps/s 7.2435 KOps/s $\color{#d91a1a}-1.23\%$
test_values 6.2398μs 1.0726μs 932.3001 KOps/s 937.1496 KOps/s $\color{#d91a1a}-0.52\%$
test_values_nested 0.1869ms 96.9536μs 10.3142 KOps/s 10.3774 KOps/s $\color{#d91a1a}-0.61\%$
test_values_nested_locked 0.1698ms 97.4017μs 10.2668 KOps/s 10.4452 KOps/s $\color{#d91a1a}-1.71\%$
test_values_nested_leaf 0.1618ms 83.6408μs 11.9559 KOps/s 12.3478 KOps/s $\color{#d91a1a}-3.17\%$
test_values_stack_nested 0.1725ms 98.1995μs 10.1833 KOps/s 10.4195 KOps/s $\color{#d91a1a}-2.27\%$
test_values_stack_nested_leaf 0.1436ms 82.4443μs 12.1294 KOps/s 12.3271 KOps/s $\color{#d91a1a}-1.60\%$
test_values_stack_nested_locked 0.1557ms 97.0292μs 10.3062 KOps/s 10.5310 KOps/s $\color{#d91a1a}-2.13\%$
test_membership 21.3000μs 0.9287μs 1.0768 MOps/s 1.1263 MOps/s $\color{#d91a1a}-4.40\%$
test_membership_nested 23.1740μs 2.8657μs 348.9513 KOps/s 366.4396 KOps/s $\color{#d91a1a}-4.77\%$
test_membership_nested_leaf 24.6670μs 2.8847μs 346.6537 KOps/s 364.7127 KOps/s $\color{#d91a1a}-4.95\%$
test_membership_stacked_nested 22.9630μs 2.8755μs 347.7653 KOps/s 364.9114 KOps/s $\color{#d91a1a}-4.70\%$
test_membership_stacked_nested_leaf 29.9260μs 2.9144μs 343.1201 KOps/s 368.7271 KOps/s $\textbf{\color{#d91a1a}-6.94\%}$
test_membership_nested_last 25.4070μs 4.3319μs 230.8475 KOps/s 232.3815 KOps/s $\color{#d91a1a}-0.66\%$
test_membership_nested_leaf_last 26.6190μs 4.3652μs 229.0871 KOps/s 237.4847 KOps/s $\color{#d91a1a}-3.54\%$
test_membership_stacked_nested_last 28.3930μs 4.3585μs 229.4387 KOps/s 237.6236 KOps/s $\color{#d91a1a}-3.44\%$
test_membership_stacked_nested_leaf_last 26.0390μs 4.3346μs 230.7034 KOps/s 235.6261 KOps/s $\color{#d91a1a}-2.09\%$
test_nested_getleaf 33.7930μs 10.5925μs 94.4068 KOps/s 93.8171 KOps/s $\color{#35bf28}+0.63\%$
test_nested_get 55.2940μs 9.9542μs 100.4598 KOps/s 98.3939 KOps/s $\color{#35bf28}+2.10\%$
test_stacked_getleaf 36.0180μs 10.5987μs 94.3512 KOps/s 94.7916 KOps/s $\color{#d91a1a}-0.46\%$
test_stacked_get 32.6510μs 9.9791μs 100.2095 KOps/s 99.7176 KOps/s $\color{#35bf28}+0.49\%$
test_nested_getitemleaf 34.3140μs 10.9187μs 91.5858 KOps/s 90.7088 KOps/s $\color{#35bf28}+0.97\%$
test_nested_getitem 54.3820μs 10.1169μs 98.8447 KOps/s 97.2975 KOps/s $\color{#35bf28}+1.59\%$
test_stacked_getitemleaf 44.9150μs 10.8791μs 91.9194 KOps/s 90.7819 KOps/s $\color{#35bf28}+1.25\%$
test_stacked_getitem 30.8270μs 10.0875μs 99.1325 KOps/s 95.8756 KOps/s $\color{#35bf28}+3.40\%$
test_lock_nested 84.1360ms 0.5978ms 1.6729 KOps/s 1.9214 KOps/s $\textbf{\color{#d91a1a}-12.93\%}$
test_lock_stack_nested 0.6979ms 0.4749ms 2.1057 KOps/s 2.0698 KOps/s $\color{#35bf28}+1.73\%$
test_unlock_nested 83.2556ms 0.5150ms 1.9419 KOps/s 2.2917 KOps/s $\textbf{\color{#d91a1a}-15.27\%}$
test_unlock_stack_nested 0.6588ms 0.3913ms 2.5554 KOps/s 2.4934 KOps/s $\color{#35bf28}+2.49\%$
test_flatten_speed 0.1911ms 0.1016ms 9.8407 KOps/s 9.9553 KOps/s $\color{#d91a1a}-1.15\%$
test_unflatten_speed 1.0573ms 0.5350ms 1.8690 KOps/s 1.9370 KOps/s $\color{#d91a1a}-3.51\%$
test_common_ops 4.5509ms 1.2034ms 830.9866 Ops/s 898.3887 Ops/s $\textbf{\color{#d91a1a}-7.50\%}$
test_creation 19.5170μs 2.1418μs 466.9060 KOps/s 473.7156 KOps/s $\color{#d91a1a}-1.44\%$
test_creation_empty 47.5590μs 20.1023μs 49.7456 KOps/s 56.1580 KOps/s $\textbf{\color{#d91a1a}-11.42\%}$
test_creation_nested_1 55.9050μs 23.4589μs 42.6277 KOps/s 47.7704 KOps/s $\textbf{\color{#d91a1a}-10.77\%}$
test_creation_nested_2 61.5160μs 27.6225μs 36.2024 KOps/s 39.6664 KOps/s $\textbf{\color{#d91a1a}-8.73\%}$
test_clone 0.1201ms 17.1299μs 58.3774 KOps/s 58.2338 KOps/s $\color{#35bf28}+0.25\%$
test_getitem[int] 1.1468ms 16.9050μs 59.1542 KOps/s 57.6708 KOps/s $\color{#35bf28}+2.57\%$
test_getitem[slice_int] 0.1455ms 31.9062μs 31.3419 KOps/s 32.2207 KOps/s $\color{#d91a1a}-2.73\%$
test_getitem[range] 0.2710ms 60.5421μs 16.5174 KOps/s 17.2812 KOps/s $\color{#d91a1a}-4.42\%$
test_getitem[tuple] 0.1414ms 25.8072μs 38.7489 KOps/s 38.8393 KOps/s $\color{#d91a1a}-0.23\%$
test_getitem[list] 0.1880ms 55.1552μs 18.1307 KOps/s 18.6575 KOps/s $\color{#d91a1a}-2.82\%$
test_setitem_dim[int] 61.4160μs 34.1248μs 29.3042 KOps/s 30.1376 KOps/s $\color{#d91a1a}-2.77\%$
test_setitem_dim[slice_int] 0.1097ms 62.7226μs 15.9432 KOps/s 16.2444 KOps/s $\color{#d91a1a}-1.85\%$
test_setitem_dim[range] 0.1385ms 87.2948μs 11.4554 KOps/s 11.9410 KOps/s $\color{#d91a1a}-4.07\%$
test_setitem_dim[tuple] 80.2610μs 51.3278μs 19.4826 KOps/s 19.8325 KOps/s $\color{#d91a1a}-1.76\%$
test_setitem 88.3360μs 32.0764μs 31.1756 KOps/s 32.7089 KOps/s $\color{#d91a1a}-4.69\%$
test_set 83.0760μs 30.9376μs 32.3231 KOps/s 34.3094 KOps/s $\textbf{\color{#d91a1a}-5.79\%}$
test_set_shared 3.2701ms 0.2189ms 4.5688 KOps/s 4.5577 KOps/s $\color{#35bf28}+0.24\%$
test_update 0.1481ms 40.3122μs 24.8064 KOps/s 26.7878 KOps/s $\textbf{\color{#d91a1a}-7.40\%}$
test_update_nested 0.1257ms 50.9592μs 19.6235 KOps/s 20.8724 KOps/s $\textbf{\color{#d91a1a}-5.98\%}$
test_update__nested 1.0217ms 37.6594μs 26.5538 KOps/s 26.1721 KOps/s $\color{#35bf28}+1.46\%$
test_set_nested 0.1714ms 34.9620μs 28.6025 KOps/s 31.5271 KOps/s $\textbf{\color{#d91a1a}-9.28\%}$
test_set_nested_new 0.1367ms 38.2407μs 26.1502 KOps/s 26.6935 KOps/s $\color{#d91a1a}-2.04\%$
test_select 0.1340ms 55.8132μs 17.9169 KOps/s 18.5139 KOps/s $\color{#d91a1a}-3.22\%$
test_select_nested 0.1265ms 60.9997μs 16.3935 KOps/s 16.8152 KOps/s $\color{#d91a1a}-2.51\%$
test_exclude_nested 0.1776ms 77.1810μs 12.9566 KOps/s 13.4215 KOps/s $\color{#d91a1a}-3.46\%$
test_empty[True] 0.6525ms 0.3526ms 2.8363 KOps/s 2.8365 KOps/s $-0.00\%$
test_empty[False] 8.5637μs 1.3037μs 767.0563 KOps/s 801.6773 KOps/s $\color{#d91a1a}-4.32\%$
test_unbind_speed 0.6510ms 0.3065ms 3.2630 KOps/s 3.1473 KOps/s $\color{#35bf28}+3.68\%$
test_unbind_speed_stack0 0.5954ms 0.3022ms 3.3090 KOps/s 3.2202 KOps/s $\color{#35bf28}+2.76\%$
test_unbind_speed_stack1 91.6826ms 0.8020ms 1.2469 KOps/s 1.2813 KOps/s $\color{#d91a1a}-2.68\%$
test_split 3.2178ms 2.0625ms 484.8526 Ops/s 449.4527 Ops/s $\textbf{\color{#35bf28}+7.88\%}$
test_chunk 89.9785ms 2.2424ms 445.9494 Ops/s 451.2638 Ops/s $\color{#d91a1a}-1.18\%$
test_creation[device0] 0.2292ms 0.1166ms 8.5729 KOps/s 8.5724 KOps/s $+0.01\%$
test_creation_from_tensor 5.7147ms 0.1183ms 8.4529 KOps/s 8.4557 KOps/s $\color{#d91a1a}-0.03\%$
test_add_one[memmap_tensor0] 0.2123ms 7.4447μs 134.3231 KOps/s 134.9379 KOps/s $\color{#d91a1a}-0.46\%$
test_contiguous[memmap_tensor0] 17.3220μs 1.9316μs 517.7183 KOps/s 527.0806 KOps/s $\color{#d91a1a}-1.78\%$
test_stack[memmap_tensor0] 55.4650μs 5.7497μs 173.9212 KOps/s 176.8150 KOps/s $\color{#d91a1a}-1.64\%$
test_memmaptd_index 1.0988ms 0.4265ms 2.3448 KOps/s 2.4384 KOps/s $\color{#d91a1a}-3.84\%$
test_memmaptd_index_astensor 1.2452ms 0.5274ms 1.8963 KOps/s 1.9403 KOps/s $\color{#d91a1a}-2.27\%$
test_memmaptd_index_op 1.7066ms 1.0992ms 909.7751 Ops/s 954.8079 Ops/s $\color{#d91a1a}-4.72\%$
test_serialize_model 0.2091s 0.1295s 7.7238 Ops/s 8.5527 Ops/s $\textbf{\color{#d91a1a}-9.69\%}$
test_serialize_model_pickle 0.4730s 0.3917s 2.5529 Ops/s 2.5166 Ops/s $\color{#35bf28}+1.44\%$
test_serialize_weights 0.1242s 0.1165s 8.5841 Ops/s 8.5972 Ops/s $\color{#d91a1a}-0.15\%$
test_serialize_weights_returnearly 0.1689s 0.1602s 6.2429 Ops/s 6.4266 Ops/s $\color{#d91a1a}-2.86\%$
test_serialize_weights_pickle 0.5189s 0.4225s 2.3667 Ops/s 2.4715 Ops/s $\color{#d91a1a}-4.24\%$
test_serialize_weights_filesystem 0.2231s 0.1524s 6.5611 Ops/s 7.1210 Ops/s $\textbf{\color{#d91a1a}-7.86\%}$
test_serialize_model_filesystem 0.1568s 0.1455s 6.8721 Ops/s 6.5340 Ops/s $\textbf{\color{#35bf28}+5.17\%}$
test_reshape_pytree 90.7600μs 39.0870μs 25.5840 KOps/s 25.7282 KOps/s $\color{#d91a1a}-0.56\%$
test_reshape_td 0.1174ms 45.6259μs 21.9174 KOps/s 21.1107 KOps/s $\color{#35bf28}+3.82\%$
test_view_pytree 90.5200μs 38.8738μs 25.7243 KOps/s 26.0067 KOps/s $\color{#d91a1a}-1.09\%$
test_view_td 0.1285ms 52.8314μs 18.9281 KOps/s 18.3564 KOps/s $\color{#35bf28}+3.11\%$
test_unbind_pytree 78.6080μs 36.5084μs 27.3909 KOps/s 28.1187 KOps/s $\color{#d91a1a}-2.59\%$
test_unbind_td 0.3059ms 46.3309μs 21.5839 KOps/s 21.3131 KOps/s $\color{#35bf28}+1.27\%$
test_split_pytree 85.1500μs 38.2810μs 26.1226 KOps/s 26.1316 KOps/s $\color{#d91a1a}-0.03\%$
test_split_td 0.4566ms 61.0102μs 16.3907 KOps/s 16.7720 KOps/s $\color{#d91a1a}-2.27\%$
test_add_pytree 98.3750μs 44.9851μs 22.2296 KOps/s 22.2010 KOps/s $\color{#35bf28}+0.13\%$
test_add_td 0.2719ms 90.4186μs 11.0597 KOps/s 11.5658 KOps/s $\color{#d91a1a}-4.38\%$
test_compile_add_one_nested[tensordict-compile] 0.1074ms 59.0086μs 16.9467 KOps/s 17.1838 KOps/s $\color{#d91a1a}-1.38\%$
test_compile_add_one_nested[tensordict-eager] 0.3576ms 0.2013ms 4.9687 KOps/s 5.1075 KOps/s $\color{#d91a1a}-2.72\%$
test_compile_add_one_nested[pytree-compile] 0.1280ms 56.8126μs 17.6017 KOps/s 17.6212 KOps/s $\color{#d91a1a}-0.11\%$
test_compile_add_one_nested[pytree-eager] 0.2885ms 0.1403ms 7.1253 KOps/s 7.1658 KOps/s $\color{#d91a1a}-0.56\%$
test_compile_copy_nested[tensordict-compile] 81.5250μs 23.2872μs 42.9421 KOps/s 43.3749 KOps/s $\color{#d91a1a}-1.00\%$
test_compile_copy_nested[tensordict-eager] 0.1576ms 73.7437μs 13.5605 KOps/s 13.4352 KOps/s $\color{#35bf28}+0.93\%$
test_compile_copy_nested[pytree-compile] 0.1457ms 75.8423μs 13.1853 KOps/s 13.3912 KOps/s $\color{#d91a1a}-1.54\%$
test_compile_copy_nested[pytree-eager] 0.1274ms 68.5130μs 14.5958 KOps/s 14.6828 KOps/s $\color{#d91a1a}-0.59\%$
test_compile_add_one_flat[tensordict-compile] 0.3864ms 0.1806ms 5.5375 KOps/s 5.5560 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_add_one_flat[tensordict-eager] 0.4562ms 0.2391ms 4.1821 KOps/s 4.2065 KOps/s $\color{#d91a1a}-0.58\%$
test_compile_add_one_flat[tensorclass-compile] 0.1087ms 49.0932μs 20.3694 KOps/s 20.6967 KOps/s $\color{#d91a1a}-1.58\%$
test_compile_add_one_flat[tensorclass-eager] 0.1498ms 76.9514μs 12.9952 KOps/s 12.7903 KOps/s $\color{#35bf28}+1.60\%$
test_compile_add_one_flat[pytree-compile] 0.2854ms 0.1731ms 5.7763 KOps/s 5.7805 KOps/s $\color{#d91a1a}-0.07\%$
test_compile_add_one_flat[pytree-eager] 0.4765ms 0.2847ms 3.5121 KOps/s 3.5194 KOps/s $\color{#d91a1a}-0.21\%$
test_compile_add_self_flat[tensordict-eager] 0.4792ms 0.2761ms 3.6219 KOps/s 3.6121 KOps/s $\color{#35bf28}+0.27\%$
test_compile_add_self_flat[tensordict-compile] 0.3383ms 0.1819ms 5.4986 KOps/s 5.5951 KOps/s $\color{#d91a1a}-1.72\%$
test_compile_add_self_flat[tensorclass-eager] 0.1642ms 73.8821μs 13.5351 KOps/s 13.6423 KOps/s $\color{#d91a1a}-0.79\%$
test_compile_add_self_flat[tensorclass-compile] 0.1157ms 50.3107μs 19.8765 KOps/s 20.8773 KOps/s $\color{#d91a1a}-4.79\%$
test_compile_add_self_flat[pytree-eager] 0.4557ms 0.2313ms 4.3237 KOps/s 4.3344 KOps/s $\color{#d91a1a}-0.25\%$
test_compile_add_self_flat[pytree-compile] 0.2840ms 0.1755ms 5.6987 KOps/s 5.8243 KOps/s $\color{#d91a1a}-2.16\%$
test_compile_copy_flat[tensordict-compile] 0.2102ms 0.1105ms 9.0520 KOps/s 8.9367 KOps/s $\color{#35bf28}+1.29\%$
test_compile_copy_flat[tensordict-eager] 0.1682ms 78.2182μs 12.7847 KOps/s 12.7389 KOps/s $\color{#35bf28}+0.36\%$
test_compile_copy_flat[pytree-compile] 0.1617ms 79.1303μs 12.6374 KOps/s 13.0793 KOps/s $\color{#d91a1a}-3.38\%$
test_compile_copy_flat[pytree-eager] 0.1453ms 68.9800μs 14.4970 KOps/s 14.6847 KOps/s $\color{#d91a1a}-1.28\%$
test_compile_assign_and_add[tensordict-compile] 0.2933ms 0.1952ms 5.1228 KOps/s 5.1342 KOps/s $\color{#d91a1a}-0.22\%$
test_compile_assign_and_add[tensordict-eager] 3.0140ms 1.8097ms 552.5834 Ops/s 572.9522 Ops/s $\color{#d91a1a}-3.56\%$
test_compile_assign_and_add[pytree-compile] 0.4111ms 0.1949ms 5.1300 KOps/s 5.1650 KOps/s $\color{#d91a1a}-0.68\%$
test_compile_assign_and_add[pytree-eager] 1.8455ms 1.0976ms 911.0601 Ops/s 912.5589 Ops/s $\color{#d91a1a}-0.16\%$
test_compile_assign_and_add_stack[compile] 0.4990ms 0.4207ms 2.3772 KOps/s 2.4170 KOps/s $\color{#d91a1a}-1.65\%$
test_compile_assign_and_add_stack[eager] 4.5957ms 4.2028ms 237.9386 Ops/s 252.4610 Ops/s $\textbf{\color{#d91a1a}-5.75\%}$
test_compile_indexing[tensor-tensordict-compile] 96.1910μs 35.6506μs 28.0500 KOps/s 29.2637 KOps/s $\color{#d91a1a}-4.15\%$
test_compile_indexing[tensor-tensordict-eager] 1.1608ms 50.8379μs 19.6704 KOps/s 20.5411 KOps/s $\color{#d91a1a}-4.24\%$
test_compile_indexing[tensor-tensorclass-compile] 96.7720μs 30.5153μs 32.7704 KOps/s 32.4667 KOps/s $\color{#35bf28}+0.94\%$
test_compile_indexing[tensor-tensorclass-eager] 74.7410μs 30.3705μs 32.9267 KOps/s 34.7078 KOps/s $\textbf{\color{#d91a1a}-5.13\%}$
test_compile_indexing[tensor-pytree-compile] 95.3890μs 30.4359μs 32.8560 KOps/s 33.0795 KOps/s $\color{#d91a1a}-0.68\%$
test_compile_indexing[tensor-pytree-eager] 81.4130μs 29.8484μs 33.5026 KOps/s 34.4463 KOps/s $\color{#d91a1a}-2.74\%$
test_compile_indexing[slice-tensordict-compile] 0.1594ms 75.5160μs 13.2422 KOps/s 13.6777 KOps/s $\color{#d91a1a}-3.18\%$
test_compile_indexing[slice-tensordict-eager] 0.5549ms 28.7525μs 34.7796 KOps/s 35.2401 KOps/s $\color{#d91a1a}-1.31\%$
test_compile_indexing[slice-tensorclass-compile] 0.2016ms 70.2774μs 14.2293 KOps/s 14.9017 KOps/s $\color{#d91a1a}-4.51\%$
test_compile_indexing[slice-tensorclass-eager] 82.1340μs 24.0611μs 41.5608 KOps/s 42.8101 KOps/s $\color{#d91a1a}-2.92\%$
test_compile_indexing[slice-pytree-compile] 0.1651ms 69.1798μs 14.4551 KOps/s 14.8678 KOps/s $\color{#d91a1a}-2.78\%$
test_compile_indexing[slice-pytree-eager] 60.7640μs 23.6238μs 42.3301 KOps/s 42.9500 KOps/s $\color{#d91a1a}-1.44\%$
test_compile_indexing[int-tensordict-compile] 0.1531ms 75.2762μs 13.2844 KOps/s 13.6248 KOps/s $\color{#d91a1a}-2.50\%$
test_compile_indexing[int-tensordict-eager] 1.1097ms 28.3047μs 35.3298 KOps/s 36.2266 KOps/s $\color{#d91a1a}-2.48\%$
test_compile_indexing[int-tensorclass-compile] 0.1516ms 68.7121μs 14.5535 KOps/s 14.8571 KOps/s $\color{#d91a1a}-2.04\%$
test_compile_indexing[int-tensorclass-eager] 70.7230μs 23.6120μs 42.3514 KOps/s 43.0664 KOps/s $\color{#d91a1a}-1.66\%$
test_compile_indexing[int-pytree-compile] 0.1511ms 68.9939μs 14.4940 KOps/s 15.0056 KOps/s $\color{#d91a1a}-3.41\%$
test_compile_indexing[int-pytree-eager] 86.2820μs 23.9320μs 41.7850 KOps/s 42.0776 KOps/s $\color{#d91a1a}-0.70\%$
test_mod_add[eager] 94.0060μs 27.5001μs 36.3635 KOps/s 38.9514 KOps/s $\textbf{\color{#d91a1a}-6.64\%}$
test_mod_add[compile] 87.1840μs 39.5345μs 25.2944 KOps/s 26.6504 KOps/s $\textbf{\color{#d91a1a}-5.09\%}$
test_mod_add[compile-overhead] 92.0530μs 39.3325μs 25.4243 KOps/s 25.7760 KOps/s $\color{#d91a1a}-1.36\%$
test_mod_wrap[eager] 0.4440ms 0.2114ms 4.7307 KOps/s 4.9270 KOps/s $\color{#d91a1a}-3.98\%$
test_mod_wrap[compile] 0.4385ms 0.2370ms 4.2196 KOps/s 4.4040 KOps/s $\color{#d91a1a}-4.19\%$
test_mod_wrap[compile-overhead] 0.4142ms 0.2330ms 4.2922 KOps/s 4.4145 KOps/s $\color{#d91a1a}-2.77\%$
test_mod_wrap_and_backward[eager] 11.9082ms 10.4843ms 95.3804 Ops/s 92.2352 Ops/s $\color{#35bf28}+3.41\%$
test_mod_wrap_and_backward[compile] 11.9915ms 10.5152ms 95.1002 Ops/s 86.1214 Ops/s $\textbf{\color{#35bf28}+10.43\%}$
test_mod_wrap_and_backward[compile-overhead] 12.2716ms 10.5610ms 94.6879 Ops/s 87.7345 Ops/s $\textbf{\color{#35bf28}+7.93\%}$
test_seq_add[eager] 0.1809ms 95.2630μs 10.4972 KOps/s 10.7663 KOps/s $\color{#d91a1a}-2.50\%$
test_seq_add[compile] 0.1382ms 65.1759μs 15.3431 KOps/s 15.4203 KOps/s $\color{#d91a1a}-0.50\%$
test_seq_add[compile-overhead] 0.1304ms 64.6524μs 15.4673 KOps/s 15.6688 KOps/s $\color{#d91a1a}-1.29\%$
test_seq_wrap[eager] 0.6237ms 0.3988ms 2.5074 KOps/s 2.6421 KOps/s $\textbf{\color{#d91a1a}-5.10\%}$
test_seq_wrap[compile] 1.2107ms 0.2717ms 3.6805 KOps/s 3.7369 KOps/s $\color{#d91a1a}-1.51\%$
test_seq_wrap[compile-overhead] 1.2315ms 0.2710ms 3.6904 KOps/s 3.7753 KOps/s $\color{#d91a1a}-2.25\%$
test_func_call_runtime[False-eager] 0.9446ms 0.5309ms 1.8836 KOps/s 1.9678 KOps/s $\color{#d91a1a}-4.28\%$
test_func_call_runtime[False-compile] 1.0155ms 0.5031ms 1.9876 KOps/s 2.0193 KOps/s $\color{#d91a1a}-1.57\%$
test_func_call_runtime[False-compile-overhead] 0.8735ms 0.5043ms 1.9830 KOps/s 2.0066 KOps/s $\color{#d91a1a}-1.18\%$
test_func_call_runtime[True-eager] 1.2074ms 0.7620ms 1.3124 KOps/s 1.3584 KOps/s $\color{#d91a1a}-3.39\%$
test_func_call_runtime[True-compile] 0.9142ms 0.5172ms 1.9336 KOps/s 1.9845 KOps/s $\color{#d91a1a}-2.56\%$
test_func_call_runtime[True-compile-overhead] 0.6455ms 0.5165ms 1.9360 KOps/s 1.9711 KOps/s $\color{#d91a1a}-1.78\%$
test_func_call_cm_runtime[False-eager] 1.0419ms 0.5370ms 1.8621 KOps/s 1.9640 KOps/s $\textbf{\color{#d91a1a}-5.19\%}$
test_func_call_cm_runtime[False-compile] 0.6824ms 0.5062ms 1.9754 KOps/s 2.0015 KOps/s $\color{#d91a1a}-1.30\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6391ms 0.5064ms 1.9748 KOps/s 1.9969 KOps/s $\color{#d91a1a}-1.10\%$
test_func_call_cm_runtime[True-eager] 1.0749ms 0.9103ms 1.0986 KOps/s 1.1304 KOps/s $\color{#d91a1a}-2.82\%$
test_func_call_cm_runtime[True-compile] 1.1427ms 0.7589ms 1.3177 KOps/s 1.3622 KOps/s $\color{#d91a1a}-3.26\%$
test_func_call_cm_runtime[True-compile-overhead] 0.8618ms 0.7535ms 1.3271 KOps/s 1.3589 KOps/s $\color{#d91a1a}-2.34\%$
test_vmap_func_call_cm_runtime[eager] 2.5718ms 1.9410ms 515.2000 Ops/s 525.1234 Ops/s $\color{#d91a1a}-1.89\%$
test_vmap_func_call_cm_runtime[compile] 2.6922ms 1.9984ms 500.4032 Ops/s 509.5431 Ops/s $\color{#d91a1a}-1.79\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.5975ms 1.9978ms 500.5463 Ops/s 513.3802 Ops/s $\color{#d91a1a}-2.50\%$
test_distributed 0.2711ms 0.1245ms 8.0346 KOps/s 7.8031 KOps/s $\color{#35bf28}+2.97\%$
test_tdmodule 70.3330μs 19.2502μs 51.9476 KOps/s 58.3347 KOps/s $\textbf{\color{#d91a1a}-10.95\%}$
test_tdmodule_dispatch 66.3140μs 38.4895μs 25.9811 KOps/s 29.1391 KOps/s $\textbf{\color{#d91a1a}-10.84\%}$
test_tdseq 49.3630μs 22.0910μs 45.2674 KOps/s 50.8114 KOps/s $\textbf{\color{#d91a1a}-10.91\%}$
test_tdseq_dispatch 79.7200μs 44.1328μs 22.6589 KOps/s 25.0570 KOps/s $\textbf{\color{#d91a1a}-9.57\%}$
test_instantiation_functorch 1.7339ms 1.5936ms 627.5120 Ops/s 641.6514 Ops/s $\color{#d91a1a}-2.20\%$
test_instantiation_td 2.1891ms 1.2216ms 818.5940 Ops/s 853.8919 Ops/s $\color{#d91a1a}-4.13\%$
test_exec_functorch 0.2867ms 0.1896ms 5.2746 KOps/s 5.3838 KOps/s $\color{#d91a1a}-2.03\%$
test_exec_functional_call 0.4379ms 0.1755ms 5.6966 KOps/s 5.8665 KOps/s $\color{#d91a1a}-2.90\%$
test_exec_td 0.3660ms 0.2001ms 4.9968 KOps/s 5.0085 KOps/s $\color{#d91a1a}-0.23\%$
test_exec_td_decorator 0.8959ms 0.2382ms 4.1974 KOps/s 4.3425 KOps/s $\color{#d91a1a}-3.34\%$
test_vmap_mlp_speed[True-True] 0.9286ms 0.6968ms 1.4351 KOps/s 1.4463 KOps/s $\color{#d91a1a}-0.77\%$
test_vmap_mlp_speed[True-False] 1.0887ms 0.6979ms 1.4328 KOps/s 1.4778 KOps/s $\color{#d91a1a}-3.04\%$
test_vmap_mlp_speed[False-True] 0.6168ms 0.5412ms 1.8479 KOps/s 1.8832 KOps/s $\color{#d91a1a}-1.88\%$
test_vmap_mlp_speed[False-False] 0.7616ms 0.5421ms 1.8448 KOps/s 1.8745 KOps/s $\color{#d91a1a}-1.59\%$
test_vmap_mlp_speed_decorator[True-True] 1.3197ms 0.6604ms 1.5142 KOps/s 1.5690 KOps/s $\color{#d91a1a}-3.49\%$
test_vmap_mlp_speed_decorator[True-False] 1.0489ms 0.6635ms 1.5072 KOps/s 1.5740 KOps/s $\color{#d91a1a}-4.24\%$
test_vmap_mlp_speed_decorator[False-True] 0.9190ms 0.5462ms 1.8309 KOps/s 1.8892 KOps/s $\color{#d91a1a}-3.08\%$
test_vmap_mlp_speed_decorator[False-False] 0.8697ms 0.5453ms 1.8340 KOps/s 1.8883 KOps/s $\color{#d91a1a}-2.88\%$
test_to_module_speed[True] 2.0614ms 1.4555ms 687.0623 Ops/s 718.1580 Ops/s $\color{#d91a1a}-4.33\%$
test_to_module_speed[False] 2.2297ms 1.4277ms 700.4133 Ops/s 731.4227 Ops/s $\color{#d91a1a}-4.24\%$
test_tc_init 98.3650μs 47.7083μs 20.9607 KOps/s 21.9232 KOps/s $\color{#d91a1a}-4.39\%$
test_tc_init_nested 0.1700ms 96.6748μs 10.3440 KOps/s 10.9682 KOps/s $\textbf{\color{#d91a1a}-5.69\%}$
test_tc_first_layer_tensor 37.0600μs 1.5824μs 631.9673 KOps/s 632.1583 KOps/s $\color{#d91a1a}-0.03\%$
test_tc_first_layer_nontensor 24.4550μs 4.9329μs 202.7212 KOps/s 210.0437 KOps/s $\color{#d91a1a}-3.49\%$
test_tc_second_layer_tensor 22.1620μs 2.9379μs 340.3789 KOps/s 347.3818 KOps/s $\color{#d91a1a}-2.02\%$
test_tc_second_layer_nontensor 28.8440μs 6.1208μs 163.3770 KOps/s 164.7053 KOps/s $\color{#d91a1a}-0.81\%$
test_unbind 0.4667s 13.1081ms 76.2886 Ops/s 75.7584 Ops/s $\color{#35bf28}+0.70\%$
test_full_like 8.7797ms 7.2483ms 137.9626 Ops/s 144.9704 Ops/s $\color{#d91a1a}-4.83\%$
test_zeros_like 3.1614ms 2.7655ms 361.5937 Ops/s 346.6296 Ops/s $\color{#35bf28}+4.32\%$
test_ones_like 3.5754ms 3.1884ms 313.6374 Ops/s 290.0997 Ops/s $\textbf{\color{#35bf28}+8.11\%}$
test_clone 5.6838ms 5.0558ms 197.7914 Ops/s 206.7255 Ops/s $\color{#d91a1a}-4.32\%$
test_squeeze 66.0740μs 12.8179μs 78.0158 KOps/s 78.8672 KOps/s $\color{#d91a1a}-1.08\%$
test_unsqueeze 0.1600ms 95.0781μs 10.5177 KOps/s 10.8456 KOps/s $\color{#d91a1a}-3.02\%$
test_split 0.5694ms 0.1959ms 5.1049 KOps/s 5.1302 KOps/s $\color{#d91a1a}-0.49\%$
test_permute 0.3564ms 0.2208ms 4.5300 KOps/s 4.4661 KOps/s $\color{#35bf28}+1.43\%$
test_stack 27.3719ms 24.6537ms 40.5619 Ops/s 41.3618 Ops/s $\color{#d91a1a}-1.93\%$
test_cat 30.6883ms 24.4095ms 40.9677 Ops/s 41.6513 Ops/s $\color{#d91a1a}-1.64\%$

Copy link

github-actions bot commented Oct 4, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}23$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1489ms 16.1417μs 61.9515 KOps/s 58.4373 KOps/s $\textbf{\color{#35bf28}+6.01\%}$
test_plain_set_stack_nested 35.6500μs 16.2227μs 61.6422 KOps/s 57.7120 KOps/s $\textbf{\color{#35bf28}+6.81\%}$
test_plain_set_nested_inplace 41.5100μs 17.3759μs 57.5509 KOps/s 54.3644 KOps/s $\textbf{\color{#35bf28}+5.86\%}$
test_plain_set_stack_nested_inplace 41.8800μs 17.1283μs 58.3830 KOps/s 54.9699 KOps/s $\textbf{\color{#35bf28}+6.21\%}$
test_items 22.0100μs 2.8591μs 349.7640 KOps/s 348.8000 KOps/s $\color{#35bf28}+0.28\%$
test_items_nested 0.3748ms 0.3464ms 2.8866 KOps/s 2.8758 KOps/s $\color{#35bf28}+0.38\%$
test_items_nested_locked 0.4044ms 0.3479ms 2.8741 KOps/s 2.9106 KOps/s $\color{#d91a1a}-1.25\%$
test_items_nested_leaf 0.1107ms 69.1224μs 14.4671 KOps/s 14.4991 KOps/s $\color{#d91a1a}-0.22\%$
test_items_stack_nested 0.4012ms 0.3474ms 2.8786 KOps/s 2.8888 KOps/s $\color{#d91a1a}-0.35\%$
test_items_stack_nested_leaf 0.1071ms 70.1134μs 14.2626 KOps/s 14.1808 KOps/s $\color{#35bf28}+0.58\%$
test_items_stack_nested_locked 0.3930ms 0.3497ms 2.8593 KOps/s 2.8835 KOps/s $\color{#d91a1a}-0.84\%$
test_keys 43.1700μs 3.4201μs 292.3907 KOps/s 294.3286 KOps/s $\color{#d91a1a}-0.66\%$
test_keys_nested 96.4410μs 70.9389μs 14.0966 KOps/s 14.0830 KOps/s $\color{#35bf28}+0.10\%$
test_keys_nested_locked 2.5071ms 76.1701μs 13.1285 KOps/s 13.0032 KOps/s $\color{#35bf28}+0.96\%$
test_keys_nested_leaf 92.8710μs 61.1906μs 16.3424 KOps/s 16.3798 KOps/s $\color{#d91a1a}-0.23\%$
test_keys_stack_nested 97.1720μs 70.9975μs 14.0850 KOps/s 14.0844 KOps/s $+0.00\%$
test_keys_stack_nested_leaf 93.8720μs 62.3452μs 16.0397 KOps/s 15.7732 KOps/s $\color{#35bf28}+1.69\%$
test_keys_stack_nested_locked 0.1061ms 76.2984μs 13.1064 KOps/s 12.9315 KOps/s $\color{#35bf28}+1.35\%$
test_values 6.2502μs 0.8422μs 1.1874 MOps/s 1.1914 MOps/s $\color{#d91a1a}-0.33\%$
test_values_nested 81.8410μs 48.5034μs 20.6171 KOps/s 20.4728 KOps/s $\color{#35bf28}+0.70\%$
test_values_nested_locked 76.2710μs 50.2826μs 19.8876 KOps/s 19.8357 KOps/s $\color{#35bf28}+0.26\%$
test_values_nested_leaf 69.5210μs 42.6050μs 23.4714 KOps/s 23.3568 KOps/s $\color{#35bf28}+0.49\%$
test_values_stack_nested 77.8110μs 50.4327μs 19.8284 KOps/s 19.6929 KOps/s $\color{#35bf28}+0.69\%$
test_values_stack_nested_leaf 77.7810μs 43.3359μs 23.0755 KOps/s 22.6558 KOps/s $\color{#35bf28}+1.85\%$
test_values_stack_nested_locked 79.6810μs 51.0940μs 19.5718 KOps/s 19.1012 KOps/s $\color{#35bf28}+2.46\%$
test_membership 1.8295μs 0.4979μs 2.0085 MOps/s 1.9956 MOps/s $\color{#35bf28}+0.65\%$
test_membership_nested 13.0400μs 1.8358μs 544.7291 KOps/s 543.1131 KOps/s $\color{#35bf28}+0.30\%$
test_membership_nested_leaf 11.2533μs 1.8094μs 552.6595 KOps/s 554.2519 KOps/s $\color{#d91a1a}-0.29\%$
test_membership_stacked_nested 23.8700μs 1.8810μs 531.6355 KOps/s 537.0805 KOps/s $\color{#d91a1a}-1.01\%$
test_membership_stacked_nested_leaf 22.7410μs 1.9252μs 519.4236 KOps/s 535.7859 KOps/s $\color{#d91a1a}-3.05\%$
test_membership_nested_last 32.1010μs 2.9296μs 341.3405 KOps/s 345.5148 KOps/s $\color{#d91a1a}-1.21\%$
test_membership_nested_leaf_last 32.4210μs 2.9595μs 337.8955 KOps/s 347.3320 KOps/s $\color{#d91a1a}-2.72\%$
test_membership_stacked_nested_last 37.9900μs 5.6149μs 178.0970 KOps/s 288.9760 KOps/s $\textbf{\color{#d91a1a}-38.37\%}$
test_membership_stacked_nested_leaf_last 33.9900μs 5.5775μs 179.2902 KOps/s 286.8494 KOps/s $\textbf{\color{#d91a1a}-37.50\%}$
test_nested_getleaf 32.5900μs 6.0451μs 165.4239 KOps/s 165.6837 KOps/s $\color{#d91a1a}-0.16\%$
test_nested_get 36.7210μs 5.7204μs 174.8142 KOps/s 176.5633 KOps/s $\color{#d91a1a}-0.99\%$
test_stacked_getleaf 37.3300μs 5.9741μs 167.3898 KOps/s 165.9591 KOps/s $\color{#35bf28}+0.86\%$
test_stacked_get 31.8200μs 5.6263μs 177.7378 KOps/s 177.8625 KOps/s $\color{#d91a1a}-0.07\%$
test_nested_getitemleaf 29.4100μs 6.1413μs 162.8319 KOps/s 162.6734 KOps/s $\color{#35bf28}+0.10\%$
test_nested_getitem 30.8210μs 5.7666μs 173.4115 KOps/s 175.4934 KOps/s $\color{#d91a1a}-1.19\%$
test_stacked_getitemleaf 36.0610μs 6.0703μs 164.7370 KOps/s 164.3580 KOps/s $\color{#35bf28}+0.23\%$
test_stacked_getitem 39.4300μs 5.6938μs 175.6299 KOps/s 176.2203 KOps/s $\color{#d91a1a}-0.33\%$
test_lock_nested 4.8927ms 0.4302ms 2.3245 KOps/s 2.3745 KOps/s $\color{#d91a1a}-2.11\%$
test_lock_stack_nested 0.4180ms 0.3802ms 2.6301 KOps/s 2.6640 KOps/s $\color{#d91a1a}-1.27\%$
test_unlock_nested 0.7546ms 0.3631ms 2.7542 KOps/s 2.8192 KOps/s $\color{#d91a1a}-2.30\%$
test_unlock_stack_nested 0.3559ms 0.3173ms 3.1516 KOps/s 3.1950 KOps/s $\color{#d91a1a}-1.36\%$
test_flatten_speed 0.1166ms 83.9826μs 11.9072 KOps/s 12.0913 KOps/s $\color{#d91a1a}-1.52\%$
test_unflatten_speed 0.3617ms 0.3245ms 3.0814 KOps/s 3.1131 KOps/s $\color{#d91a1a}-1.02\%$
test_common_ops 1.6489ms 1.3117ms 762.3637 Ops/s 778.9759 Ops/s $\color{#d91a1a}-2.13\%$
test_creation 26.4000μs 1.4647μs 682.7285 KOps/s 679.5280 KOps/s $\color{#35bf28}+0.47\%$
test_creation_empty 42.0700μs 14.5315μs 68.8160 KOps/s 62.5276 KOps/s $\textbf{\color{#35bf28}+10.06\%}$
test_creation_nested_1 65.0310μs 15.9073μs 62.8642 KOps/s 55.5439 KOps/s $\textbf{\color{#35bf28}+13.18\%}$
test_creation_nested_2 53.9010μs 18.7307μs 53.3883 KOps/s 49.1935 KOps/s $\textbf{\color{#35bf28}+8.53\%}$
test_clone 71.2410μs 28.0817μs 35.6103 KOps/s 35.1167 KOps/s $\color{#35bf28}+1.41\%$
test_getitem[int] 92.4959ms 23.3966μs 42.7413 KOps/s 66.7016 KOps/s $\textbf{\color{#d91a1a}-35.92\%}$
test_getitem[slice_int] 0.1183ms 27.3719μs 36.5339 KOps/s 37.7743 KOps/s $\color{#d91a1a}-3.28\%$
test_getitem[range] 0.2251ms 0.1081ms 9.2537 KOps/s 9.3656 KOps/s $\color{#d91a1a}-1.19\%$
test_getitem[tuple] 0.1158ms 24.0477μs 41.5840 KOps/s 43.3538 KOps/s $\color{#d91a1a}-4.08\%$
test_getitem[list] 0.1923ms 98.2641μs 10.1767 KOps/s 10.4652 KOps/s $\color{#d91a1a}-2.76\%$
test_setitem_dim[int] 93.1210μs 44.3907μs 22.5272 KOps/s 22.9853 KOps/s $\color{#d91a1a}-1.99\%$
test_setitem_dim[slice_int] 91.2310μs 66.6790μs 14.9972 KOps/s 15.0369 KOps/s $\color{#d91a1a}-0.26\%$
test_setitem_dim[range] 0.1610ms 0.1257ms 7.9556 KOps/s 8.0026 KOps/s $\color{#d91a1a}-0.59\%$
test_setitem_dim[tuple] 85.5510μs 60.2998μs 16.5838 KOps/s 16.7890 KOps/s $\color{#d91a1a}-1.22\%$
test_setitem 67.1910μs 40.9077μs 24.4453 KOps/s 23.6405 KOps/s $\color{#35bf28}+3.40\%$
test_set 82.8210μs 41.3750μs 24.1692 KOps/s 22.9953 KOps/s $\textbf{\color{#35bf28}+5.10\%}$
test_set_shared 0.3515ms 52.8942μs 18.9057 KOps/s 17.7779 KOps/s $\textbf{\color{#35bf28}+6.34\%}$
test_update 82.6310μs 49.4037μs 20.2414 KOps/s 19.5711 KOps/s $\color{#35bf28}+3.42\%$
test_update_nested 98.0710μs 58.3691μs 17.1323 KOps/s 16.7788 KOps/s $\color{#35bf28}+2.11\%$
test_update__nested 96.0610μs 60.0040μs 16.6656 KOps/s 16.3018 KOps/s $\color{#35bf28}+2.23\%$
test_set_nested 74.6210μs 42.4665μs 23.5480 KOps/s 22.8792 KOps/s $\color{#35bf28}+2.92\%$
test_set_nested_new 96.2310μs 46.2171μs 21.6370 KOps/s 21.0158 KOps/s $\color{#35bf28}+2.96\%$
test_select 99.6210μs 59.5547μs 16.7913 KOps/s 16.4820 KOps/s $\color{#35bf28}+1.88\%$
test_select_nested 0.3466ms 42.5560μs 23.4985 KOps/s 23.6869 KOps/s $\color{#d91a1a}-0.80\%$
test_exclude_nested 0.1069ms 57.6251μs 17.3535 KOps/s 17.5553 KOps/s $\color{#d91a1a}-1.15\%$
test_empty[True] 0.2943ms 0.2549ms 3.9239 KOps/s 3.9121 KOps/s $\color{#35bf28}+0.30\%$
test_empty[False] 3.2490μs 0.7363μs 1.3582 MOps/s 1.3545 MOps/s $\color{#35bf28}+0.27\%$
test_to 52.0410μs 25.9858μs 38.4826 KOps/s 37.9993 KOps/s $\color{#35bf28}+1.27\%$
test_to_nonblocking 57.8810μs 24.6455μs 40.5754 KOps/s 40.3062 KOps/s $\color{#35bf28}+0.67\%$
test_unbind_speed 0.3140ms 0.2768ms 3.6130 KOps/s 3.6484 KOps/s $\color{#d91a1a}-0.97\%$
test_unbind_speed_stack0 0.3079ms 0.2738ms 3.6527 KOps/s 3.7524 KOps/s $\color{#d91a1a}-2.66\%$
test_unbind_speed_stack1 91.9810ms 0.7024ms 1.4237 KOps/s 1.4561 KOps/s $\color{#d91a1a}-2.23\%$
test_split 94.2540ms 2.1321ms 469.0250 Ops/s 475.7444 Ops/s $\color{#d91a1a}-1.41\%$
test_chunk 94.2802ms 2.1281ms 469.8962 Ops/s 475.1637 Ops/s $\color{#d91a1a}-1.11\%$
test_creation[device0] 0.3944ms 0.1243ms 8.0435 KOps/s 8.0346 KOps/s $\color{#35bf28}+0.11\%$
test_creation_from_tensor 0.3510ms 0.1296ms 7.7151 KOps/s 7.8308 KOps/s $\color{#d91a1a}-1.48\%$
test_add_one[memmap_tensor0] 0.2914ms 8.3668μs 119.5200 KOps/s 118.4392 KOps/s $\color{#35bf28}+0.91\%$
test_contiguous[memmap_tensor0] 32.7600μs 2.1168μs 472.4078 KOps/s 477.6385 KOps/s $\color{#d91a1a}-1.10\%$
test_stack[memmap_tensor0] 45.0010μs 6.4359μs 155.3793 KOps/s 158.2423 KOps/s $\color{#d91a1a}-1.81\%$
test_memmaptd_index 1.2591ms 0.4199ms 2.3814 KOps/s 2.4214 KOps/s $\color{#d91a1a}-1.65\%$
test_memmaptd_index_astensor 0.9348ms 0.4886ms 2.0468 KOps/s 2.0621 KOps/s $\color{#d91a1a}-0.74\%$
test_memmaptd_index_op 1.4167ms 0.9911ms 1.0090 KOps/s 979.4465 Ops/s $\color{#35bf28}+3.02\%$
test_serialize_model 0.1314s 0.1305s 7.6626 Ops/s 7.7008 Ops/s $\color{#d91a1a}-0.50\%$
test_serialize_model_pickle 1.3725s 1.2174s 0.8215 Ops/s 0.8237 Ops/s $\color{#d91a1a}-0.28\%$
test_serialize_weights 0.1301s 0.1295s 7.7233 Ops/s 7.6889 Ops/s $\color{#35bf28}+0.45\%$
test_serialize_weights_returnearly 0.2278s 56.5210ms 17.6925 Ops/s 18.0318 Ops/s $\color{#d91a1a}-1.88\%$
test_serialize_weights_pickle 1.4282s 1.2258s 0.8158 Ops/s 0.8206 Ops/s $\color{#d91a1a}-0.59\%$
test_reshape_pytree 67.8110μs 34.1943μs 29.2446 KOps/s 29.9168 KOps/s $\color{#d91a1a}-2.25\%$
test_reshape_td 81.8010μs 40.7243μs 24.5554 KOps/s 25.2506 KOps/s $\color{#d91a1a}-2.75\%$
test_view_pytree 69.2310μs 34.7347μs 28.7897 KOps/s 29.9431 KOps/s $\color{#d91a1a}-3.85\%$
test_view_td 79.0710μs 47.4942μs 21.0552 KOps/s 22.4955 KOps/s $\textbf{\color{#d91a1a}-6.40\%}$
test_unbind_pytree 66.0510μs 33.2401μs 30.0841 KOps/s 30.9366 KOps/s $\color{#d91a1a}-2.76\%$
test_unbind_td 0.5378ms 44.3298μs 22.5582 KOps/s 24.9962 KOps/s $\textbf{\color{#d91a1a}-9.75\%}$
test_split_pytree 0.5233ms 45.5658μs 21.9463 KOps/s 22.7359 KOps/s $\color{#d91a1a}-3.47\%$
test_split_td 0.1539ms 55.7483μs 17.9378 KOps/s 16.3080 KOps/s $\textbf{\color{#35bf28}+9.99\%}$
test_add_pytree 92.5310μs 56.1147μs 17.8207 KOps/s 18.6886 KOps/s $\color{#d91a1a}-4.64\%$
test_add_td 0.1555ms 94.4896μs 10.5832 KOps/s 10.9770 KOps/s $\color{#d91a1a}-3.59\%$
test_compile_add_one_nested[tensordict-compile] 0.2971ms 0.1597ms 6.2633 KOps/s 6.2578 KOps/s $\color{#35bf28}+0.09\%$
test_compile_add_one_nested[tensordict-eager] 0.2668ms 0.1673ms 5.9788 KOps/s 6.1568 KOps/s $\color{#d91a1a}-2.89\%$
test_compile_add_one_nested[pytree-compile] 1.0143ms 0.1415ms 7.0654 KOps/s 7.1066 KOps/s $\color{#d91a1a}-0.58\%$
test_compile_add_one_nested[pytree-eager] 0.2702ms 0.1765ms 5.6659 KOps/s 5.4382 KOps/s $\color{#35bf28}+4.19\%$
test_compile_copy_nested[tensordict-compile] 0.1242ms 20.1079μs 49.7318 KOps/s 46.7270 KOps/s $\textbf{\color{#35bf28}+6.43\%}$
test_compile_copy_nested[tensordict-eager] 0.1405ms 47.8500μs 20.8986 KOps/s 20.3777 KOps/s $\color{#35bf28}+2.56\%$
test_compile_copy_nested[pytree-compile] 0.2816ms 63.5416μs 15.7377 KOps/s 15.5139 KOps/s $\color{#35bf28}+1.44\%$
test_compile_copy_nested[pytree-eager] 0.1338ms 49.6463μs 20.1425 KOps/s 20.0131 KOps/s $\color{#35bf28}+0.65\%$
test_compile_add_one_flat[tensordict-compile] 0.4202ms 0.3120ms 3.2049 KOps/s 3.2212 KOps/s $\color{#d91a1a}-0.51\%$
test_compile_add_one_flat[tensordict-eager] 0.3318ms 0.2369ms 4.2206 KOps/s 4.3532 KOps/s $\color{#d91a1a}-3.05\%$
test_compile_add_one_flat[tensorclass-compile] 0.2307ms 0.1246ms 8.0234 KOps/s 7.9244 KOps/s $\color{#35bf28}+1.25\%$
test_compile_add_one_flat[tensorclass-eager] 0.1637ms 64.1707μs 15.5834 KOps/s 15.1299 KOps/s $\color{#35bf28}+3.00\%$
test_compile_add_one_flat[pytree-compile] 0.4228ms 0.3128ms 3.1974 KOps/s 3.2165 KOps/s $\color{#d91a1a}-0.59\%$
test_compile_add_one_flat[pytree-eager] 0.6867ms 0.5936ms 1.6846 KOps/s 1.6294 KOps/s $\color{#35bf28}+3.39\%$
test_compile_add_self_flat[tensordict-eager] 0.3840ms 0.2825ms 3.5392 KOps/s 3.6012 KOps/s $\color{#d91a1a}-1.72\%$
test_compile_add_self_flat[tensordict-compile] 0.4280ms 0.3128ms 3.1968 KOps/s 3.1821 KOps/s $\color{#35bf28}+0.46\%$
test_compile_add_self_flat[tensorclass-eager] 0.1749ms 74.7114μs 13.3848 KOps/s 13.4886 KOps/s $\color{#d91a1a}-0.77\%$
test_compile_add_self_flat[tensorclass-compile] 0.1829ms 0.1297ms 7.7111 KOps/s 7.8658 KOps/s $\color{#d91a1a}-1.97\%$
test_compile_add_self_flat[pytree-eager] 0.6185ms 0.5172ms 1.9336 KOps/s 1.8771 KOps/s $\color{#35bf28}+3.01\%$
test_compile_add_self_flat[pytree-compile] 0.4174ms 0.3121ms 3.2039 KOps/s 3.2231 KOps/s $\color{#d91a1a}-0.59\%$
test_compile_copy_flat[tensordict-compile] 0.1194ms 17.6752μs 56.5763 KOps/s 51.5416 KOps/s $\textbf{\color{#35bf28}+9.77\%}$
test_compile_copy_flat[tensordict-eager] 0.1267ms 39.2814μs 25.4573 KOps/s 23.7342 KOps/s $\textbf{\color{#35bf28}+7.26\%}$
test_compile_copy_flat[pytree-compile] 0.1508ms 70.0508μs 14.2754 KOps/s 14.3238 KOps/s $\color{#d91a1a}-0.34\%$
test_compile_copy_flat[pytree-eager] 0.1301ms 52.6951μs 18.9771 KOps/s 18.9315 KOps/s $\color{#35bf28}+0.24\%$
test_compile_assign_and_add[tensordict-compile] 2.2898ms 0.8000ms 1.2500 KOps/s 1.1406 KOps/s $\textbf{\color{#35bf28}+9.59\%}$
test_compile_assign_and_add[tensordict-eager] 3.1964ms 3.0740ms 325.3119 Ops/s 329.2184 Ops/s $\color{#d91a1a}-1.19\%$
test_compile_assign_and_add[pytree-compile] 2.2823ms 0.7998ms 1.2502 KOps/s 1.1566 KOps/s $\textbf{\color{#35bf28}+8.10\%}$
test_compile_assign_and_add[pytree-eager] 3.3232ms 3.1717ms 315.2858 Ops/s 318.5201 Ops/s $\color{#d91a1a}-1.02\%$
test_compile_indexing[tensor-tensordict-compile] 0.1601ms 0.1090ms 9.1763 KOps/s 9.5085 KOps/s $\color{#d91a1a}-3.49\%$
test_compile_indexing[tensor-tensordict-eager] 0.2338ms 59.9290μs 16.6864 KOps/s 16.7876 KOps/s $\color{#d91a1a}-0.60\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1842ms 0.1004ms 9.9649 KOps/s 9.9820 KOps/s $\color{#d91a1a}-0.17\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1412ms 40.6674μs 24.5897 KOps/s 23.5637 KOps/s $\color{#35bf28}+4.35\%$
test_compile_indexing[tensor-pytree-compile] 0.1848ms 0.1012ms 9.8825 KOps/s 9.8365 KOps/s $\color{#35bf28}+0.47\%$
test_compile_indexing[tensor-pytree-eager] 0.1380ms 40.2142μs 24.8669 KOps/s 23.7332 KOps/s $\color{#35bf28}+4.78\%$
test_compile_indexing[slice-tensordict-compile] 0.1898ms 0.1332ms 7.5080 KOps/s 7.5102 KOps/s $\color{#d91a1a}-0.03\%$
test_compile_indexing[slice-tensordict-eager] 0.1541ms 23.8995μs 41.8418 KOps/s 42.3625 KOps/s $\color{#d91a1a}-1.23\%$
test_compile_indexing[slice-tensorclass-compile] 0.2384ms 0.1271ms 7.8681 KOps/s 7.8521 KOps/s $\color{#35bf28}+0.20\%$
test_compile_indexing[slice-tensorclass-eager] 0.1037ms 19.5834μs 51.0636 KOps/s 50.1381 KOps/s $\color{#35bf28}+1.85\%$
test_compile_indexing[slice-pytree-compile] 0.2113ms 0.1281ms 7.8072 KOps/s 7.8005 KOps/s $\color{#35bf28}+0.09\%$
test_compile_indexing[slice-pytree-eager] 70.2110μs 19.5631μs 51.1167 KOps/s 51.3059 KOps/s $\color{#d91a1a}-0.37\%$
test_compile_indexing[int-tensordict-compile] 0.2250ms 0.1345ms 7.4348 KOps/s 7.4638 KOps/s $\color{#d91a1a}-0.39\%$
test_compile_indexing[int-tensordict-eager] 0.5175ms 23.7158μs 42.1660 KOps/s 41.6298 KOps/s $\color{#35bf28}+1.29\%$
test_compile_indexing[int-tensorclass-compile] 0.2178ms 0.1282ms 7.8025 KOps/s 7.8209 KOps/s $\color{#d91a1a}-0.24\%$
test_compile_indexing[int-tensorclass-eager] 0.1093ms 23.9501μs 41.7534 KOps/s 50.5352 KOps/s $\textbf{\color{#d91a1a}-17.38\%}$
test_compile_indexing[int-pytree-compile] 0.2246ms 0.1321ms 7.5703 KOps/s 7.8288 KOps/s $\color{#d91a1a}-3.30\%$
test_compile_indexing[int-pytree-eager] 0.1526ms 19.6071μs 51.0020 KOps/s 40.9872 KOps/s $\textbf{\color{#35bf28}+24.43\%}$
test_mod_add[eager] 0.1314ms 30.9777μs 32.2813 KOps/s 31.1002 KOps/s $\color{#35bf28}+3.80\%$
test_mod_add[compile] 0.5138ms 70.1574μs 14.2537 KOps/s 14.3123 KOps/s $\color{#d91a1a}-0.41\%$
test_mod_add[compile-overhead] 0.2619ms 0.1387ms 7.2096 KOps/s 6.8230 KOps/s $\textbf{\color{#35bf28}+5.67\%}$
test_mod_wrap[eager] 0.9883ms 0.7984ms 1.2525 KOps/s 1.2489 KOps/s $\color{#35bf28}+0.29\%$
test_mod_wrap[compile] 2.1132ms 0.8347ms 1.1980 KOps/s 1.2081 KOps/s $\color{#d91a1a}-0.83\%$
test_mod_wrap[compile-overhead] 4.8848ms 3.0787ms 324.8119 Ops/s 326.8383 Ops/s $\color{#d91a1a}-0.62\%$
test_mod_wrap_and_backward[eager] 4.6125ms 4.2018ms 237.9909 Ops/s 243.1239 Ops/s $\color{#d91a1a}-2.11\%$
test_mod_wrap_and_backward[compile] 4.5741ms 4.1308ms 242.0830 Ops/s 244.6037 Ops/s $\color{#d91a1a}-1.03\%$
test_mod_wrap_and_backward[compile-overhead] 1.3789ms 0.9696ms 1.0314 KOps/s 982.0624 Ops/s $\textbf{\color{#35bf28}+5.02\%}$
test_seq_add[eager] 0.1638ms 95.4403μs 10.4777 KOps/s 10.2845 KOps/s $\color{#35bf28}+1.88\%$
test_seq_add[compile] 0.2344ms 79.4675μs 12.5838 KOps/s 12.0156 KOps/s $\color{#35bf28}+4.73\%$
test_seq_add[compile-overhead] 0.4968ms 0.1113ms 8.9857 KOps/s 8.8688 KOps/s $\color{#35bf28}+1.32\%$
test_seq_wrap[eager] 1.3082ms 0.9332ms 1.0716 KOps/s 1.0838 KOps/s $\color{#d91a1a}-1.13\%$
test_seq_wrap[compile] 1.2349ms 0.8468ms 1.1810 KOps/s 1.1952 KOps/s $\color{#d91a1a}-1.19\%$
test_seq_wrap[compile-overhead] 0.6084ms 0.2158ms 4.6341 KOps/s 4.5977 KOps/s $\color{#35bf28}+0.79\%$
test_func_call_runtime[False-eager] 2.7724ms 2.3820ms 419.8135 Ops/s 421.5495 Ops/s $\color{#d91a1a}-0.41\%$
test_func_call_runtime[False-compile] 2.7811ms 2.3999ms 416.6877 Ops/s 422.6779 Ops/s $\color{#d91a1a}-1.42\%$
test_func_call_runtime[False-compile-overhead] 0.7556ms 0.3523ms 2.8386 KOps/s 2.8527 KOps/s $\color{#d91a1a}-0.49\%$
test_func_call_runtime[True-eager] 2.7214ms 2.5486ms 392.3697 Ops/s 398.8938 Ops/s $\color{#d91a1a}-1.64\%$
test_func_call_runtime[True-compile] 2.5880ms 2.4373ms 410.2950 Ops/s 418.7598 Ops/s $\color{#d91a1a}-2.02\%$
test_func_call_runtime[True-compile-overhead] 0.4239ms 0.3724ms 2.6856 KOps/s 2.6657 KOps/s $\color{#35bf28}+0.75\%$
test_func_call_cm_runtime[False-eager] 2.7780ms 2.3820ms 419.8170 Ops/s 422.6378 Ops/s $\color{#d91a1a}-0.67\%$
test_func_call_cm_runtime[False-compile] 2.8126ms 2.3948ms 417.5677 Ops/s 422.0705 Ops/s $\color{#d91a1a}-1.07\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4084ms 0.3549ms 2.8176 KOps/s 2.8437 KOps/s $\color{#d91a1a}-0.92\%$
test_func_call_cm_runtime[True-eager] 2.8499ms 2.6494ms 377.4495 Ops/s 382.7642 Ops/s $\color{#d91a1a}-1.39\%$
test_func_call_cm_runtime[True-compile] 2.6599ms 2.4766ms 403.7804 Ops/s 415.6312 Ops/s $\color{#d91a1a}-2.85\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5101ms 0.3998ms 2.5012 KOps/s 2.4989 KOps/s $\color{#35bf28}+0.09\%$
test_vmap_func_call_cm_runtime[eager] 4.2191ms 3.7718ms 265.1249 Ops/s 266.8760 Ops/s $\color{#d91a1a}-0.66\%$
test_vmap_func_call_cm_runtime[compile] 2.8436ms 2.4492ms 408.2998 Ops/s 410.8199 Ops/s $\color{#d91a1a}-0.61\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5303ms 0.4006ms 2.4961 KOps/s 2.4762 KOps/s $\color{#35bf28}+0.81\%$
test_distributed 6.3285ms 0.2156ms 4.6385 KOps/s 8.9053 KOps/s $\textbf{\color{#d91a1a}-47.91\%}$
test_tdmodule 52.0110μs 14.0984μs 70.9302 KOps/s 67.5723 KOps/s $\color{#35bf28}+4.97\%$
test_tdmodule_dispatch 54.6810μs 27.5461μs 36.3028 KOps/s 34.1814 KOps/s $\textbf{\color{#35bf28}+6.21\%}$
test_tdseq 35.4800μs 14.8496μs 67.3417 KOps/s 62.9167 KOps/s $\textbf{\color{#35bf28}+7.03\%}$
test_tdseq_dispatch 52.5510μs 30.4040μs 32.8904 KOps/s 30.9515 KOps/s $\textbf{\color{#35bf28}+6.26\%}$
test_instantiation_functorch 1.9637ms 1.8014ms 555.1144 Ops/s 555.3556 Ops/s $\color{#d91a1a}-0.04\%$
test_instantiation_td 1.7056ms 1.1557ms 865.2400 Ops/s 852.9471 Ops/s $\color{#35bf28}+1.44\%$
test_exec_functorch 1.1037ms 0.9998ms 1.0002 KOps/s 1.0108 KOps/s $\color{#d91a1a}-1.05\%$
test_exec_functional_call 1.1243ms 1.0110ms 989.1038 Ops/s 1.0079 KOps/s $\color{#d91a1a}-1.87\%$
test_exec_td 1.1845ms 1.0397ms 961.8200 Ops/s 979.7895 Ops/s $\color{#d91a1a}-1.83\%$
test_exec_td_decorator 1.5996ms 1.0823ms 923.9622 Ops/s 953.7011 Ops/s $\color{#d91a1a}-3.12\%$
test_vmap_mlp_speed[True-True] 2.0782ms 1.2748ms 784.4134 Ops/s 798.1068 Ops/s $\color{#d91a1a}-1.72\%$
test_vmap_mlp_speed[True-False] 1.3599ms 1.2608ms 793.1615 Ops/s 796.1943 Ops/s $\color{#d91a1a}-0.38\%$
test_vmap_mlp_speed[False-True] 1.2389ms 1.1549ms 865.9086 Ops/s 872.0375 Ops/s $\color{#d91a1a}-0.70\%$
test_vmap_mlp_speed[False-False] 1.2371ms 1.1540ms 866.5188 Ops/s 868.9582 Ops/s $\color{#d91a1a}-0.28\%$
test_vmap_mlp_speed_decorator[True-True] 1.3729ms 1.2566ms 795.8073 Ops/s 814.7839 Ops/s $\color{#d91a1a}-2.33\%$
test_vmap_mlp_speed_decorator[True-False] 1.7842ms 1.2606ms 793.3041 Ops/s 814.5220 Ops/s $\color{#d91a1a}-2.60\%$
test_vmap_mlp_speed_decorator[False-True] 1.5069ms 1.1828ms 845.4760 Ops/s 872.6298 Ops/s $\color{#d91a1a}-3.11\%$
test_vmap_mlp_speed_decorator[False-False] 1.3433ms 1.1813ms 846.5034 Ops/s 872.4035 Ops/s $\color{#d91a1a}-2.97\%$
test_vmap_transformer_speed[True-True] 13.4719ms 13.3353ms 74.9887 Ops/s 76.6320 Ops/s $\color{#d91a1a}-2.14\%$
test_vmap_transformer_speed[True-False] 13.3946ms 13.3071ms 75.1478 Ops/s 76.8351 Ops/s $\color{#d91a1a}-2.20\%$
test_vmap_transformer_speed[False-True] 13.2649ms 13.1291ms 76.1669 Ops/s 78.3988 Ops/s $\color{#d91a1a}-2.85\%$
test_vmap_transformer_speed[False-False] 13.2927ms 13.1431ms 76.0855 Ops/s 78.5312 Ops/s $\color{#d91a1a}-3.11\%$
test_vmap_transformer_speed_decorator[True-True] 34.4970ms 34.3550ms 29.1078 Ops/s 29.9254 Ops/s $\color{#d91a1a}-2.73\%$
test_vmap_transformer_speed_decorator[True-False] 34.4677ms 34.3167ms 29.1403 Ops/s 29.8389 Ops/s $\color{#d91a1a}-2.34\%$
test_vmap_transformer_speed_decorator[False-True] 34.3954ms 34.1311ms 29.2988 Ops/s 30.1020 Ops/s $\color{#d91a1a}-2.67\%$
test_vmap_transformer_speed_decorator[False-False] 34.4002ms 34.1845ms 29.2530 Ops/s 30.1944 Ops/s $\color{#d91a1a}-3.12\%$
test_to_module_speed[True] 1.3399ms 0.9720ms 1.0288 KOps/s 1.0140 KOps/s $\color{#35bf28}+1.46\%$
test_to_module_speed[False] 1.3750ms 0.9536ms 1.0487 KOps/s 1.0491 KOps/s $\color{#d91a1a}-0.04\%$
test_tc_init 0.1113ms 31.5894μs 31.6562 KOps/s 29.9761 KOps/s $\textbf{\color{#35bf28}+5.60\%}$
test_tc_init_nested 0.1163ms 65.4467μs 15.2796 KOps/s 14.3754 KOps/s $\textbf{\color{#35bf28}+6.29\%}$
test_tc_first_layer_tensor 10.8159μs 0.6655μs 1.5026 MOps/s 1.4907 MOps/s $\color{#35bf28}+0.80\%$
test_tc_first_layer_nontensor 19.1610μs 2.1957μs 455.4258 KOps/s 455.9964 KOps/s $\color{#d91a1a}-0.13\%$
test_tc_second_layer_tensor 20.3903μs 1.3545μs 738.2763 KOps/s 732.2792 KOps/s $\color{#35bf28}+0.82\%$
test_tc_second_layer_nontensor 78.6910μs 2.8642μs 349.1427 KOps/s 344.6868 KOps/s $\color{#35bf28}+1.29\%$
test_unbind 0.1945s 12.1680ms 82.1829 Ops/s 91.3035 Ops/s $\textbf{\color{#d91a1a}-9.99\%}$
test_full_like 0.6565ms 0.5740ms 1.7423 KOps/s 1.7497 KOps/s $\color{#d91a1a}-0.42\%$
test_zeros_like 0.2687ms 0.1979ms 5.0523 KOps/s 5.0533 KOps/s $\color{#d91a1a}-0.02\%$
test_ones_like 0.2809ms 0.1978ms 5.0567 KOps/s 5.0545 KOps/s $\color{#35bf28}+0.04\%$
test_clone 0.4559ms 0.4137ms 2.4173 KOps/s 2.4158 KOps/s $\color{#35bf28}+0.06\%$
test_squeeze 39.7300μs 9.4073μs 106.3005 KOps/s 108.3317 KOps/s $\color{#d91a1a}-1.87\%$
test_unsqueeze 0.2787ms 70.1260μs 14.2600 KOps/s 13.9495 KOps/s $\color{#35bf28}+2.23\%$
test_split 0.2548ms 0.1498ms 6.6769 KOps/s 6.6782 KOps/s $\color{#d91a1a}-0.02\%$
test_permute 0.2633ms 0.1707ms 5.8570 KOps/s 5.8643 KOps/s $\color{#d91a1a}-0.12\%$
test_stack 1.2584ms 0.8481ms 1.1790 KOps/s 1.1633 KOps/s $\color{#35bf28}+1.35\%$
test_cat 1.2551ms 1.2316ms 811.9654 Ops/s 812.0526 Ops/s $\color{#d91a1a}-0.01\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants