Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Add checks and to_module to tensorclass accepted methods #1124

Merged
merged 1 commit into from
Dec 3, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 3, 2024

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 3, 2024
Copy link

github-actions bot commented Dec 3, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}33$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 37.4500μs 17.0258μs 58.7345 KOps/s 52.7479 KOps/s $\textbf{\color{#35bf28}+11.35\%}$
test_plain_set_stack_nested 40.0240μs 17.2340μs 58.0248 KOps/s 52.6685 KOps/s $\textbf{\color{#35bf28}+10.17\%}$
test_plain_set_nested_inplace 49.5020μs 18.5169μs 54.0047 KOps/s 47.9791 KOps/s $\textbf{\color{#35bf28}+12.56\%}$
test_plain_set_stack_nested_inplace 73.1030μs 18.3429μs 54.5171 KOps/s 47.9031 KOps/s $\textbf{\color{#35bf28}+13.81\%}$
test_items 18.1130μs 4.0954μs 244.1749 KOps/s 242.2623 KOps/s $\color{#35bf28}+0.79\%$
test_items_nested 0.7111ms 0.3945ms 2.5351 KOps/s 2.4953 KOps/s $\color{#35bf28}+1.59\%$
test_items_nested_locked 0.8630ms 0.3958ms 2.5265 KOps/s 2.4868 KOps/s $\color{#35bf28}+1.60\%$
test_items_nested_leaf 0.1379ms 70.9784μs 14.0888 KOps/s 13.9735 KOps/s $\color{#35bf28}+0.82\%$
test_items_stack_nested 0.4731ms 0.3957ms 2.5269 KOps/s 2.4913 KOps/s $\color{#35bf28}+1.43\%$
test_items_stack_nested_leaf 0.1441ms 74.3404μs 13.4516 KOps/s 14.0453 KOps/s $\color{#d91a1a}-4.23\%$
test_items_stack_nested_locked 0.5474ms 0.3982ms 2.5111 KOps/s 2.4935 KOps/s $\color{#35bf28}+0.71\%$
test_keys 20.8080μs 3.5182μs 284.2380 KOps/s 272.5323 KOps/s $\color{#35bf28}+4.30\%$
test_keys_nested 0.2542ms 0.1345ms 7.4354 KOps/s 7.1769 KOps/s $\color{#35bf28}+3.60\%$
test_keys_nested_locked 1.7585ms 0.1399ms 7.1473 KOps/s 6.9024 KOps/s $\color{#35bf28}+3.55\%$
test_keys_nested_leaf 0.2255ms 0.1167ms 8.5664 KOps/s 8.5435 KOps/s $\color{#35bf28}+0.27\%$
test_keys_stack_nested 0.2292ms 0.1341ms 7.4594 KOps/s 7.3255 KOps/s $\color{#35bf28}+1.83\%$
test_keys_stack_nested_leaf 0.2037ms 0.1145ms 8.7371 KOps/s 8.5558 KOps/s $\color{#35bf28}+2.12\%$
test_keys_stack_nested_locked 0.2678ms 0.1392ms 7.1813 KOps/s 7.0046 KOps/s $\color{#35bf28}+2.52\%$
test_values 6.8046μs 1.0262μs 974.4582 KOps/s 918.1887 KOps/s $\textbf{\color{#35bf28}+6.13\%}$
test_values_nested 0.1021ms 55.2734μs 18.0919 KOps/s 17.7725 KOps/s $\color{#35bf28}+1.80\%$
test_values_nested_locked 0.1487ms 55.9556μs 17.8713 KOps/s 17.7586 KOps/s $\color{#35bf28}+0.63\%$
test_values_nested_leaf 0.1141ms 59.5442μs 16.7942 KOps/s 16.3693 KOps/s $\color{#35bf28}+2.60\%$
test_values_stack_nested 0.1141ms 56.5788μs 17.6745 KOps/s 17.3761 KOps/s $\color{#35bf28}+1.72\%$
test_values_stack_nested_leaf 0.1164ms 59.8313μs 16.7137 KOps/s 16.3186 KOps/s $\color{#35bf28}+2.42\%$
test_values_stack_nested_locked 0.1131ms 57.4609μs 17.4031 KOps/s 17.8327 KOps/s $\color{#d91a1a}-2.41\%$
test_membership 16.4310μs 0.8915μs 1.1217 MOps/s 1.0895 MOps/s $\color{#35bf28}+2.95\%$
test_membership_nested 27.2810μs 2.9570μs 338.1812 KOps/s 337.4686 KOps/s $\color{#35bf28}+0.21\%$
test_membership_nested_leaf 25.5180μs 2.9780μs 335.7994 KOps/s 339.2059 KOps/s $\color{#d91a1a}-1.00\%$
test_membership_stacked_nested 30.0960μs 2.9854μs 334.9617 KOps/s 337.9795 KOps/s $\color{#d91a1a}-0.89\%$
test_membership_stacked_nested_leaf 39.9040μs 2.9579μs 338.0804 KOps/s 330.7299 KOps/s $\color{#35bf28}+2.22\%$
test_membership_nested_last 33.0120μs 4.2962μs 232.7617 KOps/s 224.6322 KOps/s $\color{#35bf28}+3.62\%$
test_membership_nested_leaf_last 34.5250μs 4.2917μs 233.0091 KOps/s 223.6956 KOps/s $\color{#35bf28}+4.16\%$
test_membership_stacked_nested_last 25.7280μs 5.9200μs 168.9199 KOps/s 230.6234 KOps/s $\textbf{\color{#d91a1a}-26.76\%}$
test_membership_stacked_nested_leaf_last 50.0730μs 5.9150μs 169.0612 KOps/s 230.9921 KOps/s $\textbf{\color{#d91a1a}-26.81\%}$
test_nested_getleaf 35.4760μs 10.7446μs 93.0699 KOps/s 91.4708 KOps/s $\color{#35bf28}+1.75\%$
test_nested_get 35.3660μs 10.2080μs 97.9622 KOps/s 95.8913 KOps/s $\color{#35bf28}+2.16\%$
test_stacked_getleaf 37.3400μs 10.8304μs 92.3324 KOps/s 92.5684 KOps/s $\color{#d91a1a}-0.26\%$
test_stacked_get 37.5200μs 10.2378μs 97.6772 KOps/s 95.4963 KOps/s $\color{#35bf28}+2.28\%$
test_nested_getitemleaf 32.6810μs 11.3021μs 88.4793 KOps/s 88.5926 KOps/s $\color{#d91a1a}-0.13\%$
test_nested_getitem 53.7500μs 10.6713μs 93.7091 KOps/s 94.3214 KOps/s $\color{#d91a1a}-0.65\%$
test_stacked_getitemleaf 34.3640μs 11.2300μs 89.0472 KOps/s 89.8695 KOps/s $\color{#d91a1a}-0.92\%$
test_stacked_getitem 31.8700μs 10.4209μs 95.9610 KOps/s 95.0110 KOps/s $\color{#35bf28}+1.00\%$
test_lock_nested 4.2806ms 0.4418ms 2.2636 KOps/s 2.2599 KOps/s $\color{#35bf28}+0.16\%$
test_lock_stack_nested 0.7955ms 0.4069ms 2.4577 KOps/s 2.4073 KOps/s $\color{#35bf28}+2.10\%$
test_unlock_nested 0.8039ms 0.3537ms 2.8269 KOps/s 2.7669 KOps/s $\color{#35bf28}+2.17\%$
test_unlock_stack_nested 0.5685ms 0.3248ms 3.0789 KOps/s 3.0051 KOps/s $\color{#35bf28}+2.45\%$
test_flatten_speed 0.1733ms 93.2450μs 10.7244 KOps/s 10.6224 KOps/s $\color{#35bf28}+0.96\%$
test_unflatten_speed 5.6787ms 0.4952ms 2.0196 KOps/s 2.0334 KOps/s $\color{#d91a1a}-0.68\%$
test_common_ops 1.5430ms 0.7405ms 1.3505 KOps/s 1.2430 KOps/s $\textbf{\color{#35bf28}+8.65\%}$
test_creation 32.7510μs 2.0936μs 477.6456 KOps/s 444.0407 KOps/s $\textbf{\color{#35bf28}+7.57\%}$
test_creation_empty 30.8670μs 9.3761μs 106.6536 KOps/s 77.6313 KOps/s $\textbf{\color{#35bf28}+37.38\%}$
test_creation_nested_1 37.1790μs 12.1225μs 82.4912 KOps/s 62.9145 KOps/s $\textbf{\color{#35bf28}+31.12\%}$
test_creation_nested_2 56.8560μs 16.5685μs 60.3556 KOps/s 49.5442 KOps/s $\textbf{\color{#35bf28}+21.82\%}$
test_clone 0.1024ms 13.0624μs 76.5553 KOps/s 76.0180 KOps/s $\color{#35bf28}+0.71\%$
test_getitem[int] 1.2343ms 12.5623μs 79.6030 KOps/s 80.6359 KOps/s $\color{#d91a1a}-1.28\%$
test_getitem[slice_int] 0.1450ms 24.2630μs 41.2149 KOps/s 41.6361 KOps/s $\color{#d91a1a}-1.01\%$
test_getitem[range] 0.1864ms 47.8343μs 20.9055 KOps/s 20.5761 KOps/s $\color{#35bf28}+1.60\%$
test_getitem[tuple] 0.1289ms 19.7994μs 50.5067 KOps/s 51.3571 KOps/s $\color{#d91a1a}-1.66\%$
test_getitem[list] 0.1755ms 44.2354μs 22.6063 KOps/s 23.2569 KOps/s $\color{#d91a1a}-2.80\%$
test_setitem_dim[int] 59.4410μs 25.4192μs 39.3403 KOps/s 39.1646 KOps/s $\color{#35bf28}+0.45\%$
test_setitem_dim[slice_int] 96.7800μs 52.0058μs 19.2286 KOps/s 18.7229 KOps/s $\color{#35bf28}+2.70\%$
test_setitem_dim[range] 0.1171ms 72.2659μs 13.8378 KOps/s 13.4650 KOps/s $\color{#35bf28}+2.77\%$
test_setitem_dim[tuple] 96.2290μs 40.3597μs 24.7772 KOps/s 24.1178 KOps/s $\color{#35bf28}+2.73\%$
test_setitem 93.1640μs 19.4375μs 51.4469 KOps/s 46.7196 KOps/s $\textbf{\color{#35bf28}+10.12\%}$
test_set 69.3490μs 18.7166μs 53.4284 KOps/s 47.6549 KOps/s $\textbf{\color{#35bf28}+12.12\%}$
test_set_shared 3.3601ms 0.1677ms 5.9627 KOps/s 5.9092 KOps/s $\color{#35bf28}+0.91\%$
test_update 0.1262ms 20.5300μs 48.7091 KOps/s 40.6462 KOps/s $\textbf{\color{#35bf28}+19.84\%}$
test_update_nested 0.1023ms 31.0283μs 32.2286 KOps/s 28.3095 KOps/s $\textbf{\color{#35bf28}+13.84\%}$
test_update__nested 1.0136ms 32.0699μs 31.1819 KOps/s 30.6235 KOps/s $\color{#35bf28}+1.82\%$
test_set_nested 71.7140μs 20.5922μs 48.5622 KOps/s 43.0646 KOps/s $\textbf{\color{#35bf28}+12.77\%}$
test_set_nested_new 99.5760μs 24.9288μs 40.1143 KOps/s 35.5442 KOps/s $\textbf{\color{#35bf28}+12.86\%}$
test_select 0.1187ms 41.4179μs 24.1441 KOps/s 21.9770 KOps/s $\textbf{\color{#35bf28}+9.86\%}$
test_select_nested 0.1287ms 59.0850μs 16.9248 KOps/s 16.5963 KOps/s $\color{#35bf28}+1.98\%$
test_exclude_nested 0.1567ms 77.9125μs 12.8349 KOps/s 12.8045 KOps/s $\color{#35bf28}+0.24\%$
test_empty[True] 0.5752ms 0.3799ms 2.6322 KOps/s 2.6082 KOps/s $\color{#35bf28}+0.92\%$
test_empty[False] 13.4650μs 1.2624μs 792.1563 KOps/s 753.9427 KOps/s $\textbf{\color{#35bf28}+5.07\%}$
test_unbind_speed 0.3460ms 0.2611ms 3.8298 KOps/s 3.7505 KOps/s $\color{#35bf28}+2.11\%$
test_unbind_speed_stack0 0.3943ms 0.2547ms 3.9267 KOps/s 3.8111 KOps/s $\color{#35bf28}+3.03\%$
test_unbind_speed_stack1 0.1029s 0.7543ms 1.3258 KOps/s 1.4105 KOps/s $\textbf{\color{#d91a1a}-6.00\%}$
test_split 0.1026s 1.7272ms 578.9597 Ops/s 585.5891 Ops/s $\color{#d91a1a}-1.13\%$
test_chunk 0.1062s 1.7346ms 576.4926 Ops/s 581.8272 Ops/s $\color{#d91a1a}-0.92\%$
test_consolidate_njt[False-None] 8.5788ms 7.9606ms 125.6192 Ops/s 124.1947 Ops/s $\color{#35bf28}+1.15\%$
test_creation[device0] 4.2820ms 93.7170μs 10.6704 KOps/s 10.6598 KOps/s $\color{#35bf28}+0.10\%$
test_creation_from_tensor 0.2396ms 93.6263μs 10.6808 KOps/s 10.5740 KOps/s $\color{#35bf28}+1.01\%$
test_add_one[memmap_tensor0] 0.1993ms 5.0313μs 198.7549 KOps/s 210.0212 KOps/s $\textbf{\color{#d91a1a}-5.36\%}$
test_contiguous[memmap_tensor0] 12.9240μs 0.5075μs 1.9704 MOps/s 1.9079 MOps/s $\color{#35bf28}+3.27\%$
test_stack[memmap_tensor0] 31.8800μs 3.6112μs 276.9177 KOps/s 298.6828 KOps/s $\textbf{\color{#d91a1a}-7.29\%}$
test_memmaptd_index 1.0012ms 0.2365ms 4.2279 KOps/s 4.2717 KOps/s $\color{#d91a1a}-1.03\%$
test_memmaptd_index_astensor 0.5747ms 0.3146ms 3.1786 KOps/s 3.1823 KOps/s $\color{#d91a1a}-0.11\%$
test_memmaptd_index_op 1.2879ms 0.5581ms 1.7919 KOps/s 1.6485 KOps/s $\textbf{\color{#35bf28}+8.70\%}$
test_serialize_model 0.1253s 0.1138s 8.7866 Ops/s 7.3136 Ops/s $\textbf{\color{#35bf28}+20.14\%}$
test_serialize_model_pickle 0.5125s 0.4046s 2.4716 Ops/s 2.5720 Ops/s $\color{#d91a1a}-3.90\%$
test_serialize_weights 0.1214s 0.1181s 8.4698 Ops/s 8.9325 Ops/s $\textbf{\color{#d91a1a}-5.18\%}$
test_serialize_weights_returnearly 0.2576s 0.1737s 5.7558 Ops/s 6.5750 Ops/s $\textbf{\color{#d91a1a}-12.46\%}$
test_serialize_weights_pickle 0.5910s 0.4834s 2.0687 Ops/s 2.2738 Ops/s $\textbf{\color{#d91a1a}-9.02\%}$
test_serialize_weights_filesystem 0.1495s 0.1414s 7.0743 Ops/s 6.9255 Ops/s $\color{#35bf28}+2.15\%$
test_serialize_model_filesystem 0.1675s 0.1511s 6.6186 Ops/s 6.6226 Ops/s $\color{#d91a1a}-0.06\%$
test_reshape_pytree 55.2030μs 26.4363μs 37.8267 KOps/s 36.6140 KOps/s $\color{#35bf28}+3.31\%$
test_reshape_td 0.1013ms 33.1161μs 30.1968 KOps/s 29.8631 KOps/s $\color{#35bf28}+1.12\%$
test_view_pytree 81.1720μs 26.4854μs 37.7567 KOps/s 36.8772 KOps/s $\color{#35bf28}+2.38\%$
test_view_td 80.0290μs 38.7510μs 25.8058 KOps/s 25.1178 KOps/s $\color{#35bf28}+2.74\%$
test_unbind_pytree 68.6580μs 29.5914μs 33.7936 KOps/s 33.1448 KOps/s $\color{#35bf28}+1.96\%$
test_unbind_td 0.3277ms 38.3005μs 26.1093 KOps/s 25.3616 KOps/s $\color{#35bf28}+2.95\%$
test_split_pytree 75.5710μs 29.2411μs 34.1985 KOps/s 33.7016 KOps/s $\color{#35bf28}+1.47\%$
test_split_td 0.5080ms 44.3731μs 22.5362 KOps/s 23.0237 KOps/s $\color{#d91a1a}-2.12\%$
test_add_pytree 0.1006ms 36.0835μs 27.7135 KOps/s 27.8359 KOps/s $\color{#d91a1a}-0.44\%$
test_add_td 0.1282ms 51.8719μs 19.2783 KOps/s 17.5982 KOps/s $\textbf{\color{#35bf28}+9.55\%}$
test_compile_add_one_nested[tensordict-compile] 0.1511ms 61.7202μs 16.2021 KOps/s 16.6398 KOps/s $\color{#d91a1a}-2.63\%$
test_compile_add_one_nested[tensordict-eager] 1.4092ms 0.1646ms 6.0759 KOps/s 6.1360 KOps/s $\color{#d91a1a}-0.98\%$
test_compile_add_one_nested[pytree-compile] 0.1053ms 45.6860μs 21.8885 KOps/s 22.6698 KOps/s $\color{#d91a1a}-3.45\%$
test_compile_add_one_nested[pytree-eager] 0.2250ms 0.1186ms 8.4306 KOps/s 8.3878 KOps/s $\color{#35bf28}+0.51\%$
test_compile_copy_nested[tensordict-compile] 86.7510μs 26.1579μs 38.2293 KOps/s 38.2124 KOps/s $\color{#35bf28}+0.04\%$
test_compile_copy_nested[tensordict-eager] 0.1096ms 53.0442μs 18.8522 KOps/s 18.2603 KOps/s $\color{#35bf28}+3.24\%$
test_compile_copy_nested[pytree-compile] 0.1534ms 79.1702μs 12.6310 KOps/s 12.3741 KOps/s $\color{#35bf28}+2.08\%$
test_compile_copy_nested[pytree-eager] 0.1633ms 66.6592μs 15.0017 KOps/s 14.4148 KOps/s $\color{#35bf28}+4.07\%$
test_compile_add_one_flat[tensordict-compile] 0.1788ms 0.1053ms 9.5010 KOps/s 9.7275 KOps/s $\color{#d91a1a}-2.33\%$
test_compile_add_one_flat[tensordict-eager] 0.4019ms 0.2003ms 4.9920 KOps/s 5.0507 KOps/s $\color{#d91a1a}-1.16\%$
test_compile_add_one_flat[tensorclass-compile] 0.1110ms 46.2041μs 21.6431 KOps/s 22.5141 KOps/s $\color{#d91a1a}-3.87\%$
test_compile_add_one_flat[tensorclass-eager] 0.4860ms 63.1533μs 15.8345 KOps/s 16.4584 KOps/s $\color{#d91a1a}-3.79\%$
test_compile_add_one_flat[pytree-compile] 0.2480ms 0.1051ms 9.5179 KOps/s 10.0067 KOps/s $\color{#d91a1a}-4.88\%$
test_compile_add_one_flat[pytree-eager] 0.3596ms 0.2011ms 4.9737 KOps/s 4.9914 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_add_self_flat[tensordict-eager] 0.4234ms 0.2132ms 4.6895 KOps/s 4.7632 KOps/s $\color{#d91a1a}-1.55\%$
test_compile_add_self_flat[tensordict-compile] 0.2220ms 0.1080ms 9.2630 KOps/s 9.7636 KOps/s $\textbf{\color{#d91a1a}-5.13\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1659ms 55.3298μs 18.0734 KOps/s 18.5281 KOps/s $\color{#d91a1a}-2.45\%$
test_compile_add_self_flat[tensorclass-compile] 0.1025ms 46.1543μs 21.6664 KOps/s 22.2983 KOps/s $\color{#d91a1a}-2.83\%$
test_compile_add_self_flat[pytree-eager] 0.5878ms 0.1605ms 6.2297 KOps/s 6.2623 KOps/s $\color{#d91a1a}-0.52\%$
test_compile_add_self_flat[pytree-compile] 0.2222ms 0.1039ms 9.6269 KOps/s 9.8828 KOps/s $\color{#d91a1a}-2.59\%$
test_compile_copy_flat[tensordict-compile] 56.9460μs 22.7425μs 43.9705 KOps/s 47.2003 KOps/s $\textbf{\color{#d91a1a}-6.84\%}$
test_compile_copy_flat[tensordict-eager] 0.1281ms 57.2661μs 17.4623 KOps/s 16.6210 KOps/s $\textbf{\color{#35bf28}+5.06\%}$
test_compile_copy_flat[pytree-compile] 0.1893ms 80.4551μs 12.4293 KOps/s 12.4715 KOps/s $\color{#d91a1a}-0.34\%$
test_compile_copy_flat[pytree-eager] 0.1278ms 67.8155μs 14.7459 KOps/s 14.5552 KOps/s $\color{#35bf28}+1.31\%$
test_compile_assign_and_add[tensordict-compile] 0.3223ms 0.2078ms 4.8121 KOps/s 4.9111 KOps/s $\color{#d91a1a}-2.02\%$
test_compile_assign_and_add[tensordict-eager] 2.1768ms 1.2817ms 780.2344 Ops/s 775.7503 Ops/s $\color{#35bf28}+0.58\%$
test_compile_assign_and_add[pytree-compile] 0.3124ms 0.2085ms 4.7969 KOps/s 4.9878 KOps/s $\color{#d91a1a}-3.83\%$
test_compile_assign_and_add[pytree-eager] 1.0054ms 0.7835ms 1.2763 KOps/s 1.2914 KOps/s $\color{#d91a1a}-1.16\%$
test_compile_assign_and_add_stack[compile] 0.5518ms 0.4566ms 2.1899 KOps/s 2.2036 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_assign_and_add_stack[eager] 4.0958ms 2.5702ms 389.0732 Ops/s 377.5060 Ops/s $\color{#35bf28}+3.06\%$
test_compile_indexing[tensor-tensordict-compile] 98.0330μs 36.2893μs 27.5564 KOps/s 28.1301 KOps/s $\color{#d91a1a}-2.04\%$
test_compile_indexing[tensor-tensordict-eager] 0.5254ms 33.0743μs 30.2350 KOps/s 30.5851 KOps/s $\color{#d91a1a}-1.14\%$
test_compile_indexing[tensor-tensorclass-compile] 71.1730μs 29.3092μs 34.1190 KOps/s 35.0108 KOps/s $\color{#d91a1a}-2.55\%$
test_compile_indexing[tensor-tensorclass-eager] 76.9330μs 23.2270μs 43.0533 KOps/s 42.8700 KOps/s $\color{#35bf28}+0.43\%$
test_compile_indexing[tensor-pytree-compile] 79.3980μs 30.8154μs 32.4513 KOps/s 33.5860 KOps/s $\color{#d91a1a}-3.38\%$
test_compile_indexing[tensor-pytree-eager] 85.5390μs 23.1366μs 43.2215 KOps/s 43.0619 KOps/s $\color{#35bf28}+0.37\%$
test_compile_indexing[slice-tensordict-compile] 0.1153ms 52.3159μs 19.1147 KOps/s 19.4634 KOps/s $\color{#d91a1a}-1.79\%$
test_compile_indexing[slice-tensordict-eager] 0.6064ms 19.8987μs 50.2546 KOps/s 51.3307 KOps/s $\color{#d91a1a}-2.10\%$
test_compile_indexing[slice-tensorclass-compile] 90.4890μs 44.2683μs 22.5895 KOps/s 22.3964 KOps/s $\color{#35bf28}+0.86\%$
test_compile_indexing[slice-tensorclass-eager] 70.0810μs 18.6409μs 53.6454 KOps/s 53.1633 KOps/s $\color{#35bf28}+0.91\%$
test_compile_indexing[slice-pytree-compile] 0.1163ms 45.1071μs 22.1695 KOps/s 22.1928 KOps/s $\color{#d91a1a}-0.11\%$
test_compile_indexing[slice-pytree-eager] 64.7110μs 18.7871μs 53.2279 KOps/s 52.3683 KOps/s $\color{#35bf28}+1.64\%$
test_compile_indexing[int-tensordict-compile] 0.1120ms 53.4093μs 18.7233 KOps/s 18.8574 KOps/s $\color{#d91a1a}-0.71\%$
test_compile_indexing[int-tensordict-eager] 0.9027ms 19.8317μs 50.4243 KOps/s 51.4601 KOps/s $\color{#d91a1a}-2.01\%$
test_compile_indexing[int-tensorclass-compile] 0.1325ms 44.8914μs 22.2760 KOps/s 22.2351 KOps/s $\color{#35bf28}+0.18\%$
test_compile_indexing[int-tensorclass-eager] 0.3128ms 18.7036μs 53.4657 KOps/s 53.6346 KOps/s $\color{#d91a1a}-0.31\%$
test_compile_indexing[int-pytree-compile] 0.1092ms 45.1487μs 22.1490 KOps/s 22.0530 KOps/s $\color{#35bf28}+0.44\%$
test_compile_indexing[int-pytree-eager] 81.0810μs 18.5433μs 53.9278 KOps/s 53.0167 KOps/s $\color{#35bf28}+1.72\%$
test_mod_add[eager] 0.1606ms 33.4803μs 29.8683 KOps/s 29.0639 KOps/s $\color{#35bf28}+2.77\%$
test_mod_add[compile] 0.1267ms 48.1558μs 20.7659 KOps/s 21.2196 KOps/s $\color{#d91a1a}-2.14\%$
test_mod_add[compile-overhead] 0.1106ms 48.0660μs 20.8047 KOps/s 21.0405 KOps/s $\color{#d91a1a}-1.12\%$
test_mod_wrap[eager] 0.3552ms 0.2244ms 4.4561 KOps/s 4.4522 KOps/s $\color{#35bf28}+0.09\%$
test_mod_wrap[compile] 0.2902ms 0.2083ms 4.8002 KOps/s 4.7476 KOps/s $\color{#35bf28}+1.11\%$
test_mod_wrap[compile-overhead] 0.4047ms 0.2132ms 4.6895 KOps/s 4.8976 KOps/s $\color{#d91a1a}-4.25\%$
test_mod_wrap_and_backward[eager] 12.7439ms 11.1289ms 89.8558 Ops/s 85.0455 Ops/s $\textbf{\color{#35bf28}+5.66\%}$
test_mod_wrap_and_backward[compile] 13.1055ms 11.0733ms 90.3070 Ops/s 72.5835 Ops/s $\textbf{\color{#35bf28}+24.42\%}$
test_mod_wrap_and_backward[compile-overhead] 13.7246ms 11.2135ms 89.1780 Ops/s 74.1538 Ops/s $\textbf{\color{#35bf28}+20.26\%}$
test_seq_add[eager] 0.2112ms 0.1113ms 8.9861 KOps/s 8.7936 KOps/s $\color{#35bf28}+2.19\%$
test_seq_add[compile] 0.1193ms 62.8215μs 15.9181 KOps/s 15.9947 KOps/s $\color{#d91a1a}-0.48\%$
test_seq_add[compile-overhead] 0.1199ms 61.9793μs 16.1344 KOps/s 16.6895 KOps/s $\color{#d91a1a}-3.33\%$
test_seq_wrap[eager] 0.7434ms 0.4389ms 2.2787 KOps/s 2.2150 KOps/s $\color{#35bf28}+2.88\%$
test_seq_wrap[compile] 0.4155ms 0.2315ms 4.3188 KOps/s 4.3018 KOps/s $\color{#35bf28}+0.40\%$
test_seq_wrap[compile-overhead] 0.3729ms 0.2312ms 4.3259 KOps/s 4.3944 KOps/s $\color{#d91a1a}-1.56\%$
test_func_call_runtime[False-eager] 0.9365ms 0.5700ms 1.7545 KOps/s 1.8060 KOps/s $\color{#d91a1a}-2.85\%$
test_func_call_runtime[False-compile] 0.5148ms 0.4293ms 2.3295 KOps/s 2.3429 KOps/s $\color{#d91a1a}-0.57\%$
test_func_call_runtime[False-compile-overhead] 0.5965ms 0.4304ms 2.3235 KOps/s 2.3419 KOps/s $\color{#d91a1a}-0.79\%$
test_func_call_runtime[True-eager] 0.9142ms 0.7662ms 1.3052 KOps/s 1.3150 KOps/s $\color{#d91a1a}-0.75\%$
test_func_call_runtime[True-compile] 0.6648ms 0.4687ms 2.1334 KOps/s 2.1577 KOps/s $\color{#d91a1a}-1.13\%$
test_func_call_runtime[True-compile-overhead] 0.8519ms 0.4724ms 2.1170 KOps/s 2.0789 KOps/s $\color{#35bf28}+1.83\%$
test_func_call_cm_runtime[False-eager] 0.9199ms 0.5694ms 1.7564 KOps/s 1.8363 KOps/s $\color{#d91a1a}-4.35\%$
test_func_call_cm_runtime[False-compile] 0.5700ms 0.4298ms 2.3264 KOps/s 2.3547 KOps/s $\color{#d91a1a}-1.20\%$
test_func_call_cm_runtime[False-compile-overhead] 0.8051ms 0.4331ms 2.3087 KOps/s 2.3425 KOps/s $\color{#d91a1a}-1.44\%$
test_func_call_cm_runtime[True-eager] 1.3621ms 0.9060ms 1.1038 KOps/s 1.1027 KOps/s $\color{#35bf28}+0.10\%$
test_func_call_cm_runtime[True-compile] 0.6307ms 0.4937ms 2.0254 KOps/s 2.0241 KOps/s $\color{#35bf28}+0.06\%$
test_func_call_cm_runtime[True-compile-overhead] 0.6448ms 0.4942ms 2.0237 KOps/s 2.0327 KOps/s $\color{#d91a1a}-0.44\%$
test_vmap_func_call_cm_runtime[eager] 2.4341ms 1.8856ms 530.3403 Ops/s 526.7730 Ops/s $\color{#35bf28}+0.68\%$
test_vmap_func_call_cm_runtime[compile] 0.9370ms 0.5222ms 1.9148 KOps/s 1.9540 KOps/s $\color{#d91a1a}-2.00\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.8837ms 0.5223ms 1.9146 KOps/s 1.9741 KOps/s $\color{#d91a1a}-3.02\%$
test_distributed 0.2615ms 0.1268ms 7.8888 KOps/s 7.8136 KOps/s $\color{#35bf28}+0.96\%$
test_tdmodule 0.1476ms 25.8608μs 38.6685 KOps/s 36.9225 KOps/s $\color{#35bf28}+4.73\%$
test_tdmodule_dispatch 83.0350μs 46.3987μs 21.5523 KOps/s 20.0987 KOps/s $\textbf{\color{#35bf28}+7.23\%}$
test_tdseq 41.2860μs 25.5448μs 39.1469 KOps/s 36.2861 KOps/s $\textbf{\color{#35bf28}+7.88\%}$
test_tdseq_dispatch 78.6570μs 48.4778μs 20.6280 KOps/s 18.7811 KOps/s $\textbf{\color{#35bf28}+9.83\%}$
test_instantiation_functorch 1.8556ms 1.5340ms 651.8959 Ops/s 657.0488 Ops/s $\color{#d91a1a}-0.78\%$
test_exec_functorch 0.4019ms 0.1865ms 5.3615 KOps/s 5.5090 KOps/s $\color{#d91a1a}-2.68\%$
test_exec_functional_call 0.2817ms 0.1785ms 5.6033 KOps/s 5.6777 KOps/s $\color{#d91a1a}-1.31\%$
test_exec_td_decorator 0.5285ms 0.2348ms 4.2595 KOps/s 4.3500 KOps/s $\color{#d91a1a}-2.08\%$
test_vmap_mlp_speed_decorator[True-True] 1.1284ms 0.6491ms 1.5407 KOps/s 1.5532 KOps/s $\color{#d91a1a}-0.81\%$
test_vmap_mlp_speed_decorator[True-False] 0.9813ms 0.6579ms 1.5199 KOps/s 1.5580 KOps/s $\color{#d91a1a}-2.44\%$
test_vmap_mlp_speed_decorator[False-True] 0.6991ms 0.5213ms 1.9184 KOps/s 1.9378 KOps/s $\color{#d91a1a}-1.00\%$
test_vmap_mlp_speed_decorator[False-False] 0.8274ms 0.5234ms 1.9106 KOps/s 1.9507 KOps/s $\color{#d91a1a}-2.05\%$
test_to_module_speed[True] 1.4780ms 1.2837ms 779.0081 Ops/s 771.6343 Ops/s $\color{#35bf28}+0.96\%$
test_to_module_speed[False] 1.4180ms 1.2537ms 797.6661 Ops/s 788.0908 Ops/s $\color{#35bf28}+1.22\%$
test_tc_init 92.5320μs 44.3521μs 22.5469 KOps/s 21.0721 KOps/s $\textbf{\color{#35bf28}+7.00\%}$
test_tc_init_nested 0.1737ms 91.4100μs 10.9397 KOps/s 10.2961 KOps/s $\textbf{\color{#35bf28}+6.25\%}$
test_tc_first_layer_tensor 25.5470μs 1.5574μs 642.0812 KOps/s 666.3338 KOps/s $\color{#d91a1a}-3.64\%$
test_tc_first_layer_nontensor 39.4440μs 4.7193μs 211.8945 KOps/s 210.7385 KOps/s $\color{#35bf28}+0.55\%$
test_tc_second_layer_tensor 31.7670μs 2.8599μs 349.6664 KOps/s 358.4796 KOps/s $\color{#d91a1a}-2.46\%$
test_tc_second_layer_nontensor 33.1010μs 6.0920μs 164.1498 KOps/s 161.2049 KOps/s $\color{#35bf28}+1.83\%$
test_unbind 0.2367s 13.9643ms 71.6112 Ops/s 78.3597 Ops/s $\textbf{\color{#d91a1a}-8.61\%}$
test_full_like 9.1208ms 7.9085ms 126.4465 Ops/s 131.2055 Ops/s $\color{#d91a1a}-3.63\%$
test_zeros_like 3.4658ms 2.9561ms 338.2804 Ops/s 141.5701 Ops/s $\textbf{\color{#35bf28}+138.95\%}$
test_ones_like 12.1109ms 6.3150ms 158.3542 Ops/s 133.7445 Ops/s $\textbf{\color{#35bf28}+18.40\%}$
test_clone 15.7974ms 8.2543ms 121.1488 Ops/s 108.3381 Ops/s $\textbf{\color{#35bf28}+11.82\%}$
test_squeeze 74.9790μs 11.8126μs 84.6552 KOps/s 81.9765 KOps/s $\color{#35bf28}+3.27\%$
test_unsqueeze 0.2018ms 90.4912μs 11.0508 KOps/s 11.0686 KOps/s $\color{#d91a1a}-0.16\%$
test_split 0.5026ms 0.1903ms 5.2540 KOps/s 5.1659 KOps/s $\color{#35bf28}+1.71\%$
test_permute 0.3125ms 0.2189ms 4.5687 KOps/s 4.4737 KOps/s $\color{#35bf28}+2.12\%$
test_stack 33.8552ms 25.9890ms 38.4778 Ops/s 38.3691 Ops/s $\color{#35bf28}+0.28\%$
test_cat 32.8985ms 25.8968ms 38.6148 Ops/s 38.9478 Ops/s $\color{#d91a1a}-0.85\%$

@vmoens vmoens merged commit 31eb86f into gh/vmoens/35/base Dec 3, 2024
50 of 53 checks passed
vmoens added a commit that referenced this pull request Dec 3, 2024
ghstack-source-id: 5a1cd0d2ac9a0880111f503fc9cb12519d85ef42
Pull Request resolved: #1124
@vmoens vmoens deleted the gh/vmoens/35/head branch December 3, 2024 15:09
@vmoens vmoens added the bug Something isn't working label Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants