[CUDA] Refactor GroupNorm and add common vectorize implementation #19158
Azure Pipelines / orttraining-ortmodule-distributed (ORTModuleDistributedTest Onnxruntime_Linux_GPU_ORTModule_Distributed_Test)
succeeded
Jan 26, 2024 in 56m 51s
ORTModuleDistributedTest Onnxruntime_Linux_GPU_ORTModule_Distributed_Test succeeded
Loading