You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OS platform and distribution (e.g., Linux Ubuntu 16.04): Ubuntu 22.04
GCC/Compiler version (if compiled from source):
Describe the current behavior
MindSpore’s lack of flexibility in converting model parameter data types, unlike other frameworks, introduces several limitations that impact its effectiveness in various scenarios. There are several flaws in this design:
Limited Memory Flexibility: Different data types require varying amounts of memory, and the ability to switch types helps optimize memory usage on constrained devices. A fixed data type can lead to unnecessary memory consumption or prevent larger models from being loaded on memory-limited devices.
Reduced Compatibility and Portability: In multi-framework environments (e.g., converting models between PyTorch and MindSpore), restrictions on data types increase the difficulty of model transfer. This can lead to inconsistent model performance across platforms, affecting model accuracy and reliability.
Lowered User Experience: Developers often need to adjust data types based on specific task requirements, especially when balancing accuracy and performance. Without flexible data type conversion, users face limitations in training and deploying models, reducing the overall usability of the framework.
Describe the expected behavior
Like any other framework
Steps to reproduce the issue
importmindsporeimportnumpyasnpimportrandomclassModel_ObroUQ4fAbnFbmOInLYaYRvj8vfg1MYT(mindspore.nn.Cell):
def__init__(self):
super(Model_ObroUQ4fAbnFbmOInLYaYRvj8vfg1MYT, self).__init__()
self.conv1_mutated=mindspore.nn.Conv2d(in_channels=1, out_channels=6, kernel_size=(8, 8), stride=(1, 1), pad_mode="pad", padding=(0, 0, 0, 0), dilation=(1, 1), group=1, has_bias=True)
self.tail_flatten=mindspore.nn.Flatten(start_dim=1, end_dim=-1)
self.tail_fc=mindspore.nn.Dense(in_channels=2646, out_channels=10)
defconstruct(self, input):
conv1_output=self.conv1_mutated(input)
tail_flatten_output=self.tail_flatten(conv1_output)
tail_fc_output=self.tail_fc(tail_flatten_output)
tail_fc_output=tail_fc_outputreturntail_fc_outputdefset_mindspore_params(ms_model, init_params):
importmindsporeforname, paraminms_model.parameters_and_names():
ifnameininit_params:
target_params=init_params[name]
iflen(target_params.shape) ==2:
target_params=init_params[name].Tifstr(target_params.dtype).endswith('float16'):
dtype=mindspore.float16elifstr(target_params.dtype).endswith('float64'):
dtype=mindspore.float64else:
dtype=mindspore.float32param.set_data(mindspore.Tensor(target_params, dtype))
returnms_modeldefset_paddle_params(paddle_model, init_params):
importpaddleforname, paraminpaddle_model.named_parameters():
ifnameininit_params:
param_data=init_params[name]
if"weight"inname:
iflen(param_data.shape) ==2:
ifparam_data.shape== (param.shape[1], param.shape[0]):
param_data=param_data.T# 转置矩阵param.set_value(paddle.to_tensor(param_data, dtype=param_data.dtype, place=param.place))
returnpaddle_model# tf can set the data type of the model by changing the dtype of the parameterdeftf_change_model_dtype(model, is_gpu):
importtensorflowastfdtype="float16"ifis_gpuelse"float64"forvariableinmodel.variables:
new_variable=tf.cast(variable, dtype)
model.variables.append(new_variable)
returnmodel# torch can set the data type of the model through function callsdeftorch_change_model_dtype(model, is_gpu):
ifnotis_gpu:
model=model.double()
else:
model=model.half()
returnmodel# paddle can set the data type of the model by setting environment variablesdefpaddle_change_model_dtype(model_class, is_gpu, init_params):
importpaddlepaddle.set_device('gpu:1') ifis_gpuelsepaddle.set_device('cpu')
paddle.set_default_dtype('float16') ifis_gpuelsepaddle.set_default_dtype('float64')
model=model_class()
new_init_params= {}
ifis_gpu:
forname, paramininit_params.items():
new_init_params[name] =param.astype('float16')
else:
forname, paramininit_params.items():
new_init_params[name] =param.astype('float64')
model=set_paddle_params(model, new_init_params)
paddle.set_default_dtype('float32')
returnmodel# When we try all three methods, we cannot change the data type of the ms modeldefms_change_model_dtype(model_class, is_gpu, init_params):
importmindsporemindspore.context.set_context(device_target=('GPU'ifis_gpuelse'CPU'))
model=model_classnew_init_params= {}
forname, paramininit_params.items():
new_init_params[name] =param.astype('float16')
model=set_mindspore_params(model, new_init_params)
returnmodeldefget_mindspore_params(model):
params= {}
forname, paraminmodel.parameters_and_names():
target_params=param.numpy()
iflen(target_params.shape) ==2:
target_params=target_params.Tparams[name] =target_paramsreturnparamsms_model=Model_ObroUQ4fAbnFbmOInLYaYRvj8vfg1MYT()
ms_input=mindspore.Tensor(np.random.randn(1, 1, 28, 28).astype(np.float16))
init_params=get_mindspore_params(ms_model)
ms_model=ms_change_model_dtype(ms_model, False, init_params)
ms_model(ms_input)
The text was updated successfully, but these errors were encountered:
Environment
Hardware Environment(
Ascend
/GPU
/CPU
): CPU/GPUSoftware Environment:
Describe the current behavior
MindSpore’s lack of flexibility in converting model parameter data types, unlike other frameworks, introduces several limitations that impact its effectiveness in various scenarios. There are several flaws in this design:
Limited Memory Flexibility: Different data types require varying amounts of memory, and the ability to switch types helps optimize memory usage on constrained devices. A fixed data type can lead to unnecessary memory consumption or prevent larger models from being loaded on memory-limited devices.
Reduced Compatibility and Portability: In multi-framework environments (e.g., converting models between PyTorch and MindSpore), restrictions on data types increase the difficulty of model transfer. This can lead to inconsistent model performance across platforms, affecting model accuracy and reliability.
Lowered User Experience: Developers often need to adjust data types based on specific task requirements, especially when balancing accuracy and performance. Without flexible data type conversion, users face limitations in training and deploying models, reducing the overall usability of the framework.
Describe the expected behavior
Like any other framework
Steps to reproduce the issue
The text was updated successfully, but these errors were encountered: