MindSpore cannot convert the data types of model parameters #313

PhyllisJi · 2024-10-31T01:46:53Z

Environment

Hardware Environment(`Ascend`/`GPU`/`CPU`): CPU/GPU

Software Environment:

MindSpore version (source or binary): 2.2.14
Python version (e.g., Python 3.7.5): 3.8
OS platform and distribution (e.g., Linux Ubuntu 16.04): Ubuntu 22.04
GCC/Compiler version (if compiled from source):

Describe the current behavior

MindSpore’s lack of flexibility in converting model parameter data types, unlike other frameworks, introduces several limitations that impact its effectiveness in various scenarios. There are several flaws in this design:

Limited Memory Flexibility: Different data types require varying amounts of memory, and the ability to switch types helps optimize memory usage on constrained devices. A fixed data type can lead to unnecessary memory consumption or prevent larger models from being loaded on memory-limited devices.
Reduced Compatibility and Portability: In multi-framework environments (e.g., converting models between PyTorch and MindSpore), restrictions on data types increase the difficulty of model transfer. This can lead to inconsistent model performance across platforms, affecting model accuracy and reliability.
Lowered User Experience: Developers often need to adjust data types based on specific task requirements, especially when balancing accuracy and performance. Without flexible data type conversion, users face limitations in training and deploying models, reducing the overall usability of the framework.

Describe the expected behavior

Like any other framework

Steps to reproduce the issue

import mindspore
import numpy as np
import random


class Model_ObroUQ4fAbnFbmOInLYaYRvj8vfg1MYT(mindspore.nn.Cell):
    def __init__(self):
        super(Model_ObroUQ4fAbnFbmOInLYaYRvj8vfg1MYT, self).__init__()
        self.conv1_mutated = mindspore.nn.Conv2d(in_channels=1, out_channels=6, kernel_size=(8, 8), stride=(1, 1), pad_mode="pad", padding=(0, 0, 0, 0), dilation=(1, 1), group=1, has_bias=True)
        self.tail_flatten = mindspore.nn.Flatten(start_dim=1, end_dim=-1)
        self.tail_fc = mindspore.nn.Dense(in_channels=2646, out_channels=10)

    def construct(self, input):
        conv1_output = self.conv1_mutated(input)
        tail_flatten_output = self.tail_flatten(conv1_output)
        tail_fc_output = self.tail_fc(tail_flatten_output)

        tail_fc_output = tail_fc_output
        return tail_fc_output


def set_mindspore_params(ms_model, init_params):
    import mindspore

    for name, param in ms_model.parameters_and_names():
        if name in init_params:
            target_params = init_params[name]
            if len(target_params.shape) == 2:
                target_params = init_params[name].T

            if str(target_params.dtype).endswith('float16'):
                dtype = mindspore.float16
            elif str(target_params.dtype).endswith('float64'):
                dtype = mindspore.float64
            else:
                dtype = mindspore.float32

            param.set_data(mindspore.Tensor(target_params, dtype))
    return ms_model


def set_paddle_params(paddle_model, init_params):
    import paddle

    for name, param in paddle_model.named_parameters():
        if name in init_params:
            param_data = init_params[name]
            if "weight" in name:
                if len(param_data.shape) == 2:
                    if param_data.shape == (param.shape[1], param.shape[0]):
                        param_data = param_data.T  # 转置矩阵
            param.set_value(paddle.to_tensor(param_data, dtype=param_data.dtype, place=param.place))
    return paddle_model


# tf can set the data type of the model by changing the dtype of the parameter
def tf_change_model_dtype(model, is_gpu):
    import tensorflow as tf

    dtype = "float16" if is_gpu else "float64"
    for variable in model.variables:
        new_variable = tf.cast(variable, dtype)
        model.variables.append(new_variable)
    return model


# torch can set the data type of the model through function calls
def torch_change_model_dtype(model, is_gpu):
    if not is_gpu:
        model = model.double()
    else:
        model = model.half()

    return model


# paddle can set the data type of the model by setting environment variables
def paddle_change_model_dtype(model_class, is_gpu, init_params):
    import paddle
    paddle.set_device('gpu:1') if is_gpu else paddle.set_device('cpu')
    paddle.set_default_dtype('float16') if is_gpu else paddle.set_default_dtype('float64')
    model = model_class()
    new_init_params = {}
    if is_gpu:
        for name, param in init_params.items():
            new_init_params[name] = param.astype('float16')
    else:
        for name, param in init_params.items():
            new_init_params[name] = param.astype('float64')
    model = set_paddle_params(model, new_init_params)
    paddle.set_default_dtype('float32')

    return model


# When we try all three methods, we cannot change the data type of the ms model
def ms_change_model_dtype(model_class, is_gpu, init_params):
    import mindspore

    mindspore.context.set_context(device_target=('GPU' if is_gpu else 'CPU'))
    model = model_class
    new_init_params = {}
    for name, param in init_params.items():
        new_init_params[name] = param.astype('float16')
    model = set_mindspore_params(model, new_init_params)
    return model


def get_mindspore_params(model):
    params = {}
    for name, param in model.parameters_and_names():
        target_params = param.numpy()
        if len(target_params.shape) == 2:
            target_params = target_params.T
        params[name] = target_params
    return params


ms_model = Model_ObroUQ4fAbnFbmOInLYaYRvj8vfg1MYT()
ms_input = mindspore.Tensor(np.random.randn(1, 1, 28, 28).astype(np.float16))

init_params = get_mindspore_params(ms_model)
ms_model = ms_change_model_dtype(ms_model, False, init_params)
ms_model(ms_input)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MindSpore cannot convert the data types of model parameters #313

MindSpore cannot convert the data types of model parameters #313

PhyllisJi commented Oct 31, 2024

MindSpore cannot convert the data types of model parameters #313

MindSpore cannot convert the data types of model parameters #313

Comments

PhyllisJi commented Oct 31, 2024

Environment

Hardware Environment(Ascend/GPU/CPU): CPU/GPU

Software Environment:

Describe the current behavior

Describe the expected behavior

Steps to reproduce the issue

Hardware Environment(`Ascend`/`GPU`/`CPU`): CPU/GPU