Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T5模型如何加载多个decoder #483

Open
KyrieIrving24 opened this issue Jul 18, 2022 · 1 comment
Open

T5模型如何加载多个decoder #483

KyrieIrving24 opened this issue Jul 18, 2022 · 1 comment

Comments

@KyrieIrving24
Copy link

提问时请尽可能提供如下信息:

基本信息

  • 你使用的操作系统:
  • 你使用的Python版本: 3.6
  • 你使用的Tensorflow版本: 1.15
  • 你使用的Keras版本: 2.3.1
  • 你使用的bert4keras版本: 0.11.3
  • 你使用纯keras还是tf.keras:
  • 你加载的预训练模型:mt5.1.1

核心代码

class Multi_decoder(tf.keras.Model):
    def __init__(self, encoder, decoder):
        super().__init__()
        self.encoder = encoder
        self.decoder = decoder

    def call(self, inputs):
        encoder_input, decoder_input = inputs
        encoder_encodings, encoder_masks = self.encoder(encoder_input)
        decoder_outputs = self.decoder([decoder_input, encoder_encodings, encoder_masks])
        return decoder_outputs

输出信息

 Traceback (most recent call last):
  File "call.py", line 175, in <module>
    model.fit(x=[batch_t_token_ids, batch_p_token_ids], y=batch_p_token_ids, batch_size=batch_size, epochs=epochs, callbacks=[evaluator])
  File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 727, in fit
    use_multiprocessing=use_multiprocessing)
  File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_arrays.py", line 643, in fit
    shuffle=shuffle)
  File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2418, in _standardize_user_data
    all_inputs, y_input, dict_inputs = self._build_model_with_inputs(x, y)
  File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2621, in _build_model_with_inputs
    self._set_inputs(cast_inputs)
  File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2708, in _set_inputs
    outputs = self(inputs, **kwargs)
  File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 854, in __call__
    outputs = call_fn(cast_inputs, *args, **kwargs)
  File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/autograph/impl/api.py", line 237, in wrapper
    raise e.ag_error_metadata.to_exception(e)
NameError: in converted code:

    call.py:23 call  *
        encoder_encodings, encoder_masks = self.encoder(encoder_input)
    /Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/keras/engine/base_layer.py:506 __call__  *
        output_shape = self.compute_output_shape(input_shape)
    /Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/keras/engine/network.py:656 compute_output_shape  *
        output_shape = layer.compute_output_shape(
    /Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/keras/layers/merge.py:173 compute_output_shape  *
        output_shape = self._compute_elemwise_op_output_shape(output_shape,
    /Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/keras/layers/merge.py:50 _compute_elemwise_op_output_shape  *
        for i, j in zip(shape1[-len(shape2):], shape2):
    /Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:339 for_stmt
        return _py_for_stmt(iter_, extra_test, body, get_state, set_state, init_vars)
    /Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:348 _py_for_stmt
        if extra_test is not None and not extra_test(*state):
    /var/folders/v_/m84qz0751dv95zzxzwll27840000gp/T/tmpnbr9tvxs.py:158 extra_test
        return ag__.not_(do_return_2)

    NameError: free variable 'do_return_2' referenced before assignment in enclosing scope

自我尝试

您好,我想尝试基于T5的多个decoder,也就是将T5拆解开,decoder复制多个。思路是通过build_transformer_model加载多个decoder,目前还是单个decoder,这样就已经跑不通了。通过这样的代码实现方式能否实现呢

@bojone
Copy link
Owner

bojone commented Jul 21, 2022

看错误信息,似乎跟模型实现没有关系?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants