Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MetaKD问题记录 #378

Open
songzetao opened this issue Aug 15, 2022 · 5 comments
Open

MetaKD问题记录 #378

songzetao opened this issue Aug 15, 2022 · 5 comments

Comments

@songzetao
Copy link
Contributor

No description provided.

@songzetao
Copy link
Contributor Author

torch 版本from torch.optim.optimizer import required修改为 oneflow 版本时候,发现 oneflow 的 optimizer 需要以此导入from oneflow.nn.optimizer.optimizer import required ,仅以记录。

@songzetao
Copy link
Contributor Author

module 'oneflow.distributed' has no attribute 'is_initialized'

复现代码

import oneflow as torch
if torch.distributed.is_initialized():
    pass

报错信息

Traceback (most recent call last):
  File "meta_teacher_train.py", line 20, in <module>
    initialize_easynlp()
  File "/workspace/models/KnowledgeDistillation/knowledge_distillation_metakd/metakd_oneflow/easynlp/utils/initializer.py", line 39, in initialize_easynlp
    _initialize_distributed()
  File "/workspace/models/KnowledgeDistillation/knowledge_distillation_metakd/metakd_oneflow/easynlp/utils/initializer.py", line 109, in _initialize_distributed
    if torch.distributed.is_initialized():
AttributeError: module 'oneflow.distributed' has no attribute 'is_initialized'

运行环境

onecloud平台,4core-14Gi-P40(1Card)机器。oneflow version: 0.8.1+cu112(nightly),python version:3.7.7

@songzetao
Copy link
Contributor Author

'Tensor' object has no attribute 'is_sparse'

复现代码

import oneflow as flow
tensor = flow.randn(2, 3)
print(tensor.is_sparse)

报错信息

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_54640/2438940525.py in <module>
      1 import oneflow as flow
      2 tensor = flow.randn(2, 3)
----> 3 print(tensor.is_sparse)

AttributeError: 'Tensor' object has no attribute 'is_sparse'

对比 torch

import torch as flow
tensor = flow.randn(2, 3)
print(tensor.is_sparse)
>>False

运行环境

onecloud平台,4core-14Gi-P40(1Card)机器。oneflow version: 0.8.1+cu112(nightly),python version:3.7.7

@songzetao
Copy link
Contributor Author

对 0 维 tensor 进行 mean(-1) 操作, oneflow 会程序崩溃而 torch 不会

复现代码

import oneflow as flow
input = flow.randn(2, 3)
target = flow.randn(2, 3)
loss = flow.nn.functional.mse_loss(input, target) # 求loss
print(loss)
print(loss.shape)
loss_mean = loss.mean(-1)
print(loss_mean)

报错信息

loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
Canceled future for execute_request message before replies were done
The Kernel crashed while executing code in the the current cell or a previous cell. Please review the code in the cell(s) to identify a possible cause of the failure. Click [here](https://aka.ms/vscodeJupyterKernelCrash) for more info. View Jupyter [log](command:jupyter.viewOutput) for further details.

对比 torch

import torch as flow
input = flow.randn(2, 3)
target = flow.randn(2, 3)
loss = flow.nn.functional.mse_loss(input, target)
print(loss)
print(loss.shape)
loss_mean = loss.mean(-1)
print(loss_mean)
>>> tensor(1.3708)
>>> torch.Size([])
>>> tensor(1.3708)

运行环境

onecloud平台,4core-14Gi-P40(1Card)机器。oneflow version: 0.8.1+cu112(nightly),python version:3.7.7

@Flowingsun007
Copy link
Contributor

对 0 维 tensor 进行 mean(-1) 操作, oneflow 会程序崩溃而 torch 不会

复现代码

import oneflow as flow
input = flow.randn(2, 3)
target = flow.randn(2, 3)
loss = flow.nn.functional.mse_loss(input, target) # 求loss
print(loss)
print(loss.shape)
loss_mean = loss.mean(-1)
print(loss_mean)

报错信息

loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
Canceled future for execute_request message before replies were done
The Kernel crashed while executing code in the the current cell or a previous cell. Please review the code in the cell(s) to identify a possible cause of the failure. Click [here](https://aka.ms/vscodeJupyterKernelCrash) for more info. View Jupyter [log](command:jupyter.viewOutput) for further details.

对比 torch

import torch as flow
input = flow.randn(2, 3)
target = flow.randn(2, 3)
loss = flow.nn.functional.mse_loss(input, target)
print(loss)
print(loss.shape)
loss_mean = loss.mean(-1)
print(loss_mean)
>>> tensor(1.3708)
>>> torch.Size([])
>>> tensor(1.3708)

运行环境

onecloud平台,4core-14Gi-P40(1Card)机器。oneflow version: 0.8.1+cu112(nightly),python version:3.7.7

这个bug应该是oneflow/core/functional/impl/common.cpp 里的CheckAxis对0-dim的判断有点问题,我认领一下,后面提个pr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants