-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The dp compress with pytorch backend can not do the compression (finetune-->distillation based on deepmd-kit-v3-b3, dpgen2) #274
Comments
dp --pt compress -i model.pth -o compress_model.pth also not work |
The setting of descriptor during distillation with the "attn_layer": 0 |
b3 only supports compression with tf backend. |
Whether the b4 supports compression with pytorch backend? |
we use the Pytorch backend for the distillation process, and download the model at /prep-run-train/output/models/task.0000 by "dpgen2 download ..." command, then get a model: model.ckpt.pt, and frozen it by the " dp --pt freeze -o model.pth" command (need a manual add checkpoint file) and obtain the model.pth. However, this model.pth cannot compress by the Pytorch compression command "dp compress -i model.pth -o model-compress.pth", and it gives the following error message:
root@bohrium-12166-1204587:/personal/dpa2_hea/version10/distill_stable/iter-000002/prep-run-train/output/models/task.0000# dp compress -i model.pth -o model_compress.pth
2024-11-12 14:48:51.537628: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-11-12 14:48:51.537679: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-11-12 14:48:51.537696: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-11-12 14:48:51.544735: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING:tensorflow:From /opt/deepmd-kit-3.0.0b3/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:108: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
WARNING:tensorflow:disable_mixed_precision_graph_rewrite() called when mixed precision is already disabled.
Traceback (most recent call last):
File "/opt/deepmd-kit-3.0.0b3/bin/dp", line 10, in
sys.exit(main())
File "/opt/deepmd-kit-3.0.0b3/lib/python3.10/site-packages/deepmd/main.py", line 923, in main
deepmd_main(args)
File "/opt/deepmd-kit-3.0.0b3/lib/python3.10/site-packages/deepmd/tf/entrypoints/main.py", line 81, in main
compress(**dict_args)
File "/opt/deepmd-kit-3.0.0b3/lib/python3.10/site-packages/deepmd/tf/entrypoints/compress.py", line 98, in compress
graph, _ = load_graph_def(input)
File "/opt/deepmd-kit-3.0.0b3/lib/python3.10/site-packages/deepmd/tf/utils/graph.py", line 42, in load_graph_def
graph_def.ParseFromString(f.read())
google.protobuf.message.DecodeError: Error parsing message with type 'tensorflow.GraphDef'
The text was updated successfully, but these errors were encountered: