Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练出错 #32

Open
yanduosha opened this issue Mar 21, 2020 · 1 comment
Open

训练出错 #32

yanduosha opened this issue Mar 21, 2020 · 1 comment

Comments

@yanduosha
Copy link

训练时提示下面错误,能否帮忙看下?
shajunqin@tonly-Super-Server:~/face/face_landmark-master$ python3 train.py
/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
/home/shajunqin/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/shajunqin/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/shajunqin/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/shajunqin/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/shajunqin/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/shajunqin/.local/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
[2020-03-21 13:57:05,638] [INFO] The trainer start
2020-03-21 13:57:05.639374: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-03-21 13:57:05.643071: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:02:00.0
2020-03-21 13:57:05.643261: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-03-21 13:57:05.644634: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-03-21 13:57:05.645885: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2020-03-21 13:57:05.646210: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2020-03-21 13:57:05.647953: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2020-03-21 13:57:05.649239: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2020-03-21 13:57:05.653271: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-03-21 13:57:05.654423: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2020-03-21 13:57:05.654785: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-03-21 13:57:05.822849: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x49d5dc0 executing computations on platform CUDA. Devices:
2020-03-21 13:57:05.822906: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce GTX 1080 Ti, Compute Capability 6.1
2020-03-21 13:57:05.846545: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2299855000 Hz
2020-03-21 13:57:05.852721: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4b799f0 executing computations on platform Host. Devices:
2020-03-21 13:57:05.852772: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): ,
2020-03-21 13:57:05.853937: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:02:00.0
2020-03-21 13:57:05.854029: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-03-21 13:57:05.854067: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-03-21 13:57:05.854099: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2020-03-21 13:57:05.854130: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2020-03-21 13:57:05.854163: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2020-03-21 13:57:05.854199: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2020-03-21 13:57:05.854259: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-03-21 13:57:05.856208: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2020-03-21 13:57:05.856290: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-03-21 13:57:05.857873: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-03-21 13:57:05.857909: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2020-03-21 13:57:05.857927: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2020-03-21 13:57:05.865393: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5953 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1)
1 Physical GPUs, 1 Logical GPUs
2020-03-21 13:57:05.875745: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:02:00.0
2020-03-21 13:57:05.875858: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-03-21 13:57:05.875911: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-03-21 13:57:05.875954: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2020-03-21 13:57:05.875995: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2020-03-21 13:57:05.876037: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2020-03-21 13:57:05.876078: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2020-03-21 13:57:05.876121: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-03-21 13:57:05.881381: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2020-03-21 13:57:05.881438: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-03-21 13:57:05.881461: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2020-03-21 13:57:05.881477: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2020-03-21 13:57:05.883184: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:0 with 5953 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1)
[2020-03-21 13:57:07,515] [INFO] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
[2020-03-21 13:57:07,516] [INFO] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
[2020-03-21 13:57:07,517] [INFO] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
[2020-03-21 13:57:07,525] [INFO] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
[2020-03-21 13:57:07,526] [INFO] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
[2020-03-21 13:57:07,526] [INFO] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
Traceback (most recent call last):
File "train.py", line 108, in
main()
File "train.py", line 51, in main
model(image)
File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 712, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 416, in call
self._initialize(args, kwds, add_initializers_to=initializer_map)
File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 359, in _initialize
*args, **kwds))
File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1360, in _get_concrete_function_internal_garbage_collected
graph_function, _, _ = self._maybe_define_function(args, kwargs)
File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1648, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1541, in _create_graph_function
capture_by_value=self._capture_by_value),
File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 716, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 309, in wrapped_fn
return weak_wrapped_fn().wrapped(*args, **kwds)
File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2155, in bound_method_wrapper
return wrapped_fn(*args, **kwargs)
File "/home/shajunqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 706, in wrapper
raise e.ag_error_metadata.to_exception(type(e))
RuntimeError: in converted code:
relative to /home/shajunqin:

face/face_landmark-master/lib/core/model/shufflenet/simpleface.py:51 call  *
    x1, x2, x3 = self.backbone(inputs, training=training)
.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:667 __call__
    outputs = call_fn(inputs, *args, **kwargs)
face/face_landmark-master/lib/core/model/shufflenet/shufflenet.py:197 call  *
    x=self.first_conv(inputs,training=training)
.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:667 __call__
    outputs = call_fn(inputs, *args, **kwargs)
.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py:262 call
    outputs = layer(inputs, **kwargs)
.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:667 __call__
    outputs = call_fn(inputs, *args, **kwargs)
.local/lib/python3.6/site-packages/tensorflow/python/keras/layers/normalization.py:651 call
    outputs = self._fused_batch_norm(inputs, training=training)
.local/lib/python3.6/site-packages/tensorflow/python/keras/layers/normalization.py:533 _fused_batch_norm
    self.add_update(mean_update)
.local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py:507 new_func
    return func(*args, **kwargs)
.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:1083 add_update
    '`add_update` was called in a cross-replica context. This is not '

RuntimeError: `add_update` was called in a cross-replica context. This is not expected. If you require this feature, please file an issue.
@610265158
Copy link
Owner

用的哪个分之呢,在单卡上跑下试试

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants