[Bug] EMD loss cannot handle input size less than 4096 either #21

Lotayou · 2019-01-22T07:54:48Z

@yulequan I just found out that EMD loss module would crash too even if the input size is smaller than 4096.

Here are the error message:

Warning: Input parameter 2048 has been switched to 1722 for dyna_patch dataset...
vcl-dl-3
Namespace(batch_size=1, dataset='dyna_patch', gpu='0', learning_rate=0.001, log_dir='../model/debug', max_epoch=120, num_point=2048, phase='train', test_dir='../data/test_data/our_collected_data/MC_5k', up_ratio=2)
Traceback (most recent call last):
  File "main.py", line 277, in <module>
    assert not os.path.exists(os.path.join(MODEL_DIR, 'code/'))
AssertionError
(yanglingbo) ylb@vcl-dl-3:~/projects/3D_mesh_SR/PU-Net/code$ sh train_dyna_patch.sh
Warning: Input parameter 2048 has been switched to 1722 for dyna_patch dataset...
vcl-dl-3
Namespace(batch_size=1, dataset='dyna_patch', gpu='0', learning_rate=0.001, log_dir='../model/debug', max_epoch=120, num_point=2048, phase='train', test_dir='../data/test_data/our_collected_data/MC_5k', up_ratio=2)
2019-01-22 15:43:58.696137: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-01-22 15:44:00.175833: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.645
pciBusID: 0000:02:00.0
totalMemory: 10.92GiB freeMemory: 10.76GiB
2019-01-22 15:44:00.176297: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1)
use randominput, input h5 file is: ../h5_data/dyna_patch_dataset_pu_net.h5
Normalization the data
total 10220 samples
NUM_BATCH is 10220
True True
**** EPOCH 000 ****
2019-01-22 15:44:24.039949: E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_ADDRESS
2019-01-22 15:44:24.040244: F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:203] Unexpected Event status: 1
Aborted (core dumped)

Data Processing
My original mesh contains 6890 points, to cope with the EMD size contraint I split each human in left and right halves, with 3444 points, and I choose downsample ratio r=2, so the downsampled input contains 1722 points, and the output should also contain 3444 points. However, the error still happens just as when my input is over 4096 points. In the meantime, training the author provided 4096-point dataset works without problem.

Configuration

CUDA 9.0
CUDNN 7005
Python 3.6
Tensorflow 1.5.1

Also in #3 .

The text was updated successfully, but these errors were encountered:

MrXiaoZhen · 2019-04-29T06:32:21Z

have you solve this problem???

MrXiaoZhen · 2019-04-29T06:32:36Z

@Lotayou

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] EMD loss cannot handle input size less than 4096 either #21

[Bug] EMD loss cannot handle input size less than 4096 either #21

Lotayou commented Jan 22, 2019

MrXiaoZhen commented Apr 29, 2019

MrXiaoZhen commented Apr 29, 2019

[Bug] EMD loss cannot handle input size less than 4096 either #21

[Bug] EMD loss cannot handle input size less than 4096 either #21

Comments

Lotayou commented Jan 22, 2019

MrXiaoZhen commented Apr 29, 2019

MrXiaoZhen commented Apr 29, 2019