-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dsync generates MPI error when I'm not the owner of the source path #550
Comments
The dsync utility is used to synchronize file and directories. Which means, once the files/directories are copied over, the process will check for differences as owner (uid) and group id (did) among other things and will set them (in the destination) identically to the source. ecxample output: [2024-11-26T15:32:36] Setting ownership, permissions, and timestamps. in this case you would be better off to use 'dcp'. which will do only the copy part. To preserve the owner and other attribute with dcp you would have to specify the following flag: -p, --preserve - preserve permissions, ownership, timestamps (see also --xattrs) |
If there is no objection, I will close this issue. |
AFAICS, one reason to use Running rsync as a non-root user will silently ignore the differences in the target file ownership if the user is not running as root (strace didn't show any failed attempts at |
I totally agree with you on the use of dsync instead of dcp. I was trying to explain the error he saw. I was not able to reproduce it using another user directory with read permission on it. Shall we close the issue? |
Hello,
I've installed mpifileutils version 0.11.1 using spack.
I always get an MPI error when I am not the owner of the source file/directory. Although I have at least read permissions.
The files are copied however this error is still generated. It's annoying as it was submitted as an LSF or SLURM job it will be mrked as failed.
No errors are generated if I am the owner of the source.
Example:
[aelmazaty@codon-dm-06 lsf-hx-wp]# ls -l /hps/scratch/sysinf/power_usage
-rw-r--r-- 1 root root 17035 Sep 5 2022 /hps/scratch/sysinf/power_usage
[aelmazaty@codon-dm-06 lsf-hx-wp]# mpirun -np 4 dsync -v --progress 1 /hps/scratch/sysinf/power_usage /hps/scratch/sysinf/aelmazaty/
[2023-06-13T16:01:14] Walking source path
[2023-06-13T16:01:14] Walking /hps/scratch/sysinf/power_usage
[2023-06-13T16:01:14] Walked 1 items in 0.001 secs (882.196 items/sec) ...
[2023-06-13T16:01:14] Walked 1 items in 0.001 seconds (818.132 items/sec)
[2023-06-13T16:01:14] Walking destination path
[2023-06-13T16:01:14] Walking /hps/scratch/sysinf/aelmazaty
[2023-06-13T16:01:14] Walked 1 items in 0.002 secs (617.520 items/sec) ...
[2023-06-13T16:01:14] Walked 1 items in 0.002 seconds (606.374 items/sec)
[2023-06-13T16:01:14] Comparing file sizes and modification times of 1 items
[2023-06-13T16:01:14] Started : Jun-13-2023, 16:01:14
[2023-06-13T16:01:14] Completed : Jun-13-2023, 16:01:14
[2023-06-13T16:01:14] Seconds : 0.000
[2023-06-13T16:01:14] Items : 1
[2023-06-13T16:01:14] Item Rate : 1 items in 0.000158 seconds (6310.263012 items/sec)
[2023-06-13T16:01:14] Updating timestamps on newly copied files
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[11234,1],0]
Exit code: 1
The file is copied however an error is generated
When I try with a file I own:
[aelmazaty@codon-dm-06 lsf-hx-wp]# ls -l /hps/scratch/sysinf/power_usage_aelmazaty
-rw-r--r-- 1 aelmazaty systems 17035 Jun 13 13:54 /hps/scratch/sysinf/power_usage_aelmazaty
[aelmazaty@codon-dm-06 lsf-hx-wp]# mpirun -np 4 dsync -v --progress 1 /hps/scratch/sysinf/power_usage_aelmazaty /hps/scratch/sysinf/aelmazaty/
[2023-06-13T16:02:17] Walking source path
[2023-06-13T16:02:17] Walking /hps/scratch/sysinf/power_usage_aelmazaty
[2023-06-13T16:02:17] Walked 1 items in 0.001 secs (872.339 items/sec) ...
[2023-06-13T16:02:17] Walked 1 items in 0.001 seconds (804.228 items/sec)
[2023-06-13T16:02:17] Walking destination path
[2023-06-13T16:02:17] Walking /hps/scratch/sysinf/aelmazaty
[2023-06-13T16:02:17] Walked 1 items in 0.000 secs (2210.726 items/sec) ...
[2023-06-13T16:02:17] Walked 1 items in 0.000 seconds (2045.349 items/sec)
[2023-06-13T16:02:17] Comparing file sizes and modification times of 1 items
[2023-06-13T16:02:17] Started : Jun-13-2023, 16:02:17
[2023-06-13T16:02:17] Completed : Jun-13-2023, 16:02:17
[2023-06-13T16:02:17] Seconds : 0.000
[2023-06-13T16:02:17] Items : 1
[2023-06-13T16:02:17] Item Rate : 1 items in 0.000162 seconds (6177.720668 items/sec)
[2023-06-13T16:02:17] Deleting items from destination
[2023-06-13T16:02:17] Removing 1 items
[2023-06-13T16:02:17] Removed 1 items in 0.003 seconds (327.228 items/sec)
[2023-06-13T16:02:17] Copying items to destination
[2023-06-13T16:02:17] Copying to /hps/scratch/sysinf/aelmazaty
[2023-06-13T16:02:17] Items: 1
[2023-06-13T16:02:17] Directories: 0
[2023-06-13T16:02:17] Files: 1
[2023-06-13T16:02:17] Links: 0
[2023-06-13T16:02:17] Data: 16.636 KiB (16.636 KiB per file)
[2023-06-13T16:02:17] Creating 1 files.
[2023-06-13T16:02:17] Copying data.
[2023-06-13T16:02:17] Copy data: 16.636 KiB (17035 bytes)
[2023-06-13T16:02:17] Copy rate: 1.207 MiB/s (17035 bytes in 0.013 seconds)
[2023-06-13T16:02:17] Syncing data to disk.
[2023-06-13T16:02:17] Sync completed in 0.020 seconds.
[2023-06-13T16:02:17] Setting ownership, permissions, and timestamps.
[2023-06-13T16:02:17] Updated 1 items in 0.003 seconds (298.208 items/sec)
[2023-06-13T16:02:17] Syncing directory updates to disk.
[2023-06-13T16:02:17] Sync completed in 0.001 seconds.
[2023-06-13T16:02:17] Started: Jun-13-2023,16:02:17
[2023-06-13T16:02:17] Completed: Jun-13-2023,16:02:17
[2023-06-13T16:02:17] Seconds: 0.043
[2023-06-13T16:02:17] Items: 1
[2023-06-13T16:02:17] Directories: 0
[2023-06-13T16:02:17] Files: 1
[2023-06-13T16:02:17] Links: 0
[2023-06-13T16:02:17] Data: 16.636 KiB (17035 bytes)
[2023-06-13T16:02:17] Rate: 391.203 KiB/s (17035 bytes in 0.043 seconds)
[2023-06-13T16:02:17] Updating timestamps on newly copied files
It works normally without getting any errors.
I tried different openmpi versions. All installed via spack. The latest is 4.1.5. I get the same error on all of them.
Is that a know issue? How can I avoid these errors?
Best regards,
Ahmed
The text was updated successfully, but these errors were encountered: