Replies: 1 comment
-
From: ja***@ph*** (Jasmina Blecic) Hi Ian, can you send us your BART.cfg file. Usually this kind of error comes Cheers On 03/28/2016 09:26 PM, ia***@lp***rote:
|
Beta Was this translation helpful? Give feedback.
-
From: ja***@ph*** (Jasmina Blecic) Hi Ian, can you send us your BART.cfg file. Usually this kind of error comes Cheers On 03/28/2016 09:26 PM, ia***@lp***rote:
|
Beta Was this translation helpful? Give feedback.
-
The following is an archived message from the BART-user mailing list, which has now closed.
From: ia***@lp*** (ia***@lp***)
Date: Mon, 28 Mar 2016 10:26:45 -0700
Subject: [BART-user] BART: partial run, then crash w/MPI?
Hello,
While trying to run a four-molecule test case, the system starts the first
10 MCMC iterations but then crashes with what appears to be an MPI error
(see below). I do not encounter this problem when running the default,
single-molecule example described on the GitHub website.
Has anyone encountered a similar issue, or have any idea what my problem
might be?
Many thanks,
-Ian
.....
There are 100 layers and 10 species.
Iteration: 09999
Start MCMC chains (Mon Mar 28 10:20:55 2016)
Iteration: 09998
Iteration: 09997
Iteration: 09996
Iteration: 09995
Iteration: 09994
Iteration: 09993
Iteration: 09992
Iteration: 09991
Traceback (most recent call last):
File
"/home/ianc/python/BART_demo3/BART/modules/MCcubed//MCcubed/mccubed.py",
line 681, in
main()
File
"/home/ianc/python/BART_demo3/BART/modules/MCcubed//MCcubed/mccubed.py",
line 378, in main
comm, resume, log, rms)
File
"/home/ianc/python/BART_demo3/BART/modules/MCcubed/MCcubed/../MCcubed/mc/mcmc.py",
line 361, in mcmc
mu.comm_gather(comm, mpimodels)
File
"/home/ianc/python/BART_demo3/BART/modules/MCcubed/MCcubed/../MCcubed/utils/mcutils.py",
line 255, in comm_gather
comm.Barrier()
File "Comm.pyx", line 394, in mpi4py.MPI.Comm.Barrier
(src/mpi4py.MPI.c:61978)
mpi4py.MPI.Exception: Other MPI error, error stack:
PMPI_Barrier(425)..............: MPI_Barrier(comm=0x84000000) failed
MPIR_Barrier_impl(331).........: Failure during collective
MPIR_Barrier_impl(323).........:
MPIR_Barrier_inter(208)........:
MPIR_Bcast_inter(1245).........:
MPIC_Send(63)..................:
MPIDI_EagerContigShortSend(262): failure occurred while attempting to send
an eager message
MPIDI_CH3_iStartMsg(36)........: Communication error with rank 0
MPIR_Barrier_inter(198)........:
MPIR_Bcast_inter(1280).........:
MPIR_Bcast_intra(1155).........:
MPIR_Bcast_binomial(213).......: Failure during collective
MPIR_Bcast_inter(1263).........:
dequeue_and_set_error(596).....: Communication error with rank 0
=====================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 139
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
Transit call with the best-fitting values.
Traceback (most recent call last):
File "/home/ianc/python/BART_demo3/BART/BART.py", line 442, in
main()
File "/home/ianc/python/BART_demo3/BART/BART.py", line 394, in main
refpress, tconfig, date_dir, params, burnin, abun_basic)
File "/home/ianc/python/BART_demo3/BART/code/bestFit.py", line 313, in
callTransit
sma, grav*1e2)
File "/home/ianc/python/BART_demo3/BART/code/PT.py", line 653, in PT_line
kappa = 10**(params[0])
IndexError: index 0 is out of bounds for axis 0 with size 0
Beta Was this translation helpful? Give feedback.
All reactions