Skip to content

Commit

Permalink
only run CheckIlluminaDirectory w/ LINK_LOCS=true if HiSeqX/4k and Ch…
Browse files Browse the repository at this point in the history
…eckIlluminaDirectory failed previously (#641)

* remove broken symlinks in runs; call CheckIlluminaDirectory

HiSeq4000 and HiSeq X runs write out a single “s.locs” file to
Data/Intensities, rather than per-tile cluster location files within
Data/Intensities/L*/*.locs. Related to this, CheckIlluminaDirectory
with LINK_LOCS=‘true’ creates symlinks from where per-tile location
files would be to the absolute path of the s.locs file. This can cause
problems when moving a run to a new system, since the absolute paths
will be incorrect. This addition checks to see if s.locs is present. If
so, broken symlinks within the Data/Intensitites/L* are removed, and
LINK_LOCS is specified for the CheckIlluminaDirectory call that now
happens before demultiplexing. This call ensures the run looks correct,
and in the case of HiSeq4000/HiSeqX runs, also creates symlinks for
per-tile location files using absolute paths on the current system.
This is all something of a workaround for Broad-derived
HiSeq4000/HiSeqX runs, which are delivered with brittle absolute
symlinks. With this addition, runs from Broad walkup or elsewhere are
more cloud-compatible. In connection with this, a PR has been opened in
the Picard repository to request that LINK_LOCS can create relative
symlinks rather than absolute:
broadinstitute/picard#877

* remove_broken_symlinks -> find_broken_symlinks, with logging changes

remove_broken_symlinks has been renamed to find_broken_symlinks since
it now returns a list of broken links rather than removing them
directly. The function in util.file is now silent, and logging is
performed where the function is called in illumina.py.

* pass link_locs as python bool

* run CheckIlluminaDirectory w/o `LINK_LOCS=true` and then, if it fails, and if it’s a HiSeq X / 4000, run it again with `LINK_LOCS=true`

run CheckIlluminaDirectory w/o `LINK_LOCS=true` and then, if it fails,
and if it’s a HiSeq X / 4000, run it again with `LINK_LOCS=true`
  • Loading branch information
tomkinsc authored Jul 26, 2017
1 parent c024abd commit 78baead
Showing 1 changed file with 31 additions and 20 deletions.
51 changes: 31 additions & 20 deletions illumina.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,26 +114,37 @@ def main_illumina_demux(args):
# These links may break if the run directory is moved.
# We should begin by removing broken links, if present,
# and call CheckIlluminaDirectory ourselves if a 's.locs'
# file is present
if os.path.exists(os.path.join(illumina.get_intensities_dir(), "s.locs")):
# recurse to remove broken links in directory
log.info("This run has an 's.locs' file; checking for and removing broken per-tile symlinks...")
broken_links = util.file.find_broken_symlinks(illumina.get_intensities_dir())
if len(broken_links):
for lpath in broken_links:
log.info("Removing broken symlink: %s", lpath)
os.unlink(lpath)

# call CheckIlluminaDirectory with LINK_LOCS=true
link_locs=True

log.info("Checking run directory with Picard...")
tools.picard.CheckIlluminaDirectoryTool().execute(
illumina.get_BCLdir(),
args.lane,
illumina.get_RunInfo().get_read_structure(),
link_locs=link_locs
)
# file is present, but only if the directory check fails
# since link_locs=true tries to create symlinks even if they
# (or the files) already exist
try:
tools.picard.CheckIlluminaDirectoryTool().execute(
illumina.get_BCLdir(),
args.lane,
illumina.get_RunInfo().get_read_structure(),
link_locs=link_locs
)
except subprocess.CalledProcessError as e:
log.warning("CheckIlluminaDirectory failed for %s", illumina.get_BCLdir())
if os.path.exists(os.path.join(illumina.get_intensities_dir(), "s.locs")):
# recurse to remove broken links in directory
log.info("This run has an 's.locs' file; checking for and removing broken per-tile symlinks...")
broken_links = util.file.find_broken_symlinks(illumina.get_intensities_dir())
if len(broken_links):
for lpath in broken_links:
log.info("Removing broken symlink: %s", lpath)
os.unlink(lpath)

# call CheckIlluminaDirectory with LINK_LOCS=true
link_locs=True

log.info("Checking run directory with Picard...")
tools.picard.CheckIlluminaDirectoryTool().execute(
illumina.get_BCLdir(),
args.lane,
illumina.get_RunInfo().get_read_structure(),
link_locs=link_locs
)

# Picard ExtractIlluminaBarcodes
extract_input = util.file.mkstempfname('.txt', prefix='.'.join(['barcodeData', flowcell, str(args.lane)]))
Expand Down

0 comments on commit 78baead

Please sign in to comment.