Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

f2s_child_worker::ERROR:Bad fast5:Fast5 file '***.fast5' could not be opened or is corrupted #139

Open
hanyishui0612 opened this issue Nov 26, 2024 · 7 comments

Comments

@hanyishui0612
Copy link

When I run
slow5tools f2s ./fast5/pass/ -d ./vir1_nanopore_drs_2 -p 10

I met the Error:
f2s_child_worker::ERROR:Bad fast5:Fast5 file './fast5/pass/186/W_003002_20180912_ch_261_strand.fast5' could not be opened or is corrupted.

First question: I want to know if bad fast5 will be ignored. The second question: Will this affect the subsequent extraction of m6A modification sites.Looking forward to your reply.

@hasindu2008
Copy link
Owner

Hi

slow5tools will just terminate if an error is met. An error is different to a warning where it will proceed as normal. So answers are:

  1. it will just terminate there and no more conversion will be done
  2. yes, we need to eliminate this corrupted file and reconvert the rest. See below:

In this case, can you move this "./fast5/pass/186/W_003002_20180912_ch_261_strand.fast5" to somewhere outside the ./fast5 (maybe fast5_corrupted) and relaunch the conversion? Hopefully, this is just one single fast5 file that is corrupted. Also, can you share this W_003002_20180912_ch_261_strand.fast5 so we can investigate what is wrong with it?

@hanyishui0612
Copy link
Author

hanyishui0612 commented Nov 27, 2024 via email

@hasindu2008
Copy link
Owner

Hi the fast5 wasn't attached I think.

@hanyishui0612
Copy link
Author

Hi the fast5 wasn't attached I think.

How can I fix this? I once tried to move the bad fast5 to somewhere else, but a new bad fast5 appeared.What should I do?Looking forward to hearing from you

@hasindu2008
Copy link
Owner

Hey,

Is this a publicly available dataset that I can download and try?

@hanyishui0612
Copy link
Author

hanyishui0612 commented Dec 11, 2024 via email

@hasindu2008
Copy link
Owner

hasindu2008 commented Dec 14, 2024

Hey, I found the script I once wrote to cleanup corrupted fast5 files for cases like this. You can rename the .txt file to be .sh. Then in the script change FAST5_DIR to where your input FAST5 files are. Then TMP_FAST5 to where you want the good fast files to be copied to and TMP_FAST5_QUARANTINE to where the bad ones should be copied to. Once the script finishes, you can launch slow5tools f2s on the $TMP_FAST5 directory.

clean-crappy-single-fast5-aggressive.sh.txt

The script is a short script and also copied below:

#!/bin/bash

# edit this to point to the input
FAST5_DIR=fast5

# This is the output, tmp_fast5 will have the good fast5 files, tmp_fast5_quarantine will have the bad fast5 files
TMP_FAST5=tmp_fast5
TMP_FAST5_QUARANTINE=tmp_fast5_quarantine

# Do'nt edit below

[ -z ${SLOW5TOOLS} ] && SLOW5TOOLS=slow5tools
[ -z ${NUM_THREADS} ] && NUM_THREADS=$(nproc)

# terminate script
die() {
	echo "$1" >&2
	echo
	exit 1
}

if [ "$#" -ne 0 ]; then
    die "Usage: $0 "
fi

parallel --version || die "Could not find command 'parallel'. On Ubuntu, install using  sudo apt-get install parallel"
$SLOW5TOOLS --version || die "Could not find $SLOW5TOOLS. Add slow5tools to PATH or set SLOW5TOOLS environment variable"

test -d $FAST5_DIR || die "Directory $FAST5_DIR does not exist."

test -d $TMP_FAST5 && die "Directory $TMP_FAST5 already exists. Please delete that first."
test -d $TMP_FAST5_QUARANTINE && die "Directory $TMP_FAST5_QUARANTINE already exists. Please delete that first."

cp -r $FAST5_DIR $TMP_FAST5 || die "Copying $FAST5_DIR to $TMP_FAST5 failed"
mkdir ${TMP_FAST5_QUARANTINE} || die "Creating directory ${TMP_FAST5_QUARANTINE} failed"


clean_func(){

	file=$1
	TMP_FAST5=$2
	if ! slow5tools f2s $file > /dev/null 2> /dev/null
	then
		echo "moving $file that which slow5tools failed"
		mv -i $file ${TMP_FAST5_QUARANTINE}/ || { echo "mv $file failed"; exit 1;}
	fi
	exit 0

}
export TMP_FAST5_QUARANTINE
export -f clean_func

#classify
find $TMP_FAST5 -name '*.fast5' | parallel -I% --max-args 1 clean_func % $TMP_FAST5 || die "Cleaning up single-fast5 failed"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants