You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
mml@mml:/media/mml/6f60ef75-45fb-4532-9f2a-1a5d642a3093/3C_data/Ctrp_WT$ CreateScaffoldedFasta.pl PacBio_denovo.fasta out
Wed Apr 25 14:14:42 2018: CreateScaffoldedFasta.pl with input fasta = PacBio_denovo.fasta, OUTPUT_DIR = out
Wed Apr 25 14:14:42 2018: Found 7 ordering files ('group*.ordering' in out/main_results/).
Wed Apr 25 14:14:42 2018: Reading in sequences from assembly file PacBio_denovo.fasta
Wed Apr 25 14:14:42 2018: Found 141 contigs/scaffolds in assembly.
ERROR: Ordering file out/main_results/group0.ordering includes contig named 'tig00000015', not found in fasta file PacBio_denovo.fasta
Wed Apr 25 14:14:42 2018: Creating a scaffold from file out/main_results/group0.ordering...
But, PacBio_denovo.fasta does contain tig00000015.
Unable to figure out how to fix this.
Bhagya C T
The text was updated successfully, but these errors were encountered:
I have run into this problem as well with the fasta output of FALCON - when parsing the fasta file it appears that the function "LoadFasta" does not parse the header lines correctly. Instead of splitting off the contig name (immediately following ">") the variable contig_name is actually the entire header line (without ">"). The following modification of "LoadFasta" does this correctly and I have successfully created the Lachesis Assembly Fasta file with this change. I had not looked at perl code for sometime so this is a workaround, perhaps no the solution the authors might have chosen:
LoadFasta: Convert a fasta file to contigs.
Outputs:
1. An array of contig names.
2. A hash of contig name to contig sequence.
sub LoadFasta( $ ) {
#print localtime() . ": LoadFasta: $_[0]\n";
open IN, '<', $_[0] or die;
my $contig_name;
my @contig_names;
my @A1;
my %contig_seqs;
while (<IN>) {
chomp;
if ( /^\>(.+)/ ) {
$contig_name = $1;
@A1 = split (/ /,$contig_name);
push @contig_names, $A1[0];
}
else {
@A1 = split (/ /,$contig_name);
$contig_seqs{$A1[0]} .= $_;
}
}
close IN;
die "ERROR: LoadFasta: Couldn't parse file $_[0] properly. Are you sure this is a FASTA file?" unless scalar @contig_names >= 1 && scalar keys %contig_seqs >= 1;
return ( \@contig_names, \%contig_seqs );
mml@mml:/media/mml/6f60ef75-45fb-4532-9f2a-1a5d642a3093/3C_data/Ctrp_WT$ CreateScaffoldedFasta.pl PacBio_denovo.fasta out
Wed Apr 25 14:14:42 2018: CreateScaffoldedFasta.pl with input fasta = PacBio_denovo.fasta, OUTPUT_DIR = out
Wed Apr 25 14:14:42 2018: Found 7 ordering files ('group*.ordering' in out/main_results/).
Wed Apr 25 14:14:42 2018: Reading in sequences from assembly file PacBio_denovo.fasta
Wed Apr 25 14:14:42 2018: Found 141 contigs/scaffolds in assembly.
ERROR: Ordering file out/main_results/group0.ordering includes contig named 'tig00000015', not found in fasta file PacBio_denovo.fasta
Wed Apr 25 14:14:42 2018: Creating a scaffold from file out/main_results/group0.ordering...
But, PacBio_denovo.fasta does contain tig00000015.
Unable to figure out how to fix this.
Bhagya C T
The text was updated successfully, but these errors were encountered: