You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ls -1 pfam*.FASTA |
sed 's,^(.*).FASTA,./gi2taxonomy.pl -i & -o \1.tax.txt -db taxonomy.tmp.sqlite -r,' | bash
#Create a file of file for the *.tax.txt files:
listPath.pl -d . | grep 'tax.txt' > idx
#Compute taxonomy statistic for each domain and create a sql file to load into the database:
taxo_profile_to_sql.pl -i idx > taxo_profile.sql #just two lines ???
Load into the database: ??no thing import because the size of taxonomy.tmp.sqlite change nothing
Hi @NailouZhang,
Thank you for notifying me about this error. I am refactoring my scripts so that it doesn't use this database anymore.
I plan to push the new version of the code this month if everything goes well.
For now, you can skip the database installation as it will no longer be needed. The first steps of the pipeline (readsoustraction, demultiplex, assembly, map) doesn't need the database anyway.
Hi @marieBvr ,
I met same issues with #1 , I also continue the install piplines, but the size of taxonomy.tmp.sqlite not changed. can you help me resolve it?
Sincerely yours,
Nailou
PS: The installation information is as follows:
activate conda envirnment
cd ~/20T/DataBase/SoftwaresEnsembel/MAG/virAnnot
source ~/20T/DataBase/SoftwaresEnsembel/MiniConda/Source.sh
conda activate VirAnnot
export PERL5LIB=/home/stone/20T/DataBase/SoftwaresEnsembel/MAG/virAnnot/lib:$PERL5LIB
export PATH=$PATH:/home/stone/20T/DataBase/SoftwaresEnsembel/MAG/virAnnot/Tools
export PATH=/home/stone/20T/DataBase/SoftwaresEnsembel/MAG/virAnnot/tools:
/home/stone/20T/DataBase/SoftwaresEnsembel/MAG/virAnnot/launchers:$PATH
export PATH=$PATH:/home/stone/20T/DataBase/SoftwaresEnsembel/MAG/virAnnot/db
#下载 & 安装 数据集
####################################################################################################
cd /home/stone/20T/DataBase/SoftwaresEnsembel/BigDataBase
#NCBI Taxonomy
#Download and extract NCBI taxonomy files.
#wget ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz ;
tar -xf taxdump.tar.gz;
#wget ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/prot.accession2taxid.gz ;
gunzip prot.accession2taxid.gz;
#wget ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/nucl_gb.accession2taxid.gz ;
gunzip nucl_gb.accession2taxid.gz;
#wget ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/dead_prot.accession2taxid.gz ;
gunzip dead_prot.accession2taxid.gz;
cat prot.accession2taxid dead_prot.accession2taxid > acc2taxid.prot
#wget ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/nucl_wgs.accession2taxid.gz ;
gunzip nucl_wgs.accession2taxid.gz;
#wget ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/dead_wgs.accession2taxid.gz ;
gunzip dead_wgs.accession2taxid.gz
cat nucl_wgs.accession2taxid nucl_gb.accession2taxid dead_wgs.accession2taxid > acc2taxid.nucl
#wget ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/dead_nucl.accession2taxid.gz;
gunzip dead_nucl.accession2taxid.gz;
#create gi_taxid_prot_new.dmp
awk '{ print $4 " " $3}' acc2taxid.prot > gi_taxid_prot_temp1.dmp
tail -n +2 gi_taxid_prot_temp1.dmp > gi_taxid_prot_temp2.dmp
rm gi_taxid_prot_temp1.dmp
tr ' ' \t < gi_taxid_prot_temp2.dmp > gi_taxid_prot_new.dmp
rm gi_taxid_prot_temp2.dmp
ln -s /home/stone/20T/DataBase/SoftwaresEnsembel/BigDataBase/*
/home/stone/20T/DataBase/SoftwaresEnsembel/MAG/virAnnot/db/
cd /home/stone/20T/DataBase/SoftwaresEnsembel/MAG/virAnnot/db/
mv gi_taxid_prot_new.dmp gi_taxid_prot.dmp
sed -i 's/#!/#!//g' loadTaxonomy.pl
sed -i 's/exit(1)/exit(1)}/g' loadTaxonomy.pl
loadTaxonomy.pl
-struct taxonomyStructure.sql
-index taxonomyIndex.sql
-acc_prot acc2taxid.prot
-acc_nucl acc2taxid.nucl
-names names.dmp
-nodes nodes.dmp
-gi_prot gi_taxid_prot.dmp
2022/01/03 13:53:34 INFO> loadTaxonomy.pl:120 main::_create_sqlite_db - Creating database.
2022/01/03 13:53:35 INFO> loadTaxonomy.pl:76 main::_insertingCSVDataInDatabase - Inserting tables into database...
2022/01/03 13:53:35 INFO> loadTaxonomy.pl:78 main::_insertingCSVDataInDatabase - nodes
2022/01/03 13:53:35 INFO> loadTaxonomy.pl:78 main::_insertingCSVDataInDatabase - gi_prot
2022/01/03 13:53:35 INFO> loadTaxonomy.pl:78 main::_insertingCSVDataInDatabase - names
2022/01/03 13:53:35 INFO> loadTaxonomy.pl:78 main::_insertingCSVDataInDatabase - prot_accession2taxid
2022/01/03 13:53:35 INFO> loadTaxonomy.pl:78 main::_insertingCSVDataInDatabase - nucl_accession2taxid
ll -h taxonomy.tmp.sqlite
-rw-rw-r-- 1 stone stone 80K 1月 3 13:53 taxonomy.tmp.sqlite
tar -xzf fasta.tar.gz;
mkdir pfam
mv pfam*.FASTA pfam/
rm .FASTA
mv pfam/pfam .
ls -1 pfam*.FASTA |
sed 's,^(.*).FASTA,./gi2taxonomy.pl -i & -o \1.tax.txt -db taxonomy.tmp.sqlite -r,' | bash
#Create a file of file for the *.tax.txt files:
listPath.pl -d . | grep 'tax.txt' > idx
#Compute taxonomy statistic for each domain and create a sql file to load into the database:
taxo_profile_to_sql.pl -i idx > taxo_profile.sql #just two lines ???
Load into the database: ??no thing import because the size of taxonomy.tmp.sqlite change nothing
sqlite3 taxonomy.tmp.sqlite < taxo_profile.sql
####################################################################################################
The text was updated successfully, but these errors were encountered: