Skip to content

Latest commit

 

History

History
executable file
·
68 lines (58 loc) · 2.61 KB

README.md

File metadata and controls

executable file
·
68 lines (58 loc) · 2.61 KB

kMetaShot

Table of content

  1. Introduction
  2. Install
  3. Usage

INTRODUCTION

The application of 2nd and 3rd generation High Throughput Sequencing (HTS) technologies has deeply reshaped experimental method to investigate microbial communities and obtain a taxonomic and functional profile of the invetigated community. Shotgun Metagenomics allow to quickly obtain a representation of microorganisms genomes characterizing a particular environment. In order to obtain a fast e reliable taxonomic classification of microorganisms genomes we present kMetaShot, an alignment-free taxonomic classifier based on k-mer/minimizer counting.

INSTALL

kMetaShot is available through conda. To install it type the following line:

 conda create --name kmetashot kmetashot -c conda-forge -c gdefazio

To activate the environment:

conda activate kmetashot

kMetaShot requires a reference file available at this link. It requires about 22Gb of storage.
To download it you can simply use wget:

wget http://srv00.recas.ba.infn.it/webshare/brunofosso/kMetaShot_reference.h5

Before to use kMetaShot you may test the installation typing the following line:

kMetaShot_test.py -r /path/to/kMetaShot_reference.h5

USAGE

kMetaShot_classifier_NV.py 
                -b bins/
                -r kMetaShot_reference/kMetaShot_bacteria_archaea.h5',
                -p 10
                -o output_dir
                -a 0.1
                
Arguments:
  -h, --help            show this help message and exit
  -b , --bins_dir (char)
                        Path to a directory containing bins fasta files or 
                        path to a multi-fasta file where each header corresponds
                        to a bin/MAG
  -r , --reference (char)
                        Path to HDF5 kMetaShot reference
  -p , --processes (int)
                        Number of child processes for a Multiprocess parallelism. 
                        Warning: high parallelism <==> high RAM usage
  -o , --out_dir (char)
                        Output directory
  -a , --ass2ref (float)
                        Classification filtering based on ass2ref parameter ranging
                        between 0 and 1. Default 0. 
                        ass2ref is a ratio between the number of MAG minimizers
                        and the reference minimizers related to the assigned strain

kmetashot is also available as Docker container:

docker run -it ibiomcnr/kmetashot kMetaShot_classifier_NV.py --help