Skip to content

Rawasm is a patch to the popular miniasm tool. It enables the construction of genome assembly from raw nanopore signals.

License

Notifications You must be signed in to change notification settings

CMU-SAFARI/rawasm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

Rawasm is the first software tool that enables the construction of genome assembly from raw nanopore signals. It mostly reuses the miniasm features, but adds support to FAST5, POD5 and SLOW5 formats. Rawasm can be used in pipelining with RawHash2, using Rawsamble overlapping feature.

Installation

To install Rawasm, do the following:

  • Clone the repository
    git clone https://github.com/CMU-SAFARI/rawasm.git rawasm
  • install Rawasm
    cd rawasm && make

Rawasm makefile downloads a new miniasm and patches it. Default installation directory is rawasm/miniasm, but you can change it in the Makefile. If a local copy of miniasm exists, you can specify it in the Makefile. If this is the case, to skip the miniasm download and install use:

make install

Rawasm is a self-contained implementation that can be downloaded and run. It uses a set of pre-compiled static libraries. However, recompiling might be required depending on your system. If it is the case, you can choose two options:

  • Refer to RawHash2 repo for SLOW5, POD5 and FAST5 libraries compilation.
  • Compile the libraries using gcc ar from the original repos:
    • libhdf5.a : compile from HD5 Group repo.
    • libpod5_format.a, libarrow.a, libjemalloc_pic.a, libzstd.a : compile from POD5 format repo.
    • libslow5.a : compile from Slow5 Tools repo.

Finally, Rawasm requires libuuid. you can do the following:

  • Download libuuid
git clone https://github.com/cloudbase/libuuid/tree/master libuuid
  • Build .o files (run the following for each .c or just write a Makefile/script to automate)
gcc -c -g -Wall -O2 -Wno-all  -Wno-write-strings -Wno-deprecated-declarations -Wcpp -I. file.c -o file.o
  • Generate the library
ar rcs libuuid.a *.o

All static libraries must be in the lib directory before running the make or make install command.

Usage

Rawasm supports by default all of miniasm features. Moreover, it introduces two new features:

  • Processing of single or multiple FAST5, POD5, S/BLOW5 input files
  • Output the assembly as FAST5, POD5, S/BLOW5 unitigs files. For more info about unitigs, check miniasm.

Using Rawasm is straightforward.

./miniasm -f input_data[.fast5/pod5/slow5/blow5] overlaps.paf -H outdir > assembly.gfa

input_data can be either a directory containing multiple files, or a single file (fast5, pod5, slow5, blow5). In case of a directory, do not mix different-type files. overlaps.paf is the all vs all overlaps file produced by RawHash2. outdir specifies the unitig files output directory. Rawasm creates a unitig file for each distinct unitig that makes the assembly. The format type is the same of the input. assembly.gfa is the assembly text output.

Cite Rawsamble

If you use Rawasm in your work, please consider citing the following papers:

TBD

About

Rawasm is a patch to the popular miniasm tool. It enables the construction of genome assembly from raw nanopore signals.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published