Skip to content

Latest commit

 

History

History
268 lines (192 loc) · 10.9 KB

GUIDELINES.md

File metadata and controls

268 lines (192 loc) · 10.9 KB

Guidelines for bioconda recipes

Requirements of a good bioconda recipe

  • Follows the Filesystem Hierarchy Standard (FSH). This is most important for libraries and Java packages; for these cases use one of the recipes below as a guideline.

  • Has adequate tests (see "Testing" section below)

  • License allows redistribution and license is indicated in meta.yaml

  • Does not already exist in the main Anaconda channel or the r channel. Exceptions: Anaconda version is too old or some sort of patch is required that is not in the main Anaconda channel (TODO: examples of the latter?)

  • If the recipe installs custom wrapper scripts, usage notes should be added to extra -> notes in the meta.yaml.

  • If uploading of an unreleased version is necessary, please follow the versioning scheme of conda for pre- and post-releases (e.g. using a, b, rc, and dev suffixes, see here).

Examples

The following recipes serve as examples of good recipes that can be used as guides or templates when developing new recipes.

Python

Use conda skeleton pypi <packagename> where packagename is a package available on PyPI. Either run that command in the recipes dir or move the recipe it creates to recipes. Typically can be used without modification, though dependencies may also need recipes.

If the recipe was created using conda skeleton pypi, then the default test is probably sufficient. The exception is when the package also installs a command-line tool, in which case that should be tested as well.

  • typical import check: pysam
  • import and command-line tests: chanjo

R (CRAN)

Use conda skeleton cran <packagename> where packagename is a package available on CRAN and is case-sensitive. Either run that command in the recipes dir or move the recipe it creates to recipes. The recipe name will have an r- prefix and will be converted to lowercase. Typically can be used without modification, though dependencies may also need recipes.

If the recipe was created using conda skeleton cran or the scripts/bioconductor_skeleton.py script, the default test is probably sufficient.

  • typical R recipe from CRAN: r-locfit
  • recipe for R package not on CRAN, also with patch: spp

R (Bioconductor)

Use scripts/bioconductor/bioconductor_skeleton.py <packagename> where packagename is a case-sensitive package available on Bioconductor. The recipe name will have a bioconductor- prefix and will be converted to lowercase. Typically can be used without modification, though dependencies may also need recipes. Recipes for dependencies with an r- prefix should be created using conda skeleton cran; see above.

Java

Add a wrapper script if the software is typically called via java -jar .... For example, fastqc already has a wrapper script, but gatk-framework does not. chromhmm mimics the installation pattern of gatk-framework.

JAR files should go in $PREFIX/share/$PKG_NAME-$PKG_VERSION-$PKG_BUILDNUM. A wrapper script should be placed here as well, and symlinked to $PREFIX/bin.

Example: gatk-framework Example with patch to fix memory: fastqc

Perl

Use conda skeleton cpan <packagename> to build a recipe for Perl and place the recipe in the recipes dir. The recipe will have the perl- prefix.

The recipe as generated by conda skeleton cpan must be changed. The run and build requirements must specify perl-threaded instead of perl. Since some bioconda packages depend on a version of Perl compiled with threading support, a choice was made to have all recipes use perl-threaded to avoid maintaining multiple versions of each Perl module.

An example of such a package is perl-module-build.

Alternatively, you can additionally ensure the build requirements for the recipe include perl-app-cpanminus, and then the build.sh script can be simplified. An example of this simplification is perl-time-hires.

If the recipe was created with conda skeleton cpan, the tests are likely sufficient. Otherwise, test the import of modules (see the imports section of the meta.yaml files in above examples).

C/C++

In general, standard make should work. Other build tools (e.g., autoconf) and compilers (e.g., gcc) should be specified in the build requirements.

We have not yet decided whether to have gcc as a conda package or to assume it is in the build environment. Until this decision is made, please add gcc (for Linux packages) and llvm (for OSX packages) to the meta.yaml as follows:

requirements:
  build:
    - gcc   # [not osx]
    - llvm  # [osx]

  run:
    - libgcc    # [not osx]

If the package uses zlib, then please see the ZLIB section below.

If your package links dynamically against a particular library, it is often necessary to pin the version against which it was compiled, in order to avoid ABI incompatibilities. Instead of hardcoding a particular version in the recipe, we use jinja templates to achieve this. For example, bioconda provides an environnmnet variable CONDA_BOOST that contains the current major version of boost. You should pin your boost dependency against that version. An example is the salmon recipe. You find the libraries you can currently pin in scripts/env_matrix.yml. If you need to pin another library, please notify @bioconda/core, and we will set up a corresponding environment variable.

General command-line tools

If a command-line tool is installed, it should be tested. If it has a shebang line, it should be patched to use /usr/bin/env for more general use.

An example of this is fastq-screen.

For command-line tools, running the program with no arguments, checking the programs version (e.g. with -v) or checking the command-line help is sufficient if doing so returns an exit code 0. Often the output is piped to /dev/null to avoid output during recipe builds.

Examples:

Metapackages

Metapackages tie together other packages. All they do is define dependencies. For example, the hubward-all metapackage specifies the various other conda packages needed to get full hubward installation running just by installing one package. Other metapackages might tie together conda packages with a theme. For example, all UCSC utilities related to bigBed files, or a set of packages useful for variant calling.

For packages that are not anchored to a particular package (as in the last example above), we recommended semantic versioning starting at 1.0.0 for metapackages.

Other examples

Examples of somewhat non-standard recipes, in no particular order:

  • blast has an OS-specific installation -- OSX copies binaries while on Linux it is compiled.
  • graphviz has an OS-specific option to configure
  • crossmap removes libs that are shipped with the source distribution
  • hisat2 runs 2to3 to make it Python 3 compatible, and copies over individual scripts to the bin dir
  • htslib has a small test script that creates example data and runs multiple programs on it
  • spectacle runs 2to3 to make the wrapper script Python 3 compatible, patches the wrapper script to have a shebang line, deletes example data to avoid taking up space in the bioconda channel, and includes a script for downloading the example data separately.
  • gatk is a package for licensed software that cannot be redistributed. The package installs a placeholder script (in this case doubling as the jar wrapper) to alert the user if the program is not installed, along with a separate script (gatk-register) to copy in a user-supplied archive/binary to the conda environment

Name collisions

In some cases, there may be a name collision when writing a recipe. For example the wget recipe is for the standard command-line tool. There is also a Python package called wget on PyPI. In this case, we prefixed the Python package with python- (see python-wget). A similiar collision was resolved with weblogo and python-weblogo.

If in doubt about how to handle a naming collision, please submit an issue.

Tests

An adequate test must be included in the recipe. An "adequate" test depends on the recipe, but must be able to detect a successful installation. While many packages may ship their own test suite (unit tests or otherwise), including these in the recipe is not recommended since it may timeout the build system on Travis-CI.

Note that a test must return an exit code of 0. The test can be in the test field of meta.yaml, or can be a separate script (see the relevant conda docs for testing).

It is recommended to pipe unneeded stdout/stderr to /dev/null to avoid cluttering the output in the Travis-CI build environment.

Versions

If an older version is required, put it in a subdirectory of the recipe. Examples of this can be found in bowtie2, bx-python, and others.

ZLIB

If a package depends on zlib, then it will most likely also depend on gcc (on Linux) and llvm (on OSX). The meta.yaml requirements section should therefore at least have the following:

requirements:
  build:
    - gcc   # [not osx]
    - llvm  # [osx]
    - zlib

  run:
    - libgcc    # [not osx]
    - zlib

When building the package, you may get an error saying that zlib.h can't be found -- despite having zlib listed in the dependencies. This is becuase the location of zlib often has to be specified in the build.sh script, which can be done like this:

export CFLAGS="-I$PREFIX/include"
export LDFLAGS="-L$PREFIX/lib"

Or sometimes:

export CPATH=${PREFIX}/include

Sometimes Makefiles may specify these locations, in which case they need to be edited. See the samtools recipe for an example of this.