WIsH (Who Is the Host) is a software tool that builds a model for predicting the hosts of phages. The present scripts are to find the optimal model parameters such that the predictions are the most accurate for the particular dataset.
In the present setting, there is a set of phages (e.g. phages from a metagenomic sample) for which we would like to predict the hosts. Using WIsH (Who Is the Host) is advisable for such analysis in case of short contigs (which is often the case in NGS shotgun viromics data).
- Model order
- P-value cut off
- Evaluation method of prediction results (Best prediction, Majority vote of X best predictions, LCA of X best predictions)
For more information on the parameters, please see the publication (and supplementary material) on WIsH.
After installing R and RStudio (for version, see 'Software' bellow), clone the repository.
git clone https://github.com/eregenyi/WIsH-model-optimization
From the master script, all other scripts can be run swiftly and easily.
Keep in mind that some parts of the code may need to be taylored to the needs of your own dataset.
The scripts were written using:
R version 3.4.0
RStudio Version 1.0.143
When contributing to the repository, such as reporting a bug, submitting a fix or suggesting new features, please use the issues to discuss the changes you wish to make. Contributions are governed by our Code of Conduct, and will be made under the GPLv3 license.
GPLv3 License - see the LICENSE file for details.
- construct dummy datasets for reproducibility