Skip to content

Finding the optimal WIsH model for your phage dataset in R. Part of my thesis work, at the Laboratory of Viral Metagenomics at Rega Institute. @kuleuven

License

Notifications You must be signed in to change notification settings

eregenyi/WIsH-model-optimization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Optimization of a WIsH model

WIsH (Who Is the Host) is a software tool that builds a model for predicting the hosts of phages. The present scripts are to find the optimal model parameters such that the predictions are the most accurate for the particular dataset.

Hypothetical experimental setup

In the present setting, there is a set of phages (e.g. phages from a metagenomic sample) for which we would like to predict the hosts. Using WIsH (Who Is the Host) is advisable for such analysis in case of short contigs (which is often the case in NGS shotgun viromics data).

Parameters to optimize

  • Model order
  • P-value cut off
  • Evaluation method of prediction results (Best prediction, Majority vote of X best predictions, LCA of X best predictions)

For more information on the parameters, please see the publication (and supplementary material) on WIsH.

Quick start

After installing R and RStudio (for version, see 'Software' bellow), clone the repository.

git clone https://github.com/eregenyi/WIsH-model-optimization

From the master script, all other scripts can be run swiftly and easily.
Keep in mind that some parts of the code may need to be taylored to the needs of your own dataset.

Software

The scripts were written using:
R version 3.4.0
RStudio Version 1.0.143

Contributing

When contributing to the repository, such as reporting a bug, submitting a fix or suggesting new features, please use the issues to discuss the changes you wish to make. Contributions are governed by our Code of Conduct, and will be made under the GPLv3 license.

License

GPLv3 License - see the LICENSE file for details.

To do

  • construct dummy datasets for reproducibility

About

Finding the optimal WIsH model for your phage dataset in R. Part of my thesis work, at the Laboratory of Viral Metagenomics at Rega Institute. @kuleuven

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages