-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathstandard_db.html
231 lines (212 loc) · 11.5 KB
/
standard_db.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
<!DOCTYPE html>
<html lang="en" data-content_root="./">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
<title>4. Building a standard database — ProPhyle 0.3.3.2 documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css?v=d10597a4" />
<link rel="stylesheet" type="text/css" href="_static/sphinx13.css?v=ec60a4c3" />
<script src="_static/documentation_options.js?v=e7e3125d"></script>
<script src="_static/doctools.js?v=9a2dae69"></script>
<script src="_static/sphinx_highlight.js?v=dc90522c"></script>
<link rel="search" type="application/opensearchdescription+xml"
title="Search within ProPhyle 0.3.3.2 documentation"
href="_static/opensearch.xml"/>
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="5. Building a custom database" href="custom_db.html" />
<link rel="prev" title="3. Installing ProPhyle" href="install.html" />
<link href='https://fonts.googleapis.com/css?family=Open+Sans:300,400,700'
rel='stylesheet' type='text/css' />
<style type="text/css">
table.right { float: right; margin-left: 20px; }
table.right td { border: 1px solid #ccc; }
</style>
<script type="text/javascript">
// intelligent scrolling of the sidebar content
$(window).scroll(function() {
var sb = $('.sphinxsidebarwrapper');
var win = $(window);
var sbh = sb.height();
var offset = $('.sphinxsidebar').position()['top'];
var wintop = win.scrollTop();
var winbot = wintop + win.innerHeight();
var curtop = sb.position()['top'];
var curbot = curtop + sbh;
// does sidebar fit in window?
if (sbh < win.innerHeight()) {
// yes: easy case -- always keep at the top
sb.css('top', $u.min([$u.max([0, wintop - offset - 10]),
$(document).height() - sbh - 200]));
} else {
// no: only scroll if top/bottom edge of sidebar is at
// top/bottom edge of window
if (curtop > wintop && curbot > winbot) {
sb.css('top', $u.max([wintop - offset - 10, 0]));
} else if (curtop < wintop && curbot < winbot) {
sb.css('top', $u.min([winbot - sbh - offset - 20,
$(document).height() - sbh - 200]));
}
}
});
</script>
</head><body>
<div class="pageheader">
<ul>
<li><a href="index.html">Home</a></li>
<li><a href="install.html">Get it</a></li>
<li><a href="contents.html">Docs</a></li>
<li><a href="http://github.com/prophyle/prophyle">Extend/Develop</a></li>
</ul>
<div>
<a href="index.html">
<!--
<img src="_static/sphinxheader.png" alt="SPHINX" />
-->
<div style="color:white;font-size:240%">ProPhyle</div>
<div style="color:white;font-size:140%">DNA sequence classification</div>
</a>
</div>
</div>
<div class="related" role="navigation" aria-label="Related">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="genindex.html" title="General Index"
accesskey="I">index</a></li>
<li class="right" >
<a href="custom_db.html" title="5. Building a custom database"
accesskey="N">next</a> |</li>
<li class="right" >
<a href="install.html" title="3. Installing ProPhyle"
accesskey="P">previous</a> |</li>
<li><a href="index.html">ProPhyle home</a> |</li>
<li><a href="contents.html">Documentation</a> »</li>
<li class="nav-item nav-item-this"><a href=""><span class="section-number">4. </span>Building a standard database</a></li>
</ul>
</div>
<div class="sphinxsidebar" role="navigation" aria-label="Main">
<div class="sphinxsidebarwrapper">
<div>
<h3><a href="contents.html">Table of Contents</a></h3>
<ul>
<li><a class="reference internal" href="#">4. Building a standard database</a><ul>
<li><a class="reference internal" href="#downloading-genomes">4.1. Downloading genomes</a></li>
<li><a class="reference internal" href="#index-construction">4.2. Index construction</a></li>
</ul>
</li>
</ul>
</div>
<div>
<h4>Previous topic</h4>
<p class="topless"><a href="install.html"
title="previous chapter"><span class="section-number">3. </span>Installing ProPhyle</a></p>
</div>
<div>
<h4>Next topic</h4>
<p class="topless"><a href="custom_db.html"
title="next chapter"><span class="section-number">5. </span>Building a custom database</a></p>
</div>
<div role="note" aria-label="source link">
<h3>This Page</h3>
<ul class="this-page-menu">
<li><a href="_sources/standard_db.rst.txt"
rel="nofollow">Show Source</a></li>
</ul>
</div>
<search id="searchbox" style="display: none" role="search">
<h3 id="searchlabel">Quick search</h3>
<div class="searchformwrapper">
<form class="search" action="search.html" method="get">
<input type="text" name="q" aria-labelledby="searchlabel" autocomplete="off" autocorrect="off" autocapitalize="off" spellcheck="false"/>
<input type="submit" value="Go" />
</form>
</div>
</search>
<script>document.getElementById('searchbox').style.display = "block"</script>
</div>
</div>
<div class="document">
<div class="documentwrapper">
<div class="bodywrapper">
<div class="body" role="main">
<section id="building-a-standard-database">
<span id="index-0"></span><span id="standard-db"></span><h1><span class="section-number">4. </span>Building a standard database<a class="headerlink" href="#building-a-standard-database" title="Link to this heading">¶</a></h1>
<p>ProPhyle comes with several genome libraries containing
RefSeq genomes, augmented with the NCBI taxonomy.</p>
<section id="downloading-genomes">
<h2><span class="section-number">4.1. </span>Downloading genomes<a class="headerlink" href="#downloading-genomes" title="Link to this heading">¶</a></h2>
<p>These libraries can be downloaded using <code class="docutils literal notranslate"><span class="pre">prophyle</span> <span class="pre">download</span> <span class="pre"><library></span> <span class="pre">[<library></span> <span class="pre">...]</span></code>,
where <code class="docutils literal notranslate"><span class="pre"><library></span></code> should be replaced by <code class="docutils literal notranslate"><span class="pre">bacteria</span></code>, <code class="docutils literal notranslate"><span class="pre">viruses</span></code>, or <code class="docutils literal notranslate"><span class="pre">plasmids</span></code>.
The command also copies a prebuild Newick/NHX tree for the specified library.
If the <cite>-d</cite> parameter is not specified, all files are placed to <cite>~/prophyle</cite>.</p>
<p>To download all viral and bacterial genomes from RefSeq, execute</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>prophyle<span class="w"> </span>download<span class="w"> </span>bacteria<span class="w"> </span>viruses
</pre></div>
</div>
</section>
<section id="index-construction">
<h2><span class="section-number">4.2. </span>Index construction<a class="headerlink" href="#index-construction" title="Link to this heading">¶</a></h2>
<p>Once a library is downloaded, a ProPhyle index can be constructed using</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>prophyle<span class="w"> </span>index<span class="w"> </span><span class="o">[</span>-g<span class="w"> </span>DIR<span class="o">]</span><span class="w"> </span><span class="o">[</span>-j<span class="w"> </span>INT<span class="o">]</span><span class="w"> </span><span class="o">[</span>-k<span class="w"> </span>INT<span class="o">]</span><span class="w"> </span><span class="o">[</span>-M<span class="o">]</span><span class="w"> </span><span class="o">[</span>-P<span class="o">]</span><span class="w"> </span><span class="o">[</span>-K<span class="o">]</span><span class="w"> </span><tree.nw><span class="w"> </span><span class="o">[</span><tree.nw><span class="w"> </span>...<span class="o">]</span><span class="w"> </span><index.dir>
</pre></div>
</div>
<p><cite><tree.nw></cite> is a Newick/NHX for the index. The trees from the previous command
are placed in <cite>~/prophyle</cite> and they are called <cite>bacteria.nw</cite>, <cite>viruses.nw</cite>, etc.
<cite><index.dir></cite> is the directory directory where your index files are going to
be placed.</p>
<p>There are multiple other parameters that can be used.
<cite>-j</cite> can be used to specify the number of CPU cores used for index construction (all cores are used otherwise).
<cite>-k</cite> serves to set the <em>k</em>-mer length (31 in default).
<cite>-M</cite> activates low complexity regions filtering using DustMasker. Please, ensure that the program is install (try to run <cite>dustmasker</cite>).
If multiple trees are used, they are going to be merged. Therefore, a name collision can
appear. To prevent such a situation, ProPhyle prepends numerical prefixes to the
node names (unless <cite>-P</cite> is used).
The <cite>-K</cite> parameter can be used to deactivate <em>k</em>-LCP array construction. The resulting index
would be slightly smaller, but querying would become much slower.</p>
<p>So the entire command for index construction can look, for instance,
like this:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>prophyle<span class="w"> </span>index<span class="w"> </span>-k<span class="w"> </span><span class="m">25</span><span class="w"> </span>~/prophyle/bacteria.nw<span class="w"> </span>~/prophyle/viruses.nw<span class="w"> </span>my_BV_index
</pre></div>
</div>
<p>Index construction might take several hours, based on the database size, <em>k</em> and the number
of used cores.</p>
</section>
</section>
<div class="clearer"></div>
</div>
</div>
</div>
<div class="clearer"></div>
</div>
<div class="related" role="navigation" aria-label="Related">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="genindex.html" title="General Index"
>index</a></li>
<li class="right" >
<a href="custom_db.html" title="5. Building a custom database"
>next</a> |</li>
<li class="right" >
<a href="install.html" title="3. Installing ProPhyle"
>previous</a> |</li>
<li><a href="index.html">ProPhyle home</a> |</li>
<li><a href="contents.html">Documentation</a> »</li>
<li class="nav-item nav-item-this"><a href=""><span class="section-number">4. </span>Building a standard database</a></li>
</ul>
</div>
<div class="footer" role="contentinfo">
© Copyright 2015-2024, Karel Břinda, Kamil Salikhov, Simone Pignotti, Gregory Kucherov.
Created using <a href="https://www.sphinx-doc.org/">Sphinx</a> 8.0.2.
</div>
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-112241191-1"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'UA-112241191-1');
</script>
</body>
</html>