From 76b3e8f5f38bf26c1bc35092aa4fe3734805395d Mon Sep 17 00:00:00 2001
From: "Documenter.jl" <documenter@juliadocs.github.io>
Date: Fri, 25 Oct 2024 10:22:00 +0000
Subject: [PATCH] build based on 295ba89

---
 dev/.documenter-siteinfo.json  |  2 +-
 dev/construction/index.html    | 12 ++++++------
 dev/counting/index.html        | 12 ++++++------
 dev/index.html                 |  2 +-
 dev/interfaces/index.html      |  2 +-
 dev/io/index.html              |  2 +-
 dev/predicates/index.html      |  6 +++---
 dev/random/index.html          |  4 ++--
 dev/recipes/index.html         |  2 +-
 dev/search_index.js            |  2 +-
 dev/sequence_search/index.html |  6 +++---
 dev/symbols/index.html         |  2 +-
 dev/transforms/index.html      | 12 ++++++------
 dev/types/index.html           |  2 +-
 14 files changed, 34 insertions(+), 34 deletions(-)
diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json
index 364b959a..19fb67a3 100644
--- a/dev/.documenter-siteinfo.json
+++ b/dev/.documenter-siteinfo.json
@@ -1 +1 @@
-{"documenter":{"julia_version":"1.11.1","generation_timestamp":"2024-10-24T17:54:15","documenter_version":"1.7.0"}}
\ No newline at end of file
+{"documenter":{"julia_version":"1.11.1","generation_timestamp":"2024-10-25T10:21:53","documenter_version":"1.7.0"}}
\ No newline at end of file
diff --git a/dev/construction/index.html b/dev/construction/index.html
index 62ecc191..2003f044 100644
--- a/dev/construction/index.html
+++ b/dev/construction/index.html
@@ -135,11 +135,11 @@
 &quot;TAGA&quot;
 
 julia&gt; string(push!(f(), DNA_A))
-&quot;TAGA&quot;</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/stringliterals.jl#L12-L45">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.@rna_str" href="#BioSequences.@rna_str"><code>BioSequences.@rna_str</code></a> — <span class="docstring-category">Macro</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>The <code>LongRNA{4}</code> equivalent to <code>@dna_str</code></p><p>See also: <a href="#BioSequences.@dna_str"><code>@dna_str</code></a>, <a href="#BioSequences.@aa_str"><code>@aa_str</code></a></p><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; rna&quot;UCGUGAUGC&quot;
+&quot;TAGA&quot;</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/stringliterals.jl#L12-L45">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.@rna_str" href="#BioSequences.@rna_str"><code>BioSequences.@rna_str</code></a> — <span class="docstring-category">Macro</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>The <code>LongRNA{4}</code> equivalent to <code>@dna_str</code></p><p>See also: <a href="#BioSequences.@dna_str"><code>@dna_str</code></a>, <a href="#BioSequences.@aa_str"><code>@aa_str</code></a></p><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; rna&quot;UCGUGAUGC&quot;
 9nt RNA Sequence:
-UCGUGAUGC</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/stringliterals.jl#L61-L72">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.@aa_str" href="#BioSequences.@aa_str"><code>BioSequences.@aa_str</code></a> — <span class="docstring-category">Macro</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>The <code>AminoAcidAlphabet</code> equivalent to <code>@dna_str</code></p><p>See also: <a href="#BioSequences.@dna_str"><code>@dna_str</code></a>, <a href="#BioSequences.@rna_str"><code>@rna_str</code></a></p><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; aa&quot;PKLEQC&quot;
+UCGUGAUGC</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/stringliterals.jl#L61-L72">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.@aa_str" href="#BioSequences.@aa_str"><code>BioSequences.@aa_str</code></a> — <span class="docstring-category">Macro</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>The <code>AminoAcidAlphabet</code> equivalent to <code>@dna_str</code></p><p>See also: <a href="#BioSequences.@dna_str"><code>@dna_str</code></a>, <a href="#BioSequences.@rna_str"><code>@rna_str</code></a></p><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; aa&quot;PKLEQC&quot;
 6aa Amino Acid Sequence:
-PKLEQC</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/stringliterals.jl#L89-L100">source</a></section></article><h2 id="Loose-parsing"><a class="docs-heading-anchor" href="#Loose-parsing">Loose parsing</a><a id="Loose-parsing-1"></a><a class="docs-heading-anchor-permalink" href="#Loose-parsing" title="Permalink"></a></h2><p>As of version 3.2.0, BioSequences.jl provide the <a href="#BioSequences.bioseq"><code>bioseq</code></a> function, which can be used to build a <code>LongSequence</code> from a string (or an <code>AbstractVector{UInt8}</code>) without knowing the correct <code>Alphabet</code>.</p><pre><code class="language-julia-repl hljs">julia&gt; bioseq(&quot;ATGTGCTGA&quot;)
+PKLEQC</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/stringliterals.jl#L89-L100">source</a></section></article><h2 id="Loose-parsing"><a class="docs-heading-anchor" href="#Loose-parsing">Loose parsing</a><a id="Loose-parsing-1"></a><a class="docs-heading-anchor-permalink" href="#Loose-parsing" title="Permalink"></a></h2><p>As of version 3.2.0, BioSequences.jl provide the <a href="#BioSequences.bioseq"><code>bioseq</code></a> function, which can be used to build a <code>LongSequence</code> from a string (or an <code>AbstractVector{UInt8}</code>) without knowing the correct <code>Alphabet</code>.</p><pre><code class="language-julia-repl hljs">julia&gt; bioseq(&quot;ATGTGCTGA&quot;)
 9nt DNA Sequence:
 ATGTGCTGA</code></pre><p>The function will prioritise 2-bit alphabets over 4-bit alphabets, and prefer smaller alphabets (like <code>DNAAlphabet{4}</code>) over larger (like <code>AminoAcidAlphabet</code>). If the input cannot be encoded by any of the built-in alphabets, an error is thrown:</p><pre><code class="language-julia-repl hljs">julia&gt; bioseq(&quot;0!(CC!;#&amp;&amp;%&quot;)
 ERROR: cannot encode 0x30 in AminoAcidAlphabet
@@ -153,7 +153,7 @@
 
 julia&gt; bioseq(&quot;PKMW#3&gt;&gt;0;kL&quot;)
 ERROR: cannot encode 0x23 in AminoAcidAlphabet
-[...]</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/constructors.jl#L111-L137">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.guess_alphabet" href="#BioSequences.guess_alphabet"><code>BioSequences.guess_alphabet</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">guess_alphabet(s::Union{AbstractString, AbstractVector{UInt8}}) -&gt; Union{Integer, Alphabet}</code></pre><p>Pick an <code>Alphabet</code> that can encode input <code>s</code>.  If no <code>Alphabet</code> can, return the index of the first byte of the input which is not encodable in any alphabet. This function only knows about the alphabets listed below. If multiple alphabets are possible, pick the first from the order below (i.e. <code>DNAAlphabet{2}()</code> if possible, otherwise <code>RNAAlphabet{2}()</code> etc).</p><ol><li><code>DNAAlphabet{2}()</code></li><li><code>RNAAlphabet{2}()</code></li><li><code>DNAAlphabet{4}()</code></li><li><code>RNAAlphabet{4}()</code></li><li><code>AminoAcidAlphabet()</code></li></ol><div class="admonition is-warning"><header class="admonition-header">Warning</header><div class="admonition-body"><p>The functions <code>bioseq</code> and <code>guess_alphabet</code> are intended for use in interactive sessions, and are not suitable for use in packages or non-ephemeral work. They are type unstable, and their heuristics <strong>are subject to change</strong> in minor versions.</p></div></div><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; guess_alphabet(&quot;AGGCA&quot;)
+[...]</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/constructors.jl#L111-L137">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.guess_alphabet" href="#BioSequences.guess_alphabet"><code>BioSequences.guess_alphabet</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">guess_alphabet(s::Union{AbstractString, AbstractVector{UInt8}}) -&gt; Union{Integer, Alphabet}</code></pre><p>Pick an <code>Alphabet</code> that can encode input <code>s</code>.  If no <code>Alphabet</code> can, return the index of the first byte of the input which is not encodable in any alphabet. This function only knows about the alphabets listed below. If multiple alphabets are possible, pick the first from the order below (i.e. <code>DNAAlphabet{2}()</code> if possible, otherwise <code>RNAAlphabet{2}()</code> etc).</p><ol><li><code>DNAAlphabet{2}()</code></li><li><code>RNAAlphabet{2}()</code></li><li><code>DNAAlphabet{4}()</code></li><li><code>RNAAlphabet{4}()</code></li><li><code>AminoAcidAlphabet()</code></li></ol><div class="admonition is-warning"><header class="admonition-header">Warning</header><div class="admonition-body"><p>The functions <code>bioseq</code> and <code>guess_alphabet</code> are intended for use in interactive sessions, and are not suitable for use in packages or non-ephemeral work. They are type unstable, and their heuristics <strong>are subject to change</strong> in minor versions.</p></div></div><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; guess_alphabet(&quot;AGGCA&quot;)
 DNAAlphabet{2}()
 
 julia&gt; guess_alphabet(&quot;WKLQSTV&quot;)
@@ -163,10 +163,10 @@
 5
 
 julia&gt; guess_alphabet(&quot;UAGCSKMU&quot;)
-RNAAlphabet{4}()</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/alphabet.jl#L318-L350">source</a></section></article><h2 id="Comparison-to-other-sequence-types"><a class="docs-heading-anchor" href="#Comparison-to-other-sequence-types">Comparison to other sequence types</a><a id="Comparison-to-other-sequence-types-1"></a><a class="docs-heading-anchor-permalink" href="#Comparison-to-other-sequence-types" title="Permalink"></a></h2><p>Following Base standards, BioSequences do not compare equal to other containers even if they have the same elements. To e.g. compare a BioSequence with a vector of DNA, compare the elements themselves:</p><pre><code class="language-julia-repl hljs">julia&gt; seq = dna&quot;GAGCTGA&quot;; vec = collect(seq);
+RNAAlphabet{4}()</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/alphabet.jl#L338-L370">source</a></section></article><h2 id="Comparison-to-other-sequence-types"><a class="docs-heading-anchor" href="#Comparison-to-other-sequence-types">Comparison to other sequence types</a><a id="Comparison-to-other-sequence-types-1"></a><a class="docs-heading-anchor-permalink" href="#Comparison-to-other-sequence-types" title="Permalink"></a></h2><p>Following Base standards, BioSequences do not compare equal to other containers even if they have the same elements. To e.g. compare a BioSequence with a vector of DNA, compare the elements themselves:</p><pre><code class="language-julia-repl hljs">julia&gt; seq = dna&quot;GAGCTGA&quot;; vec = collect(seq);
 
 julia&gt; seq == vec, isequal(seq, vec)
 (false, false)
 
 julia&gt; length(seq) == length(vec) &amp;&amp; all(i == j for (i, j) in zip(seq, vec))
-true </code></pre></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../types/">« BioSequences Types</a><a class="docs-footer-nextpage" href="../transforms/">Indexing &amp; modifying sequences »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Thursday 24 October 2024 17:54">Thursday 24 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
+true </code></pre></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../types/">« BioSequences Types</a><a class="docs-footer-nextpage" href="../transforms/">Indexing &amp; modifying sequences »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Friday 25 October 2024 10:21">Friday 25 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
diff --git a/dev/counting/index.html b/dev/counting/index.html
index 9920cebf..c6402be4 100644
--- a/dev/counting/index.html
+++ b/dev/counting/index.html
@@ -9,24 +9,24 @@
 3
 
 julia&gt; matches(dna&quot;AACA&quot;, dna&quot;AAG&quot;)
-2</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/counting.jl#L51-L72">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.mismatches" href="#BioSequences.mismatches"><code>BioSequences.mismatches</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">mismatches(a::BioSequence, b::BioSequences) -&gt; Int</code></pre><p>Count the number of positions in where <code>a</code> and <code>b</code> differ. If <code>b</code> is given, and the length of <code>a</code> and <code>b</code> differ, look only at the indices of the shorter sequence. This function does not provide any special handling of ambiguous symbols, so e.g. <code>DNA_A</code> does not match <code>DNA_N</code>.</p><div class="admonition is-warning"><header class="admonition-header">Warning</header><div class="admonition-body"><p>Passing in two sequences with differing lengths is deprecated. In a future, breaking release of BioSequences, this will error.</p></div></div><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; mismatches(dna&quot;TAGCTA&quot;, dna&quot;TACNTA&quot;)
+2</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/counting.jl#L51-L72">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.mismatches" href="#BioSequences.mismatches"><code>BioSequences.mismatches</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">mismatches(a::BioSequence, b::BioSequences) -&gt; Int</code></pre><p>Count the number of positions in where <code>a</code> and <code>b</code> differ. If <code>b</code> is given, and the length of <code>a</code> and <code>b</code> differ, look only at the indices of the shorter sequence. This function does not provide any special handling of ambiguous symbols, so e.g. <code>DNA_A</code> does not match <code>DNA_N</code>.</p><div class="admonition is-warning"><header class="admonition-header">Warning</header><div class="admonition-body"><p>Passing in two sequences with differing lengths is deprecated. In a future, breaking release of BioSequences, this will error.</p></div></div><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; mismatches(dna&quot;TAGCTA&quot;, dna&quot;TACNTA&quot;)
 2
 
 julia&gt; mismatches(dna&quot;AACA&quot;, dna&quot;AAG&quot;)
-1</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/counting.jl#L81-L102">source</a></section></article><h2 id="GC-content"><a class="docs-heading-anchor" href="#GC-content">GC content</a><a id="GC-content-1"></a><a class="docs-heading-anchor-permalink" href="#GC-content" title="Permalink"></a></h2><p>The convenience function <code>gc_content(seq)</code> is equivalent to <code>count(isGC, seq) / length(seq)</code>:</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.gc_content" href="#BioSequences.gc_content"><code>BioSequences.gc_content</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">gc_content(seq::BioSequence) -&gt; Float64</code></pre><p>Calculate GC content of <code>seq</code>, i.e. the number of symbols that is <code>DNA_C</code>, <code>DNA_G</code>, <code>DNA_C</code> or <code>DNA_G</code> divided by the length of the sequence.</p><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; gc_content(dna&quot;AGCTA&quot;)
+1</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/counting.jl#L81-L102">source</a></section></article><h2 id="GC-content"><a class="docs-heading-anchor" href="#GC-content">GC content</a><a id="GC-content-1"></a><a class="docs-heading-anchor-permalink" href="#GC-content" title="Permalink"></a></h2><p>The convenience function <code>gc_content(seq)</code> is equivalent to <code>count(isGC, seq) / length(seq)</code>:</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.gc_content" href="#BioSequences.gc_content"><code>BioSequences.gc_content</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">gc_content(seq::BioSequence) -&gt; Float64</code></pre><p>Calculate GC content of <code>seq</code>, i.e. the number of symbols that is <code>DNA_C</code>, <code>DNA_G</code>, <code>DNA_C</code> or <code>DNA_G</code> divided by the length of the sequence.</p><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; gc_content(dna&quot;AGCTA&quot;)
 0.4
 
 julia&gt; gc_content(rna&quot;UAGCGA&quot;)
-0.5</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/counting.jl#L32-L46">source</a></section></article><h2 id="Deprecated-aliases"><a class="docs-heading-anchor" href="#Deprecated-aliases">Deprecated aliases</a><a id="Deprecated-aliases-1"></a><a class="docs-heading-anchor-permalink" href="#Deprecated-aliases" title="Permalink"></a></h2><p>Several of the optimised <code>count</code> methods have function names, which are deprecated:</p><table><tr><th style="text-align: left">Deprecated function</th><th style="text-align: left">Instead use</th></tr><tr><td style="text-align: left"><code>n_gaps</code></td><td style="text-align: left"><code>count(isgap, seq)</code></td></tr><tr><td style="text-align: left"><code>n_certain</code></td><td style="text-align: left"><code>count(iscertain, seq)</code></td></tr><tr><td style="text-align: left"><code>n_ambiguous</code></td><td style="text-align: left"><code>count(isambiguous, seq)</code></td></tr></table><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.n_gaps" href="#BioSequences.n_gaps"><code>BioSequences.n_gaps</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">n_gaps(a::BioSequence, [b::BioSequence]) -&gt; Int</code></pre><p>Count the number of positions where <code>a</code> (or <code>b</code>, if present) have gaps. If <code>b</code> is given, and the length of <code>a</code> and <code>b</code> differ, look only at the indices of the shorter sequence.</p><div class="admonition is-warning"><header class="admonition-header">Warning</header><div class="admonition-body"><p>Passing in two sequences is deprecated. In a future, breaking release of BioSequences, this will throw a <code>MethodError</code></p></div></div><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; n_gaps(dna&quot;--TAC-WN-ACY&quot;)
+0.5</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/counting.jl#L32-L46">source</a></section></article><h2 id="Deprecated-aliases"><a class="docs-heading-anchor" href="#Deprecated-aliases">Deprecated aliases</a><a id="Deprecated-aliases-1"></a><a class="docs-heading-anchor-permalink" href="#Deprecated-aliases" title="Permalink"></a></h2><p>Several of the optimised <code>count</code> methods have function names, which are deprecated:</p><table><tr><th style="text-align: left">Deprecated function</th><th style="text-align: left">Instead use</th></tr><tr><td style="text-align: left"><code>n_gaps</code></td><td style="text-align: left"><code>count(isgap, seq)</code></td></tr><tr><td style="text-align: left"><code>n_certain</code></td><td style="text-align: left"><code>count(iscertain, seq)</code></td></tr><tr><td style="text-align: left"><code>n_ambiguous</code></td><td style="text-align: left"><code>count(isambiguous, seq)</code></td></tr></table><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.n_gaps" href="#BioSequences.n_gaps"><code>BioSequences.n_gaps</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">n_gaps(a::BioSequence, [b::BioSequence]) -&gt; Int</code></pre><p>Count the number of positions where <code>a</code> (or <code>b</code>, if present) have gaps. If <code>b</code> is given, and the length of <code>a</code> and <code>b</code> differ, look only at the indices of the shorter sequence.</p><div class="admonition is-warning"><header class="admonition-header">Warning</header><div class="admonition-body"><p>Passing in two sequences is deprecated. In a future, breaking release of BioSequences, this will throw a <code>MethodError</code></p></div></div><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; n_gaps(dna&quot;--TAC-WN-ACY&quot;)
 4
 
 julia&gt; n_gaps(dna&quot;TC-AC-&quot;, dna&quot;-CACG&quot;)
-2</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/counting.jl#L111-L130">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.n_certain" href="#BioSequences.n_certain"><code>BioSequences.n_certain</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">n_certain(a::BioSequence, [b::BioSequence]) -&gt; Int</code></pre><p>Count the number of positions where <code>a</code> (and <code>b</code>, if present) have certain (i.e. non-ambigous and non-gap) symbols. If <code>b</code> is given, and the length of <code>a</code> and <code>b</code> differ, look only at the indices of the shorter sequence. Gaps are not certain.</p><div class="admonition is-warning"><header class="admonition-header">Warning</header><div class="admonition-body"><p>Passing in two sequences is deprecated. In a future, breaking release of BioSequences, this will throw a <code>MethodError</code></p></div></div><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; n_certain(dna&quot;--TAC-WN-ACY&quot;)
+2</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/counting.jl#L111-L130">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.n_certain" href="#BioSequences.n_certain"><code>BioSequences.n_certain</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">n_certain(a::BioSequence, [b::BioSequence]) -&gt; Int</code></pre><p>Count the number of positions where <code>a</code> (and <code>b</code>, if present) have certain (i.e. non-ambigous and non-gap) symbols. If <code>b</code> is given, and the length of <code>a</code> and <code>b</code> differ, look only at the indices of the shorter sequence. Gaps are not certain.</p><div class="admonition is-warning"><header class="admonition-header">Warning</header><div class="admonition-body"><p>Passing in two sequences is deprecated. In a future, breaking release of BioSequences, this will throw a <code>MethodError</code></p></div></div><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; n_certain(dna&quot;--TAC-WN-ACY&quot;)
 5
 
 julia&gt; n_certain(rna&quot;UAYWW&quot;, rna&quot;UAW&quot;)
-2</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/counting.jl#L173-L195">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.n_ambiguous" href="#BioSequences.n_ambiguous"><code>BioSequences.n_ambiguous</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">n_ambiguous(a::BioSequence, [b::BioSequence]) -&gt; Int</code></pre><p>Count the number of positions where <code>a</code> (or <code>b</code>, if present) have ambigious symbols. If <code>b</code> is given, and the length of <code>a</code> and <code>b</code> differ, look only at the indices of the shorter sequence. Gaps are not ambigous.</p><div class="admonition is-warning"><header class="admonition-header">Warning</header><div class="admonition-body"><p>Passing in two sequences is deprecated. In a future, breaking release of BioSequences, this will throw a <code>MethodError</code></p></div></div><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; n_ambiguous(dna&quot;--TAC-WN-ACY&quot;)
+2</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/counting.jl#L173-L195">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.n_ambiguous" href="#BioSequences.n_ambiguous"><code>BioSequences.n_ambiguous</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">n_ambiguous(a::BioSequence, [b::BioSequence]) -&gt; Int</code></pre><p>Count the number of positions where <code>a</code> (or <code>b</code>, if present) have ambigious symbols. If <code>b</code> is given, and the length of <code>a</code> and <code>b</code> differ, look only at the indices of the shorter sequence. Gaps are not ambigous.</p><div class="admonition is-warning"><header class="admonition-header">Warning</header><div class="admonition-body"><p>Passing in two sequences is deprecated. In a future, breaking release of BioSequences, this will throw a <code>MethodError</code></p></div></div><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; n_ambiguous(dna&quot;--TAC-WN-ACY&quot;)
 3
 
 julia&gt; n_ambiguous(rna&quot;UAYWW&quot;, rna&quot;UAW&quot;)
-1</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/counting.jl#L141-L162">source</a></section></article></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../sequence_search/">« Pattern matching and searching</a><a class="docs-footer-nextpage" href="../io/">I/O »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Thursday 24 October 2024 17:54">Thursday 24 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
+1</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/counting.jl#L141-L162">source</a></section></article></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../sequence_search/">« Pattern matching and searching</a><a class="docs-footer-nextpage" href="../io/">I/O »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Friday 25 October 2024 10:21">Friday 25 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
diff --git a/dev/index.html b/dev/index.html
index c7a42d23..08c77045 100644
--- a/dev/index.html
+++ b/dev/index.html
@@ -1,2 +1,2 @@
 <!DOCTYPE html>
-<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>Home · BioSequences.jl</title><meta name="title" content="Home · BioSequences.jl"/><meta property="og:title" content="Home · BioSequences.jl"/><meta property="twitter:title" content="Home · BioSequences.jl"/><meta name="description" content="Documentation for BioSequences.jl."/><meta property="og:description" content="Documentation for BioSequences.jl."/><meta property="twitter:description" content="Documentation for BioSequences.jl."/><script data-outdated-warner src="assets/warner.js"></script><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.050/juliamono.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.8/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL="."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="assets/documenter.js"></script><script src="search_index.js"></script><script src="siteinfo.js"></script><script src="../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="assets/themes/catppuccin-mocha.css" data-theme-name="catppuccin-mocha"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="assets/themes/catppuccin-macchiato.css" data-theme-name="catppuccin-macchiato"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="assets/themes/catppuccin-frappe.css" data-theme-name="catppuccin-frappe"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="assets/themes/catppuccin-latte.css" data-theme-name="catppuccin-latte"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><a class="docs-logo" href><img src="assets/logo.png" alt="BioSequences.jl logo"/></a><div class="docs-package-name"><span class="docs-autofit"><a href>BioSequences.jl</a></span></div><button class="docs-search-query input is-rounded is-small is-clickable my-2 mx-auto py-1 px-2" id="documenter-search-query">Search docs (Ctrl + /)</button><ul class="docs-menu"><li class="is-active"><a class="tocitem" href>Home</a><ul class="internal"><li><a class="tocitem" href="#Description"><span>Description</span></a></li><li><a class="tocitem" href="#Installation"><span>Installation</span></a></li><li><a class="tocitem" href="#Testing"><span>Testing</span></a></li><li><a class="tocitem" href="#Contributing"><span>Contributing</span></a></li><li><a class="tocitem" href="#Questions?"><span>Questions?</span></a></li></ul></li><li><a class="tocitem" href="symbols/">Biological Symbols</a></li><li><a class="tocitem" href="types/">BioSequences Types</a></li><li><a class="tocitem" href="construction/">Constructing sequences</a></li><li><a class="tocitem" href="transforms/">Indexing &amp; modifying sequences</a></li><li><a class="tocitem" href="predicates/">Predicates</a></li><li><a class="tocitem" href="random/">Random sequences</a></li><li><a class="tocitem" href="sequence_search/">Pattern matching and searching</a></li><li><a class="tocitem" href="counting/">Counting</a></li><li><a class="tocitem" href="io/">I/O</a></li><li><a class="tocitem" href="interfaces/">Implementing custom types</a></li><li><a class="tocitem" href="recipes/">Recipes</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><a class="docs-sidebar-button docs-navbar-link fa-solid fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>Home</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>Home</a></li></ul></nav><div class="docs-right"><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl" title="View the repository on GitHub"><span class="docs-icon fa-brands"></span><span class="docs-label is-hidden-touch">GitHub</span></a><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl/blob/master/docs/src/index.md" title="Edit source on GitHub"><span class="docs-icon fa-solid"></span></a><a class="docs-settings-button docs-navbar-link fa-solid fa-gear" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-article-toggle-button fa-solid fa-chevron-up" id="documenter-article-toggle-button" href="javascript:;" title="Collapse all docstrings"></a></div></header><article class="content" id="documenter-page"><h1 id="BioSequences"><a class="docs-heading-anchor" href="#BioSequences">BioSequences</a><a id="BioSequences-1"></a><a class="docs-heading-anchor-permalink" href="#BioSequences" title="Permalink"></a></h1><p><a href="https://github.com/BioJulia/BioSequences.jl/releases/latest"><img src="https://img.shields.io/github/release/BioJulia/BioSequences.jl.svg" alt="Latest Release"/></a> <a href="https://github.com/BioJulia/BioSequences.jl/blob/master/LICENSE"><img src="https://img.shields.io/badge/license-MIT-green.svg" alt="MIT license"/></a> <a href="https://biojulia.github.io/BioSequences.jl/stable"><img src="https://img.shields.io/badge/docs-stable-blue.svg" alt="Documentation"/></a> <a href="https://www.repostatus.org/#active"><img src="https://www.repostatus.org/badges/latest/active.svg" alt="Pkg Status"/></a></p><h2 id="Description"><a class="docs-heading-anchor" href="#Description">Description</a><a id="Description-1"></a><a class="docs-heading-anchor-permalink" href="#Description" title="Permalink"></a></h2><p>BioSequences provides data types and methods for common operations with biological sequences, including DNA, RNA, and amino acid sequences.</p><h2 id="Installation"><a class="docs-heading-anchor" href="#Installation">Installation</a><a id="Installation-1"></a><a class="docs-heading-anchor-permalink" href="#Installation" title="Permalink"></a></h2><p>You can install BioSequences from the julia REPL. Press <code>]</code> to enter pkg mode again, and enter the following:</p><pre><code class="language-julia hljs">add BioSequences</code></pre><p>If you are interested in the cutting edge of the development, please check out the master branch to try new features before release.</p><h2 id="Testing"><a class="docs-heading-anchor" href="#Testing">Testing</a><a id="Testing-1"></a><a class="docs-heading-anchor-permalink" href="#Testing" title="Permalink"></a></h2><p>BioSequences is tested against Julia <code>1.X</code> on Linux, OS X, and Windows.</p><p><a href="https://github.com/BioJulia/BioSequences.jl/actions?query=workflow%3A%22Unit+tests%22+branch%3Amaster"><img src="https://github.com/BioJulia/BioSequences.jl/workflows/Unit%20tests/badge.svg?branch=master" alt="Unit tests"/></a> <a href="https://github.com/BioJulia/BioSequences.jl/actions?query=workflow%3ADocumentation+branch%3Amaster"><img src="https://github.com/BioJulia/BioSequences.jl/workflows/Documentation/badge.svg?branch=master" alt="Documentation"/></a> <a href="https://codecov.io/gh/BioJulia/BioSequences.jl"><img src="https://codecov.io/gh/BioJulia/BioSequences.jl/branch/master/graph/badge.svg" alt/></a></p><h2 id="Contributing"><a class="docs-heading-anchor" href="#Contributing">Contributing</a><a id="Contributing-1"></a><a class="docs-heading-anchor-permalink" href="#Contributing" title="Permalink"></a></h2><p>We appreciate contributions from users including reporting bugs, fixing issues, improving performance and adding new features.</p><p>Take a look at the <a href="https://github.com/BioJulia/Contributing">contributing files</a> detailed contributor and maintainer guidelines, and code of conduct.</p><h2 id="Questions?"><a class="docs-heading-anchor" href="#Questions?">Questions?</a><a id="Questions?-1"></a><a class="docs-heading-anchor-permalink" href="#Questions?" title="Permalink"></a></h2><p>If you have a question about contributing or using BioJulia software, come on over and chat to us on <a href="https://julialang.org/slack/">the #biology channel on the Julia SLack</a>, or you can try the <a href="https://discourse.julialang.org/c/domain/bio">Bio category of the Julia discourse site</a>.</p></article><nav class="docs-footer"><a class="docs-footer-nextpage" href="symbols/">Biological Symbols »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Thursday 24 October 2024 17:54">Thursday 24 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
+<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>Home · BioSequences.jl</title><meta name="title" content="Home · BioSequences.jl"/><meta property="og:title" content="Home · BioSequences.jl"/><meta property="twitter:title" content="Home · BioSequences.jl"/><meta name="description" content="Documentation for BioSequences.jl."/><meta property="og:description" content="Documentation for BioSequences.jl."/><meta property="twitter:description" content="Documentation for BioSequences.jl."/><script data-outdated-warner src="assets/warner.js"></script><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.050/juliamono.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.8/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL="."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="assets/documenter.js"></script><script src="search_index.js"></script><script src="siteinfo.js"></script><script src="../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="assets/themes/catppuccin-mocha.css" data-theme-name="catppuccin-mocha"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="assets/themes/catppuccin-macchiato.css" data-theme-name="catppuccin-macchiato"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="assets/themes/catppuccin-frappe.css" data-theme-name="catppuccin-frappe"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="assets/themes/catppuccin-latte.css" data-theme-name="catppuccin-latte"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><a class="docs-logo" href><img src="assets/logo.png" alt="BioSequences.jl logo"/></a><div class="docs-package-name"><span class="docs-autofit"><a href>BioSequences.jl</a></span></div><button class="docs-search-query input is-rounded is-small is-clickable my-2 mx-auto py-1 px-2" id="documenter-search-query">Search docs (Ctrl + /)</button><ul class="docs-menu"><li class="is-active"><a class="tocitem" href>Home</a><ul class="internal"><li><a class="tocitem" href="#Description"><span>Description</span></a></li><li><a class="tocitem" href="#Installation"><span>Installation</span></a></li><li><a class="tocitem" href="#Testing"><span>Testing</span></a></li><li><a class="tocitem" href="#Contributing"><span>Contributing</span></a></li><li><a class="tocitem" href="#Questions?"><span>Questions?</span></a></li></ul></li><li><a class="tocitem" href="symbols/">Biological Symbols</a></li><li><a class="tocitem" href="types/">BioSequences Types</a></li><li><a class="tocitem" href="construction/">Constructing sequences</a></li><li><a class="tocitem" href="transforms/">Indexing &amp; modifying sequences</a></li><li><a class="tocitem" href="predicates/">Predicates</a></li><li><a class="tocitem" href="random/">Random sequences</a></li><li><a class="tocitem" href="sequence_search/">Pattern matching and searching</a></li><li><a class="tocitem" href="counting/">Counting</a></li><li><a class="tocitem" href="io/">I/O</a></li><li><a class="tocitem" href="interfaces/">Implementing custom types</a></li><li><a class="tocitem" href="recipes/">Recipes</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><a class="docs-sidebar-button docs-navbar-link fa-solid fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>Home</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>Home</a></li></ul></nav><div class="docs-right"><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl" title="View the repository on GitHub"><span class="docs-icon fa-brands"></span><span class="docs-label is-hidden-touch">GitHub</span></a><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl/blob/master/docs/src/index.md" title="Edit source on GitHub"><span class="docs-icon fa-solid"></span></a><a class="docs-settings-button docs-navbar-link fa-solid fa-gear" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-article-toggle-button fa-solid fa-chevron-up" id="documenter-article-toggle-button" href="javascript:;" title="Collapse all docstrings"></a></div></header><article class="content" id="documenter-page"><h1 id="BioSequences"><a class="docs-heading-anchor" href="#BioSequences">BioSequences</a><a id="BioSequences-1"></a><a class="docs-heading-anchor-permalink" href="#BioSequences" title="Permalink"></a></h1><p><a href="https://github.com/BioJulia/BioSequences.jl/releases/latest"><img src="https://img.shields.io/github/release/BioJulia/BioSequences.jl.svg" alt="Latest Release"/></a> <a href="https://github.com/BioJulia/BioSequences.jl/blob/master/LICENSE"><img src="https://img.shields.io/badge/license-MIT-green.svg" alt="MIT license"/></a> <a href="https://biojulia.github.io/BioSequences.jl/stable"><img src="https://img.shields.io/badge/docs-stable-blue.svg" alt="Documentation"/></a> <a href="https://www.repostatus.org/#active"><img src="https://www.repostatus.org/badges/latest/active.svg" alt="Pkg Status"/></a></p><h2 id="Description"><a class="docs-heading-anchor" href="#Description">Description</a><a id="Description-1"></a><a class="docs-heading-anchor-permalink" href="#Description" title="Permalink"></a></h2><p>BioSequences provides data types and methods for common operations with biological sequences, including DNA, RNA, and amino acid sequences.</p><h2 id="Installation"><a class="docs-heading-anchor" href="#Installation">Installation</a><a id="Installation-1"></a><a class="docs-heading-anchor-permalink" href="#Installation" title="Permalink"></a></h2><p>You can install BioSequences from the julia REPL. Press <code>]</code> to enter pkg mode again, and enter the following:</p><pre><code class="language-julia hljs">add BioSequences</code></pre><p>If you are interested in the cutting edge of the development, please check out the master branch to try new features before release.</p><h2 id="Testing"><a class="docs-heading-anchor" href="#Testing">Testing</a><a id="Testing-1"></a><a class="docs-heading-anchor-permalink" href="#Testing" title="Permalink"></a></h2><p>BioSequences is tested against Julia <code>1.X</code> on Linux, OS X, and Windows.</p><p><a href="https://github.com/BioJulia/BioSequences.jl/actions?query=workflow%3A%22Unit+tests%22+branch%3Amaster"><img src="https://github.com/BioJulia/BioSequences.jl/workflows/Unit%20tests/badge.svg?branch=master" alt="Unit tests"/></a> <a href="https://github.com/BioJulia/BioSequences.jl/actions?query=workflow%3ADocumentation+branch%3Amaster"><img src="https://github.com/BioJulia/BioSequences.jl/workflows/Documentation/badge.svg?branch=master" alt="Documentation"/></a> <a href="https://codecov.io/gh/BioJulia/BioSequences.jl"><img src="https://codecov.io/gh/BioJulia/BioSequences.jl/branch/master/graph/badge.svg" alt/></a></p><h2 id="Contributing"><a class="docs-heading-anchor" href="#Contributing">Contributing</a><a id="Contributing-1"></a><a class="docs-heading-anchor-permalink" href="#Contributing" title="Permalink"></a></h2><p>We appreciate contributions from users including reporting bugs, fixing issues, improving performance and adding new features.</p><p>Take a look at the <a href="https://github.com/BioJulia/Contributing">contributing files</a> detailed contributor and maintainer guidelines, and code of conduct.</p><h2 id="Questions?"><a class="docs-heading-anchor" href="#Questions?">Questions?</a><a id="Questions?-1"></a><a class="docs-heading-anchor-permalink" href="#Questions?" title="Permalink"></a></h2><p>If you have a question about contributing or using BioJulia software, come on over and chat to us on <a href="https://julialang.org/slack/">the #biology channel on the Julia SLack</a>, or you can try the <a href="https://discourse.julialang.org/c/domain/bio">Bio category of the Julia discourse site</a>.</p></article><nav class="docs-footer"><a class="docs-footer-nextpage" href="symbols/">Biological Symbols »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Friday 25 October 2024 10:21">Friday 25 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
diff --git a/dev/interfaces/index.html b/dev/interfaces/index.html
index 98a77444..b4d9ed5e 100644
--- a/dev/interfaces/index.html
+++ b/dev/interfaces/index.html
@@ -59,4 +59,4 @@
 julia&gt; Base.copy(seq::Codon) = Codon(seq.x)
 
 julia&gt; BioSequences.has_interface(BioSequence, Codon, [RNA_C, RNA_U, RNA_A], false)
-true</code></pre><h2 id="Interface-checking-functions"><a class="docs-heading-anchor" href="#Interface-checking-functions">Interface checking functions</a><a id="Interface-checking-functions-1"></a><a class="docs-heading-anchor-permalink" href="#Interface-checking-functions" title="Permalink"></a></h2><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.has_interface" href="#BioSequences.has_interface"><code>BioSequences.has_interface</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">function has_interface(::Type{Alphabet}, A::Alphabet)</code></pre><p>Returns whether <code>A</code> conforms to the <code>Alphabet</code> interface.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/alphabet.jl#L41-L45">source</a></section><section><div><pre><code class="language-julia hljs">has_interface(::Type{BioSequence}, ::T, syms::Vector, mutable::Bool, compat::Bool=true)</code></pre><p>Check if type <code>T</code> conforms to the <code>BioSequence</code> interface. A <code>T</code> is constructed from the vector of element types <code>syms</code> which must not be empty. If the <code>mutable</code> flag is set, also check the mutable interface. If the <code>compat</code> flag is set, check for compatibility with existing alphabets.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/biosequence.jl#L58-L65">source</a></section></article></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../io/">« I/O</a><a class="docs-footer-nextpage" href="../recipes/">Recipes »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Thursday 24 October 2024 17:54">Thursday 24 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
+true</code></pre><h2 id="Interface-checking-functions"><a class="docs-heading-anchor" href="#Interface-checking-functions">Interface checking functions</a><a id="Interface-checking-functions-1"></a><a class="docs-heading-anchor-permalink" href="#Interface-checking-functions" title="Permalink"></a></h2><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.has_interface" href="#BioSequences.has_interface"><code>BioSequences.has_interface</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">function has_interface(::Type{Alphabet}, A::Alphabet)</code></pre><p>Returns whether <code>A</code> conforms to the <code>Alphabet</code> interface.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/alphabet.jl#L45-L49">source</a></section><section><div><pre><code class="language-julia hljs">has_interface(::Type{BioSequence}, ::T, syms::Vector, mutable::Bool, compat::Bool=true)</code></pre><p>Check if type <code>T</code> conforms to the <code>BioSequence</code> interface. A <code>T</code> is constructed from the vector of element types <code>syms</code> which must not be empty. If the <code>mutable</code> flag is set, also check the mutable interface. If the <code>compat</code> flag is set, check for compatibility with existing alphabets.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/biosequence.jl#L58-L65">source</a></section></article></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../io/">« I/O</a><a class="docs-footer-nextpage" href="../recipes/">Recipes »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Friday 25 October 2024 10:21">Friday 25 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
diff --git a/dev/io/index.html b/dev/io/index.html
index 5d14b662..9a77cc6f 100644
--- a/dev/io/index.html
+++ b/dev/io/index.html
@@ -1,2 +1,2 @@
 <!DOCTYPE html>
-<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>I/O · BioSequences.jl</title><meta name="title" content="I/O · BioSequences.jl"/><meta property="og:title" content="I/O · BioSequences.jl"/><meta property="twitter:title" content="I/O · BioSequences.jl"/><meta name="description" content="Documentation for BioSequences.jl."/><meta property="og:description" content="Documentation for BioSequences.jl."/><meta property="twitter:description" content="Documentation for BioSequences.jl."/><script data-outdated-warner src="../assets/warner.js"></script><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.050/juliamono.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.8/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../assets/documenter.js"></script><script src="../search_index.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-mocha.css" data-theme-name="catppuccin-mocha"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-macchiato.css" data-theme-name="catppuccin-macchiato"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-frappe.css" data-theme-name="catppuccin-frappe"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-latte.css" data-theme-name="catppuccin-latte"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><a class="docs-logo" href="../"><img src="../assets/logo.png" alt="BioSequences.jl logo"/></a><div class="docs-package-name"><span class="docs-autofit"><a href="../">BioSequences.jl</a></span></div><button class="docs-search-query input is-rounded is-small is-clickable my-2 mx-auto py-1 px-2" id="documenter-search-query">Search docs (Ctrl + /)</button><ul class="docs-menu"><li><a class="tocitem" href="../">Home</a></li><li><a class="tocitem" href="../symbols/">Biological Symbols</a></li><li><a class="tocitem" href="../types/">BioSequences Types</a></li><li><a class="tocitem" href="../construction/">Constructing sequences</a></li><li><a class="tocitem" href="../transforms/">Indexing &amp; modifying sequences</a></li><li><a class="tocitem" href="../predicates/">Predicates</a></li><li><a class="tocitem" href="../random/">Random sequences</a></li><li><a class="tocitem" href="../sequence_search/">Pattern matching and searching</a></li><li><a class="tocitem" href="../counting/">Counting</a></li><li class="is-active"><a class="tocitem" href>I/O</a></li><li><a class="tocitem" href="../interfaces/">Implementing custom types</a></li><li><a class="tocitem" href="../recipes/">Recipes</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><a class="docs-sidebar-button docs-navbar-link fa-solid fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>I/O</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>I/O</a></li></ul></nav><div class="docs-right"><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl" title="View the repository on GitHub"><span class="docs-icon fa-brands"></span><span class="docs-label is-hidden-touch">GitHub</span></a><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl/blob/master/docs/src/io.md" title="Edit source on GitHub"><span class="docs-icon fa-solid"></span></a><a class="docs-settings-button docs-navbar-link fa-solid fa-gear" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-article-toggle-button fa-solid fa-chevron-up" id="documenter-article-toggle-button" href="javascript:;" title="Collapse all docstrings"></a></div></header><article class="content" id="documenter-page"><h1 id="I/O-for-sequencing-file-formats"><a class="docs-heading-anchor" href="#I/O-for-sequencing-file-formats">I/O for sequencing file formats</a><a id="I/O-for-sequencing-file-formats-1"></a><a class="docs-heading-anchor-permalink" href="#I/O-for-sequencing-file-formats" title="Permalink"></a></h1><p>Versions of BioSequences prior to v2.0 provided a FASTA, FASTQ, and 2Bit submodule for working with formatted sequence files.</p><p>After version v2.0, in order to neatly separate concerns, these submodules were removed.</p><p>Instead there will now be dedicated BioJulia packages for each format. Each of these will be compatible with BioSequences.</p><p>A list of all of the different formats and packages is provided below to help you find them quickly.</p><table><tr><th style="text-align: left">Format</th><th style="text-align: left">Package</th></tr><tr><td style="text-align: left">FASTA</td><td style="text-align: left"><a href="https://github.com/BioJulia/FASTX.jl">FASTX.jl</a></td></tr><tr><td style="text-align: left">FASTQ</td><td style="text-align: left"><a href="https://github.com/BioJulia/FASTX.jl">FASTX.jl</a></td></tr><tr><td style="text-align: left">2Bit</td><td style="text-align: left"><a href="https://github.com/BioJulia/TwoBit.jl">TwoBit.jl</a></td></tr></table></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../counting/">« Counting</a><a class="docs-footer-nextpage" href="../interfaces/">Implementing custom types »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Thursday 24 October 2024 17:54">Thursday 24 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
+<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>I/O · BioSequences.jl</title><meta name="title" content="I/O · BioSequences.jl"/><meta property="og:title" content="I/O · BioSequences.jl"/><meta property="twitter:title" content="I/O · BioSequences.jl"/><meta name="description" content="Documentation for BioSequences.jl."/><meta property="og:description" content="Documentation for BioSequences.jl."/><meta property="twitter:description" content="Documentation for BioSequences.jl."/><script data-outdated-warner src="../assets/warner.js"></script><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.050/juliamono.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.8/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../assets/documenter.js"></script><script src="../search_index.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-mocha.css" data-theme-name="catppuccin-mocha"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-macchiato.css" data-theme-name="catppuccin-macchiato"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-frappe.css" data-theme-name="catppuccin-frappe"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-latte.css" data-theme-name="catppuccin-latte"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><a class="docs-logo" href="../"><img src="../assets/logo.png" alt="BioSequences.jl logo"/></a><div class="docs-package-name"><span class="docs-autofit"><a href="../">BioSequences.jl</a></span></div><button class="docs-search-query input is-rounded is-small is-clickable my-2 mx-auto py-1 px-2" id="documenter-search-query">Search docs (Ctrl + /)</button><ul class="docs-menu"><li><a class="tocitem" href="../">Home</a></li><li><a class="tocitem" href="../symbols/">Biological Symbols</a></li><li><a class="tocitem" href="../types/">BioSequences Types</a></li><li><a class="tocitem" href="../construction/">Constructing sequences</a></li><li><a class="tocitem" href="../transforms/">Indexing &amp; modifying sequences</a></li><li><a class="tocitem" href="../predicates/">Predicates</a></li><li><a class="tocitem" href="../random/">Random sequences</a></li><li><a class="tocitem" href="../sequence_search/">Pattern matching and searching</a></li><li><a class="tocitem" href="../counting/">Counting</a></li><li class="is-active"><a class="tocitem" href>I/O</a></li><li><a class="tocitem" href="../interfaces/">Implementing custom types</a></li><li><a class="tocitem" href="../recipes/">Recipes</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><a class="docs-sidebar-button docs-navbar-link fa-solid fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>I/O</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>I/O</a></li></ul></nav><div class="docs-right"><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl" title="View the repository on GitHub"><span class="docs-icon fa-brands"></span><span class="docs-label is-hidden-touch">GitHub</span></a><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl/blob/master/docs/src/io.md" title="Edit source on GitHub"><span class="docs-icon fa-solid"></span></a><a class="docs-settings-button docs-navbar-link fa-solid fa-gear" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-article-toggle-button fa-solid fa-chevron-up" id="documenter-article-toggle-button" href="javascript:;" title="Collapse all docstrings"></a></div></header><article class="content" id="documenter-page"><h1 id="I/O-for-sequencing-file-formats"><a class="docs-heading-anchor" href="#I/O-for-sequencing-file-formats">I/O for sequencing file formats</a><a id="I/O-for-sequencing-file-formats-1"></a><a class="docs-heading-anchor-permalink" href="#I/O-for-sequencing-file-formats" title="Permalink"></a></h1><p>Versions of BioSequences prior to v2.0 provided a FASTA, FASTQ, and 2Bit submodule for working with formatted sequence files.</p><p>After version v2.0, in order to neatly separate concerns, these submodules were removed.</p><p>Instead there will now be dedicated BioJulia packages for each format. Each of these will be compatible with BioSequences.</p><p>A list of all of the different formats and packages is provided below to help you find them quickly.</p><table><tr><th style="text-align: left">Format</th><th style="text-align: left">Package</th></tr><tr><td style="text-align: left">FASTA</td><td style="text-align: left"><a href="https://github.com/BioJulia/FASTX.jl">FASTX.jl</a></td></tr><tr><td style="text-align: left">FASTQ</td><td style="text-align: left"><a href="https://github.com/BioJulia/FASTX.jl">FASTX.jl</a></td></tr><tr><td style="text-align: left">2Bit</td><td style="text-align: left"><a href="https://github.com/BioJulia/TwoBit.jl">TwoBit.jl</a></td></tr></table></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../counting/">« Counting</a><a class="docs-footer-nextpage" href="../interfaces/">Implementing custom types »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Friday 25 October 2024 10:21">Friday 25 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
diff --git a/dev/predicates/index.html b/dev/predicates/index.html
index a76f67db..f7a79c3a 100644
--- a/dev/predicates/index.html
+++ b/dev/predicates/index.html
@@ -1,12 +1,12 @@
 <!DOCTYPE html>
-<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>Predicates · BioSequences.jl</title><meta name="title" content="Predicates · BioSequences.jl"/><meta property="og:title" content="Predicates · BioSequences.jl"/><meta property="twitter:title" content="Predicates · BioSequences.jl"/><meta name="description" content="Documentation for BioSequences.jl."/><meta property="og:description" content="Documentation for BioSequences.jl."/><meta property="twitter:description" content="Documentation for BioSequences.jl."/><script data-outdated-warner src="../assets/warner.js"></script><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.050/juliamono.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.8/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../assets/documenter.js"></script><script src="../search_index.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-mocha.css" data-theme-name="catppuccin-mocha"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-macchiato.css" data-theme-name="catppuccin-macchiato"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-frappe.css" data-theme-name="catppuccin-frappe"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-latte.css" data-theme-name="catppuccin-latte"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><a class="docs-logo" href="../"><img src="../assets/logo.png" alt="BioSequences.jl logo"/></a><div class="docs-package-name"><span class="docs-autofit"><a href="../">BioSequences.jl</a></span></div><button class="docs-search-query input is-rounded is-small is-clickable my-2 mx-auto py-1 px-2" id="documenter-search-query">Search docs (Ctrl + /)</button><ul class="docs-menu"><li><a class="tocitem" href="../">Home</a></li><li><a class="tocitem" href="../symbols/">Biological Symbols</a></li><li><a class="tocitem" href="../types/">BioSequences Types</a></li><li><a class="tocitem" href="../construction/">Constructing sequences</a></li><li><a class="tocitem" href="../transforms/">Indexing &amp; modifying sequences</a></li><li class="is-active"><a class="tocitem" href>Predicates</a></li><li><a class="tocitem" href="../random/">Random sequences</a></li><li><a class="tocitem" href="../sequence_search/">Pattern matching and searching</a></li><li><a class="tocitem" href="../counting/">Counting</a></li><li><a class="tocitem" href="../io/">I/O</a></li><li><a class="tocitem" href="../interfaces/">Implementing custom types</a></li><li><a class="tocitem" href="../recipes/">Recipes</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><a class="docs-sidebar-button docs-navbar-link fa-solid fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>Predicates</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>Predicates</a></li></ul></nav><div class="docs-right"><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl" title="View the repository on GitHub"><span class="docs-icon fa-brands"></span><span class="docs-label is-hidden-touch">GitHub</span></a><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl/blob/master/docs/src/predicates.md" title="Edit source on GitHub"><span class="docs-icon fa-solid"></span></a><a class="docs-settings-button docs-navbar-link fa-solid fa-gear" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-article-toggle-button fa-solid fa-chevron-up" id="documenter-article-toggle-button" href="javascript:;" title="Collapse all docstrings"></a></div></header><article class="content" id="documenter-page"><h1 id="Predicates"><a class="docs-heading-anchor" href="#Predicates">Predicates</a><a id="Predicates-1"></a><a class="docs-heading-anchor-permalink" href="#Predicates" title="Permalink"></a></h1><p>A number of predicate or query functions are supported for sequences, allowing you to check for certain properties of a sequence.</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.isrepetitive" href="#BioSequences.isrepetitive"><code>BioSequences.isrepetitive</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">isrepetitive(seq::BioSequence, n::Integer = length(seq))</code></pre><p>Return <code>true</code> if and only if <code>seq</code> contains a repetitive subsequence of length <code>≥ n</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/predicates.jl#L27-L31">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.ispalindromic" href="#BioSequences.ispalindromic"><code>BioSequences.ispalindromic</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">ispalindromic(seq::NucSeq) -&gt; Bool</code></pre><p>Check if <code>seq</code> is palindromic. A palindromic sequence is identical to its reverse-complement, so this should be equivalent to checking if <code>seq == reverse_complement(seq)</code>.</p><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; ispalindromic(dna&quot;TGCA&quot;)
+<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>Predicates · BioSequences.jl</title><meta name="title" content="Predicates · BioSequences.jl"/><meta property="og:title" content="Predicates · BioSequences.jl"/><meta property="twitter:title" content="Predicates · BioSequences.jl"/><meta name="description" content="Documentation for BioSequences.jl."/><meta property="og:description" content="Documentation for BioSequences.jl."/><meta property="twitter:description" content="Documentation for BioSequences.jl."/><script data-outdated-warner src="../assets/warner.js"></script><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.050/juliamono.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.8/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../assets/documenter.js"></script><script src="../search_index.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-mocha.css" data-theme-name="catppuccin-mocha"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-macchiato.css" data-theme-name="catppuccin-macchiato"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-frappe.css" data-theme-name="catppuccin-frappe"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-latte.css" data-theme-name="catppuccin-latte"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><a class="docs-logo" href="../"><img src="../assets/logo.png" alt="BioSequences.jl logo"/></a><div class="docs-package-name"><span class="docs-autofit"><a href="../">BioSequences.jl</a></span></div><button class="docs-search-query input is-rounded is-small is-clickable my-2 mx-auto py-1 px-2" id="documenter-search-query">Search docs (Ctrl + /)</button><ul class="docs-menu"><li><a class="tocitem" href="../">Home</a></li><li><a class="tocitem" href="../symbols/">Biological Symbols</a></li><li><a class="tocitem" href="../types/">BioSequences Types</a></li><li><a class="tocitem" href="../construction/">Constructing sequences</a></li><li><a class="tocitem" href="../transforms/">Indexing &amp; modifying sequences</a></li><li class="is-active"><a class="tocitem" href>Predicates</a></li><li><a class="tocitem" href="../random/">Random sequences</a></li><li><a class="tocitem" href="../sequence_search/">Pattern matching and searching</a></li><li><a class="tocitem" href="../counting/">Counting</a></li><li><a class="tocitem" href="../io/">I/O</a></li><li><a class="tocitem" href="../interfaces/">Implementing custom types</a></li><li><a class="tocitem" href="../recipes/">Recipes</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><a class="docs-sidebar-button docs-navbar-link fa-solid fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>Predicates</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>Predicates</a></li></ul></nav><div class="docs-right"><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl" title="View the repository on GitHub"><span class="docs-icon fa-brands"></span><span class="docs-label is-hidden-touch">GitHub</span></a><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl/blob/master/docs/src/predicates.md" title="Edit source on GitHub"><span class="docs-icon fa-solid"></span></a><a class="docs-settings-button docs-navbar-link fa-solid fa-gear" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-article-toggle-button fa-solid fa-chevron-up" id="documenter-article-toggle-button" href="javascript:;" title="Collapse all docstrings"></a></div></header><article class="content" id="documenter-page"><h1 id="Predicates"><a class="docs-heading-anchor" href="#Predicates">Predicates</a><a id="Predicates-1"></a><a class="docs-heading-anchor-permalink" href="#Predicates" title="Permalink"></a></h1><p>A number of predicate or query functions are supported for sequences, allowing you to check for certain properties of a sequence.</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.isrepetitive" href="#BioSequences.isrepetitive"><code>BioSequences.isrepetitive</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">isrepetitive(seq::BioSequence, n::Integer = length(seq))</code></pre><p>Return <code>true</code> if and only if <code>seq</code> contains a repetitive subsequence of length <code>≥ n</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/predicates.jl#L27-L31">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.ispalindromic" href="#BioSequences.ispalindromic"><code>BioSequences.ispalindromic</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">ispalindromic(seq::NucSeq) -&gt; Bool</code></pre><p>Check if <code>seq</code> is palindromic. A palindromic sequence is identical to its reverse-complement, so this should be equivalent to checking if <code>seq == reverse_complement(seq)</code>.</p><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; ispalindromic(dna&quot;TGCA&quot;)
 true
 
 julia&gt; ispalindromic(dna&quot;TCCT&quot;)
 false
 
 julia&gt; ispalindromic(rna&quot;ACGGU&quot;)
-false</code></pre><p>Return <code>true</code> if <code>seq</code> is a palindromic sequence; otherwise return <code>false</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/predicates.jl#L61-L81">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.hasambiguity" href="#BioSequences.hasambiguity"><code>BioSequences.hasambiguity</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">hasambiguity(seq::BioSequence)</code></pre><p>Returns <code>true</code> if <code>seq</code> has an ambiguous symbol; otherwise return <code>false</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/predicates.jl#L99-L103">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.iscanonical" href="#BioSequences.iscanonical"><code>BioSequences.iscanonical</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">iscanonical(seq::NucleotideSeq)</code></pre><p>Returns <code>true</code> if <code>seq</code> is canonical.</p><p>For any sequence, there is a reverse complement, which is the same sequence, but on the complimentary strand of DNA:</p><pre><code class="nohighlight hljs">-------&gt;
+false</code></pre><p>Return <code>true</code> if <code>seq</code> is a palindromic sequence; otherwise return <code>false</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/predicates.jl#L61-L81">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.hasambiguity" href="#BioSequences.hasambiguity"><code>BioSequences.hasambiguity</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">hasambiguity(seq::BioSequence)</code></pre><p>Returns <code>true</code> if <code>seq</code> has an ambiguous symbol; otherwise return <code>false</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/predicates.jl#L99-L103">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.iscanonical" href="#BioSequences.iscanonical"><code>BioSequences.iscanonical</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">iscanonical(seq::NucleotideSeq)</code></pre><p>Returns <code>true</code> if <code>seq</code> is canonical.</p><p>For any sequence, there is a reverse complement, which is the same sequence, but on the complimentary strand of DNA:</p><pre><code class="nohighlight hljs">-------&gt;
 ATCGATCG
 CGATCGAT
-&lt;-------</code></pre><div class="admonition is-info"><header class="admonition-header">Note</header><div class="admonition-body"><p>Using the <a href="../transforms/#BioSequences.reverse_complement"><code>reverse_complement</code></a> of a DNA sequence will give give this reverse complement.</p></div></div><p>Of the two sequences, the <em>canonical</em> of the two sequences is the lesser of the two i.e. <code>canonical_seq &lt; other_seq</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/predicates.jl#L115-L136">source</a></section></article></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../transforms/">« Indexing &amp; modifying sequences</a><a class="docs-footer-nextpage" href="../random/">Random sequences »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Thursday 24 October 2024 17:54">Thursday 24 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
+&lt;-------</code></pre><div class="admonition is-info"><header class="admonition-header">Note</header><div class="admonition-body"><p>Using the <a href="../transforms/#BioSequences.reverse_complement"><code>reverse_complement</code></a> of a DNA sequence will give give this reverse complement.</p></div></div><p>Of the two sequences, the <em>canonical</em> of the two sequences is the lesser of the two i.e. <code>canonical_seq &lt; other_seq</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/predicates.jl#L115-L136">source</a></section></article></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../transforms/">« Indexing &amp; modifying sequences</a><a class="docs-footer-nextpage" href="../random/">Random sequences »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Friday 25 October 2024 10:21">Friday 25 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
diff --git a/dev/random/index.html b/dev/random/index.html
index 8648e87f..302b8737 100644
--- a/dev/random/index.html
+++ b/dev/random/index.html
@@ -1,8 +1,8 @@
 <!DOCTYPE html>
 <html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>Random sequences · BioSequences.jl</title><meta name="title" content="Random sequences · BioSequences.jl"/><meta property="og:title" content="Random sequences · BioSequences.jl"/><meta property="twitter:title" content="Random sequences · BioSequences.jl"/><meta name="description" content="Documentation for BioSequences.jl."/><meta property="og:description" content="Documentation for BioSequences.jl."/><meta property="twitter:description" content="Documentation for BioSequences.jl."/><script data-outdated-warner src="../assets/warner.js"></script><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.050/juliamono.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.8/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../assets/documenter.js"></script><script src="../search_index.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-mocha.css" data-theme-name="catppuccin-mocha"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-macchiato.css" data-theme-name="catppuccin-macchiato"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-frappe.css" data-theme-name="catppuccin-frappe"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-latte.css" data-theme-name="catppuccin-latte"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><a class="docs-logo" href="../"><img src="../assets/logo.png" alt="BioSequences.jl logo"/></a><div class="docs-package-name"><span class="docs-autofit"><a href="../">BioSequences.jl</a></span></div><button class="docs-search-query input is-rounded is-small is-clickable my-2 mx-auto py-1 px-2" id="documenter-search-query">Search docs (Ctrl + /)</button><ul class="docs-menu"><li><a class="tocitem" href="../">Home</a></li><li><a class="tocitem" href="../symbols/">Biological Symbols</a></li><li><a class="tocitem" href="../types/">BioSequences Types</a></li><li><a class="tocitem" href="../construction/">Constructing sequences</a></li><li><a class="tocitem" href="../transforms/">Indexing &amp; modifying sequences</a></li><li><a class="tocitem" href="../predicates/">Predicates</a></li><li class="is-active"><a class="tocitem" href>Random sequences</a><ul class="internal"><li><a class="tocitem" href="#Long-sequences"><span>Long sequences</span></a></li></ul></li><li><a class="tocitem" href="../sequence_search/">Pattern matching and searching</a></li><li><a class="tocitem" href="../counting/">Counting</a></li><li><a class="tocitem" href="../io/">I/O</a></li><li><a class="tocitem" href="../interfaces/">Implementing custom types</a></li><li><a class="tocitem" href="../recipes/">Recipes</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><a class="docs-sidebar-button docs-navbar-link fa-solid fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>Random sequences</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>Random sequences</a></li></ul></nav><div class="docs-right"><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl" title="View the repository on GitHub"><span class="docs-icon fa-brands"></span><span class="docs-label is-hidden-touch">GitHub</span></a><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl/blob/master/docs/src/random.md" title="Edit source on GitHub"><span class="docs-icon fa-solid"></span></a><a class="docs-settings-button docs-navbar-link fa-solid fa-gear" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-article-toggle-button fa-solid fa-chevron-up" id="documenter-article-toggle-button" href="javascript:;" title="Collapse all docstrings"></a></div></header><article class="content" id="documenter-page"><h1 id="Generating-random-sequences"><a class="docs-heading-anchor" href="#Generating-random-sequences">Generating random sequences</a><a id="Generating-random-sequences-1"></a><a class="docs-heading-anchor-permalink" href="#Generating-random-sequences" title="Permalink"></a></h1><h2 id="Long-sequences"><a class="docs-heading-anchor" href="#Long-sequences">Long sequences</a><a id="Long-sequences-1"></a><a class="docs-heading-anchor-permalink" href="#Long-sequences" title="Permalink"></a></h2><p>You can generate random long sequences using the <code>randdna</code> function and the <code>Sampler</code>&#39;s implemented in BioSequences:</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.randseq" href="#BioSequences.randseq"><code>BioSequences.randseq</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">randseq([rng::AbstractRNG], A::Alphabet, len::Integer)</code></pre><p>Generate a LongSequence{A} of length <code>len</code> from the specified alphabet, drawn from the default distribution. User-defined alphabets should implement this method to implement random LongSequence generation.</p><p>For RNA and DNA alphabets, the default distribution is uniform across A, C, G, and T/U. For AminoAcidAlphabet, it is uniform across the 20 standard amino acids. For a user-defined alphabet A, default is uniform across all elements of <code>symbols(A)</code>.</p><p><strong>Example:</strong></p><pre><code class="nohighlight hljs">julia&gt; seq = randseq(AminoAcidAlphabet(), 50)
 50aa Amino Acid Sequence:
-VFMHSIRMIRLMVHRSWKMHSARHVNFIRCQDKKWKSADGIYTDICKYSM</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/randseq.jl#L106-L125">source</a></section><section><div><pre><code class="language-julia hljs">randseq([rng::AbstractRNG], A::Alphabet, sp::Sampler, len::Integer)</code></pre><p>Generate a LongSequence{A} of length <code>len</code> with elements drawn from the given sampler.</p><p><strong>Example:</strong></p><pre><code class="nohighlight hljs"># Generate 1000-length RNA with 4% chance of N, 24% for A, C, G, or U
+VFMHSIRMIRLMVHRSWKMHSARHVNFIRCQDKKWKSADGIYTDICKYSM</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/randseq.jl#L106-L125">source</a></section><section><div><pre><code class="language-julia hljs">randseq([rng::AbstractRNG], A::Alphabet, sp::Sampler, len::Integer)</code></pre><p>Generate a LongSequence{A} of length <code>len</code> with elements drawn from the given sampler.</p><p><strong>Example:</strong></p><pre><code class="nohighlight hljs"># Generate 1000-length RNA with 4% chance of N, 24% for A, C, G, or U
 julia&gt; sp = SamplerWeighted(rna&quot;ACGUN&quot;, fill(0.24, 4))
 julia&gt; seq = randseq(RNAAlphabet{4}(), sp, 50)
 50nt RNA Sequence:
-CUNGGGCCCGGGNAAACGUGGUACACCCUGUUAAUAUCAACNNGCGCUNU</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/randseq.jl#L130-L144">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.randdnaseq" href="#BioSequences.randdnaseq"><code>BioSequences.randdnaseq</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">randdnaseq([rng::AbstractRNG], len::Integer)</code></pre><p>Generate a random LongSequence{DNAAlphabet{4}} sequence of length <code>len</code>, with bases sampled uniformly from [A, C, G, T]</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/randseq.jl#L199-L204">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.randrnaseq" href="#BioSequences.randrnaseq"><code>BioSequences.randrnaseq</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">randrnaseq([rng::AbstractRNG], len::Integer)</code></pre><p>Generate a random LongSequence{RNAAlphabet{4}} sequence of length <code>len</code>, with bases sampled uniformly from [A, C, G, U]</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/randseq.jl#L207-L212">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.randaaseq" href="#BioSequences.randaaseq"><code>BioSequences.randaaseq</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">randaaseq([rng::AbstractRNG], len::Integer)</code></pre><p>Generate a random LongSequence{AminoAcidAlphabet} sequence of length <code>len</code>, with amino acids sampled uniformly from the 20 standard amino acids.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/randseq.jl#L215-L220">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.SamplerUniform" href="#BioSequences.SamplerUniform"><code>BioSequences.SamplerUniform</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">SamplerUniform{T}</code></pre><p>Uniform sampler of type T. Instantiate with a collection of eltype T containing the elements to sample.</p><p><strong>Examples</strong></p><pre><code class="nohighlight hljs">julia&gt; sp = SamplerUniform(rna&quot;ACGU&quot;);</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/randseq.jl#L12-L22">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.SamplerWeighted" href="#BioSequences.SamplerWeighted"><code>BioSequences.SamplerWeighted</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">SamplerWeighted{T}</code></pre><p>Weighted sampler of type T. Instantiate with a collection of eltype T containing the elements to sample, and an orderen collection of probabilities to sample each element except the last. The last probability is the remaining probability up to 1.</p><p><strong>Examples</strong></p><pre><code class="nohighlight hljs">julia&gt; sp = SamplerWeighted(rna&quot;ACGUN&quot;, fill(0.2475, 4));</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/randseq.jl#L41-L53">source</a></section></article></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../predicates/">« Predicates</a><a class="docs-footer-nextpage" href="../sequence_search/">Pattern matching and searching »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Thursday 24 October 2024 17:54">Thursday 24 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
+CUNGGGCCCGGGNAAACGUGGUACACCCUGUUAAUAUCAACNNGCGCUNU</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/randseq.jl#L130-L144">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.randdnaseq" href="#BioSequences.randdnaseq"><code>BioSequences.randdnaseq</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">randdnaseq([rng::AbstractRNG], len::Integer)</code></pre><p>Generate a random LongSequence{DNAAlphabet{4}} sequence of length <code>len</code>, with bases sampled uniformly from [A, C, G, T]</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/randseq.jl#L199-L204">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.randrnaseq" href="#BioSequences.randrnaseq"><code>BioSequences.randrnaseq</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">randrnaseq([rng::AbstractRNG], len::Integer)</code></pre><p>Generate a random LongSequence{RNAAlphabet{4}} sequence of length <code>len</code>, with bases sampled uniformly from [A, C, G, U]</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/randseq.jl#L207-L212">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.randaaseq" href="#BioSequences.randaaseq"><code>BioSequences.randaaseq</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">randaaseq([rng::AbstractRNG], len::Integer)</code></pre><p>Generate a random LongSequence{AminoAcidAlphabet} sequence of length <code>len</code>, with amino acids sampled uniformly from the 20 standard amino acids.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/randseq.jl#L215-L220">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.SamplerUniform" href="#BioSequences.SamplerUniform"><code>BioSequences.SamplerUniform</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">SamplerUniform{T}</code></pre><p>Uniform sampler of type T. Instantiate with a collection of eltype T containing the elements to sample.</p><p><strong>Examples</strong></p><pre><code class="nohighlight hljs">julia&gt; sp = SamplerUniform(rna&quot;ACGU&quot;);</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/randseq.jl#L12-L22">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.SamplerWeighted" href="#BioSequences.SamplerWeighted"><code>BioSequences.SamplerWeighted</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">SamplerWeighted{T}</code></pre><p>Weighted sampler of type T. Instantiate with a collection of eltype T containing the elements to sample, and an orderen collection of probabilities to sample each element except the last. The last probability is the remaining probability up to 1.</p><p><strong>Examples</strong></p><pre><code class="nohighlight hljs">julia&gt; sp = SamplerWeighted(rna&quot;ACGUN&quot;, fill(0.2475, 4));</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/randseq.jl#L41-L53">source</a></section></article></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../predicates/">« Predicates</a><a class="docs-footer-nextpage" href="../sequence_search/">Pattern matching and searching »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Friday 25 October 2024 10:21">Friday 25 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
diff --git a/dev/recipes/index.html b/dev/recipes/index.html
index c0d3aa03..a4cf1ae7 100644
--- a/dev/recipes/index.html
+++ b/dev/recipes/index.html
@@ -29,4 +29,4 @@
  0  0  1  0  0  0  0  1  0  0
  0  0  1  0  0  1  0  0  0  0
  0  1  1  0  1  0  0  0  0  0
- 1  0  1  1  1  0  1  1  0  1</code></pre></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../interfaces/">« Implementing custom types</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Thursday 24 October 2024 17:54">Thursday 24 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
+ 1  0  1  1  1  0  1  1  0  1</code></pre></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../interfaces/">« Implementing custom types</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Friday 25 October 2024 10:21">Friday 25 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
diff --git a/dev/search_index.js b/dev/search_index.js
index 40c54a97..d5e66b43 100644
--- a/dev/search_index.js
+++ b/dev/search_index.js
@@ -1,3 +1,3 @@
 var documenterSearchIndex = {"docs":
-[{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"symbols/#Biological-symbols","page":"Biological Symbols","title":"Biological symbols","text":"","category":"section"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"The BioSequences module reexports the biological symbol (character) types that are provided by BioSymbols.jl:","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Type Meaning\nDNA DNA nucleotide\nRNA RNA nucleotide\nAminoAcid Amino acid","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"These symbols are elements of biological sequence types, just as characters are elements of strings.","category":"page"},{"location":"symbols/#DNA-and-RNA-nucleotides","page":"Biological Symbols","title":"DNA and RNA nucleotides","text":"","category":"section"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Set of nucleotide symbols in BioSequences covers IUPAC nucleotide base plus a gap symbol:","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Symbol Constant Meaning\n'A' DNA_A / RNA_A A; Adenine\n'C' DNA_C / RNA_C C; Cytosine\n'G' DNA_G / RNA_G G; Guanine\n'T' DNA_T T; Thymine (DNA only)\n'U' RNA_U U; Uracil (RNA only)\n'M' DNA_M / RNA_M A or C\n'R' DNA_R / RNA_R A or G\n'W' DNA_W / RNA_W A or T/U\n'S' DNA_S / RNA_S C or G\n'Y' DNA_Y / RNA_Y C or T/U\n'K' DNA_K / RNA_K G or T/U\n'V' DNA_V / RNA_V A or C or G; not T/U\n'H' DNA_H / RNA_H A or C or T; not G\n'D' DNA_D / RNA_D A or G or T/U; not C\n'B' DNA_B / RNA_B C or G or T/U; not A\n'N' DNA_N / RNA_N A or C or G or T/U\n'-' DNA_Gap / RNA_Gap Gap (none of the above)","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"https://www.bioinformatics.org/sms/iupac.html","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Symbols are accessible as constants with DNA_ or RNA_ prefix:","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"julia> DNA_A\nDNA_A\n\njulia> DNA_T\nDNA_T\n\njulia> RNA_U\nRNA_U\n\njulia> DNA_Gap\nDNA_Gap\n\njulia> typeof(DNA_A)\nDNA\n\njulia> typeof(RNA_A)\nRNA\n","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Symbols can be constructed by converting regular characters:","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"julia> convert(DNA, 'C')\nDNA_C\n\njulia> convert(DNA, 'C') === DNA_C\ntrue\n","category":"page"},{"location":"symbols/#Amino-acids","page":"Biological Symbols","title":"Amino acids","text":"","category":"section"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Set of amino acid symbols also covers IUPAC amino acid symbols plus a gap symbol:","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Symbol Constant Meaning\n'A' AA_A Alanine\n'R' AA_R Arginine\n'N' AA_N Asparagine\n'D' AA_D Aspartic acid (Aspartate)\n'C' AA_C Cysteine\n'Q' AA_Q Glutamine\n'E' AA_E Glutamic acid (Glutamate)\n'G' AA_G Glycine\n'H' AA_H Histidine\n'I' AA_I Isoleucine\n'L' AA_L Leucine\n'K' AA_K Lysine\n'M' AA_M Methionine\n'F' AA_F Phenylalanine\n'P' AA_P Proline\n'S' AA_S Serine\n'T' AA_T Threonine\n'W' AA_W Tryptophan\n'Y' AA_Y Tyrosine\n'V' AA_V Valine\n'O' AA_O Pyrrolysine\n'U' AA_U Selenocysteine\n'B' AA_B Aspartic acid or Asparagine\n'J' AA_J Leucine or Isoleucine\n'Z' AA_Z Glutamine or Glutamic acid\n'X' AA_X Any amino acid\n'*' AA_Term Termination codon\n'-' AA_Gap Gap (none of the above)","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"https://www.bioinformatics.org/sms/iupac.html","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Symbols are accessible as constants with AA_ prefix:","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"julia> AA_A\nAA_A\n\njulia> AA_Q\nAA_Q\n\njulia> AA_Term\nAA_Term\n\njulia> typeof(AA_A)\nAminoAcid\n","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Symbols can be constructed by converting regular characters:","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"julia> convert(AminoAcid, 'A')\nAA_A\n\njulia> convert(AminoAcid, 'P') === AA_P\ntrue\n","category":"page"},{"location":"symbols/#Other-functions","page":"Biological Symbols","title":"Other functions","text":"","category":"section"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"alphabet\ngap\niscompatible\nisambiguous","category":"page"},{"location":"symbols/#BioSymbols.alphabet","page":"Biological Symbols","title":"BioSymbols.alphabet","text":"alphabet(DNA)\n\nGet all symbols of DNA in sorted order.\n\nExamples\n\njulia> alphabet(DNA)\n(DNA_Gap, DNA_A, DNA_C, DNA_M, DNA_G, DNA_R, DNA_S, DNA_V, DNA_T, DNA_W, DNA_Y, DNA_H, DNA_K, DNA_D, DNA_B, DNA_N)\n\njulia> issorted(alphabet(DNA))\ntrue\n\n\n\n\n\n\nalphabet(RNA)\n\nGet all symbols of RNA in sorted order.\n\nExamples\n\njulia> alphabet(RNA)\n(RNA_Gap, RNA_A, RNA_C, RNA_M, RNA_G, RNA_R, RNA_S, RNA_V, RNA_U, RNA_W, RNA_Y, RNA_H, RNA_K, RNA_D, RNA_B, RNA_N)\n\njulia> issorted(alphabet(RNA))\ntrue\n\n\n\n\n\n\nalphabet(AminoAcid)\n\nGet all symbols of AminoAcid in sorted order.\n\nExamples\n\njulia> alphabet(AminoAcid)\n(AA_A, AA_R, AA_N, AA_D, AA_C, AA_Q, AA_E, AA_G, AA_H, AA_I, AA_L, AA_K, AA_M, AA_F, AA_P, AA_S, AA_T, AA_W, AA_Y, AA_V, AA_O, AA_U, AA_B, AA_J, AA_Z, AA_X, AA_Term, AA_Gap)\n\njulia> issorted(alphabet(AminoAcid))\ntrue\n\n\n\n\n\n\n","category":"function"},{"location":"symbols/#BioSymbols.gap","page":"Biological Symbols","title":"BioSymbols.gap","text":"gap(::Type{T})::T\n\nReturn the gap (indel) representation of T. By default, gap is defined for DNA, RNA, AminoAcid and Char.\n\nExamples\n\njulia> gap(RNA)\nRNA_Gap\n\njulia> gap(Char)\n'-': ASCII/Unicode U+002D (category Pd: Punctuation, dash)\n\n\n\n\n\n","category":"function"},{"location":"symbols/#BioSymbols.iscompatible","page":"Biological Symbols","title":"BioSymbols.iscompatible","text":"iscompatible(x::S, y::S) where S <: BioSymbol\n\nTest if x and y are compatible with each other.\n\nExamples\n\njulia> iscompatible(AA_A, AA_R)\nfalse\n\njulia> iscompatible(AA_A, AA_X)\ntrue\n\njulia> iscompatible(DNA_A, DNA_A)\ntrue\n\njulia> iscompatible(DNA_C, DNA_N)  # DNA_N can be DNA_C\ntrue\n\njulia> iscompatible(DNA_C, DNA_R)  # DNA_R (A or G) cannot be DNA_C\nfalse\n\n\n\n\n\n\n","category":"function"},{"location":"symbols/#BioSymbols.isambiguous","page":"Biological Symbols","title":"BioSymbols.isambiguous","text":"isambiguous(nt::NucleicAcid)\n\nTest if nt is an ambiguous nucleotide.\n\n\n\n\n\nisambiguous(aa::AminoAcid)\n\nTest if aa is an ambiguous amino acid.\n\n\n\n\n\n","category":"function"},{"location":"io/#I/O-for-sequencing-file-formats","page":"I/O","title":"I/O for sequencing file formats","text":"","category":"section"},{"location":"io/","page":"I/O","title":"I/O","text":"Versions of BioSequences prior to v2.0 provided a FASTA, FASTQ, and 2Bit submodule for working with formatted sequence files.","category":"page"},{"location":"io/","page":"I/O","title":"I/O","text":"After version v2.0, in order to neatly separate concerns, these submodules were removed.","category":"page"},{"location":"io/","page":"I/O","title":"I/O","text":"Instead there will now be dedicated BioJulia packages for each format. Each of these will be compatible with BioSequences.","category":"page"},{"location":"io/","page":"I/O","title":"I/O","text":"A list of all of the different formats and packages is provided below to help you find them quickly.","category":"page"},{"location":"io/","page":"I/O","title":"I/O","text":"Format Package\nFASTA FASTX.jl\nFASTQ FASTX.jl\n2Bit TwoBit.jl","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"counting/#Counting","page":"Counting","title":"Counting","text":"","category":"section"},{"location":"counting/","page":"Counting","title":"Counting","text":"BioSequences contains functionality to efficiently count biosymbols in a biosequence that satisfies some predicate.","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"Consider a naive counting function like this:","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"function count_Ns(seq::BioSequence{<:DNAAlphabet})\n    ns = 0\n    for i in seq\n        ns += (i == DNA_N)::Bool\n    end\n    ns\nend ","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"This function can be more efficiently implemented by exploiting the internal data layout of certain biosequences. Therefore, Julia provides optimised methods for Base.count, such that count_Ns above can be more efficiently expressed count(==(DNA_N), seq).","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"note: Note\nIt is important to understand that this speed is achieved with custom methods of Base.count, and not by a generic mechanism that improves the speed of counting symbols in BioSequencein general. Hence, while count(==(DNA_N), seq) may be optimised, count(i -> i == DNA_N, seq) is not, as this is a different method.","category":"page"},{"location":"counting/#Currently-optimised-methods","page":"Counting","title":"Currently optimised methods","text":"","category":"section"},{"location":"counting/","page":"Counting","title":"Counting","text":"By default, only the BioSequence and Alphabet types found in BioSequences.jl have optimised methods.","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"count(isGC, seq)\ncount(isambiguous, seq)\ncount(iscertain, seq)\ncount(isgap, seq)\ncount(==(biosymbol), seq) and count(isequal(biosymbol), seq)","category":"page"},{"location":"counting/#Matches-and-mismatches","page":"Counting","title":"Matches and mismatches","text":"","category":"section"},{"location":"counting/","page":"Counting","title":"Counting","text":"The methods matches and mismatches take two sequences and count the number of positions where the sequences are unequal or equal, respectively.","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"They are equivalent to matches(a, b) = count(splat(==), zip(a, b)) (and with !=, respectively).","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"matches\nmismatches","category":"page"},{"location":"counting/#BioSequences.matches","page":"Counting","title":"BioSequences.matches","text":"matches(a::BioSequence, b::BioSequences) -> Int\n\nCount the number of positions in where a and b are equal. If b is given, and the length of a and b differ, look only at the indices of the shorter sequence. This function does not provide any special handling of ambiguous symbols, so e.g. DNA_A does not match DNA_N.\n\nwarning: Warning\nPassing in two sequences with differing lengths is deprecated. In a future, breaking release of BioSequences, this will error.\n\nExamples\n\njulia> matches(dna\"TAWNNA\", dna\"TACCTA\")\n3\n\njulia> matches(dna\"AACA\", dna\"AAG\")\n2\n\n\n\n\n\n","category":"function"},{"location":"counting/#BioSequences.mismatches","page":"Counting","title":"BioSequences.mismatches","text":"mismatches(a::BioSequence, b::BioSequences) -> Int\n\nCount the number of positions in where a and b differ. If b is given, and the length of a and b differ, look only at the indices of the shorter sequence. This function does not provide any special handling of ambiguous symbols, so e.g. DNA_A does not match DNA_N.\n\nwarning: Warning\nPassing in two sequences with differing lengths is deprecated. In a future, breaking release of BioSequences, this will error.\n\nExamples\n\njulia> mismatches(dna\"TAGCTA\", dna\"TACNTA\")\n2\n\njulia> mismatches(dna\"AACA\", dna\"AAG\")\n1\n\n\n\n\n\n","category":"function"},{"location":"counting/#GC-content","page":"Counting","title":"GC content","text":"","category":"section"},{"location":"counting/","page":"Counting","title":"Counting","text":"The convenience function gc_content(seq) is equivalent to count(isGC, seq) / length(seq):","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"gc_content","category":"page"},{"location":"counting/#BioSequences.gc_content","page":"Counting","title":"BioSequences.gc_content","text":"gc_content(seq::BioSequence) -> Float64\n\nCalculate GC content of seq, i.e. the number of symbols that is DNA_C, DNA_G, DNA_C or DNA_G divided by the length of the sequence.\n\nExamples\n\njulia> gc_content(dna\"AGCTA\")\n0.4\n\njulia> gc_content(rna\"UAGCGA\")\n0.5\n\n\n\n\n\n","category":"function"},{"location":"counting/#Deprecated-aliases","page":"Counting","title":"Deprecated aliases","text":"","category":"section"},{"location":"counting/","page":"Counting","title":"Counting","text":"Several of the optimised count methods have function names, which are deprecated:","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"Deprecated function Instead use\nn_gaps count(isgap, seq)\nn_certain count(iscertain, seq)\nn_ambiguous count(isambiguous, seq)","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"n_gaps\nn_certain\nn_ambiguous","category":"page"},{"location":"counting/#BioSequences.n_gaps","page":"Counting","title":"BioSequences.n_gaps","text":"n_gaps(a::BioSequence, [b::BioSequence]) -> Int\n\nCount the number of positions where a (or b, if present) have gaps. If b is given, and the length of a and b differ, look only at the indices of the shorter sequence.\n\nwarning: Warning\nPassing in two sequences is deprecated. In a future, breaking release of BioSequences, this will throw a MethodError\n\nExamples\n\njulia> n_gaps(dna\"--TAC-WN-ACY\")\n4\n\njulia> n_gaps(dna\"TC-AC-\", dna\"-CACG\")\n2\n\n\n\n\n\n","category":"function"},{"location":"counting/#BioSequences.n_certain","page":"Counting","title":"BioSequences.n_certain","text":"n_certain(a::BioSequence, [b::BioSequence]) -> Int\n\nCount the number of positions where a (and b, if present) have certain (i.e. non-ambigous and non-gap) symbols. If b is given, and the length of a and b differ, look only at the indices of the shorter sequence. Gaps are not certain.\n\nwarning: Warning\nPassing in two sequences is deprecated. In a future, breaking release of BioSequences, this will throw a MethodError\n\nExamples\n\njulia> n_certain(dna\"--TAC-WN-ACY\")\n5\n\njulia> n_certain(rna\"UAYWW\", rna\"UAW\")\n2\n\n\n\n\n\n","category":"function"},{"location":"counting/#BioSequences.n_ambiguous","page":"Counting","title":"BioSequences.n_ambiguous","text":"n_ambiguous(a::BioSequence, [b::BioSequence]) -> Int\n\nCount the number of positions where a (or b, if present) have ambigious symbols. If b is given, and the length of a and b differ, look only at the indices of the shorter sequence. Gaps are not ambigous.\n\nwarning: Warning\nPassing in two sequences is deprecated. In a future, breaking release of BioSequences, this will throw a MethodError\n\nExamples\n\njulia> n_ambiguous(dna\"--TAC-WN-ACY\")\n3\n\njulia> n_ambiguous(rna\"UAYWW\", rna\"UAW\")\n1\n\n\n\n\n\n","category":"function"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"interfaces/#Custom-BioSequences-types","page":"Implementing custom types","title":"Custom BioSequences types","text":"","category":"section"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"If you're a developing your own Bioinformatics package or method, you may find that the reference implementation of concrete LongSequence types provided in this package are not optimal for your purposes.","category":"page"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"This page describes the interfaces for BioSequences' core types for developers or other packages implementing their own sequence types or extending BioSequences functionality.","category":"page"},{"location":"interfaces/#Implementing-custom-Alphabets","page":"Implementing custom types","title":"Implementing custom Alphabets","text":"","category":"section"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"Recall the required methods that define the Alphabet interface. ","category":"page"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"To create an example custom alphabet, we need to create a singleton type, that implements a few methods in order to conform to the interface as described in the Alphabet documentation.","category":"page"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"Let's do that for a restricted Amino Acid alphabet. We can test that it conforms to the interface with the BioSequences.has_interface function.","category":"page"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"julia> struct ReducedAAAlphabet <: Alphabet end\n\njulia> Base.eltype(::Type{ReducedAAAlphabet}) = AminoAcid\n\njulia> BioSequences.BitsPerSymbol(::ReducedAAAlphabet) = BioSequences.BitsPerSymbol{4}()\n\njulia> function BioSequences.symbols(::ReducedAAAlphabet)\n           (AA_L, AA_C, AA_A, AA_G, AA_S, AA_T, AA_P, AA_F,\n            AA_W, AA_E, AA_D, AA_N, AA_Q, AA_K, AA_H, AA_M)\n       end\n\njulia> const (ENC_LUT, DEC_LUT) = let\n           enc_lut = fill(0xff, length(alphabet(AminoAcid)))\n           dec_lut = fill(AA_A, length(symbols(ReducedAAAlphabet())))\n           for (i, aa) in enumerate(symbols(ReducedAAAlphabet()))\n               enc_lut[reinterpret(UInt8, aa) + 0x01] = i - 1\n               dec_lut[i] = aa\n           end\n           (Tuple(enc_lut), Tuple(dec_lut))\n       end\n((0x02, 0xff, 0x0b, 0x0a, 0x01, 0x0c, 0x09, 0x03, 0x0e, 0xff, 0x00, 0x0d, 0x0f, 0x07, 0x06, 0x04, 0x05, 0x08, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff), (AA_L, AA_C, AA_A, AA_G, AA_S, AA_T, AA_P, AA_F, AA_W, AA_E, AA_D, AA_N, AA_Q, AA_K, AA_H, AA_M))\n\njulia> function BioSequences.encode(::ReducedAAAlphabet, aa::AminoAcid)\n           i = reinterpret(UInt8, aa) + 0x01\n           (i ≥ length(ENC_LUT) || @inbounds ENC_LUT[i] === 0xff) && throw(DomainError(aa))\n           (@inbounds ENC_LUT[i]) % UInt\n       end\n\njulia> function BioSequences.decode(::ReducedAAAlphabet, x::UInt)\n           x ≥ length(DEC_LUT) && throw(DomainError(aa))\n           @inbounds DEC_LUT[x + UInt(1)]\n       end\n\njulia> BioSequences.has_interface(Alphabet, ReducedAAAlphabet())\ntrue\n","category":"page"},{"location":"interfaces/#Implementing-custom-BioSequences","page":"Implementing custom types","title":"Implementing custom BioSequences","text":"","category":"section"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"Recall the required methods that define the BioSequence interface. ","category":"page"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"To create an example custom alphabet, we need to create a singleton type, that implements a few methods in order to conform to the interface as described in the BioSequence documentation.","category":"page"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"Let's do that for a custom sequence type that is optimised to represent a small sequence: A Codon. We can test that it conforms to the interface with the BioSequences.has_interface function.","category":"page"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"julia> struct Codon <: BioSequence{RNAAlphabet{2}}\n           x::UInt8\n       end\n\njulia> function Codon(iterable)\n           length(iterable) == 3 || error(\"Must have length 3\")\n           x = zero(UInt)\n           for (i, nt) in enumerate(iterable)\n               x |= BioSequences.encode(Alphabet(Codon), convert(RNA, nt)) << (6-2i)\n           end\n           Codon(x % UInt8)\n       end\nCodon\n\njulia> Base.length(::Codon) = 3\n\njulia> BioSequences.encoded_data_eltype(::Type{Codon}) = UInt\n\njulia> function BioSequences.extract_encoded_element(x::Codon, i::Int)\n           ((x.x >>> (6-2i)) & 3) % UInt\n       end\n\njulia> Base.copy(seq::Codon) = Codon(seq.x)\n\njulia> BioSequences.has_interface(BioSequence, Codon, [RNA_C, RNA_U, RNA_A], false)\ntrue","category":"page"},{"location":"interfaces/#Interface-checking-functions","page":"Implementing custom types","title":"Interface checking functions","text":"","category":"section"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"BioSequences.has_interface","category":"page"},{"location":"interfaces/#BioSequences.has_interface","page":"Implementing custom types","title":"BioSequences.has_interface","text":"function has_interface(::Type{Alphabet}, A::Alphabet)\n\nReturns whether A conforms to the Alphabet interface.\n\n\n\n\n\nhas_interface(::Type{BioSequence}, ::T, syms::Vector, mutable::Bool, compat::Bool=true)\n\nCheck if type T conforms to the BioSequence interface. A T is constructed from the vector of element types syms which must not be empty. If the mutable flag is set, also check the mutable interface. If the compat flag is set, check for compatibility with existing alphabets.\n\n\n\n\n\n","category":"function"},{"location":"random/","page":"Random sequences","title":"Random sequences","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"random/#Generating-random-sequences","page":"Random sequences","title":"Generating random sequences","text":"","category":"section"},{"location":"random/#Long-sequences","page":"Random sequences","title":"Long sequences","text":"","category":"section"},{"location":"random/","page":"Random sequences","title":"Random sequences","text":"You can generate random long sequences using the randdna function and the Sampler's implemented in BioSequences:","category":"page"},{"location":"random/","page":"Random sequences","title":"Random sequences","text":"randseq\nranddnaseq\nrandrnaseq\nrandaaseq\nSamplerUniform\nSamplerWeighted","category":"page"},{"location":"random/#BioSequences.randseq","page":"Random sequences","title":"BioSequences.randseq","text":"randseq([rng::AbstractRNG], A::Alphabet, len::Integer)\n\nGenerate a LongSequence{A} of length len from the specified alphabet, drawn from the default distribution. User-defined alphabets should implement this method to implement random LongSequence generation.\n\nFor RNA and DNA alphabets, the default distribution is uniform across A, C, G, and T/U. For AminoAcidAlphabet, it is uniform across the 20 standard amino acids. For a user-defined alphabet A, default is uniform across all elements of symbols(A).\n\nExample:\n\njulia> seq = randseq(AminoAcidAlphabet(), 50)\n50aa Amino Acid Sequence:\nVFMHSIRMIRLMVHRSWKMHSARHVNFIRCQDKKWKSADGIYTDICKYSM\n\n\n\n\n\nrandseq([rng::AbstractRNG], A::Alphabet, sp::Sampler, len::Integer)\n\nGenerate a LongSequence{A} of length len with elements drawn from the given sampler.\n\nExample:\n\n# Generate 1000-length RNA with 4% chance of N, 24% for A, C, G, or U\njulia> sp = SamplerWeighted(rna\"ACGUN\", fill(0.24, 4))\njulia> seq = randseq(RNAAlphabet{4}(), sp, 50)\n50nt RNA Sequence:\nCUNGGGCCCGGGNAAACGUGGUACACCCUGUUAAUAUCAACNNGCGCUNU\n\n\n\n\n\n","category":"function"},{"location":"random/#BioSequences.randdnaseq","page":"Random sequences","title":"BioSequences.randdnaseq","text":"randdnaseq([rng::AbstractRNG], len::Integer)\n\nGenerate a random LongSequence{DNAAlphabet{4}} sequence of length len, with bases sampled uniformly from [A, C, G, T]\n\n\n\n\n\n","category":"function"},{"location":"random/#BioSequences.randrnaseq","page":"Random sequences","title":"BioSequences.randrnaseq","text":"randrnaseq([rng::AbstractRNG], len::Integer)\n\nGenerate a random LongSequence{RNAAlphabet{4}} sequence of length len, with bases sampled uniformly from [A, C, G, U]\n\n\n\n\n\n","category":"function"},{"location":"random/#BioSequences.randaaseq","page":"Random sequences","title":"BioSequences.randaaseq","text":"randaaseq([rng::AbstractRNG], len::Integer)\n\nGenerate a random LongSequence{AminoAcidAlphabet} sequence of length len, with amino acids sampled uniformly from the 20 standard amino acids.\n\n\n\n\n\n","category":"function"},{"location":"random/#BioSequences.SamplerUniform","page":"Random sequences","title":"BioSequences.SamplerUniform","text":"SamplerUniform{T}\n\nUniform sampler of type T. Instantiate with a collection of eltype T containing the elements to sample.\n\nExamples\n\njulia> sp = SamplerUniform(rna\"ACGU\");\n\n\n\n\n\n","category":"type"},{"location":"random/#BioSequences.SamplerWeighted","page":"Random sequences","title":"BioSequences.SamplerWeighted","text":"SamplerWeighted{T}\n\nWeighted sampler of type T. Instantiate with a collection of eltype T containing the elements to sample, and an orderen collection of probabilities to sample each element except the last. The last probability is the remaining probability up to 1.\n\nExamples\n\njulia> sp = SamplerWeighted(rna\"ACGUN\", fill(0.2475, 4));\n\n\n\n\n\n","category":"type"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"transforms/#Indexing-and-modifying-sequences","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"","category":"section"},{"location":"transforms/#Indexing","page":"Indexing & modifying sequences","title":"Indexing","text":"","category":"section"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"Most BioSequence concrete subtypes for the most part behave like other vector or string types. They can be indexed using integers or ranges:","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"For example, with LongSequences:","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"julia> seq = dna\"ACGTTTANAGTNNAGTACC\"\n19nt DNA Sequence:\nACGTTTANAGTNNAGTACC\n\njulia> seq[5]\nDNA_T\n\njulia> seq[6:end]\n14nt DNA Sequence:\nTANAGTNNAGTACC\n","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"The biological symbol at a given locus in a biological sequence can be set using setindex:","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"julia> seq = dna\"ACGTTTANAGTNNAGTACC\"\n19nt DNA Sequence:\nACGTTTANAGTNNAGTACC\n\njulia> seq[5] = DNA_A\nDNA_A\n","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"note: Note\nSome types such can be indexed using integers but not using ranges.","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"For LongSequence types, indexing a sequence by range creates a copy of the original sequence, similar to Array in Julia's Base library. If you find yourself slowed down by the allocation of these subsequences, consider using a sequence view instead.","category":"page"},{"location":"transforms/#Modifying-sequences","page":"Indexing & modifying sequences","title":"Modifying sequences","text":"","category":"section"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"In addition to setindex, many other modifying operations are possible for biological sequences such as push!, pop!, and insert!, which should be familiar to anyone used to editing arrays.","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"push!(::BioSequences.BioSequence, ::Any)\npop!(::BioSequences.BioSequence)\npushfirst!(::BioSequences.BioSequence, ::Any)\npopfirst!(::BioSequences.BioSequence)\ninsert!(::BioSequences.BioSequence, ::Integer, ::Any)\ndeleteat!(::BioSequences.BioSequence, ::Integer)\nappend!(::BioSequences.BioSequence, ::BioSequences.BioSequence)\nresize!(::BioSequences.LongSequence, ::Integer)\nempty!(::BioSequences.BioSequence)","category":"page"},{"location":"transforms/#Base.push!-Tuple{BioSequence, Any}","page":"Indexing & modifying sequences","title":"Base.push!","text":"push!(seq::BioSequence, x)\n\nAppend a biological symbol x to a biological sequence seq.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.pop!-Tuple{BioSequence}","page":"Indexing & modifying sequences","title":"Base.pop!","text":"pop!(seq::BioSequence)\n\nRemove the symbol from the end of a biological sequence seq and return it. Returns a variable of eltype(seq).\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.pushfirst!-Tuple{BioSequence, Any}","page":"Indexing & modifying sequences","title":"Base.pushfirst!","text":"pushfirst!(seq, x)\n\nInsert a biological symbol x at the beginning of a biological sequence seq.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.popfirst!-Tuple{BioSequence}","page":"Indexing & modifying sequences","title":"Base.popfirst!","text":"popfirst!(seq)\n\nRemove the symbol from the beginning of a biological sequence seq and return it. Returns a variable of eltype(seq).\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.insert!-Tuple{BioSequence, Integer, Any}","page":"Indexing & modifying sequences","title":"Base.insert!","text":"insert!(seq::BioSequence, i, x)\n\nInsert a biological symbol x into a biological sequence seq, at the given index i.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.deleteat!-Tuple{BioSequence, Integer}","page":"Indexing & modifying sequences","title":"Base.deleteat!","text":"deleteat!(seq::BioSequence, i::Integer)\n\nDelete a biological symbol at a single position i in a biological sequence seq.\n\nModifies the input sequence.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.append!-Tuple{BioSequence, BioSequence}","page":"Indexing & modifying sequences","title":"Base.append!","text":"append!(seq, other)\n\nAdd a biological sequence other onto the end of biological sequence seq. Modifies and returns seq.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.resize!-Tuple{LongSequence, Integer}","page":"Indexing & modifying sequences","title":"Base.resize!","text":"resize!(seq, size, [force::Bool=false])\n\nResize a biological sequence seq, to a given size. Does not resize the underlying data array unless the new size does not fit. If force, always resize underlying data array.\n\nNote that resizing to a larger size, and then loading from uninitialized positions is not allowed and may cause undefined behaviour.  Make sure to always fill any uninitialized biosymbols after resizing.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.empty!-Tuple{BioSequence}","page":"Indexing & modifying sequences","title":"Base.empty!","text":"empty!(seq::BioSequence)\n\nCompletely empty a biological sequence seq of nucleotides.\n\n\n\n\n\n","category":"method"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"Here are some examples:","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"julia> seq = dna\"ACG\"\n3nt DNA Sequence:\nACG\n\njulia> push!(seq, DNA_T)\n4nt DNA Sequence:\nACGT\n\njulia> append!(seq, dna\"AT\")\n6nt DNA Sequence:\nACGTAT\n\njulia> deleteat!(seq, 2)\n5nt DNA Sequence:\nAGTAT\n\njulia> deleteat!(seq, 2:3)\n3nt DNA Sequence:\nAAT\n","category":"page"},{"location":"transforms/#Additional-transformations","page":"Indexing & modifying sequences","title":"Additional transformations","text":"","category":"section"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"In addition to these basic modifying functions, other sequence transformations that are common in bioinformatics are also provided.","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"reverse!(::BioSequences.LongSequence)\nreverse(::BioSequences.LongSequence{<:NucleicAcidAlphabet})\ncomplement!\ncomplement\nreverse_complement!\nreverse_complement\nungap!\nungap\ncanonical!\ncanonical","category":"page"},{"location":"transforms/#Base.reverse!-Tuple{LongSequence}","page":"Indexing & modifying sequences","title":"Base.reverse!","text":"reverse!(seq::LongSequence)\n\nReverse a biological sequence seq in place.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.reverse-Tuple{LongSequence{<:NucleicAcidAlphabet}}","page":"Indexing & modifying sequences","title":"Base.reverse","text":"reverse(seq::BioSequence)\n\nCreate reversed copy of a biological sequence.\n\n\n\n\n\nreverse(seq::LongSequence)\n\nCreate reversed copy of a biological sequence.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#BioSequences.complement!","page":"Indexing & modifying sequences","title":"BioSequences.complement!","text":"complement!(seq)\n\nMake a complement sequence of seq in place.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSymbols.complement","page":"Indexing & modifying sequences","title":"BioSymbols.complement","text":"complement(nt::NucleicAcid)\n\nReturn the complementary nucleotide of nt.\n\nThis function returns the union of all possible complementary nucleotides.\n\nExamples\n\njulia> complement(DNA_A)\nDNA_T\n\njulia> complement(DNA_N)\nDNA_N\n\njulia> complement(RNA_U)\nRNA_A\n\n\n\n\n\n\ncomplement(seq)\n\nMake a complement sequence of seq.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSequences.reverse_complement!","page":"Indexing & modifying sequences","title":"BioSequences.reverse_complement!","text":"reverse_complement!(seq)\n\nMake a reversed complement sequence of seq in place.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSequences.reverse_complement","page":"Indexing & modifying sequences","title":"BioSequences.reverse_complement","text":"reverse_complement(seq)\n\nMake a reversed complement sequence of seq.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSequences.ungap!","page":"Indexing & modifying sequences","title":"BioSequences.ungap!","text":"Remove gap characters from an input sequence.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSequences.ungap","page":"Indexing & modifying sequences","title":"BioSequences.ungap","text":"Create a copy of a sequence with gap characters removed.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSequences.canonical!","page":"Indexing & modifying sequences","title":"BioSequences.canonical!","text":"canonical!(seq::NucleotideSeq)\n\nTransforms the seq into its canonical form, if it is not already canonical. Modifies the input sequence inplace.\n\nFor any sequence, there is a reverse complement, which is the same sequence, but on the complimentary strand of DNA:\n\n------->\nATCGATCG\nCGATCGAT\n<-------\n\nnote: Note\nUsing the reverse_complement of a DNA sequence will give give this reverse complement.\n\nOf the two sequences, the canonical of the two sequences is the lesser of the two i.e. canonical_seq < other_seq.\n\nUsing this function on a seq will ensure it is the canonical version.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSequences.canonical","page":"Indexing & modifying sequences","title":"BioSequences.canonical","text":"canonical(seq::NucleotideSeq)\n\nCreate the canonical sequence of seq.\n\n\n\n\n\n","category":"function"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"Some examples:","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"julia> seq = dna\"ACGTAT\"\n6nt DNA Sequence:\nACGTAT\n\njulia> reverse!(seq)\n6nt DNA Sequence:\nTATGCA\n\njulia> complement!(seq)\n6nt DNA Sequence:\nATACGT\n\njulia> reverse_complement!(seq)\n6nt DNA Sequence:\nACGTAT\n","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"Many of these methods also have a version which makes a copy of the input sequence, so you get a modified copy, and don't alter the original sequence. Such methods are named the same, but without the exclamation mark. E.g. reverse instead of reverse!, and ungap instead of ungap!.  ","category":"page"},{"location":"transforms/#Translation","page":"Indexing & modifying sequences","title":"Translation","text":"","category":"section"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"Translation is a slightly more complex transformation for RNA Sequences and so we describe it here in more detail.","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"The translate function translates a sequence of codons in a RNA sequence to a amino acid sequence based on a genetic code. The BioSequences package provides all NCBI defined genetic codes and they are registered in ncbi_trans_table.","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"translate\nncbi_trans_table","category":"page"},{"location":"transforms/#BioSequences.translate","page":"Indexing & modifying sequences","title":"BioSequences.translate","text":"translate(seq, code=standard_genetic_code, allow_ambiguous_codons=true, alternative_start=false)\n\nTranslate an LongRNA or a LongDNA to an LongAA.\n\nTranslation uses genetic code code to map codons to amino acids. See ncbi_trans_table for available genetic codes. If codons in the given sequence cannot determine a unique amino acid, they will be translated to AA_X if allow_ambiguous_codons is true and otherwise result in an error. For organisms that utilize alternative start codons, one can set alternative_start=true, in which case the first codon will always be converted to a methionine.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSequences.ncbi_trans_table","page":"Indexing & modifying sequences","title":"BioSequences.ncbi_trans_table","text":"Genetic code list of NCBI.\n\nThe standard genetic code is ncbi_trans_table[1] and others can be shown by show(ncbi_trans_table). For more details, consult the next link: http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi?chapter=cgencodes.\n\n\n\n\n\n","category":"constant"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"julia> ncbi_trans_table\nTranslation Tables:\n  1. The Standard Code (standard_genetic_code)\n  2. The Vertebrate Mitochondrial Code (vertebrate_mitochondrial_genetic_code)\n  3. The Yeast Mitochondrial Code (yeast_mitochondrial_genetic_code)\n  4. The Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code (mold_mitochondrial_genetic_code)\n  5. The Invertebrate Mitochondrial Code (invertebrate_mitochondrial_genetic_code)\n  6. The Ciliate, Dasycladacean and Hexamita Nuclear Code (ciliate_nuclear_genetic_code)\n  9. The Echinoderm and Flatworm Mitochondrial Code (echinoderm_mitochondrial_genetic_code)\n 10. The Euplotid Nuclear Code (euplotid_nuclear_genetic_code)\n 11. The Bacterial, Archaeal and Plant Plastid Code (bacterial_plastid_genetic_code)\n 12. The Alternative Yeast Nuclear Code (alternative_yeast_nuclear_genetic_code)\n 13. The Ascidian Mitochondrial Code (ascidian_mitochondrial_genetic_code)\n 14. The Alternative Flatworm Mitochondrial Code (alternative_flatworm_mitochondrial_genetic_code)\n 15. Blepharisma Macronuclear Code (blepharisma_macronuclear_genetic_code)\n 16. Chlorophycean Mitochondrial Code (chlorophycean_mitochondrial_genetic_code)\n 21. Trematode Mitochondrial Code (trematode_mitochondrial_genetic_code)\n 22. Scenedesmus obliquus Mitochondrial Code (scenedesmus_obliquus_mitochondrial_genetic_code)\n 23. Thraustochytrium Mitochondrial Code (thraustochytrium_mitochondrial_genetic_code)\n 24. Pterobranchia Mitochondrial Code (pterobrachia_mitochondrial_genetic_code)\n 25. Candidate Division SR1 and Gracilibacteria Code (candidate_division_sr1_genetic_code)\n","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"https://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi?chapter=cgencodes","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"construction/#Construction-and-conversion","page":"Constructing sequences","title":"Construction & conversion","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Here we will showcase the various ways you can construct the various sequence types in BioSequences.","category":"page"},{"location":"construction/#Constructing-sequences","page":"Constructing sequences","title":"Constructing sequences","text":"","category":"section"},{"location":"construction/#From-strings","page":"Constructing sequences","title":"From strings","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Sequences can be constructed from strings using their constructors:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> LongDNA{4}(\"TTANC\")\n5nt DNA Sequence:\nTTANC\n\njulia> LongSequence{DNAAlphabet{2}}(\"TTAGC\")\n5nt DNA Sequence:\nTTAGC\n\njulia> LongRNA{4}(\"UUANC\")\n5nt RNA Sequence:\nUUANC\n\njulia> LongSequence{RNAAlphabet{2}}(\"UUAGC\")\n5nt RNA Sequence:\nUUAGC\n","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Type alias' can also be used for brevity.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> LongDNA{4}(\"TTANC\")\n5nt DNA Sequence:\nTTANC\n\njulia> LongDNA{2}(\"TTAGC\")\n5nt DNA Sequence:\nTTAGC\n\njulia> LongRNA{4}(\"UUANC\")\n5nt RNA Sequence:\nUUANC\n\njulia> LongRNA{2}(\"UUAGC\")\n5nt RNA Sequence:\nUUAGC","category":"page"},{"location":"construction/#Constructing-sequences-from-arrays-of-BioSymbols","page":"Constructing sequences","title":"Constructing sequences from arrays of BioSymbols","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Sequences can be constructed using vectors or arrays of a BioSymbol type:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> LongDNA{4}([DNA_T, DNA_T, DNA_A, DNA_N, DNA_C])\n5nt DNA Sequence:\nTTANC\n\njulia> LongSequence{DNAAlphabet{2}}([DNA_T, DNA_T, DNA_A, DNA_G, DNA_C])\n5nt DNA Sequence:\nTTAGC\n","category":"page"},{"location":"construction/#Constructing-sequences-from-other-sequences","page":"Constructing sequences","title":"Constructing sequences from other sequences","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"You can create sequences, by concatenating other sequences together:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> LongDNA{2}(\"ACGT\") * LongDNA{2}(\"TGCA\")\n8nt DNA Sequence:\nACGTTGCA\n\njulia> repeat(LongDNA{4}(\"TA\"), 10)\n20nt DNA Sequence:\nTATATATATATATATATATA\n\njulia> LongDNA{4}(\"TA\") ^ 10\n20nt DNA Sequence:\nTATATATATATATATATATA\n","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Sequence views (LongSubSeqs) are special, in that they do not own their own data, and must be constructed from a LongSequence or another LongSubSeq:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> seq = LongDNA{4}(\"TACGGACATTA\")\n11nt DNA Sequence:\nTACGGACATTA\n\njulia> seqview = LongSubSeq(seq, 3:7)\n5nt DNA Sequence:\nCGGAC\n\njulia> seqview2 = @view seq[1:3]\n3nt DNA Sequence:\nTAC\n\njulia> typeof(seqview) == typeof(seqview2) && typeof(seqview) <: LongSubSeq\ntrue\n","category":"page"},{"location":"construction/#Conversion-of-sequence-types","page":"Constructing sequences","title":"Conversion of sequence types","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"You can convert between sequence types, if the sequences are compatible - that is, if the source sequence does not contain symbols that are un-encodable by the destination type.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> dna = dna\"TTACGTAGACCG\"\n12nt DNA Sequence:\nTTACGTAGACCG\n\njulia> dna2 = convert(LongDNA{2}, dna)\n12nt DNA Sequence:\nTTACGTAGACCG","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"DNA/RNA are special in that they can be converted to each other, despite containing distinct symbols. When doing so, DNA_T is converted to RNA_U and vice versa.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> convert(LongRNA{2}, dna\"TAGCTAGG\")\n8nt RNA Sequence:\nUAGCUAGG","category":"page"},{"location":"construction/#String-literals","page":"Constructing sequences","title":"String literals","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"BioSequences provides several string literal macros for creating sequences.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"note: Note\nWhen you use literals you may mix the case of characters.","category":"page"},{"location":"construction/#Long-sequence-literals","page":"Constructing sequences","title":"Long sequence literals","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> dna\"TACGTANNATC\"\n11nt DNA Sequence:\nTACGTANNATC\n\njulia> rna\"AUUUGNCCANU\"\n11nt RNA Sequence:\nAUUUGNCCANU\n\njulia> aa\"ARNDCQEGHILKMFPSTWYVX\"\n21aa Amino Acid Sequence:\nARNDCQEGHILKMFPSTWYVX","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"However, it should be noted that by default these sequence literals allocate the LongSequence object before the code containing the sequence literal is run. This means there may be occasions where your program does not behave as you first expect. For example consider the following code:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> function foo()\n           s = dna\"CTT\"\n           push!(s, DNA_A)\n       end\nfoo (generic function with 1 method)\n","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"DocTestSetup = quote\n    using BioSequences\n    function foo()\n        s = dna\"CTT\"d\n        push!(s, DNA_A)\n    end\nend","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"You might expect that every time you call foo, that a DNA sequence CTTA would be returned. You might expect that this is because every time foo is called, a new DNA sequence variable CTT is created, and the A nucleotide is pushed to it, and the result, CTTA is returned. In other words you might expect the following output:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> foo()\n4nt DNA Sequence:\nCTTA\n\njulia> foo()\n4nt DNA Sequence:\nCTTA\n\njulia> foo()\n4nt DNA Sequence:\nCTTA\n","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"However, this is not what happens, instead the following happens:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"DocTestSetup = quote\n    using BioSequences\n    function foo()\n        s = dna\"CTT\"s\n        push!(s, DNA_A)\n    end\nend","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> foo()\n4nt DNA Sequence:\nCTTA\n\njulia> foo()\n5nt DNA Sequence:\nCTTAA\n\njulia> foo()\n6nt DNA Sequence:\nCTTAAA\n","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"The reason for this is because the sequence literal is allocated only once before the first time the function foo is called and run. Therefore, s in foo is always a reference to that one sequence that was allocated. So one sequence is created before foo is called, and then it is pushed to every time foo is called. Thus, that one allocated sequence grows with every call of foo.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"If you wanted foo to create a new sequence each time it is called, then you can add a flag to the end of the sequence literal to dictate behaviour: A flag of 's' means 'static': the sequence will be allocated before code is run, as is the default behaviour described above. However providing 'd' flag changes the behaviour: 'd' means 'dynamic': the sequence will be allocated whilst the code is running, and not before. So to change foo so as it creates a new sequence each time it is called, simply add the 'd' flag to the sequence literal:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"DocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> function foo()\n           s = dna\"CTT\"d     # 'd' flag appended to the string literal.\n           push!(s, DNA_A)\n       end\nfoo (generic function with 1 method)\n","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Now every time foo is called, a new sequence CTT is created, and an A nucleotide is pushed to it:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"DocTestSetup = quote\n    using BioSequences\n    function foo()\n        s = dna\"CTT\"d\n        push!(s, DNA_A)\n    end\nend","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> foo()\n4nt DNA Sequence:\nCTTA\n\njulia> foo()\n4nt DNA Sequence:\nCTTA\n\njulia> foo()\n4nt DNA Sequence:\nCTTA\n","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"DocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"So the take home message of sequence literals is this:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Be careful when you are using sequence literals inside of functions, and inside the bodies of things like for loops. And if you use them and are unsure, use the  's' and 'd' flags to ensure the behaviour you get is the behaviour you intend.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"@dna_str\n@rna_str\n@aa_str","category":"page"},{"location":"construction/#BioSequences.@dna_str","page":"Constructing sequences","title":"BioSequences.@dna_str","text":"@dna_str(seq, flag=\"s\") -> LongDNA{4}\n\nCreate a LongDNA{4} sequence at parse time from string seq. If flag is \"s\" ('static', the default), the sequence is created at parse time, and inserted directly into the returned expression. A static string ought not to be mutated Alternatively, if flag is \"d\" (dynamic), a new sequence is parsed and created whenever the code where is macro is placed is run.\n\nSee also: @aa_str, @rna_str\n\nExamples\n\nIn the example below, the static sequence is created once, at parse time, NOT when the function f is run. This means it is the same  sequence that is pushed to repeatedly.\n\njulia> f() = dna\"TAG\";\n\njulia> string(push!(f(), DNA_A)) # NB: Mutates static string!\n\"TAGA\"\n\njulia> string(push!(f(), DNA_A))\n\"TAGAA\"\n\njulia> f() = dna\"TAG\"d; # dynamically make seq\n\njulia> string(push!(f(), DNA_A))\n\"TAGA\"\n\njulia> string(push!(f(), DNA_A))\n\"TAGA\"\n\n\n\n\n\n","category":"macro"},{"location":"construction/#BioSequences.@rna_str","page":"Constructing sequences","title":"BioSequences.@rna_str","text":"The LongRNA{4} equivalent to @dna_str\n\nSee also: @dna_str, @aa_str\n\nExamples\n\njulia> rna\"UCGUGAUGC\"\n9nt RNA Sequence:\nUCGUGAUGC\n\n\n\n\n\n","category":"macro"},{"location":"construction/#BioSequences.@aa_str","page":"Constructing sequences","title":"BioSequences.@aa_str","text":"The AminoAcidAlphabet equivalent to @dna_str\n\nSee also: @dna_str, @rna_str\n\nExamples\n\njulia> aa\"PKLEQC\"\n6aa Amino Acid Sequence:\nPKLEQC\n\n\n\n\n\n","category":"macro"},{"location":"construction/#Loose-parsing","page":"Constructing sequences","title":"Loose parsing","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"As of version 3.2.0, BioSequences.jl provide the bioseq function, which can be used to build a LongSequence from a string (or an AbstractVector{UInt8}) without knowing the correct Alphabet.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> bioseq(\"ATGTGCTGA\")\n9nt DNA Sequence:\nATGTGCTGA","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"The function will prioritise 2-bit alphabets over 4-bit alphabets, and prefer smaller alphabets (like DNAAlphabet{4}) over larger (like AminoAcidAlphabet). If the input cannot be encoded by any of the built-in alphabets, an error is thrown:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> bioseq(\"0!(CC!;#&&%\")\nERROR: cannot encode 0x30 in AminoAcidAlphabet\n[...]","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Note that this function is only intended to be used for interactive, ephemeral work. The function is necessarily type unstable, and the precise returned alphabet for a given input is a heuristic which is subject to change.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"bioseq\nguess_alphabet","category":"page"},{"location":"construction/#BioSequences.bioseq","page":"Constructing sequences","title":"BioSequences.bioseq","text":"bioseq(s::Union{AbstractString, AbstractVector{UInt8}}) -> LongSequence\n\nParse s into a LongSequence with an appropriate Alphabet, or throw an exception if no alphabet matches. See guess_alphabet for the available alphabets and the alphabet priority.\n\nwarning: Warning\nThe functions bioseq and guess_alphabet are intended for use in interactive sessions, and are not suitable for use in packages or non-ephemeral work. They are type unstable, and their heuristics are subject to change in minor versions.\n\nExamples\n\njulia> bioseq(\"QMKLPEEFW\")\n9aa Amino Acid Sequence:\nQMKLPEEFW\n\njulia> bioseq(\"UAUGCUGUAGG\")\n11nt RNA Sequence:\nUAUGCUGUAGG\n\njulia> bioseq(\"PKMW#3>>0;kL\")\nERROR: cannot encode 0x23 in AminoAcidAlphabet\n[...]\n\n\n\n\n\n","category":"function"},{"location":"construction/#BioSequences.guess_alphabet","page":"Constructing sequences","title":"BioSequences.guess_alphabet","text":"guess_alphabet(s::Union{AbstractString, AbstractVector{UInt8}}) -> Union{Integer, Alphabet}\n\nPick an Alphabet that can encode input s.  If no Alphabet can, return the index of the first byte of the input which is not encodable in any alphabet. This function only knows about the alphabets listed below. If multiple alphabets are possible, pick the first from the order below (i.e. DNAAlphabet{2}() if possible, otherwise RNAAlphabet{2}() etc).\n\nDNAAlphabet{2}()\nRNAAlphabet{2}()\nDNAAlphabet{4}()\nRNAAlphabet{4}()\nAminoAcidAlphabet()\n\nwarning: Warning\nThe functions bioseq and guess_alphabet are intended for use in interactive sessions, and are not suitable for use in packages or non-ephemeral work. They are type unstable, and their heuristics are subject to change in minor versions.\n\nExamples\n\njulia> guess_alphabet(\"AGGCA\")\nDNAAlphabet{2}()\n\njulia> guess_alphabet(\"WKLQSTV\")\nAminoAcidAlphabet()\n\njulia> guess_alphabet(\"QAWT+!\")\n5\n\njulia> guess_alphabet(\"UAGCSKMU\")\nRNAAlphabet{4}()\n\n\n\n\n\n","category":"function"},{"location":"construction/#Comparison-to-other-sequence-types","page":"Constructing sequences","title":"Comparison to other sequence types","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Following Base standards, BioSequences do not compare equal to other containers even if they have the same elements. To e.g. compare a BioSequence with a vector of DNA, compare the elements themselves:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> seq = dna\"GAGCTGA\"; vec = collect(seq);\n\njulia> seq == vec, isequal(seq, vec)\n(false, false)\n\njulia> length(seq) == length(vec) && all(i == j for (i, j) in zip(seq, vec))\ntrue ","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"sequence_search/#Searching-for-sequence-motifs","page":"Pattern matching and searching","title":"Searching for sequence motifs","text":"","category":"section"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"There are many ways to search for particular motifs in biological sequences:","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Exact searches, where you are looking for exact matches of a particular character of substring.\nApproximate searches, where you are looking for sequences that are sufficiently similar to a given sequence or family of sequences.\nSearches where you are looking for sequences that conform to some sort of pattern.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Like other Julia sequences such as Vector, you can search a BioSequence with the findfirst(predicate, collection) method pattern.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"All these kinds of searches are provided in BioSequences.jl, and they all conform to the findnext, findprev, and occursin patterns established in Base for String and collections like Vector.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"The exception is searching using the specialised regex provided in this package, which as you shall see, conforms to the match pattern established in Base for pcre and Strings.","category":"page"},{"location":"sequence_search/#Symbol-search","page":"Pattern matching and searching","title":"Symbol search","text":"","category":"section"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"julia> seq = dna\"ACAGCGTAGCT\";\n\njulia> findfirst(DNA_A, seq)\n1\n\njulia> findlast(DNA_A, seq)\n8\n\njulia> findnext(DNA_A, seq, 2)\n3\n\njulia> findprev(DNA_A, seq, 7)\n3\n\njulia> findall(DNA_A, seq)\n3-element Vector{Int64}:\n 1\n 3\n 8","category":"page"},{"location":"sequence_search/#Exact-search","page":"Pattern matching and searching","title":"Exact search","text":"","category":"section"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"ExactSearchQuery","category":"page"},{"location":"sequence_search/#BioSequences.ExactSearchQuery","page":"Pattern matching and searching","title":"BioSequences.ExactSearchQuery","text":"ExactSearchQuery{F<:Function,S<:BioSequence}\n\nQuery type for exact sequence search.\n\nAn exact search, is one where are you are looking in some given sequence, for exact instances of some given substring.\n\nThese queries are used as a predicate for the Base.findnext, Base.findprev, Base.occursin, Base.findfirst, and Base.findlast functions.\n\nExamples\n\njulia> seq = dna\"ACAGCGTAGCT\";\n\njulia> query = ExactSearchQuery(dna\"AGC\");\n\njulia> findfirst(query, seq)\n3:5\n\njulia> findlast(query, seq)\n8:10\n\njulia> findnext(query, seq, 6)\n8:10\n\njulia> findprev(query, seq, 7)\n3:5\n\njulia> findall(query, seq)\n2-element Vector{UnitRange{Int64}}:\n 3:5\n 8:10\n\njulia> occursin(query, seq)\ntrue\n\n\nYou can pass a comparator function such as isequal or iscompatible to its constructor to modify the search behaviour.\n\nThe default is isequal, however, in biology, sometimes we want a more flexible comparison to find subsequences of compatible symbols.\n\njulia> query = ExactSearchQuery(dna\"CGT\", iscompatible);\n\njulia> findfirst(query, dna\"ACNT\")  # 'N' matches 'G'\n2:4\n\njulia> findfirst(query, dna\"ACGT\")  # 'G' matches 'N'\n2:4\n\njulia> occursin(ExactSearchQuery(dna\"CNT\", iscompatible), dna\"ACNT\")\ntrue\n\n\n\n\n\n\n","category":"type"},{"location":"sequence_search/#Allowing-mismatches","page":"Pattern matching and searching","title":"Allowing mismatches","text":"","category":"section"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"ApproximateSearchQuery","category":"page"},{"location":"sequence_search/#BioSequences.ApproximateSearchQuery","page":"Pattern matching and searching","title":"BioSequences.ApproximateSearchQuery","text":"ApproximateSearchQuery{F<:Function,S<:BioSequence}\n\nQuery type for approximate sequence search.\n\nThese queries are used as a predicate for the Base.findnext, Base.findprev, Base.occursin, Base.findfirst, and Base.findlast functions.\n\nUsing these functions with these queries allows you to search a given sequence for a sub-sequence, whilst allowing a specific number of errors.\n\nIn other words they find a subsequence of the target sequence within a specific Levenshtein distance of the query sequence.\n\nExamples\n\njulia> seq = dna\"ACAGCGTAGCT\";\n\njulia> query = ApproximateSearchQuery(dna\"AGGG\");\n\njulia> findfirst(query, 0, seq) == nothing # nothing matches with no errors\ntrue\n\njulia> findfirst(query, 1, seq)  # seq[3:6] matches with one error\n3:6\n\njulia> findfirst(query, 2, seq)  # seq[1:4] matches with two errors\n1:4\n\n\nYou can pass a comparator function such as isequal or iscompatible to its constructor to modify the search behaviour.\n\nThe default is isequal, however, in biology, sometimes we want a more flexible comparison to find subsequences of compatible symbols.\n\njulia> query = ApproximateSearchQuery(dna\"AGGG\", iscompatible);\n\njulia> occursin(query, 1, dna\"AAGNGG\")    # 1 mismatch permitted (A vs G) & matched N\ntrue\n\njulia> findnext(query, 1, dna\"AAGNGG\", 1) # 1 mismatch permitted (A vs G) & matched N\n1:4\n\n\nnote: Note\nThis method of searching for motifs was implemented with smaller query motifs in mind.If you are looking to search for imperfect matches of longer sequences in this manner, you are likely better off using some kind of local-alignment algorithm or one of the BLAST variants.\n\n\n\n\n\n","category":"type"},{"location":"sequence_search/#Searching-according-to-a-pattern","page":"Pattern matching and searching","title":"Searching according to a pattern","text":"","category":"section"},{"location":"sequence_search/#Regular-expression-search","page":"Pattern matching and searching","title":"Regular expression search","text":"","category":"section"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Query patterns can be described in regular expressions. The syntax supports a subset of Perl and PROSITE's notation.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Biological regexes can be constructed using the BioRegex constructor, for example by doing BioRegex{AminoAcid}(\"MV+\"). For bioregex literals, it is instead recommended using the @biore_str macro:","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"The Perl-like syntax starts with biore (BIOlogical REgular expression) and ends with a symbol option: \"dna\", \"rna\" or \"aa\". For example, biore\"A+\"dna is a regular expression for DNA sequences and biore\"A+\"aa is for amino acid sequences. The symbol options can be abbreviated to its first character: \"d\", \"r\" or \"a\", respectively.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Here are examples of using the regular expression for BioSequences:","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"julia> match(biore\"A+C*\"dna, dna\"AAAACC\")\nRegexMatch(\"AAAACC\")\n\njulia> match(biore\"A+C*\"d, dna\"AAAACC\")\nRegexMatch(\"AAAACC\")\n\njulia> occursin(biore\"A+C*\"dna, dna\"AAC\")\ntrue\n\njulia> occursin(biore\"A+C*\"dna, dna\"C\")\nfalse\n","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"match will return a RegexMatch if a match is found, otherwise it will return nothing if no match is found.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"The table below summarizes available syntax elements.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Syntax Description Example\n| alternation \"A|T\" matches \"A\" and \"T\"\n* zero or more times repeat \"TA*\" matches \"T\", \"TA\" and \"TAA\"\n+ one or more times repeat \"TA+\" matches \"TA\" and \"TAA\"\n? zero or one time \"TA?\" matches \"T\" and \"TA\"\n{n,} n or more times repeat \"A{3,}\" matches \"AAA\" and \"AAAA\"\n{n,m} n-m times repeat \"A{3,5}\" matches \"AAA\", \"AAAA\" and \"AAAAA\"\n^ the start of the sequence \"^TAN*\" matches \"TATGT\"\n$ the end of the sequence \"N*TA$\" matches \"GCTA\"\n(...) pattern grouping \"(TA)+\" matches \"TA\" and \"TATA\"\n[...] one of symbols \"[ACG]+\" matches \"AGGC\"","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"eachmatch and findfirst are also defined, just like usual regex and strings found in Base.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"julia> collect(matched(x) for x in eachmatch(biore\"TATA*?\"d, dna\"TATTATAATTA\")) # overlap\n4-element Vector{LongSequence{DNAAlphabet{4}}}:\n TAT  \n TAT\n TATA\n TATAA\n\njulia> collect(matched(x) for x in eachmatch(biore\"TATA*\"d, dna\"TATTATAATTA\", false)) # no overlap\n2-element Vector{LongSequence{DNAAlphabet{4}}}:\n TAT  \n TATAA\n\njulia> findfirst(biore\"TATA*\"d, dna\"TATTATAATTA\")\n1:3\n\njulia> findfirst(biore\"TATA*\"d, dna\"TATTATAATTA\", 2)\n4:8\n","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Noteworthy differences from strings are:","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Ambiguous characters match any compatible characters (e.g. biore\"N\"d is equivalent to biore\"[ACGT]\"d).\nWhitespaces are ignored (e.g. biore\"A C G\"d is equivalent to biore\"ACG\"d).","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"The PROSITE notation is described in ScanProsite - user manual. The syntax supports almost all notations including the extended syntax. The PROSITE notation starts with prosite prefix and no symbol option is needed because it always describes patterns of amino acid sequences:","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"julia> match(prosite\"[AC]-x-V-x(4)-{ED}\", aa\"CPVPQARG\")\nRegexMatch(\"CPVPQARG\")\n\njulia> match(prosite\"[AC]xVx(4){ED}\", aa\"CPVPQARG\")\nRegexMatch(\"CPVPQARG\")\n","category":"page"},{"location":"sequence_search/#Position-weight-matrix-search","page":"Pattern matching and searching","title":"Position weight matrix search","text":"","category":"section"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"A motif can be specified using position weight matrix (PWM) in a probabilistic way. This method searches for the first position in the sequence where a score calculated using a PWM is greater than or equal to a threshold. More formally, denoting the sequence as S and the PWM value of symbol s at position j as M_sj, the score starting from a position p is defined as","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"operatornamescore(S p) = sum_i=1^L M_Sp+i-1i","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"and the search returns the smallest p that satisfies operatornamescore(S p) ge t.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"There are two kinds of matrices in this package: PFM and PWM. The PFM type is a position frequency matrix and stores symbol frequencies for each position. The PWM is a position weight matrix and stores symbol scores for each position. You can create a PFM from a set of sequences with the same length and then create a PWM from the PFM object.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"julia> motifs = [dna\"TTA\", dna\"CTA\", dna\"ACA\", dna\"TCA\", dna\"GTA\"]\n5-element Vector{LongSequence{DNAAlphabet{4}}}:\n TTA\n CTA\n ACA\n TCA\n GTA\n\njulia> pfm = PFM(motifs)  # sequence set => PFM\n4×3 PFM{DNA, Int64}:\n A  1  0  5\n C  1  2  0\n G  1  0  0\n T  2  3  0\n\njulia> pwm = PWM(pfm)  # PFM => PWM\n4×3 PWM{DNA, Float64}:\n A -0.321928 -Inf       2.0\n C -0.321928  0.678072 -Inf\n G -0.321928 -Inf      -Inf\n T  0.678072  1.26303  -Inf\n\njulia> pwm = PWM(pfm .+ 0.01)  # add pseudo counts to avoid infinite values\n4×3 PWM{DNA, Float64}:\n A -0.319068 -6.97728   1.99139\n C -0.319068  0.673772 -6.97728\n G -0.319068 -6.97728  -6.97728\n T  0.673772  1.25634  -6.97728\n\njulia> pwm = PWM(pfm .+ 0.01, prior=[0.2, 0.3, 0.3, 0.2])  # GC-rich prior\n4×3 PWM{DNA, Float64}:\n A  0.00285965 -6.65535   2.31331\n C -0.582103    0.410737 -7.24031\n G -0.582103   -7.24031  -7.24031\n T  0.9957      1.57827  -6.65535\n","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"The PWM_sj matrix is computed from PFM_sj and the prior probability p(s) as follows ([Wasserman2004]):","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"beginalign\n    PWM_sj = log_2 fracp(sj)p(s) \n    p(sj)  = fracPFM_sjsum_s PFM_sj\nendalign","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"However, if you just want to quickly conduct a search, constructing the PFM and PWM is done for you as a convenience if you build a PWMSearchQuery, using a collection of sequences:","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"julia> motifs = [dna\"TTA\", dna\"CTA\", dna\"ACA\", dna\"TCA\", dna\"GTA\"]\n5-element Vector{LongSequence{DNAAlphabet{4}}}:\n TTA\n CTA\n ACA\n TCA\n GTA\n\njulia> subject = dna\"TATTATAATTA\";\n\njulia> qa = PWMSearchQuery(motifs, 1.0);\n\njulia> findfirst(qa, subject)\n3\n\njulia> findall(qa, subject)\n3-element Vector{Int64}:\n 3\n 5\n 9","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"[Wasserman2004]: https://doi.org/10.1038/nrg1315","category":"page"},{"location":"predicates/","page":"Predicates","title":"Predicates","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"predicates/#Predicates","page":"Predicates","title":"Predicates","text":"","category":"section"},{"location":"predicates/","page":"Predicates","title":"Predicates","text":"A number of predicate or query functions are supported for sequences, allowing you to check for certain properties of a sequence.","category":"page"},{"location":"predicates/","page":"Predicates","title":"Predicates","text":"isrepetitive\nispalindromic\nhasambiguity\niscanonical","category":"page"},{"location":"predicates/#BioSequences.isrepetitive","page":"Predicates","title":"BioSequences.isrepetitive","text":"isrepetitive(seq::BioSequence, n::Integer = length(seq))\n\nReturn true if and only if seq contains a repetitive subsequence of length ≥ n.\n\n\n\n\n\n","category":"function"},{"location":"predicates/#BioSequences.ispalindromic","page":"Predicates","title":"BioSequences.ispalindromic","text":"ispalindromic(seq::NucSeq) -> Bool\n\nCheck if seq is palindromic. A palindromic sequence is identical to its reverse-complement, so this should be equivalent to checking if seq == reverse_complement(seq).\n\nExamples\n\njulia> ispalindromic(dna\"TGCA\")\ntrue\n\njulia> ispalindromic(dna\"TCCT\")\nfalse\n\njulia> ispalindromic(rna\"ACGGU\")\nfalse\n\nReturn true if seq is a palindromic sequence; otherwise return false.\n\n\n\n\n\n","category":"function"},{"location":"predicates/#BioSequences.hasambiguity","page":"Predicates","title":"BioSequences.hasambiguity","text":"hasambiguity(seq::BioSequence)\n\nReturns true if seq has an ambiguous symbol; otherwise return false.\n\n\n\n\n\n","category":"function"},{"location":"predicates/#BioSequences.iscanonical","page":"Predicates","title":"BioSequences.iscanonical","text":"iscanonical(seq::NucleotideSeq)\n\nReturns true if seq is canonical.\n\nFor any sequence, there is a reverse complement, which is the same sequence, but on the complimentary strand of DNA:\n\n------->\nATCGATCG\nCGATCGAT\n<-------\n\nnote: Note\nUsing the reverse_complement of a DNA sequence will give give this reverse complement.\n\nOf the two sequences, the canonical of the two sequences is the lesser of the two i.e. canonical_seq < other_seq.\n\n\n\n\n\n","category":"function"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\n    using BioSymbols\nend","category":"page"},{"location":"recipes/#Recipes","page":"Recipes","title":"Recipes","text":"","category":"section"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"This page provides tested example code to solve various common problems using BioSequences.","category":"page"},{"location":"recipes/#One-hot-encoding-biosequences","page":"Recipes","title":"One-hot encoding biosequences","text":"","category":"section"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"The types DNA, RNA and AminoAcid expose a binary representation through the exported function BioSymbols.compatbits, which is a one-hot encoding of:","category":"page"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"julia> using BioSymbols\n\njulia> compatbits(DNA_W)\n0x09\n\njulia> compatbits(AA_J)\n0x00000600","category":"page"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"Each set bit in the encoding corresponds to a compatible unambiguous symbol. For example, for RNA, the four lower bits encode A, C, G, and U, in order. Hence, the symbol D, which is short for A, G or U, is encoded as 0x01 | 0x04 | 0x08 == 0x0d:","category":"page"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"julia> compatbits(RNA_D)\n0x0d\n\njulia> compatbits(RNA_A) | compatbits(DNA_G) | compatbits(RNA_U)\n0x0d","category":"page"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"Using this, we can construct a function to one-hot encode sequences - in this example, nucleic acid sequences:","category":"page"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"function one_hot(s::NucSeq)\n    M = falses(4, length(s))\n    for (i, s) in enumerate(s)\n        bits = compatbits(s)\n        while !iszero(bits)\n            M[trailing_zeros(bits) + 1, i] = true\n            bits &= bits - one(bits) # clear lowest bit\n        end\n    end\n    M\nend\n\none_hot(dna\"TGNTKCTW-T\")\n\n# output\n\n4×10 BitMatrix:\n 0  0  1  0  0  0  0  1  0  0\n 0  0  1  0  0  1  0  0  0  0\n 0  1  1  0  1  0  0  0  0  0\n 1  0  1  1  1  0  1  1  0  1","category":"page"},{"location":"#BioSequences","page":"Home","title":"BioSequences","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"(Image: Latest Release) (Image: MIT license) (Image: Documentation) (Image: Pkg Status)","category":"page"},{"location":"#Description","page":"Home","title":"Description","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"BioSequences provides data types and methods for common operations with biological sequences, including DNA, RNA, and amino acid sequences.","category":"page"},{"location":"#Installation","page":"Home","title":"Installation","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"You can install BioSequences from the julia REPL. Press ] to enter pkg mode again, and enter the following:","category":"page"},{"location":"","page":"Home","title":"Home","text":"add BioSequences","category":"page"},{"location":"","page":"Home","title":"Home","text":"If you are interested in the cutting edge of the development, please check out the master branch to try new features before release.","category":"page"},{"location":"#Testing","page":"Home","title":"Testing","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"BioSequences is tested against Julia 1.X on Linux, OS X, and Windows.","category":"page"},{"location":"","page":"Home","title":"Home","text":"(Image: Unit tests) (Image: Documentation) (Image: )","category":"page"},{"location":"#Contributing","page":"Home","title":"Contributing","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"We appreciate contributions from users including reporting bugs, fixing issues, improving performance and adding new features.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Take a look at the contributing files detailed contributor and maintainer guidelines, and code of conduct.","category":"page"},{"location":"#Questions?","page":"Home","title":"Questions?","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"If you have a question about contributing or using BioJulia software, come on over and chat to us on the #biology channel on the Julia SLack, or you can try the Bio category of the Julia discourse site.","category":"page"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"types/#Abstract-Types","page":"BioSequences Types","title":"Abstract Types","text":"","category":"section"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"BioSequences exports an abstract BioSequence type, and several concrete sequence types which inherit from it.","category":"page"},{"location":"types/#The-abstract-BioSequence","page":"BioSequences Types","title":"The abstract BioSequence","text":"","category":"section"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"BioSequences provides an abstract type called a BioSequence{A<:Alphabet}. This abstract type, and the methods and traits is supports, allows for many algorithms in BioSequences to be written as generically as possible, thus reducing the amount of code to read and understand, whilst maintaining high performance when such code is compiled for a concrete BioSequence subtype. Additionally, it allows new types to be implemented that are fully compatible with the rest of BioSequences, providing that key methods or traits are defined).","category":"page"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"BioSequence","category":"page"},{"location":"types/#BioSequences.BioSequence","page":"BioSequences Types","title":"BioSequences.BioSequence","text":"BioSequence{A <: Alphabet}\n\nBioSequence is the main abstract type of BioSequences. It abstracts over the internal representation of different biological sequences, and is parameterized by an Alphabet, which controls the element type.\n\nExtended help\n\nIts subtypes are characterized by:\n\nBeing a linear container type with random access and indices Base.OneTo(length(x)).\nContaining zero or more internal data elements of type encoded_data_eltype(typeof(x)).\nBeing associated with an Alphabet, A by being a subtype of BioSequence{A}.\n\nA BioSequence{A} is indexed by an integer. The biosequence subtype, the index and the alphabet A determine how to extract the internal encoded data. The alphabet decides how to decode the data to the element type of the biosequence. Hence, the element type and container type of a BioSequence are separated.\n\nSubtypes T of BioSequence must implement the following, with E begin an encoded data type:\n\nBase.length(::T)::Int\nencoded_data_eltype(::Type{T})::Type{E}\nextract_encoded_element(::T, ::Integer)::E\ncopy(::T)\nT must be able to be constructed from any iterable with length defined and with a known, compatible element type.\n\nFurthermore, mutable sequences should implement\n\nencoded_setindex!(::T, ::E, ::Integer)\nT(undef, ::Int)\nresize!(::T, ::Int)\n\nFor compatibility with existing Alphabets, the encoded data eltype must be UInt.\n\n\n\n\n\n","category":"type"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"Some aliases for BioSequence are also provided for your convenience:","category":"page"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"NucSeq\nAASeq","category":"page"},{"location":"types/#BioSequences.NucSeq","page":"BioSequences Types","title":"BioSequences.NucSeq","text":"An alias for BioSequence{<:NucleicAcidAlphabet}\n\n\n\n\n\n","category":"type"},{"location":"types/#BioSequences.AASeq","page":"BioSequences Types","title":"BioSequences.AASeq","text":"An alias for BioSequence{AminoAcidAlphabet}\n\n\n\n\n\n","category":"type"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"Let's have a closer look at some of those methods that a subtype of BioSequence must implement. Check out julia base library docs for length, copy and resize!.","category":"page"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"encoded_data_eltype\nextract_encoded_element\nencoded_setindex!","category":"page"},{"location":"types/#BioSequences.encoded_data_eltype","page":"BioSequences Types","title":"BioSequences.encoded_data_eltype","text":"encoded_data_eltype(::Type{<:BioSequence})\n\nReturns the element type of the encoded data of the BioSequence. This is the return type of extract_encoded_element, i.e. the data type that stores the biological symbols in the biosequence.\n\nSee also: BioSequence \n\n\n\n\n\n","category":"function"},{"location":"types/#BioSequences.extract_encoded_element","page":"BioSequences Types","title":"BioSequences.extract_encoded_element","text":"extract_encoded_element(::BioSequence{A}, i::Integer)\n\nReturns the encoded element at position i. This data can be decoded using decode(A(), data) to yield the element type of the biosequence.\n\nSee also: BioSequence \n\n\n\n\n\n","category":"function"},{"location":"types/#BioSequences.encoded_setindex!","page":"BioSequences Types","title":"BioSequences.encoded_setindex!","text":"encoded_setindex!(seq::BioSequence, x::E, i::Integer)\n\nGiven encoded data x of type encoded_data_eltype(typeof(seq)), sets the internal sequence data at the given index.\n\nSee also: BioSequence \n\n\n\n\n\n","category":"function"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"A correctly defined subtype of BioSequence that satisfies the interface, will find the vast majority of methods described in the rest of this manual should work out of the box for that type. But they can always be overloaded if needed. Indeed the LongSequence type overloads Indeed some of the generic BioSequence methods, are overloaded for LongSequence, for example for transformation and counting operations where efficiency gains can be made due to the specific internal representation of a specific type.","category":"page"},{"location":"types/#The-abstract-Alphabet","page":"BioSequences Types","title":"The abstract Alphabet","text":"","category":"section"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"Alphabets control how biological symbols are encoded and decoded. They also confer many of the automatic traits and methods that any subtype of T<:BioSequence{A<:Alphabet} will get.","category":"page"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"BioSequences.Alphabet\nBioSequences.AsciiAlphabet","category":"page"},{"location":"types/#BioSequences.Alphabet","page":"BioSequences Types","title":"BioSequences.Alphabet","text":"Alphabet\n\nAlphabet is the most important type trait for BioSequence. An Alphabet represents a set of biological symbols encoded by a sequence, e.g. A, C, G and T for a DNA Alphabet that requires only 2 bits to represent each symbol.\n\nExtended help\n\nSubtypes of Alphabet are singleton structs that may or may not be parameterized.\nAlphabets span over a finite set of biological symbols.\nThe alphabet controls the encoding from some internal \"encoded data\" to a BioSymbol  of the alphabet's element type, as well as the decoding, the inverse process.\nAn Alphabet's encode method must not produce invalid data. \n\nEvery subtype A of Alphabet must implement:\n\nBase.eltype(::Type{A})::Type{S} for some eltype S, which must be a BioSymbol.\nsymbols(::A)::Tuple{Vararg{S}}. This gives tuples of all symbols in the set of A.\nencode(::A, ::S)::E encodes a symbol to an internal data eltype E.\ndecode(::A, ::E)::S decodes an internal data eltype E to a symbol S.\nExcept for eltype which must follow Base conventions, all functions operating on Alphabet should operate on instances of the alphabet, not the type.\n\nIf you want interoperation with existing subtypes of BioSequence, the encoded representation E must be of type UInt, and you must also implement:\n\nBitsPerSymbol(::A)::BitsPerSymbol{N}, where the N must be zero or a power of two in [1, 2, 4, 8, 16, 32, [64 for 64-bit systems]].\n\nFor increased performance, see BioSequences.AsciiAlphabet\n\n\n\n\n\n","category":"type"},{"location":"types/#BioSequences.AsciiAlphabet","page":"BioSequences Types","title":"BioSequences.AsciiAlphabet","text":"AsciiAlphabet\n\nTrait for alphabet using ASCII characters as String representation. Define codetype(A) = AsciiAlphabet() for a user-defined Alphabet A to gain speed. Methods needed: BioSymbols.stringbyte(::eltype(A)) and ascii_encode(A, ::UInt8).\n\n\n\n\n\n","category":"type"},{"location":"types/#Concrete-types","page":"BioSequences Types","title":"Concrete types","text":"","category":"section"},{"location":"types/#Implemented-alphabets","page":"BioSequences Types","title":"Implemented alphabets","text":"","category":"section"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"DNAAlphabet\nRNAAlphabet\nAminoAcidAlphabet","category":"page"},{"location":"types/#BioSequences.DNAAlphabet","page":"BioSequences Types","title":"BioSequences.DNAAlphabet","text":"DNA nucleotide alphabet.\n\nDNAAlphabet has a parameter N which is a number that determines the BitsPerSymbol trait. Currently supported values of N are 2 and 4.\n\n\n\n\n\n","category":"type"},{"location":"types/#BioSequences.RNAAlphabet","page":"BioSequences Types","title":"BioSequences.RNAAlphabet","text":"RNA nucleotide alphabet.\n\nRNAAlphabet has a parameter N which is a number that determines the BitsPerSymbol trait. Currently supported values of N are 2 and 4.\n\n\n\n\n\n","category":"type"},{"location":"types/#BioSequences.AminoAcidAlphabet","page":"BioSequences Types","title":"BioSequences.AminoAcidAlphabet","text":"Amino acid alphabet.\n\n\n\n\n\n","category":"type"},{"location":"types/#Long-Sequences","page":"BioSequences Types","title":"Long Sequences","text":"","category":"section"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"LongSequence","category":"page"},{"location":"types/#BioSequences.LongSequence","page":"BioSequences Types","title":"BioSequences.LongSequence","text":"LongSequence{A <: Alphabet}\n\nGeneral-purpose BioSequence. This type is mutable and variable-length, and should be preferred for most use cases.\n\nExtended help\n\nLongSequence{A<:Alphabet} <: BioSequence{A} is parameterized by a concrete Alphabet type A that defines the domain (or set) of biological symbols permitted.\n\nAs the BioSequence interface definition implies, LongSequences store the biological symbol elements that they contain in a succinct encoded form that permits many operations to be done in an efficient bit-parallel manner. As per the interface of BioSequence, the Alphabet determines how an element is encoded or decoded when it is inserted or extracted from the sequence.\n\nFor example, AminoAcidAlphabet is associated with AminoAcid and hence an object of the LongSequence{AminoAcidAlphabet} type represents a sequence of amino acids.\n\nSymbols from multiple alphabets can't be intermixed in one sequence type.\n\nThe following table summarizes common LongSequence types that have been given aliases for convenience.\n\nType Symbol type Type alias\nLongSequence{DNAAlphabet{N}} DNA LongDNA{N}\nLongSequence{RNAAlphabet{N}} RNA LongRNA{N}\nLongSequence{AminoAcidAlphabet} AminoAcid LongAA\n\nThe LongDNA and LongRNA aliases use a DNAAlphabet{4}.\n\nDNAAlphabet{4} permits ambiguous nucleotides, and a sequence must use at least 4 bits to internally store each element (and indeed LongSequence does).\n\nIf you are sure that you are working with sequences with no ambiguous nucleotides, you can use LongSequences parameterised with DNAAlphabet{2} instead.\n\nDNAAlphabet{2} is an alphabet that uses two bits per base and limits to only unambiguous nucleotide symbols (A,C,G,T).\n\nChanging this single parameter, is all you need to do in order to benefit from memory savings. Some computations that use bitwise operations will also be dramatically faster.\n\nThe same applies with LongSequence{RNAAlphabet{4}}, simply replace the alphabet parameter with RNAAlphabet{2} in order to benefit.\n\n\n\n\n\n","category":"type"},{"location":"types/#Sequence-views","page":"BioSequences Types","title":"Sequence views","text":"","category":"section"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"Similar to how Base Julia offers views of array objects, BioSequences offers view of LongSequences - the LongSubSeq{A<:Alphabet}.","category":"page"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"Conceptually, a LongSubSeq{A} is similar to a LongSequence{A}, but instead of storing their own data, they refer to the data of a LongSequence. Modiying the LongSequence will be reflected in the view, and vice versa. If the underlying LongSequence is truncated, the behaviour of a view is undefined. For the same reason, some operations are not supported for views, such as resizing.","category":"page"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"The purpose of LongSubSeq is that, since they only contain a pointer to the underlying array, an offset and a length, they are much lighter than LongSequences, and will be stack allocated on Julia 1.5 and newer. Thus, the user may construct millions of views without major performance implications.","category":"page"}]
+[{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"symbols/#Biological-symbols","page":"Biological Symbols","title":"Biological symbols","text":"","category":"section"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"The BioSequences module reexports the biological symbol (character) types that are provided by BioSymbols.jl:","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Type Meaning\nDNA DNA nucleotide\nRNA RNA nucleotide\nAminoAcid Amino acid","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"These symbols are elements of biological sequence types, just as characters are elements of strings.","category":"page"},{"location":"symbols/#DNA-and-RNA-nucleotides","page":"Biological Symbols","title":"DNA and RNA nucleotides","text":"","category":"section"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Set of nucleotide symbols in BioSequences covers IUPAC nucleotide base plus a gap symbol:","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Symbol Constant Meaning\n'A' DNA_A / RNA_A A; Adenine\n'C' DNA_C / RNA_C C; Cytosine\n'G' DNA_G / RNA_G G; Guanine\n'T' DNA_T T; Thymine (DNA only)\n'U' RNA_U U; Uracil (RNA only)\n'M' DNA_M / RNA_M A or C\n'R' DNA_R / RNA_R A or G\n'W' DNA_W / RNA_W A or T/U\n'S' DNA_S / RNA_S C or G\n'Y' DNA_Y / RNA_Y C or T/U\n'K' DNA_K / RNA_K G or T/U\n'V' DNA_V / RNA_V A or C or G; not T/U\n'H' DNA_H / RNA_H A or C or T; not G\n'D' DNA_D / RNA_D A or G or T/U; not C\n'B' DNA_B / RNA_B C or G or T/U; not A\n'N' DNA_N / RNA_N A or C or G or T/U\n'-' DNA_Gap / RNA_Gap Gap (none of the above)","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"https://www.bioinformatics.org/sms/iupac.html","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Symbols are accessible as constants with DNA_ or RNA_ prefix:","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"julia> DNA_A\nDNA_A\n\njulia> DNA_T\nDNA_T\n\njulia> RNA_U\nRNA_U\n\njulia> DNA_Gap\nDNA_Gap\n\njulia> typeof(DNA_A)\nDNA\n\njulia> typeof(RNA_A)\nRNA\n","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Symbols can be constructed by converting regular characters:","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"julia> convert(DNA, 'C')\nDNA_C\n\njulia> convert(DNA, 'C') === DNA_C\ntrue\n","category":"page"},{"location":"symbols/#Amino-acids","page":"Biological Symbols","title":"Amino acids","text":"","category":"section"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Set of amino acid symbols also covers IUPAC amino acid symbols plus a gap symbol:","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Symbol Constant Meaning\n'A' AA_A Alanine\n'R' AA_R Arginine\n'N' AA_N Asparagine\n'D' AA_D Aspartic acid (Aspartate)\n'C' AA_C Cysteine\n'Q' AA_Q Glutamine\n'E' AA_E Glutamic acid (Glutamate)\n'G' AA_G Glycine\n'H' AA_H Histidine\n'I' AA_I Isoleucine\n'L' AA_L Leucine\n'K' AA_K Lysine\n'M' AA_M Methionine\n'F' AA_F Phenylalanine\n'P' AA_P Proline\n'S' AA_S Serine\n'T' AA_T Threonine\n'W' AA_W Tryptophan\n'Y' AA_Y Tyrosine\n'V' AA_V Valine\n'O' AA_O Pyrrolysine\n'U' AA_U Selenocysteine\n'B' AA_B Aspartic acid or Asparagine\n'J' AA_J Leucine or Isoleucine\n'Z' AA_Z Glutamine or Glutamic acid\n'X' AA_X Any amino acid\n'*' AA_Term Termination codon\n'-' AA_Gap Gap (none of the above)","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"https://www.bioinformatics.org/sms/iupac.html","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Symbols are accessible as constants with AA_ prefix:","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"julia> AA_A\nAA_A\n\njulia> AA_Q\nAA_Q\n\njulia> AA_Term\nAA_Term\n\njulia> typeof(AA_A)\nAminoAcid\n","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"Symbols can be constructed by converting regular characters:","category":"page"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"julia> convert(AminoAcid, 'A')\nAA_A\n\njulia> convert(AminoAcid, 'P') === AA_P\ntrue\n","category":"page"},{"location":"symbols/#Other-functions","page":"Biological Symbols","title":"Other functions","text":"","category":"section"},{"location":"symbols/","page":"Biological Symbols","title":"Biological Symbols","text":"alphabet\ngap\niscompatible\nisambiguous","category":"page"},{"location":"symbols/#BioSymbols.alphabet","page":"Biological Symbols","title":"BioSymbols.alphabet","text":"alphabet(DNA)\n\nGet all symbols of DNA in sorted order.\n\nExamples\n\njulia> alphabet(DNA)\n(DNA_Gap, DNA_A, DNA_C, DNA_M, DNA_G, DNA_R, DNA_S, DNA_V, DNA_T, DNA_W, DNA_Y, DNA_H, DNA_K, DNA_D, DNA_B, DNA_N)\n\njulia> issorted(alphabet(DNA))\ntrue\n\n\n\n\n\n\nalphabet(RNA)\n\nGet all symbols of RNA in sorted order.\n\nExamples\n\njulia> alphabet(RNA)\n(RNA_Gap, RNA_A, RNA_C, RNA_M, RNA_G, RNA_R, RNA_S, RNA_V, RNA_U, RNA_W, RNA_Y, RNA_H, RNA_K, RNA_D, RNA_B, RNA_N)\n\njulia> issorted(alphabet(RNA))\ntrue\n\n\n\n\n\n\nalphabet(AminoAcid)\n\nGet all symbols of AminoAcid in sorted order.\n\nExamples\n\njulia> alphabet(AminoAcid)\n(AA_A, AA_R, AA_N, AA_D, AA_C, AA_Q, AA_E, AA_G, AA_H, AA_I, AA_L, AA_K, AA_M, AA_F, AA_P, AA_S, AA_T, AA_W, AA_Y, AA_V, AA_O, AA_U, AA_B, AA_J, AA_Z, AA_X, AA_Term, AA_Gap)\n\njulia> issorted(alphabet(AminoAcid))\ntrue\n\n\n\n\n\n\n","category":"function"},{"location":"symbols/#BioSymbols.gap","page":"Biological Symbols","title":"BioSymbols.gap","text":"gap(::Type{T})::T\n\nReturn the gap (indel) representation of T. By default, gap is defined for DNA, RNA, AminoAcid and Char.\n\nExamples\n\njulia> gap(RNA)\nRNA_Gap\n\njulia> gap(Char)\n'-': ASCII/Unicode U+002D (category Pd: Punctuation, dash)\n\n\n\n\n\n","category":"function"},{"location":"symbols/#BioSymbols.iscompatible","page":"Biological Symbols","title":"BioSymbols.iscompatible","text":"iscompatible(x::S, y::S) where S <: BioSymbol\n\nTest if x and y are compatible with each other.\n\nExamples\n\njulia> iscompatible(AA_A, AA_R)\nfalse\n\njulia> iscompatible(AA_A, AA_X)\ntrue\n\njulia> iscompatible(DNA_A, DNA_A)\ntrue\n\njulia> iscompatible(DNA_C, DNA_N)  # DNA_N can be DNA_C\ntrue\n\njulia> iscompatible(DNA_C, DNA_R)  # DNA_R (A or G) cannot be DNA_C\nfalse\n\n\n\n\n\n\n","category":"function"},{"location":"symbols/#BioSymbols.isambiguous","page":"Biological Symbols","title":"BioSymbols.isambiguous","text":"isambiguous(nt::NucleicAcid)\n\nTest if nt is an ambiguous nucleotide.\n\n\n\n\n\nisambiguous(aa::AminoAcid)\n\nTest if aa is an ambiguous amino acid.\n\n\n\n\n\n","category":"function"},{"location":"io/#I/O-for-sequencing-file-formats","page":"I/O","title":"I/O for sequencing file formats","text":"","category":"section"},{"location":"io/","page":"I/O","title":"I/O","text":"Versions of BioSequences prior to v2.0 provided a FASTA, FASTQ, and 2Bit submodule for working with formatted sequence files.","category":"page"},{"location":"io/","page":"I/O","title":"I/O","text":"After version v2.0, in order to neatly separate concerns, these submodules were removed.","category":"page"},{"location":"io/","page":"I/O","title":"I/O","text":"Instead there will now be dedicated BioJulia packages for each format. Each of these will be compatible with BioSequences.","category":"page"},{"location":"io/","page":"I/O","title":"I/O","text":"A list of all of the different formats and packages is provided below to help you find them quickly.","category":"page"},{"location":"io/","page":"I/O","title":"I/O","text":"Format Package\nFASTA FASTX.jl\nFASTQ FASTX.jl\n2Bit TwoBit.jl","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"counting/#Counting","page":"Counting","title":"Counting","text":"","category":"section"},{"location":"counting/","page":"Counting","title":"Counting","text":"BioSequences contains functionality to efficiently count biosymbols in a biosequence that satisfies some predicate.","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"Consider a naive counting function like this:","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"function count_Ns(seq::BioSequence{<:DNAAlphabet})\n    ns = 0\n    for i in seq\n        ns += (i == DNA_N)::Bool\n    end\n    ns\nend ","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"This function can be more efficiently implemented by exploiting the internal data layout of certain biosequences. Therefore, Julia provides optimised methods for Base.count, such that count_Ns above can be more efficiently expressed count(==(DNA_N), seq).","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"note: Note\nIt is important to understand that this speed is achieved with custom methods of Base.count, and not by a generic mechanism that improves the speed of counting symbols in BioSequencein general. Hence, while count(==(DNA_N), seq) may be optimised, count(i -> i == DNA_N, seq) is not, as this is a different method.","category":"page"},{"location":"counting/#Currently-optimised-methods","page":"Counting","title":"Currently optimised methods","text":"","category":"section"},{"location":"counting/","page":"Counting","title":"Counting","text":"By default, only the BioSequence and Alphabet types found in BioSequences.jl have optimised methods.","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"count(isGC, seq)\ncount(isambiguous, seq)\ncount(iscertain, seq)\ncount(isgap, seq)\ncount(==(biosymbol), seq) and count(isequal(biosymbol), seq)","category":"page"},{"location":"counting/#Matches-and-mismatches","page":"Counting","title":"Matches and mismatches","text":"","category":"section"},{"location":"counting/","page":"Counting","title":"Counting","text":"The methods matches and mismatches take two sequences and count the number of positions where the sequences are unequal or equal, respectively.","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"They are equivalent to matches(a, b) = count(splat(==), zip(a, b)) (and with !=, respectively).","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"matches\nmismatches","category":"page"},{"location":"counting/#BioSequences.matches","page":"Counting","title":"BioSequences.matches","text":"matches(a::BioSequence, b::BioSequences) -> Int\n\nCount the number of positions in where a and b are equal. If b is given, and the length of a and b differ, look only at the indices of the shorter sequence. This function does not provide any special handling of ambiguous symbols, so e.g. DNA_A does not match DNA_N.\n\nwarning: Warning\nPassing in two sequences with differing lengths is deprecated. In a future, breaking release of BioSequences, this will error.\n\nExamples\n\njulia> matches(dna\"TAWNNA\", dna\"TACCTA\")\n3\n\njulia> matches(dna\"AACA\", dna\"AAG\")\n2\n\n\n\n\n\n","category":"function"},{"location":"counting/#BioSequences.mismatches","page":"Counting","title":"BioSequences.mismatches","text":"mismatches(a::BioSequence, b::BioSequences) -> Int\n\nCount the number of positions in where a and b differ. If b is given, and the length of a and b differ, look only at the indices of the shorter sequence. This function does not provide any special handling of ambiguous symbols, so e.g. DNA_A does not match DNA_N.\n\nwarning: Warning\nPassing in two sequences with differing lengths is deprecated. In a future, breaking release of BioSequences, this will error.\n\nExamples\n\njulia> mismatches(dna\"TAGCTA\", dna\"TACNTA\")\n2\n\njulia> mismatches(dna\"AACA\", dna\"AAG\")\n1\n\n\n\n\n\n","category":"function"},{"location":"counting/#GC-content","page":"Counting","title":"GC content","text":"","category":"section"},{"location":"counting/","page":"Counting","title":"Counting","text":"The convenience function gc_content(seq) is equivalent to count(isGC, seq) / length(seq):","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"gc_content","category":"page"},{"location":"counting/#BioSequences.gc_content","page":"Counting","title":"BioSequences.gc_content","text":"gc_content(seq::BioSequence) -> Float64\n\nCalculate GC content of seq, i.e. the number of symbols that is DNA_C, DNA_G, DNA_C or DNA_G divided by the length of the sequence.\n\nExamples\n\njulia> gc_content(dna\"AGCTA\")\n0.4\n\njulia> gc_content(rna\"UAGCGA\")\n0.5\n\n\n\n\n\n","category":"function"},{"location":"counting/#Deprecated-aliases","page":"Counting","title":"Deprecated aliases","text":"","category":"section"},{"location":"counting/","page":"Counting","title":"Counting","text":"Several of the optimised count methods have function names, which are deprecated:","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"Deprecated function Instead use\nn_gaps count(isgap, seq)\nn_certain count(iscertain, seq)\nn_ambiguous count(isambiguous, seq)","category":"page"},{"location":"counting/","page":"Counting","title":"Counting","text":"n_gaps\nn_certain\nn_ambiguous","category":"page"},{"location":"counting/#BioSequences.n_gaps","page":"Counting","title":"BioSequences.n_gaps","text":"n_gaps(a::BioSequence, [b::BioSequence]) -> Int\n\nCount the number of positions where a (or b, if present) have gaps. If b is given, and the length of a and b differ, look only at the indices of the shorter sequence.\n\nwarning: Warning\nPassing in two sequences is deprecated. In a future, breaking release of BioSequences, this will throw a MethodError\n\nExamples\n\njulia> n_gaps(dna\"--TAC-WN-ACY\")\n4\n\njulia> n_gaps(dna\"TC-AC-\", dna\"-CACG\")\n2\n\n\n\n\n\n","category":"function"},{"location":"counting/#BioSequences.n_certain","page":"Counting","title":"BioSequences.n_certain","text":"n_certain(a::BioSequence, [b::BioSequence]) -> Int\n\nCount the number of positions where a (and b, if present) have certain (i.e. non-ambigous and non-gap) symbols. If b is given, and the length of a and b differ, look only at the indices of the shorter sequence. Gaps are not certain.\n\nwarning: Warning\nPassing in two sequences is deprecated. In a future, breaking release of BioSequences, this will throw a MethodError\n\nExamples\n\njulia> n_certain(dna\"--TAC-WN-ACY\")\n5\n\njulia> n_certain(rna\"UAYWW\", rna\"UAW\")\n2\n\n\n\n\n\n","category":"function"},{"location":"counting/#BioSequences.n_ambiguous","page":"Counting","title":"BioSequences.n_ambiguous","text":"n_ambiguous(a::BioSequence, [b::BioSequence]) -> Int\n\nCount the number of positions where a (or b, if present) have ambigious symbols. If b is given, and the length of a and b differ, look only at the indices of the shorter sequence. Gaps are not ambigous.\n\nwarning: Warning\nPassing in two sequences is deprecated. In a future, breaking release of BioSequences, this will throw a MethodError\n\nExamples\n\njulia> n_ambiguous(dna\"--TAC-WN-ACY\")\n3\n\njulia> n_ambiguous(rna\"UAYWW\", rna\"UAW\")\n1\n\n\n\n\n\n","category":"function"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"interfaces/#Custom-BioSequences-types","page":"Implementing custom types","title":"Custom BioSequences types","text":"","category":"section"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"If you're a developing your own Bioinformatics package or method, you may find that the reference implementation of concrete LongSequence types provided in this package are not optimal for your purposes.","category":"page"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"This page describes the interfaces for BioSequences' core types for developers or other packages implementing their own sequence types or extending BioSequences functionality.","category":"page"},{"location":"interfaces/#Implementing-custom-Alphabets","page":"Implementing custom types","title":"Implementing custom Alphabets","text":"","category":"section"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"Recall the required methods that define the Alphabet interface. ","category":"page"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"To create an example custom alphabet, we need to create a singleton type, that implements a few methods in order to conform to the interface as described in the Alphabet documentation.","category":"page"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"Let's do that for a restricted Amino Acid alphabet. We can test that it conforms to the interface with the BioSequences.has_interface function.","category":"page"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"julia> struct ReducedAAAlphabet <: Alphabet end\n\njulia> Base.eltype(::Type{ReducedAAAlphabet}) = AminoAcid\n\njulia> BioSequences.BitsPerSymbol(::ReducedAAAlphabet) = BioSequences.BitsPerSymbol{4}()\n\njulia> function BioSequences.symbols(::ReducedAAAlphabet)\n           (AA_L, AA_C, AA_A, AA_G, AA_S, AA_T, AA_P, AA_F,\n            AA_W, AA_E, AA_D, AA_N, AA_Q, AA_K, AA_H, AA_M)\n       end\n\njulia> const (ENC_LUT, DEC_LUT) = let\n           enc_lut = fill(0xff, length(alphabet(AminoAcid)))\n           dec_lut = fill(AA_A, length(symbols(ReducedAAAlphabet())))\n           for (i, aa) in enumerate(symbols(ReducedAAAlphabet()))\n               enc_lut[reinterpret(UInt8, aa) + 0x01] = i - 1\n               dec_lut[i] = aa\n           end\n           (Tuple(enc_lut), Tuple(dec_lut))\n       end\n((0x02, 0xff, 0x0b, 0x0a, 0x01, 0x0c, 0x09, 0x03, 0x0e, 0xff, 0x00, 0x0d, 0x0f, 0x07, 0x06, 0x04, 0x05, 0x08, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff), (AA_L, AA_C, AA_A, AA_G, AA_S, AA_T, AA_P, AA_F, AA_W, AA_E, AA_D, AA_N, AA_Q, AA_K, AA_H, AA_M))\n\njulia> function BioSequences.encode(::ReducedAAAlphabet, aa::AminoAcid)\n           i = reinterpret(UInt8, aa) + 0x01\n           (i ≥ length(ENC_LUT) || @inbounds ENC_LUT[i] === 0xff) && throw(DomainError(aa))\n           (@inbounds ENC_LUT[i]) % UInt\n       end\n\njulia> function BioSequences.decode(::ReducedAAAlphabet, x::UInt)\n           x ≥ length(DEC_LUT) && throw(DomainError(aa))\n           @inbounds DEC_LUT[x + UInt(1)]\n       end\n\njulia> BioSequences.has_interface(Alphabet, ReducedAAAlphabet())\ntrue\n","category":"page"},{"location":"interfaces/#Implementing-custom-BioSequences","page":"Implementing custom types","title":"Implementing custom BioSequences","text":"","category":"section"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"Recall the required methods that define the BioSequence interface. ","category":"page"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"To create an example custom alphabet, we need to create a singleton type, that implements a few methods in order to conform to the interface as described in the BioSequence documentation.","category":"page"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"Let's do that for a custom sequence type that is optimised to represent a small sequence: A Codon. We can test that it conforms to the interface with the BioSequences.has_interface function.","category":"page"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"julia> struct Codon <: BioSequence{RNAAlphabet{2}}\n           x::UInt8\n       end\n\njulia> function Codon(iterable)\n           length(iterable) == 3 || error(\"Must have length 3\")\n           x = zero(UInt)\n           for (i, nt) in enumerate(iterable)\n               x |= BioSequences.encode(Alphabet(Codon), convert(RNA, nt)) << (6-2i)\n           end\n           Codon(x % UInt8)\n       end\nCodon\n\njulia> Base.length(::Codon) = 3\n\njulia> BioSequences.encoded_data_eltype(::Type{Codon}) = UInt\n\njulia> function BioSequences.extract_encoded_element(x::Codon, i::Int)\n           ((x.x >>> (6-2i)) & 3) % UInt\n       end\n\njulia> Base.copy(seq::Codon) = Codon(seq.x)\n\njulia> BioSequences.has_interface(BioSequence, Codon, [RNA_C, RNA_U, RNA_A], false)\ntrue","category":"page"},{"location":"interfaces/#Interface-checking-functions","page":"Implementing custom types","title":"Interface checking functions","text":"","category":"section"},{"location":"interfaces/","page":"Implementing custom types","title":"Implementing custom types","text":"BioSequences.has_interface","category":"page"},{"location":"interfaces/#BioSequences.has_interface","page":"Implementing custom types","title":"BioSequences.has_interface","text":"function has_interface(::Type{Alphabet}, A::Alphabet)\n\nReturns whether A conforms to the Alphabet interface.\n\n\n\n\n\nhas_interface(::Type{BioSequence}, ::T, syms::Vector, mutable::Bool, compat::Bool=true)\n\nCheck if type T conforms to the BioSequence interface. A T is constructed from the vector of element types syms which must not be empty. If the mutable flag is set, also check the mutable interface. If the compat flag is set, check for compatibility with existing alphabets.\n\n\n\n\n\n","category":"function"},{"location":"random/","page":"Random sequences","title":"Random sequences","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"random/#Generating-random-sequences","page":"Random sequences","title":"Generating random sequences","text":"","category":"section"},{"location":"random/#Long-sequences","page":"Random sequences","title":"Long sequences","text":"","category":"section"},{"location":"random/","page":"Random sequences","title":"Random sequences","text":"You can generate random long sequences using the randdna function and the Sampler's implemented in BioSequences:","category":"page"},{"location":"random/","page":"Random sequences","title":"Random sequences","text":"randseq\nranddnaseq\nrandrnaseq\nrandaaseq\nSamplerUniform\nSamplerWeighted","category":"page"},{"location":"random/#BioSequences.randseq","page":"Random sequences","title":"BioSequences.randseq","text":"randseq([rng::AbstractRNG], A::Alphabet, len::Integer)\n\nGenerate a LongSequence{A} of length len from the specified alphabet, drawn from the default distribution. User-defined alphabets should implement this method to implement random LongSequence generation.\n\nFor RNA and DNA alphabets, the default distribution is uniform across A, C, G, and T/U. For AminoAcidAlphabet, it is uniform across the 20 standard amino acids. For a user-defined alphabet A, default is uniform across all elements of symbols(A).\n\nExample:\n\njulia> seq = randseq(AminoAcidAlphabet(), 50)\n50aa Amino Acid Sequence:\nVFMHSIRMIRLMVHRSWKMHSARHVNFIRCQDKKWKSADGIYTDICKYSM\n\n\n\n\n\nrandseq([rng::AbstractRNG], A::Alphabet, sp::Sampler, len::Integer)\n\nGenerate a LongSequence{A} of length len with elements drawn from the given sampler.\n\nExample:\n\n# Generate 1000-length RNA with 4% chance of N, 24% for A, C, G, or U\njulia> sp = SamplerWeighted(rna\"ACGUN\", fill(0.24, 4))\njulia> seq = randseq(RNAAlphabet{4}(), sp, 50)\n50nt RNA Sequence:\nCUNGGGCCCGGGNAAACGUGGUACACCCUGUUAAUAUCAACNNGCGCUNU\n\n\n\n\n\n","category":"function"},{"location":"random/#BioSequences.randdnaseq","page":"Random sequences","title":"BioSequences.randdnaseq","text":"randdnaseq([rng::AbstractRNG], len::Integer)\n\nGenerate a random LongSequence{DNAAlphabet{4}} sequence of length len, with bases sampled uniformly from [A, C, G, T]\n\n\n\n\n\n","category":"function"},{"location":"random/#BioSequences.randrnaseq","page":"Random sequences","title":"BioSequences.randrnaseq","text":"randrnaseq([rng::AbstractRNG], len::Integer)\n\nGenerate a random LongSequence{RNAAlphabet{4}} sequence of length len, with bases sampled uniformly from [A, C, G, U]\n\n\n\n\n\n","category":"function"},{"location":"random/#BioSequences.randaaseq","page":"Random sequences","title":"BioSequences.randaaseq","text":"randaaseq([rng::AbstractRNG], len::Integer)\n\nGenerate a random LongSequence{AminoAcidAlphabet} sequence of length len, with amino acids sampled uniformly from the 20 standard amino acids.\n\n\n\n\n\n","category":"function"},{"location":"random/#BioSequences.SamplerUniform","page":"Random sequences","title":"BioSequences.SamplerUniform","text":"SamplerUniform{T}\n\nUniform sampler of type T. Instantiate with a collection of eltype T containing the elements to sample.\n\nExamples\n\njulia> sp = SamplerUniform(rna\"ACGU\");\n\n\n\n\n\n","category":"type"},{"location":"random/#BioSequences.SamplerWeighted","page":"Random sequences","title":"BioSequences.SamplerWeighted","text":"SamplerWeighted{T}\n\nWeighted sampler of type T. Instantiate with a collection of eltype T containing the elements to sample, and an orderen collection of probabilities to sample each element except the last. The last probability is the remaining probability up to 1.\n\nExamples\n\njulia> sp = SamplerWeighted(rna\"ACGUN\", fill(0.2475, 4));\n\n\n\n\n\n","category":"type"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"transforms/#Indexing-and-modifying-sequences","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"","category":"section"},{"location":"transforms/#Indexing","page":"Indexing & modifying sequences","title":"Indexing","text":"","category":"section"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"Most BioSequence concrete subtypes for the most part behave like other vector or string types. They can be indexed using integers or ranges:","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"For example, with LongSequences:","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"julia> seq = dna\"ACGTTTANAGTNNAGTACC\"\n19nt DNA Sequence:\nACGTTTANAGTNNAGTACC\n\njulia> seq[5]\nDNA_T\n\njulia> seq[6:end]\n14nt DNA Sequence:\nTANAGTNNAGTACC\n","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"The biological symbol at a given locus in a biological sequence can be set using setindex:","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"julia> seq = dna\"ACGTTTANAGTNNAGTACC\"\n19nt DNA Sequence:\nACGTTTANAGTNNAGTACC\n\njulia> seq[5] = DNA_A\nDNA_A\n","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"note: Note\nSome types such can be indexed using integers but not using ranges.","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"For LongSequence types, indexing a sequence by range creates a copy of the original sequence, similar to Array in Julia's Base library. If you find yourself slowed down by the allocation of these subsequences, consider using a sequence view instead.","category":"page"},{"location":"transforms/#Modifying-sequences","page":"Indexing & modifying sequences","title":"Modifying sequences","text":"","category":"section"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"In addition to setindex, many other modifying operations are possible for biological sequences such as push!, pop!, and insert!, which should be familiar to anyone used to editing arrays.","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"push!(::BioSequences.BioSequence, ::Any)\npop!(::BioSequences.BioSequence)\npushfirst!(::BioSequences.BioSequence, ::Any)\npopfirst!(::BioSequences.BioSequence)\ninsert!(::BioSequences.BioSequence, ::Integer, ::Any)\ndeleteat!(::BioSequences.BioSequence, ::Integer)\nappend!(::BioSequences.BioSequence, ::BioSequences.BioSequence)\nresize!(::BioSequences.LongSequence, ::Integer)\nempty!(::BioSequences.BioSequence)","category":"page"},{"location":"transforms/#Base.push!-Tuple{BioSequence, Any}","page":"Indexing & modifying sequences","title":"Base.push!","text":"push!(seq::BioSequence, x)\n\nAppend a biological symbol x to a biological sequence seq.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.pop!-Tuple{BioSequence}","page":"Indexing & modifying sequences","title":"Base.pop!","text":"pop!(seq::BioSequence)\n\nRemove the symbol from the end of a biological sequence seq and return it. Returns a variable of eltype(seq).\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.pushfirst!-Tuple{BioSequence, Any}","page":"Indexing & modifying sequences","title":"Base.pushfirst!","text":"pushfirst!(seq, x)\n\nInsert a biological symbol x at the beginning of a biological sequence seq.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.popfirst!-Tuple{BioSequence}","page":"Indexing & modifying sequences","title":"Base.popfirst!","text":"popfirst!(seq)\n\nRemove the symbol from the beginning of a biological sequence seq and return it. Returns a variable of eltype(seq).\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.insert!-Tuple{BioSequence, Integer, Any}","page":"Indexing & modifying sequences","title":"Base.insert!","text":"insert!(seq::BioSequence, i, x)\n\nInsert a biological symbol x into a biological sequence seq, at the given index i.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.deleteat!-Tuple{BioSequence, Integer}","page":"Indexing & modifying sequences","title":"Base.deleteat!","text":"deleteat!(seq::BioSequence, i::Integer)\n\nDelete a biological symbol at a single position i in a biological sequence seq.\n\nModifies the input sequence.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.append!-Tuple{BioSequence, BioSequence}","page":"Indexing & modifying sequences","title":"Base.append!","text":"append!(seq, other)\n\nAdd a biological sequence other onto the end of biological sequence seq. Modifies and returns seq.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.resize!-Tuple{LongSequence, Integer}","page":"Indexing & modifying sequences","title":"Base.resize!","text":"resize!(seq, size, [force::Bool=false])\n\nResize a biological sequence seq, to a given size. Does not resize the underlying data array unless the new size does not fit. If force, always resize underlying data array.\n\nNote that resizing to a larger size, and then loading from uninitialized positions is not allowed and may cause undefined behaviour.  Make sure to always fill any uninitialized biosymbols after resizing.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.empty!-Tuple{BioSequence}","page":"Indexing & modifying sequences","title":"Base.empty!","text":"empty!(seq::BioSequence)\n\nCompletely empty a biological sequence seq of nucleotides.\n\n\n\n\n\n","category":"method"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"Here are some examples:","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"julia> seq = dna\"ACG\"\n3nt DNA Sequence:\nACG\n\njulia> push!(seq, DNA_T)\n4nt DNA Sequence:\nACGT\n\njulia> append!(seq, dna\"AT\")\n6nt DNA Sequence:\nACGTAT\n\njulia> deleteat!(seq, 2)\n5nt DNA Sequence:\nAGTAT\n\njulia> deleteat!(seq, 2:3)\n3nt DNA Sequence:\nAAT\n","category":"page"},{"location":"transforms/#Additional-transformations","page":"Indexing & modifying sequences","title":"Additional transformations","text":"","category":"section"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"In addition to these basic modifying functions, other sequence transformations that are common in bioinformatics are also provided.","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"reverse!(::BioSequences.LongSequence)\nreverse(::BioSequences.LongSequence{<:NucleicAcidAlphabet})\ncomplement!\ncomplement\nreverse_complement!\nreverse_complement\nungap!\nungap\ncanonical!\ncanonical","category":"page"},{"location":"transforms/#Base.reverse!-Tuple{LongSequence}","page":"Indexing & modifying sequences","title":"Base.reverse!","text":"reverse!(seq::LongSequence)\n\nReverse a biological sequence seq in place.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#Base.reverse-Tuple{LongSequence{<:NucleicAcidAlphabet}}","page":"Indexing & modifying sequences","title":"Base.reverse","text":"reverse(seq::BioSequence)\n\nCreate reversed copy of a biological sequence.\n\n\n\n\n\nreverse(seq::LongSequence)\n\nCreate reversed copy of a biological sequence.\n\n\n\n\n\n","category":"method"},{"location":"transforms/#BioSequences.complement!","page":"Indexing & modifying sequences","title":"BioSequences.complement!","text":"complement!(seq)\n\nMake a complement sequence of seq in place.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSymbols.complement","page":"Indexing & modifying sequences","title":"BioSymbols.complement","text":"complement(nt::NucleicAcid)\n\nReturn the complementary nucleotide of nt.\n\nThis function returns the union of all possible complementary nucleotides.\n\nExamples\n\njulia> complement(DNA_A)\nDNA_T\n\njulia> complement(DNA_N)\nDNA_N\n\njulia> complement(RNA_U)\nRNA_A\n\n\n\n\n\n\ncomplement(seq)\n\nMake a complement sequence of seq.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSequences.reverse_complement!","page":"Indexing & modifying sequences","title":"BioSequences.reverse_complement!","text":"reverse_complement!(seq)\n\nMake a reversed complement sequence of seq in place.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSequences.reverse_complement","page":"Indexing & modifying sequences","title":"BioSequences.reverse_complement","text":"reverse_complement(seq)\n\nMake a reversed complement sequence of seq.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSequences.ungap!","page":"Indexing & modifying sequences","title":"BioSequences.ungap!","text":"Remove gap characters from an input sequence.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSequences.ungap","page":"Indexing & modifying sequences","title":"BioSequences.ungap","text":"Create a copy of a sequence with gap characters removed.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSequences.canonical!","page":"Indexing & modifying sequences","title":"BioSequences.canonical!","text":"canonical!(seq::NucleotideSeq)\n\nTransforms the seq into its canonical form, if it is not already canonical. Modifies the input sequence inplace.\n\nFor any sequence, there is a reverse complement, which is the same sequence, but on the complimentary strand of DNA:\n\n------->\nATCGATCG\nCGATCGAT\n<-------\n\nnote: Note\nUsing the reverse_complement of a DNA sequence will give give this reverse complement.\n\nOf the two sequences, the canonical of the two sequences is the lesser of the two i.e. canonical_seq < other_seq.\n\nUsing this function on a seq will ensure it is the canonical version.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSequences.canonical","page":"Indexing & modifying sequences","title":"BioSequences.canonical","text":"canonical(seq::NucleotideSeq)\n\nCreate the canonical sequence of seq.\n\n\n\n\n\n","category":"function"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"Some examples:","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"julia> seq = dna\"ACGTAT\"\n6nt DNA Sequence:\nACGTAT\n\njulia> reverse!(seq)\n6nt DNA Sequence:\nTATGCA\n\njulia> complement!(seq)\n6nt DNA Sequence:\nATACGT\n\njulia> reverse_complement!(seq)\n6nt DNA Sequence:\nACGTAT\n","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"Many of these methods also have a version which makes a copy of the input sequence, so you get a modified copy, and don't alter the original sequence. Such methods are named the same, but without the exclamation mark. E.g. reverse instead of reverse!, and ungap instead of ungap!.  ","category":"page"},{"location":"transforms/#Translation","page":"Indexing & modifying sequences","title":"Translation","text":"","category":"section"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"Translation is a slightly more complex transformation for RNA Sequences and so we describe it here in more detail.","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"The translate function translates a sequence of codons in a RNA sequence to a amino acid sequence based on a genetic code. The BioSequences package provides all NCBI defined genetic codes and they are registered in ncbi_trans_table.","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"translate\nncbi_trans_table","category":"page"},{"location":"transforms/#BioSequences.translate","page":"Indexing & modifying sequences","title":"BioSequences.translate","text":"translate(seq, code=standard_genetic_code, allow_ambiguous_codons=true, alternative_start=false)\n\nTranslate an LongRNA or a LongDNA to an LongAA.\n\nTranslation uses genetic code code to map codons to amino acids. See ncbi_trans_table for available genetic codes. If codons in the given sequence cannot determine a unique amino acid, they will be translated to AA_X if allow_ambiguous_codons is true and otherwise result in an error. For organisms that utilize alternative start codons, one can set alternative_start=true, in which case the first codon will always be converted to a methionine.\n\n\n\n\n\n","category":"function"},{"location":"transforms/#BioSequences.ncbi_trans_table","page":"Indexing & modifying sequences","title":"BioSequences.ncbi_trans_table","text":"Genetic code list of NCBI.\n\nThe standard genetic code is ncbi_trans_table[1] and others can be shown by show(ncbi_trans_table). For more details, consult the next link: http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi?chapter=cgencodes.\n\n\n\n\n\n","category":"constant"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"julia> ncbi_trans_table\nTranslation Tables:\n  1. The Standard Code (standard_genetic_code)\n  2. The Vertebrate Mitochondrial Code (vertebrate_mitochondrial_genetic_code)\n  3. The Yeast Mitochondrial Code (yeast_mitochondrial_genetic_code)\n  4. The Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code (mold_mitochondrial_genetic_code)\n  5. The Invertebrate Mitochondrial Code (invertebrate_mitochondrial_genetic_code)\n  6. The Ciliate, Dasycladacean and Hexamita Nuclear Code (ciliate_nuclear_genetic_code)\n  9. The Echinoderm and Flatworm Mitochondrial Code (echinoderm_mitochondrial_genetic_code)\n 10. The Euplotid Nuclear Code (euplotid_nuclear_genetic_code)\n 11. The Bacterial, Archaeal and Plant Plastid Code (bacterial_plastid_genetic_code)\n 12. The Alternative Yeast Nuclear Code (alternative_yeast_nuclear_genetic_code)\n 13. The Ascidian Mitochondrial Code (ascidian_mitochondrial_genetic_code)\n 14. The Alternative Flatworm Mitochondrial Code (alternative_flatworm_mitochondrial_genetic_code)\n 15. Blepharisma Macronuclear Code (blepharisma_macronuclear_genetic_code)\n 16. Chlorophycean Mitochondrial Code (chlorophycean_mitochondrial_genetic_code)\n 21. Trematode Mitochondrial Code (trematode_mitochondrial_genetic_code)\n 22. Scenedesmus obliquus Mitochondrial Code (scenedesmus_obliquus_mitochondrial_genetic_code)\n 23. Thraustochytrium Mitochondrial Code (thraustochytrium_mitochondrial_genetic_code)\n 24. Pterobranchia Mitochondrial Code (pterobrachia_mitochondrial_genetic_code)\n 25. Candidate Division SR1 and Gracilibacteria Code (candidate_division_sr1_genetic_code)\n","category":"page"},{"location":"transforms/","page":"Indexing & modifying sequences","title":"Indexing & modifying sequences","text":"https://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi?chapter=cgencodes","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"construction/#Construction-and-conversion","page":"Constructing sequences","title":"Construction & conversion","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Here we will showcase the various ways you can construct the various sequence types in BioSequences.","category":"page"},{"location":"construction/#Constructing-sequences","page":"Constructing sequences","title":"Constructing sequences","text":"","category":"section"},{"location":"construction/#From-strings","page":"Constructing sequences","title":"From strings","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Sequences can be constructed from strings using their constructors:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> LongDNA{4}(\"TTANC\")\n5nt DNA Sequence:\nTTANC\n\njulia> LongSequence{DNAAlphabet{2}}(\"TTAGC\")\n5nt DNA Sequence:\nTTAGC\n\njulia> LongRNA{4}(\"UUANC\")\n5nt RNA Sequence:\nUUANC\n\njulia> LongSequence{RNAAlphabet{2}}(\"UUAGC\")\n5nt RNA Sequence:\nUUAGC\n","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Type alias' can also be used for brevity.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> LongDNA{4}(\"TTANC\")\n5nt DNA Sequence:\nTTANC\n\njulia> LongDNA{2}(\"TTAGC\")\n5nt DNA Sequence:\nTTAGC\n\njulia> LongRNA{4}(\"UUANC\")\n5nt RNA Sequence:\nUUANC\n\njulia> LongRNA{2}(\"UUAGC\")\n5nt RNA Sequence:\nUUAGC","category":"page"},{"location":"construction/#Constructing-sequences-from-arrays-of-BioSymbols","page":"Constructing sequences","title":"Constructing sequences from arrays of BioSymbols","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Sequences can be constructed using vectors or arrays of a BioSymbol type:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> LongDNA{4}([DNA_T, DNA_T, DNA_A, DNA_N, DNA_C])\n5nt DNA Sequence:\nTTANC\n\njulia> LongSequence{DNAAlphabet{2}}([DNA_T, DNA_T, DNA_A, DNA_G, DNA_C])\n5nt DNA Sequence:\nTTAGC\n","category":"page"},{"location":"construction/#Constructing-sequences-from-other-sequences","page":"Constructing sequences","title":"Constructing sequences from other sequences","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"You can create sequences, by concatenating other sequences together:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> LongDNA{2}(\"ACGT\") * LongDNA{2}(\"TGCA\")\n8nt DNA Sequence:\nACGTTGCA\n\njulia> repeat(LongDNA{4}(\"TA\"), 10)\n20nt DNA Sequence:\nTATATATATATATATATATA\n\njulia> LongDNA{4}(\"TA\") ^ 10\n20nt DNA Sequence:\nTATATATATATATATATATA\n","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Sequence views (LongSubSeqs) are special, in that they do not own their own data, and must be constructed from a LongSequence or another LongSubSeq:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> seq = LongDNA{4}(\"TACGGACATTA\")\n11nt DNA Sequence:\nTACGGACATTA\n\njulia> seqview = LongSubSeq(seq, 3:7)\n5nt DNA Sequence:\nCGGAC\n\njulia> seqview2 = @view seq[1:3]\n3nt DNA Sequence:\nTAC\n\njulia> typeof(seqview) == typeof(seqview2) && typeof(seqview) <: LongSubSeq\ntrue\n","category":"page"},{"location":"construction/#Conversion-of-sequence-types","page":"Constructing sequences","title":"Conversion of sequence types","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"You can convert between sequence types, if the sequences are compatible - that is, if the source sequence does not contain symbols that are un-encodable by the destination type.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> dna = dna\"TTACGTAGACCG\"\n12nt DNA Sequence:\nTTACGTAGACCG\n\njulia> dna2 = convert(LongDNA{2}, dna)\n12nt DNA Sequence:\nTTACGTAGACCG","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"DNA/RNA are special in that they can be converted to each other, despite containing distinct symbols. When doing so, DNA_T is converted to RNA_U and vice versa.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> convert(LongRNA{2}, dna\"TAGCTAGG\")\n8nt RNA Sequence:\nUAGCUAGG","category":"page"},{"location":"construction/#String-literals","page":"Constructing sequences","title":"String literals","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"BioSequences provides several string literal macros for creating sequences.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"note: Note\nWhen you use literals you may mix the case of characters.","category":"page"},{"location":"construction/#Long-sequence-literals","page":"Constructing sequences","title":"Long sequence literals","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> dna\"TACGTANNATC\"\n11nt DNA Sequence:\nTACGTANNATC\n\njulia> rna\"AUUUGNCCANU\"\n11nt RNA Sequence:\nAUUUGNCCANU\n\njulia> aa\"ARNDCQEGHILKMFPSTWYVX\"\n21aa Amino Acid Sequence:\nARNDCQEGHILKMFPSTWYVX","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"However, it should be noted that by default these sequence literals allocate the LongSequence object before the code containing the sequence literal is run. This means there may be occasions where your program does not behave as you first expect. For example consider the following code:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> function foo()\n           s = dna\"CTT\"\n           push!(s, DNA_A)\n       end\nfoo (generic function with 1 method)\n","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"DocTestSetup = quote\n    using BioSequences\n    function foo()\n        s = dna\"CTT\"d\n        push!(s, DNA_A)\n    end\nend","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"You might expect that every time you call foo, that a DNA sequence CTTA would be returned. You might expect that this is because every time foo is called, a new DNA sequence variable CTT is created, and the A nucleotide is pushed to it, and the result, CTTA is returned. In other words you might expect the following output:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> foo()\n4nt DNA Sequence:\nCTTA\n\njulia> foo()\n4nt DNA Sequence:\nCTTA\n\njulia> foo()\n4nt DNA Sequence:\nCTTA\n","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"However, this is not what happens, instead the following happens:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"DocTestSetup = quote\n    using BioSequences\n    function foo()\n        s = dna\"CTT\"s\n        push!(s, DNA_A)\n    end\nend","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> foo()\n4nt DNA Sequence:\nCTTA\n\njulia> foo()\n5nt DNA Sequence:\nCTTAA\n\njulia> foo()\n6nt DNA Sequence:\nCTTAAA\n","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"The reason for this is because the sequence literal is allocated only once before the first time the function foo is called and run. Therefore, s in foo is always a reference to that one sequence that was allocated. So one sequence is created before foo is called, and then it is pushed to every time foo is called. Thus, that one allocated sequence grows with every call of foo.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"If you wanted foo to create a new sequence each time it is called, then you can add a flag to the end of the sequence literal to dictate behaviour: A flag of 's' means 'static': the sequence will be allocated before code is run, as is the default behaviour described above. However providing 'd' flag changes the behaviour: 'd' means 'dynamic': the sequence will be allocated whilst the code is running, and not before. So to change foo so as it creates a new sequence each time it is called, simply add the 'd' flag to the sequence literal:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"DocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> function foo()\n           s = dna\"CTT\"d     # 'd' flag appended to the string literal.\n           push!(s, DNA_A)\n       end\nfoo (generic function with 1 method)\n","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Now every time foo is called, a new sequence CTT is created, and an A nucleotide is pushed to it:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"DocTestSetup = quote\n    using BioSequences\n    function foo()\n        s = dna\"CTT\"d\n        push!(s, DNA_A)\n    end\nend","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> foo()\n4nt DNA Sequence:\nCTTA\n\njulia> foo()\n4nt DNA Sequence:\nCTTA\n\njulia> foo()\n4nt DNA Sequence:\nCTTA\n","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"DocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"So the take home message of sequence literals is this:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Be careful when you are using sequence literals inside of functions, and inside the bodies of things like for loops. And if you use them and are unsure, use the  's' and 'd' flags to ensure the behaviour you get is the behaviour you intend.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"@dna_str\n@rna_str\n@aa_str","category":"page"},{"location":"construction/#BioSequences.@dna_str","page":"Constructing sequences","title":"BioSequences.@dna_str","text":"@dna_str(seq, flag=\"s\") -> LongDNA{4}\n\nCreate a LongDNA{4} sequence at parse time from string seq. If flag is \"s\" ('static', the default), the sequence is created at parse time, and inserted directly into the returned expression. A static string ought not to be mutated Alternatively, if flag is \"d\" (dynamic), a new sequence is parsed and created whenever the code where is macro is placed is run.\n\nSee also: @aa_str, @rna_str\n\nExamples\n\nIn the example below, the static sequence is created once, at parse time, NOT when the function f is run. This means it is the same  sequence that is pushed to repeatedly.\n\njulia> f() = dna\"TAG\";\n\njulia> string(push!(f(), DNA_A)) # NB: Mutates static string!\n\"TAGA\"\n\njulia> string(push!(f(), DNA_A))\n\"TAGAA\"\n\njulia> f() = dna\"TAG\"d; # dynamically make seq\n\njulia> string(push!(f(), DNA_A))\n\"TAGA\"\n\njulia> string(push!(f(), DNA_A))\n\"TAGA\"\n\n\n\n\n\n","category":"macro"},{"location":"construction/#BioSequences.@rna_str","page":"Constructing sequences","title":"BioSequences.@rna_str","text":"The LongRNA{4} equivalent to @dna_str\n\nSee also: @dna_str, @aa_str\n\nExamples\n\njulia> rna\"UCGUGAUGC\"\n9nt RNA Sequence:\nUCGUGAUGC\n\n\n\n\n\n","category":"macro"},{"location":"construction/#BioSequences.@aa_str","page":"Constructing sequences","title":"BioSequences.@aa_str","text":"The AminoAcidAlphabet equivalent to @dna_str\n\nSee also: @dna_str, @rna_str\n\nExamples\n\njulia> aa\"PKLEQC\"\n6aa Amino Acid Sequence:\nPKLEQC\n\n\n\n\n\n","category":"macro"},{"location":"construction/#Loose-parsing","page":"Constructing sequences","title":"Loose parsing","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"As of version 3.2.0, BioSequences.jl provide the bioseq function, which can be used to build a LongSequence from a string (or an AbstractVector{UInt8}) without knowing the correct Alphabet.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> bioseq(\"ATGTGCTGA\")\n9nt DNA Sequence:\nATGTGCTGA","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"The function will prioritise 2-bit alphabets over 4-bit alphabets, and prefer smaller alphabets (like DNAAlphabet{4}) over larger (like AminoAcidAlphabet). If the input cannot be encoded by any of the built-in alphabets, an error is thrown:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> bioseq(\"0!(CC!;#&&%\")\nERROR: cannot encode 0x30 in AminoAcidAlphabet\n[...]","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Note that this function is only intended to be used for interactive, ephemeral work. The function is necessarily type unstable, and the precise returned alphabet for a given input is a heuristic which is subject to change.","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"bioseq\nguess_alphabet","category":"page"},{"location":"construction/#BioSequences.bioseq","page":"Constructing sequences","title":"BioSequences.bioseq","text":"bioseq(s::Union{AbstractString, AbstractVector{UInt8}}) -> LongSequence\n\nParse s into a LongSequence with an appropriate Alphabet, or throw an exception if no alphabet matches. See guess_alphabet for the available alphabets and the alphabet priority.\n\nwarning: Warning\nThe functions bioseq and guess_alphabet are intended for use in interactive sessions, and are not suitable for use in packages or non-ephemeral work. They are type unstable, and their heuristics are subject to change in minor versions.\n\nExamples\n\njulia> bioseq(\"QMKLPEEFW\")\n9aa Amino Acid Sequence:\nQMKLPEEFW\n\njulia> bioseq(\"UAUGCUGUAGG\")\n11nt RNA Sequence:\nUAUGCUGUAGG\n\njulia> bioseq(\"PKMW#3>>0;kL\")\nERROR: cannot encode 0x23 in AminoAcidAlphabet\n[...]\n\n\n\n\n\n","category":"function"},{"location":"construction/#BioSequences.guess_alphabet","page":"Constructing sequences","title":"BioSequences.guess_alphabet","text":"guess_alphabet(s::Union{AbstractString, AbstractVector{UInt8}}) -> Union{Integer, Alphabet}\n\nPick an Alphabet that can encode input s.  If no Alphabet can, return the index of the first byte of the input which is not encodable in any alphabet. This function only knows about the alphabets listed below. If multiple alphabets are possible, pick the first from the order below (i.e. DNAAlphabet{2}() if possible, otherwise RNAAlphabet{2}() etc).\n\nDNAAlphabet{2}()\nRNAAlphabet{2}()\nDNAAlphabet{4}()\nRNAAlphabet{4}()\nAminoAcidAlphabet()\n\nwarning: Warning\nThe functions bioseq and guess_alphabet are intended for use in interactive sessions, and are not suitable for use in packages or non-ephemeral work. They are type unstable, and their heuristics are subject to change in minor versions.\n\nExamples\n\njulia> guess_alphabet(\"AGGCA\")\nDNAAlphabet{2}()\n\njulia> guess_alphabet(\"WKLQSTV\")\nAminoAcidAlphabet()\n\njulia> guess_alphabet(\"QAWT+!\")\n5\n\njulia> guess_alphabet(\"UAGCSKMU\")\nRNAAlphabet{4}()\n\n\n\n\n\n","category":"function"},{"location":"construction/#Comparison-to-other-sequence-types","page":"Constructing sequences","title":"Comparison to other sequence types","text":"","category":"section"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"Following Base standards, BioSequences do not compare equal to other containers even if they have the same elements. To e.g. compare a BioSequence with a vector of DNA, compare the elements themselves:","category":"page"},{"location":"construction/","page":"Constructing sequences","title":"Constructing sequences","text":"julia> seq = dna\"GAGCTGA\"; vec = collect(seq);\n\njulia> seq == vec, isequal(seq, vec)\n(false, false)\n\njulia> length(seq) == length(vec) && all(i == j for (i, j) in zip(seq, vec))\ntrue ","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"sequence_search/#Searching-for-sequence-motifs","page":"Pattern matching and searching","title":"Searching for sequence motifs","text":"","category":"section"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"There are many ways to search for particular motifs in biological sequences:","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Exact searches, where you are looking for exact matches of a particular character of substring.\nApproximate searches, where you are looking for sequences that are sufficiently similar to a given sequence or family of sequences.\nSearches where you are looking for sequences that conform to some sort of pattern.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Like other Julia sequences such as Vector, you can search a BioSequence with the findfirst(predicate, collection) method pattern.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"All these kinds of searches are provided in BioSequences.jl, and they all conform to the findnext, findprev, and occursin patterns established in Base for String and collections like Vector.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"The exception is searching using the specialised regex provided in this package, which as you shall see, conforms to the match pattern established in Base for pcre and Strings.","category":"page"},{"location":"sequence_search/#Symbol-search","page":"Pattern matching and searching","title":"Symbol search","text":"","category":"section"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"julia> seq = dna\"ACAGCGTAGCT\";\n\njulia> findfirst(DNA_A, seq)\n1\n\njulia> findlast(DNA_A, seq)\n8\n\njulia> findnext(DNA_A, seq, 2)\n3\n\njulia> findprev(DNA_A, seq, 7)\n3\n\njulia> findall(DNA_A, seq)\n3-element Vector{Int64}:\n 1\n 3\n 8","category":"page"},{"location":"sequence_search/#Exact-search","page":"Pattern matching and searching","title":"Exact search","text":"","category":"section"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"ExactSearchQuery","category":"page"},{"location":"sequence_search/#BioSequences.ExactSearchQuery","page":"Pattern matching and searching","title":"BioSequences.ExactSearchQuery","text":"ExactSearchQuery{F<:Function,S<:BioSequence}\n\nQuery type for exact sequence search.\n\nAn exact search, is one where are you are looking in some given sequence, for exact instances of some given substring.\n\nThese queries are used as a predicate for the Base.findnext, Base.findprev, Base.occursin, Base.findfirst, and Base.findlast functions.\n\nExamples\n\njulia> seq = dna\"ACAGCGTAGCT\";\n\njulia> query = ExactSearchQuery(dna\"AGC\");\n\njulia> findfirst(query, seq)\n3:5\n\njulia> findlast(query, seq)\n8:10\n\njulia> findnext(query, seq, 6)\n8:10\n\njulia> findprev(query, seq, 7)\n3:5\n\njulia> findall(query, seq)\n2-element Vector{UnitRange{Int64}}:\n 3:5\n 8:10\n\njulia> occursin(query, seq)\ntrue\n\n\nYou can pass a comparator function such as isequal or iscompatible to its constructor to modify the search behaviour.\n\nThe default is isequal, however, in biology, sometimes we want a more flexible comparison to find subsequences of compatible symbols.\n\njulia> query = ExactSearchQuery(dna\"CGT\", iscompatible);\n\njulia> findfirst(query, dna\"ACNT\")  # 'N' matches 'G'\n2:4\n\njulia> findfirst(query, dna\"ACGT\")  # 'G' matches 'N'\n2:4\n\njulia> occursin(ExactSearchQuery(dna\"CNT\", iscompatible), dna\"ACNT\")\ntrue\n\n\n\n\n\n\n","category":"type"},{"location":"sequence_search/#Allowing-mismatches","page":"Pattern matching and searching","title":"Allowing mismatches","text":"","category":"section"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"ApproximateSearchQuery","category":"page"},{"location":"sequence_search/#BioSequences.ApproximateSearchQuery","page":"Pattern matching and searching","title":"BioSequences.ApproximateSearchQuery","text":"ApproximateSearchQuery{F<:Function,S<:BioSequence}\n\nQuery type for approximate sequence search.\n\nThese queries are used as a predicate for the Base.findnext, Base.findprev, Base.occursin, Base.findfirst, and Base.findlast functions.\n\nUsing these functions with these queries allows you to search a given sequence for a sub-sequence, whilst allowing a specific number of errors.\n\nIn other words they find a subsequence of the target sequence within a specific Levenshtein distance of the query sequence.\n\nExamples\n\njulia> seq = dna\"ACAGCGTAGCT\";\n\njulia> query = ApproximateSearchQuery(dna\"AGGG\");\n\njulia> findfirst(query, 0, seq) == nothing # nothing matches with no errors\ntrue\n\njulia> findfirst(query, 1, seq)  # seq[3:6] matches with one error\n3:6\n\njulia> findfirst(query, 2, seq)  # seq[1:4] matches with two errors\n1:4\n\n\nYou can pass a comparator function such as isequal or iscompatible to its constructor to modify the search behaviour.\n\nThe default is isequal, however, in biology, sometimes we want a more flexible comparison to find subsequences of compatible symbols.\n\njulia> query = ApproximateSearchQuery(dna\"AGGG\", iscompatible);\n\njulia> occursin(query, 1, dna\"AAGNGG\")    # 1 mismatch permitted (A vs G) & matched N\ntrue\n\njulia> findnext(query, 1, dna\"AAGNGG\", 1) # 1 mismatch permitted (A vs G) & matched N\n1:4\n\n\nnote: Note\nThis method of searching for motifs was implemented with smaller query motifs in mind.If you are looking to search for imperfect matches of longer sequences in this manner, you are likely better off using some kind of local-alignment algorithm or one of the BLAST variants.\n\n\n\n\n\n","category":"type"},{"location":"sequence_search/#Searching-according-to-a-pattern","page":"Pattern matching and searching","title":"Searching according to a pattern","text":"","category":"section"},{"location":"sequence_search/#Regular-expression-search","page":"Pattern matching and searching","title":"Regular expression search","text":"","category":"section"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Query patterns can be described in regular expressions. The syntax supports a subset of Perl and PROSITE's notation.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Biological regexes can be constructed using the BioRegex constructor, for example by doing BioRegex{AminoAcid}(\"MV+\"). For bioregex literals, it is instead recommended using the @biore_str macro:","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"The Perl-like syntax starts with biore (BIOlogical REgular expression) and ends with a symbol option: \"dna\", \"rna\" or \"aa\". For example, biore\"A+\"dna is a regular expression for DNA sequences and biore\"A+\"aa is for amino acid sequences. The symbol options can be abbreviated to its first character: \"d\", \"r\" or \"a\", respectively.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Here are examples of using the regular expression for BioSequences:","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"julia> match(biore\"A+C*\"dna, dna\"AAAACC\")\nRegexMatch(\"AAAACC\")\n\njulia> match(biore\"A+C*\"d, dna\"AAAACC\")\nRegexMatch(\"AAAACC\")\n\njulia> occursin(biore\"A+C*\"dna, dna\"AAC\")\ntrue\n\njulia> occursin(biore\"A+C*\"dna, dna\"C\")\nfalse\n","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"match will return a RegexMatch if a match is found, otherwise it will return nothing if no match is found.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"The table below summarizes available syntax elements.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Syntax Description Example\n| alternation \"A|T\" matches \"A\" and \"T\"\n* zero or more times repeat \"TA*\" matches \"T\", \"TA\" and \"TAA\"\n+ one or more times repeat \"TA+\" matches \"TA\" and \"TAA\"\n? zero or one time \"TA?\" matches \"T\" and \"TA\"\n{n,} n or more times repeat \"A{3,}\" matches \"AAA\" and \"AAAA\"\n{n,m} n-m times repeat \"A{3,5}\" matches \"AAA\", \"AAAA\" and \"AAAAA\"\n^ the start of the sequence \"^TAN*\" matches \"TATGT\"\n$ the end of the sequence \"N*TA$\" matches \"GCTA\"\n(...) pattern grouping \"(TA)+\" matches \"TA\" and \"TATA\"\n[...] one of symbols \"[ACG]+\" matches \"AGGC\"","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"eachmatch and findfirst are also defined, just like usual regex and strings found in Base.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"julia> collect(matched(x) for x in eachmatch(biore\"TATA*?\"d, dna\"TATTATAATTA\")) # overlap\n4-element Vector{LongSequence{DNAAlphabet{4}}}:\n TAT  \n TAT\n TATA\n TATAA\n\njulia> collect(matched(x) for x in eachmatch(biore\"TATA*\"d, dna\"TATTATAATTA\", false)) # no overlap\n2-element Vector{LongSequence{DNAAlphabet{4}}}:\n TAT  \n TATAA\n\njulia> findfirst(biore\"TATA*\"d, dna\"TATTATAATTA\")\n1:3\n\njulia> findfirst(biore\"TATA*\"d, dna\"TATTATAATTA\", 2)\n4:8\n","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Noteworthy differences from strings are:","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"Ambiguous characters match any compatible characters (e.g. biore\"N\"d is equivalent to biore\"[ACGT]\"d).\nWhitespaces are ignored (e.g. biore\"A C G\"d is equivalent to biore\"ACG\"d).","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"The PROSITE notation is described in ScanProsite - user manual. The syntax supports almost all notations including the extended syntax. The PROSITE notation starts with prosite prefix and no symbol option is needed because it always describes patterns of amino acid sequences:","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"julia> match(prosite\"[AC]-x-V-x(4)-{ED}\", aa\"CPVPQARG\")\nRegexMatch(\"CPVPQARG\")\n\njulia> match(prosite\"[AC]xVx(4){ED}\", aa\"CPVPQARG\")\nRegexMatch(\"CPVPQARG\")\n","category":"page"},{"location":"sequence_search/#Position-weight-matrix-search","page":"Pattern matching and searching","title":"Position weight matrix search","text":"","category":"section"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"A motif can be specified using position weight matrix (PWM) in a probabilistic way. This method searches for the first position in the sequence where a score calculated using a PWM is greater than or equal to a threshold. More formally, denoting the sequence as S and the PWM value of symbol s at position j as M_sj, the score starting from a position p is defined as","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"operatornamescore(S p) = sum_i=1^L M_Sp+i-1i","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"and the search returns the smallest p that satisfies operatornamescore(S p) ge t.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"There are two kinds of matrices in this package: PFM and PWM. The PFM type is a position frequency matrix and stores symbol frequencies for each position. The PWM is a position weight matrix and stores symbol scores for each position. You can create a PFM from a set of sequences with the same length and then create a PWM from the PFM object.","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"julia> motifs = [dna\"TTA\", dna\"CTA\", dna\"ACA\", dna\"TCA\", dna\"GTA\"]\n5-element Vector{LongSequence{DNAAlphabet{4}}}:\n TTA\n CTA\n ACA\n TCA\n GTA\n\njulia> pfm = PFM(motifs)  # sequence set => PFM\n4×3 PFM{DNA, Int64}:\n A  1  0  5\n C  1  2  0\n G  1  0  0\n T  2  3  0\n\njulia> pwm = PWM(pfm)  # PFM => PWM\n4×3 PWM{DNA, Float64}:\n A -0.321928 -Inf       2.0\n C -0.321928  0.678072 -Inf\n G -0.321928 -Inf      -Inf\n T  0.678072  1.26303  -Inf\n\njulia> pwm = PWM(pfm .+ 0.01)  # add pseudo counts to avoid infinite values\n4×3 PWM{DNA, Float64}:\n A -0.319068 -6.97728   1.99139\n C -0.319068  0.673772 -6.97728\n G -0.319068 -6.97728  -6.97728\n T  0.673772  1.25634  -6.97728\n\njulia> pwm = PWM(pfm .+ 0.01, prior=[0.2, 0.3, 0.3, 0.2])  # GC-rich prior\n4×3 PWM{DNA, Float64}:\n A  0.00285965 -6.65535   2.31331\n C -0.582103    0.410737 -7.24031\n G -0.582103   -7.24031  -7.24031\n T  0.9957      1.57827  -6.65535\n","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"The PWM_sj matrix is computed from PFM_sj and the prior probability p(s) as follows ([Wasserman2004]):","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"beginalign\n    PWM_sj = log_2 fracp(sj)p(s) \n    p(sj)  = fracPFM_sjsum_s PFM_sj\nendalign","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"However, if you just want to quickly conduct a search, constructing the PFM and PWM is done for you as a convenience if you build a PWMSearchQuery, using a collection of sequences:","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"julia> motifs = [dna\"TTA\", dna\"CTA\", dna\"ACA\", dna\"TCA\", dna\"GTA\"]\n5-element Vector{LongSequence{DNAAlphabet{4}}}:\n TTA\n CTA\n ACA\n TCA\n GTA\n\njulia> subject = dna\"TATTATAATTA\";\n\njulia> qa = PWMSearchQuery(motifs, 1.0);\n\njulia> findfirst(qa, subject)\n3\n\njulia> findall(qa, subject)\n3-element Vector{Int64}:\n 3\n 5\n 9","category":"page"},{"location":"sequence_search/","page":"Pattern matching and searching","title":"Pattern matching and searching","text":"[Wasserman2004]: https://doi.org/10.1038/nrg1315","category":"page"},{"location":"predicates/","page":"Predicates","title":"Predicates","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"predicates/#Predicates","page":"Predicates","title":"Predicates","text":"","category":"section"},{"location":"predicates/","page":"Predicates","title":"Predicates","text":"A number of predicate or query functions are supported for sequences, allowing you to check for certain properties of a sequence.","category":"page"},{"location":"predicates/","page":"Predicates","title":"Predicates","text":"isrepetitive\nispalindromic\nhasambiguity\niscanonical","category":"page"},{"location":"predicates/#BioSequences.isrepetitive","page":"Predicates","title":"BioSequences.isrepetitive","text":"isrepetitive(seq::BioSequence, n::Integer = length(seq))\n\nReturn true if and only if seq contains a repetitive subsequence of length ≥ n.\n\n\n\n\n\n","category":"function"},{"location":"predicates/#BioSequences.ispalindromic","page":"Predicates","title":"BioSequences.ispalindromic","text":"ispalindromic(seq::NucSeq) -> Bool\n\nCheck if seq is palindromic. A palindromic sequence is identical to its reverse-complement, so this should be equivalent to checking if seq == reverse_complement(seq).\n\nExamples\n\njulia> ispalindromic(dna\"TGCA\")\ntrue\n\njulia> ispalindromic(dna\"TCCT\")\nfalse\n\njulia> ispalindromic(rna\"ACGGU\")\nfalse\n\nReturn true if seq is a palindromic sequence; otherwise return false.\n\n\n\n\n\n","category":"function"},{"location":"predicates/#BioSequences.hasambiguity","page":"Predicates","title":"BioSequences.hasambiguity","text":"hasambiguity(seq::BioSequence)\n\nReturns true if seq has an ambiguous symbol; otherwise return false.\n\n\n\n\n\n","category":"function"},{"location":"predicates/#BioSequences.iscanonical","page":"Predicates","title":"BioSequences.iscanonical","text":"iscanonical(seq::NucleotideSeq)\n\nReturns true if seq is canonical.\n\nFor any sequence, there is a reverse complement, which is the same sequence, but on the complimentary strand of DNA:\n\n------->\nATCGATCG\nCGATCGAT\n<-------\n\nnote: Note\nUsing the reverse_complement of a DNA sequence will give give this reverse complement.\n\nOf the two sequences, the canonical of the two sequences is the lesser of the two i.e. canonical_seq < other_seq.\n\n\n\n\n\n","category":"function"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\n    using BioSymbols\nend","category":"page"},{"location":"recipes/#Recipes","page":"Recipes","title":"Recipes","text":"","category":"section"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"This page provides tested example code to solve various common problems using BioSequences.","category":"page"},{"location":"recipes/#One-hot-encoding-biosequences","page":"Recipes","title":"One-hot encoding biosequences","text":"","category":"section"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"The types DNA, RNA and AminoAcid expose a binary representation through the exported function BioSymbols.compatbits, which is a one-hot encoding of:","category":"page"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"julia> using BioSymbols\n\njulia> compatbits(DNA_W)\n0x09\n\njulia> compatbits(AA_J)\n0x00000600","category":"page"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"Each set bit in the encoding corresponds to a compatible unambiguous symbol. For example, for RNA, the four lower bits encode A, C, G, and U, in order. Hence, the symbol D, which is short for A, G or U, is encoded as 0x01 | 0x04 | 0x08 == 0x0d:","category":"page"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"julia> compatbits(RNA_D)\n0x0d\n\njulia> compatbits(RNA_A) | compatbits(DNA_G) | compatbits(RNA_U)\n0x0d","category":"page"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"Using this, we can construct a function to one-hot encode sequences - in this example, nucleic acid sequences:","category":"page"},{"location":"recipes/","page":"Recipes","title":"Recipes","text":"function one_hot(s::NucSeq)\n    M = falses(4, length(s))\n    for (i, s) in enumerate(s)\n        bits = compatbits(s)\n        while !iszero(bits)\n            M[trailing_zeros(bits) + 1, i] = true\n            bits &= bits - one(bits) # clear lowest bit\n        end\n    end\n    M\nend\n\none_hot(dna\"TGNTKCTW-T\")\n\n# output\n\n4×10 BitMatrix:\n 0  0  1  0  0  0  0  1  0  0\n 0  0  1  0  0  1  0  0  0  0\n 0  1  1  0  1  0  0  0  0  0\n 1  0  1  1  1  0  1  1  0  1","category":"page"},{"location":"#BioSequences","page":"Home","title":"BioSequences","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"(Image: Latest Release) (Image: MIT license) (Image: Documentation) (Image: Pkg Status)","category":"page"},{"location":"#Description","page":"Home","title":"Description","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"BioSequences provides data types and methods for common operations with biological sequences, including DNA, RNA, and amino acid sequences.","category":"page"},{"location":"#Installation","page":"Home","title":"Installation","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"You can install BioSequences from the julia REPL. Press ] to enter pkg mode again, and enter the following:","category":"page"},{"location":"","page":"Home","title":"Home","text":"add BioSequences","category":"page"},{"location":"","page":"Home","title":"Home","text":"If you are interested in the cutting edge of the development, please check out the master branch to try new features before release.","category":"page"},{"location":"#Testing","page":"Home","title":"Testing","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"BioSequences is tested against Julia 1.X on Linux, OS X, and Windows.","category":"page"},{"location":"","page":"Home","title":"Home","text":"(Image: Unit tests) (Image: Documentation) (Image: )","category":"page"},{"location":"#Contributing","page":"Home","title":"Contributing","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"We appreciate contributions from users including reporting bugs, fixing issues, improving performance and adding new features.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Take a look at the contributing files detailed contributor and maintainer guidelines, and code of conduct.","category":"page"},{"location":"#Questions?","page":"Home","title":"Questions?","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"If you have a question about contributing or using BioJulia software, come on over and chat to us on the #biology channel on the Julia SLack, or you can try the Bio category of the Julia discourse site.","category":"page"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"CurrentModule = BioSequences\nDocTestSetup = quote\n    using BioSequences\nend","category":"page"},{"location":"types/#Abstract-Types","page":"BioSequences Types","title":"Abstract Types","text":"","category":"section"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"BioSequences exports an abstract BioSequence type, and several concrete sequence types which inherit from it.","category":"page"},{"location":"types/#The-abstract-BioSequence","page":"BioSequences Types","title":"The abstract BioSequence","text":"","category":"section"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"BioSequences provides an abstract type called a BioSequence{A<:Alphabet}. This abstract type, and the methods and traits is supports, allows for many algorithms in BioSequences to be written as generically as possible, thus reducing the amount of code to read and understand, whilst maintaining high performance when such code is compiled for a concrete BioSequence subtype. Additionally, it allows new types to be implemented that are fully compatible with the rest of BioSequences, providing that key methods or traits are defined).","category":"page"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"BioSequence","category":"page"},{"location":"types/#BioSequences.BioSequence","page":"BioSequences Types","title":"BioSequences.BioSequence","text":"BioSequence{A <: Alphabet}\n\nBioSequence is the main abstract type of BioSequences. It abstracts over the internal representation of different biological sequences, and is parameterized by an Alphabet, which controls the element type.\n\nExtended help\n\nIts subtypes are characterized by:\n\nBeing a linear container type with random access and indices Base.OneTo(length(x)).\nContaining zero or more internal data elements of type encoded_data_eltype(typeof(x)).\nBeing associated with an Alphabet, A by being a subtype of BioSequence{A}.\n\nA BioSequence{A} is indexed by an integer. The biosequence subtype, the index and the alphabet A determine how to extract the internal encoded data. The alphabet decides how to decode the data to the element type of the biosequence. Hence, the element type and container type of a BioSequence are separated.\n\nSubtypes T of BioSequence must implement the following, with E begin an encoded data type:\n\nBase.length(::T)::Int\nencoded_data_eltype(::Type{T})::Type{E}\nextract_encoded_element(::T, ::Integer)::E\ncopy(::T)\nT must be able to be constructed from any iterable with length defined and with a known, compatible element type.\n\nFurthermore, mutable sequences should implement\n\nencoded_setindex!(::T, ::E, ::Integer)\nT(undef, ::Int)\nresize!(::T, ::Int)\n\nFor compatibility with existing Alphabets, the encoded data eltype must be UInt.\n\n\n\n\n\n","category":"type"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"Some aliases for BioSequence are also provided for your convenience:","category":"page"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"NucSeq\nAASeq","category":"page"},{"location":"types/#BioSequences.NucSeq","page":"BioSequences Types","title":"BioSequences.NucSeq","text":"An alias for BioSequence{<:NucleicAcidAlphabet}\n\n\n\n\n\n","category":"type"},{"location":"types/#BioSequences.AASeq","page":"BioSequences Types","title":"BioSequences.AASeq","text":"An alias for BioSequence{AminoAcidAlphabet}\n\n\n\n\n\n","category":"type"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"Let's have a closer look at some of those methods that a subtype of BioSequence must implement. Check out julia base library docs for length, copy and resize!.","category":"page"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"encoded_data_eltype\nextract_encoded_element\nencoded_setindex!","category":"page"},{"location":"types/#BioSequences.encoded_data_eltype","page":"BioSequences Types","title":"BioSequences.encoded_data_eltype","text":"encoded_data_eltype(::Type{<:BioSequence})\n\nReturns the element type of the encoded data of the BioSequence. This is the return type of extract_encoded_element, i.e. the data type that stores the biological symbols in the biosequence.\n\nSee also: BioSequence \n\n\n\n\n\n","category":"function"},{"location":"types/#BioSequences.extract_encoded_element","page":"BioSequences Types","title":"BioSequences.extract_encoded_element","text":"extract_encoded_element(::BioSequence{A}, i::Integer)\n\nReturns the encoded element at position i. This data can be decoded using decode(A(), data) to yield the element type of the biosequence.\n\nSee also: BioSequence \n\n\n\n\n\n","category":"function"},{"location":"types/#BioSequences.encoded_setindex!","page":"BioSequences Types","title":"BioSequences.encoded_setindex!","text":"encoded_setindex!(seq::BioSequence, x::E, i::Integer)\n\nGiven encoded data x of type encoded_data_eltype(typeof(seq)), sets the internal sequence data at the given index.\n\nSee also: BioSequence \n\n\n\n\n\n","category":"function"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"A correctly defined subtype of BioSequence that satisfies the interface, will find the vast majority of methods described in the rest of this manual should work out of the box for that type. But they can always be overloaded if needed. Indeed the LongSequence type overloads Indeed some of the generic BioSequence methods, are overloaded for LongSequence, for example for transformation and counting operations where efficiency gains can be made due to the specific internal representation of a specific type.","category":"page"},{"location":"types/#The-abstract-Alphabet","page":"BioSequences Types","title":"The abstract Alphabet","text":"","category":"section"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"Alphabets control how biological symbols are encoded and decoded. They also confer many of the automatic traits and methods that any subtype of T<:BioSequence{A<:Alphabet} will get.","category":"page"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"BioSequences.Alphabet\nBioSequences.AsciiAlphabet","category":"page"},{"location":"types/#BioSequences.Alphabet","page":"BioSequences Types","title":"BioSequences.Alphabet","text":"Alphabet\n\nAlphabet is the most important type trait for BioSequence. An Alphabet represents a set of biological symbols encoded by a sequence, e.g. A, C, G and T for a DNA Alphabet that requires only 2 bits to represent each symbol.\n\nExtended help\n\nSubtypes of Alphabet are singleton structs that may or may not be parameterized.\nAlphabets span over a finite set of biological symbols.\nThe alphabet controls the encoding from some internal \"encoded data\" to a BioSymbol  of the alphabet's element type, as well as the decoding, the inverse process.\nAn Alphabet's encode method must not produce invalid data. \n\nRequired methods\n\nEvery subtype A of Alphabet must implement:\n\nBase.eltype(::Type{A})::Type{S} for some eltype S, which must be a BioSymbol.\nsymbols(::A)::Tuple{Vararg{S}}. This gives tuples of all symbols in the set of A.\nencode(::A, ::S)::E encodes a symbol to an internal data eltype E.\ndecode(::A, ::E)::S decodes an internal data eltype E to a symbol S.\nExcept for eltype which must follow Base conventions, all functions operating on Alphabet should operate on instances of the alphabet, not the type.\n\nIf you want interoperation with existing subtypes of BioSequence, the encoded representation E must be of type UInt, and you must also implement:\n\nBitsPerSymbol(::A)::BitsPerSymbol{N}, where the N must be zero or a power of two in [1, 2, 4, 8, 16, 32, [64 for 64-bit systems]].\n\nOptional methods\n\nBitsPerSymbol for compatibility with existing BioSequences\nAsciiAlphabet for increased printing/writing efficiency\ntryencode for fallible encoding.\n\n\n\n\n\n","category":"type"},{"location":"types/#BioSequences.AsciiAlphabet","page":"BioSequences Types","title":"BioSequences.AsciiAlphabet","text":"AsciiAlphabet\n\nTrait for alphabet using ASCII characters as String representation. Define codetype(A) = AsciiAlphabet() for a user-defined Alphabet A to gain speed. Methods needed: BioSymbols.stringbyte(::eltype(A)) and ascii_encode(A, ::UInt8).\n\n\n\n\n\n","category":"type"},{"location":"types/#Concrete-types","page":"BioSequences Types","title":"Concrete types","text":"","category":"section"},{"location":"types/#Implemented-alphabets","page":"BioSequences Types","title":"Implemented alphabets","text":"","category":"section"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"DNAAlphabet\nRNAAlphabet\nAminoAcidAlphabet","category":"page"},{"location":"types/#BioSequences.DNAAlphabet","page":"BioSequences Types","title":"BioSequences.DNAAlphabet","text":"DNA nucleotide alphabet.\n\nDNAAlphabet has a parameter N which is a number that determines the BitsPerSymbol trait. Currently supported values of N are 2 and 4.\n\n\n\n\n\n","category":"type"},{"location":"types/#BioSequences.RNAAlphabet","page":"BioSequences Types","title":"BioSequences.RNAAlphabet","text":"RNA nucleotide alphabet.\n\nRNAAlphabet has a parameter N which is a number that determines the BitsPerSymbol trait. Currently supported values of N are 2 and 4.\n\n\n\n\n\n","category":"type"},{"location":"types/#BioSequences.AminoAcidAlphabet","page":"BioSequences Types","title":"BioSequences.AminoAcidAlphabet","text":"Amino acid alphabet.\n\n\n\n\n\n","category":"type"},{"location":"types/#Long-Sequences","page":"BioSequences Types","title":"Long Sequences","text":"","category":"section"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"LongSequence","category":"page"},{"location":"types/#BioSequences.LongSequence","page":"BioSequences Types","title":"BioSequences.LongSequence","text":"LongSequence{A <: Alphabet}\n\nGeneral-purpose BioSequence. This type is mutable and variable-length, and should be preferred for most use cases.\n\nExtended help\n\nLongSequence{A<:Alphabet} <: BioSequence{A} is parameterized by a concrete Alphabet type A that defines the domain (or set) of biological symbols permitted.\n\nAs the BioSequence interface definition implies, LongSequences store the biological symbol elements that they contain in a succinct encoded form that permits many operations to be done in an efficient bit-parallel manner. As per the interface of BioSequence, the Alphabet determines how an element is encoded or decoded when it is inserted or extracted from the sequence.\n\nFor example, AminoAcidAlphabet is associated with AminoAcid and hence an object of the LongSequence{AminoAcidAlphabet} type represents a sequence of amino acids.\n\nSymbols from multiple alphabets can't be intermixed in one sequence type.\n\nThe following table summarizes common LongSequence types that have been given aliases for convenience.\n\nType Symbol type Type alias\nLongSequence{DNAAlphabet{N}} DNA LongDNA{N}\nLongSequence{RNAAlphabet{N}} RNA LongRNA{N}\nLongSequence{AminoAcidAlphabet} AminoAcid LongAA\n\nThe LongDNA and LongRNA aliases use a DNAAlphabet{4}.\n\nDNAAlphabet{4} permits ambiguous nucleotides, and a sequence must use at least 4 bits to internally store each element (and indeed LongSequence does).\n\nIf you are sure that you are working with sequences with no ambiguous nucleotides, you can use LongSequences parameterised with DNAAlphabet{2} instead.\n\nDNAAlphabet{2} is an alphabet that uses two bits per base and limits to only unambiguous nucleotide symbols (A,C,G,T).\n\nChanging this single parameter, is all you need to do in order to benefit from memory savings. Some computations that use bitwise operations will also be dramatically faster.\n\nThe same applies with LongSequence{RNAAlphabet{4}}, simply replace the alphabet parameter with RNAAlphabet{2} in order to benefit.\n\n\n\n\n\n","category":"type"},{"location":"types/#Sequence-views","page":"BioSequences Types","title":"Sequence views","text":"","category":"section"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"Similar to how Base Julia offers views of array objects, BioSequences offers view of LongSequences - the LongSubSeq{A<:Alphabet}.","category":"page"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"Conceptually, a LongSubSeq{A} is similar to a LongSequence{A}, but instead of storing their own data, they refer to the data of a LongSequence. Modiying the LongSequence will be reflected in the view, and vice versa. If the underlying LongSequence is truncated, the behaviour of a view is undefined. For the same reason, some operations are not supported for views, such as resizing.","category":"page"},{"location":"types/","page":"BioSequences Types","title":"BioSequences Types","text":"The purpose of LongSubSeq is that, since they only contain a pointer to the underlying array, an offset and a length, they are much lighter than LongSequences, and will be stack allocated on Julia 1.5 and newer. Thus, the user may construct millions of views without major performance implications.","category":"page"}]
 }
diff --git a/dev/sequence_search/index.html b/dev/sequence_search/index.html
index 3830b361..08ff77c2 100644
--- a/dev/sequence_search/index.html
+++ b/dev/sequence_search/index.html
@@ -50,7 +50,7 @@
 
 julia&gt; occursin(ExactSearchQuery(dna&quot;CNT&quot;, iscompatible), dna&quot;ACNT&quot;)
 true
-</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/search/ExactSearchQuery.jl#L1-L60">source</a></section></article><h2 id="Allowing-mismatches"><a class="docs-heading-anchor" href="#Allowing-mismatches">Allowing mismatches</a><a id="Allowing-mismatches-1"></a><a class="docs-heading-anchor-permalink" href="#Allowing-mismatches" title="Permalink"></a></h2><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.ApproximateSearchQuery" href="#BioSequences.ApproximateSearchQuery"><code>BioSequences.ApproximateSearchQuery</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">ApproximateSearchQuery{F&lt;:Function,S&lt;:BioSequence}</code></pre><p>Query type for approximate sequence search.</p><p>These queries are used as a predicate for the <code>Base.findnext</code>, <code>Base.findprev</code>, <code>Base.occursin</code>, <code>Base.findfirst</code>, and <code>Base.findlast</code> functions.</p><p>Using these functions with these queries allows you to search a given sequence for a sub-sequence, whilst allowing a specific number of errors.</p><p>In other words they find a subsequence of the target sequence within a specific <a href="https://en.wikipedia.org/wiki/Levenshtein_distance">Levenshtein distance</a> of the query sequence.</p><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; seq = dna&quot;ACAGCGTAGCT&quot;;
+</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/search/ExactSearchQuery.jl#L1-L60">source</a></section></article><h2 id="Allowing-mismatches"><a class="docs-heading-anchor" href="#Allowing-mismatches">Allowing mismatches</a><a id="Allowing-mismatches-1"></a><a class="docs-heading-anchor-permalink" href="#Allowing-mismatches" title="Permalink"></a></h2><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.ApproximateSearchQuery" href="#BioSequences.ApproximateSearchQuery"><code>BioSequences.ApproximateSearchQuery</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">ApproximateSearchQuery{F&lt;:Function,S&lt;:BioSequence}</code></pre><p>Query type for approximate sequence search.</p><p>These queries are used as a predicate for the <code>Base.findnext</code>, <code>Base.findprev</code>, <code>Base.occursin</code>, <code>Base.findfirst</code>, and <code>Base.findlast</code> functions.</p><p>Using these functions with these queries allows you to search a given sequence for a sub-sequence, whilst allowing a specific number of errors.</p><p>In other words they find a subsequence of the target sequence within a specific <a href="https://en.wikipedia.org/wiki/Levenshtein_distance">Levenshtein distance</a> of the query sequence.</p><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; seq = dna&quot;ACAGCGTAGCT&quot;;
 
 julia&gt; query = ApproximateSearchQuery(dna&quot;AGGG&quot;);
 
@@ -69,7 +69,7 @@
 
 julia&gt; findnext(query, 1, dna&quot;AAGNGG&quot;, 1) # 1 mismatch permitted (A vs G) &amp; matched N
 1:4
-</code></pre><div class="admonition is-info"><header class="admonition-header">Note</header><div class="admonition-body"><p>This method of searching for motifs was implemented with smaller query motifs in mind.</p><p>If you are looking to search for imperfect matches of longer sequences in this manner, you are likely better off using some kind of local-alignment algorithm or one of the BLAST variants.</p></div></div></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/search/ApproxSearchQuery.jl#L9-L66">source</a></section></article><h2 id="Searching-according-to-a-pattern"><a class="docs-heading-anchor" href="#Searching-according-to-a-pattern">Searching according to a pattern</a><a id="Searching-according-to-a-pattern-1"></a><a class="docs-heading-anchor-permalink" href="#Searching-according-to-a-pattern" title="Permalink"></a></h2><h3 id="Regular-expression-search"><a class="docs-heading-anchor" href="#Regular-expression-search">Regular expression search</a><a id="Regular-expression-search-1"></a><a class="docs-heading-anchor-permalink" href="#Regular-expression-search" title="Permalink"></a></h3><p>Query patterns can be described in regular expressions. The syntax supports a subset of Perl and PROSITE&#39;s notation.</p><p>Biological regexes can be constructed using the <code>BioRegex</code> constructor, for example by doing <code>BioRegex{AminoAcid}(&quot;MV+&quot;)</code>. For bioregex literals, it is instead recommended using the <code>@biore_str</code> macro:</p><p>The Perl-like syntax starts with <code>biore</code> (BIOlogical REgular expression) and ends with a symbol option: &quot;dna&quot;, &quot;rna&quot; or &quot;aa&quot;. For example, <code>biore&quot;A+&quot;dna</code> is a regular expression for DNA sequences and <code>biore&quot;A+&quot;aa</code> is for amino acid sequences. The symbol options can be abbreviated to its first character: &quot;d&quot;, &quot;r&quot; or &quot;a&quot;, respectively.</p><p>Here are examples of using the regular expression for <code>BioSequence</code>s:</p><pre><code class="language-julia-repl hljs">julia&gt; match(biore&quot;A+C*&quot;dna, dna&quot;AAAACC&quot;)
+</code></pre><div class="admonition is-info"><header class="admonition-header">Note</header><div class="admonition-body"><p>This method of searching for motifs was implemented with smaller query motifs in mind.</p><p>If you are looking to search for imperfect matches of longer sequences in this manner, you are likely better off using some kind of local-alignment algorithm or one of the BLAST variants.</p></div></div></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/search/ApproxSearchQuery.jl#L9-L66">source</a></section></article><h2 id="Searching-according-to-a-pattern"><a class="docs-heading-anchor" href="#Searching-according-to-a-pattern">Searching according to a pattern</a><a id="Searching-according-to-a-pattern-1"></a><a class="docs-heading-anchor-permalink" href="#Searching-according-to-a-pattern" title="Permalink"></a></h2><h3 id="Regular-expression-search"><a class="docs-heading-anchor" href="#Regular-expression-search">Regular expression search</a><a id="Regular-expression-search-1"></a><a class="docs-heading-anchor-permalink" href="#Regular-expression-search" title="Permalink"></a></h3><p>Query patterns can be described in regular expressions. The syntax supports a subset of Perl and PROSITE&#39;s notation.</p><p>Biological regexes can be constructed using the <code>BioRegex</code> constructor, for example by doing <code>BioRegex{AminoAcid}(&quot;MV+&quot;)</code>. For bioregex literals, it is instead recommended using the <code>@biore_str</code> macro:</p><p>The Perl-like syntax starts with <code>biore</code> (BIOlogical REgular expression) and ends with a symbol option: &quot;dna&quot;, &quot;rna&quot; or &quot;aa&quot;. For example, <code>biore&quot;A+&quot;dna</code> is a regular expression for DNA sequences and <code>biore&quot;A+&quot;aa</code> is for amino acid sequences. The symbol options can be abbreviated to its first character: &quot;d&quot;, &quot;r&quot; or &quot;a&quot;, respectively.</p><p>Here are examples of using the regular expression for <code>BioSequence</code>s:</p><pre><code class="language-julia-repl hljs">julia&gt; match(biore&quot;A+C*&quot;dna, dna&quot;AAAACC&quot;)
 RegexMatch(&quot;AAAACC&quot;)
 
 julia&gt; match(biore&quot;A+C*&quot;d, dna&quot;AAAACC&quot;)
@@ -159,4 +159,4 @@
 3-element Vector{Int64}:
  3
  5
- 9</code></pre><p>[Wasserman2004]: https://doi.org/10.1038/nrg1315</p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../random/">« Random sequences</a><a class="docs-footer-nextpage" href="../counting/">Counting »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Thursday 24 October 2024 17:54">Thursday 24 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
+ 9</code></pre><p>[Wasserman2004]: https://doi.org/10.1038/nrg1315</p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../random/">« Random sequences</a><a class="docs-footer-nextpage" href="../counting/">Counting »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Friday 25 October 2024 10:21">Friday 25 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
diff --git a/dev/symbols/index.html b/dev/symbols/index.html
index 09f067d6..94736969 100644
--- a/dev/symbols/index.html
+++ b/dev/symbols/index.html
@@ -70,4 +70,4 @@
 
 julia&gt; iscompatible(DNA_C, DNA_R)  # DNA_R (A or G) cannot be DNA_C
 false
-</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSymbols.jl/blob/v5.1.3/src/BioSymbols.jl#L260-L285">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSymbols.isambiguous" href="#BioSymbols.isambiguous"><code>BioSymbols.isambiguous</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">isambiguous(nt::NucleicAcid)</code></pre><p>Test if <code>nt</code> is an ambiguous nucleotide.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSymbols.jl/blob/v5.1.3/src/nucleicacid.jl#L390-L394">source</a></section><section><div><pre><code class="language-julia hljs">isambiguous(aa::AminoAcid)</code></pre><p>Test if <code>aa</code> is an ambiguous amino acid.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSymbols.jl/blob/v5.1.3/src/aminoacid.jl#L179-L183">source</a></section></article></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../">« Home</a><a class="docs-footer-nextpage" href="../types/">BioSequences Types »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Thursday 24 October 2024 17:54">Thursday 24 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
+</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSymbols.jl/blob/v5.1.3/src/BioSymbols.jl#L260-L285">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSymbols.isambiguous" href="#BioSymbols.isambiguous"><code>BioSymbols.isambiguous</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">isambiguous(nt::NucleicAcid)</code></pre><p>Test if <code>nt</code> is an ambiguous nucleotide.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSymbols.jl/blob/v5.1.3/src/nucleicacid.jl#L390-L394">source</a></section><section><div><pre><code class="language-julia hljs">isambiguous(aa::AminoAcid)</code></pre><p>Test if <code>aa</code> is an ambiguous amino acid.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSymbols.jl/blob/v5.1.3/src/aminoacid.jl#L179-L183">source</a></section></article></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../">« Home</a><a class="docs-footer-nextpage" href="../types/">BioSequences Types »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Friday 25 October 2024 10:21">Friday 25 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
diff --git a/dev/transforms/index.html b/dev/transforms/index.html
index 780eee1f..42e4376e 100644
--- a/dev/transforms/index.html
+++ b/dev/transforms/index.html
@@ -15,7 +15,7 @@
 
 julia&gt; seq[5] = DNA_A
 DNA_A
-</code></pre><div class="admonition is-info"><header class="admonition-header">Note</header><div class="admonition-body"><p>Some types such can be indexed using integers but not using ranges.</p></div></div><p>For <code>LongSequence</code> types, indexing a sequence by range creates a copy of the original sequence, similar to <code>Array</code> in Julia&#39;s <code>Base</code> library. If you find yourself slowed down by the allocation of these subsequences, consider using a sequence view instead.</p><h2 id="Modifying-sequences"><a class="docs-heading-anchor" href="#Modifying-sequences">Modifying sequences</a><a id="Modifying-sequences-1"></a><a class="docs-heading-anchor-permalink" href="#Modifying-sequences" title="Permalink"></a></h2><p>In addition to <code>setindex</code>, many other modifying operations are possible for biological sequences such as <code>push!</code>, <code>pop!</code>, and <code>insert!</code>, which should be familiar to anyone used to editing arrays.</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.push!-Tuple{BioSequence, Any}" href="#Base.push!-Tuple{BioSequence, Any}"><code>Base.push!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">push!(seq::BioSequence, x)</code></pre><p>Append a biological symbol <code>x</code> to a biological sequence <code>seq</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L16-L20">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.pop!-Tuple{BioSequence}" href="#Base.pop!-Tuple{BioSequence}"><code>Base.pop!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">pop!(seq::BioSequence)</code></pre><p>Remove the symbol from the end of a biological sequence <code>seq</code> and return it. Returns a variable of <code>eltype(seq)</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L28-L33">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.pushfirst!-Tuple{BioSequence, Any}" href="#Base.pushfirst!-Tuple{BioSequence, Any}"><code>Base.pushfirst!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">pushfirst!(seq, x)</code></pre><p>Insert a biological symbol <code>x</code> at the beginning of a biological sequence <code>seq</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L113-L117">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.popfirst!-Tuple{BioSequence}" href="#Base.popfirst!-Tuple{BioSequence}"><code>Base.popfirst!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">popfirst!(seq)</code></pre><p>Remove the symbol from the beginning of a biological sequence <code>seq</code> and return it. Returns a variable of <code>eltype(seq)</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L98-L103">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.insert!-Tuple{BioSequence, Integer, Any}" href="#Base.insert!-Tuple{BioSequence, Integer, Any}"><code>Base.insert!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">insert!(seq::BioSequence, i, x)</code></pre><p>Insert a biological symbol <code>x</code> into a biological sequence <code>seq</code>, at the given index <code>i</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L43-L48">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.deleteat!-Tuple{BioSequence, Integer}" href="#Base.deleteat!-Tuple{BioSequence, Integer}"><code>Base.deleteat!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">deleteat!(seq::BioSequence, i::Integer)</code></pre><p>Delete a biological symbol at a single position <code>i</code> in a biological sequence <code>seq</code>.</p><p>Modifies the input sequence.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L71-L78">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.append!-Tuple{BioSequence, BioSequence}" href="#Base.append!-Tuple{BioSequence, BioSequence}"><code>Base.append!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">append!(seq, other)</code></pre><p>Add a biological sequence <code>other</code> onto the end of biological sequence <code>seq</code>. Modifies and returns <code>seq</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L86-L91">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.resize!-Tuple{LongSequence, Integer}" href="#Base.resize!-Tuple{LongSequence, Integer}"><code>Base.resize!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">resize!(seq, size, [force::Bool=false])</code></pre><p>Resize a biological sequence <code>seq</code>, to a given <code>size</code>. Does not resize the underlying data array unless the new size does not fit. If <code>force</code>, always resize underlying data array.</p><p>Note that resizing to a larger size, and then loading from uninitialized positions is not allowed and may cause undefined behaviour.  Make sure to always fill any uninitialized biosymbols after resizing.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/transformations.jl#L5-L14">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.empty!-Tuple{BioSequence}" href="#Base.empty!-Tuple{BioSequence}"><code>Base.empty!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">empty!(seq::BioSequence)</code></pre><p>Completely empty a biological sequence <code>seq</code> of nucleotides.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L9-L13">source</a></section></article><p>Here are some examples:</p><pre><code class="language-julia-repl hljs">julia&gt; seq = dna&quot;ACG&quot;
+</code></pre><div class="admonition is-info"><header class="admonition-header">Note</header><div class="admonition-body"><p>Some types such can be indexed using integers but not using ranges.</p></div></div><p>For <code>LongSequence</code> types, indexing a sequence by range creates a copy of the original sequence, similar to <code>Array</code> in Julia&#39;s <code>Base</code> library. If you find yourself slowed down by the allocation of these subsequences, consider using a sequence view instead.</p><h2 id="Modifying-sequences"><a class="docs-heading-anchor" href="#Modifying-sequences">Modifying sequences</a><a id="Modifying-sequences-1"></a><a class="docs-heading-anchor-permalink" href="#Modifying-sequences" title="Permalink"></a></h2><p>In addition to <code>setindex</code>, many other modifying operations are possible for biological sequences such as <code>push!</code>, <code>pop!</code>, and <code>insert!</code>, which should be familiar to anyone used to editing arrays.</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.push!-Tuple{BioSequence, Any}" href="#Base.push!-Tuple{BioSequence, Any}"><code>Base.push!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">push!(seq::BioSequence, x)</code></pre><p>Append a biological symbol <code>x</code> to a biological sequence <code>seq</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L16-L20">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.pop!-Tuple{BioSequence}" href="#Base.pop!-Tuple{BioSequence}"><code>Base.pop!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">pop!(seq::BioSequence)</code></pre><p>Remove the symbol from the end of a biological sequence <code>seq</code> and return it. Returns a variable of <code>eltype(seq)</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L28-L33">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.pushfirst!-Tuple{BioSequence, Any}" href="#Base.pushfirst!-Tuple{BioSequence, Any}"><code>Base.pushfirst!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">pushfirst!(seq, x)</code></pre><p>Insert a biological symbol <code>x</code> at the beginning of a biological sequence <code>seq</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L113-L117">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.popfirst!-Tuple{BioSequence}" href="#Base.popfirst!-Tuple{BioSequence}"><code>Base.popfirst!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">popfirst!(seq)</code></pre><p>Remove the symbol from the beginning of a biological sequence <code>seq</code> and return it. Returns a variable of <code>eltype(seq)</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L98-L103">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.insert!-Tuple{BioSequence, Integer, Any}" href="#Base.insert!-Tuple{BioSequence, Integer, Any}"><code>Base.insert!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">insert!(seq::BioSequence, i, x)</code></pre><p>Insert a biological symbol <code>x</code> into a biological sequence <code>seq</code>, at the given index <code>i</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L43-L48">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.deleteat!-Tuple{BioSequence, Integer}" href="#Base.deleteat!-Tuple{BioSequence, Integer}"><code>Base.deleteat!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">deleteat!(seq::BioSequence, i::Integer)</code></pre><p>Delete a biological symbol at a single position <code>i</code> in a biological sequence <code>seq</code>.</p><p>Modifies the input sequence.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L71-L78">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.append!-Tuple{BioSequence, BioSequence}" href="#Base.append!-Tuple{BioSequence, BioSequence}"><code>Base.append!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">append!(seq, other)</code></pre><p>Add a biological sequence <code>other</code> onto the end of biological sequence <code>seq</code>. Modifies and returns <code>seq</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L86-L91">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.resize!-Tuple{LongSequence, Integer}" href="#Base.resize!-Tuple{LongSequence, Integer}"><code>Base.resize!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">resize!(seq, size, [force::Bool=false])</code></pre><p>Resize a biological sequence <code>seq</code>, to a given <code>size</code>. Does not resize the underlying data array unless the new size does not fit. If <code>force</code>, always resize underlying data array.</p><p>Note that resizing to a larger size, and then loading from uninitialized positions is not allowed and may cause undefined behaviour.  Make sure to always fill any uninitialized biosymbols after resizing.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/transformations.jl#L5-L14">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.empty!-Tuple{BioSequence}" href="#Base.empty!-Tuple{BioSequence}"><code>Base.empty!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">empty!(seq::BioSequence)</code></pre><p>Completely empty a biological sequence <code>seq</code> of nucleotides.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L9-L13">source</a></section></article><p>Here are some examples:</p><pre><code class="language-julia-repl hljs">julia&gt; seq = dna&quot;ACG&quot;
 3nt DNA Sequence:
 ACG
 
@@ -34,7 +34,7 @@
 julia&gt; deleteat!(seq, 2:3)
 3nt DNA Sequence:
 AAT
-</code></pre><h3 id="Additional-transformations"><a class="docs-heading-anchor" href="#Additional-transformations">Additional transformations</a><a id="Additional-transformations-1"></a><a class="docs-heading-anchor-permalink" href="#Additional-transformations" title="Permalink"></a></h3><p>In addition to these basic modifying functions, other sequence transformations that are common in bioinformatics are also provided.</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.reverse!-Tuple{LongSequence}" href="#Base.reverse!-Tuple{LongSequence}"><code>Base.reverse!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">reverse!(seq::LongSequence)</code></pre><p>Reverse a biological sequence <code>seq</code> in place.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/transformations.jl#L27-L31">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.reverse-Tuple{LongSequence{&lt;:NucleicAcidAlphabet}}" href="#Base.reverse-Tuple{LongSequence{&lt;:NucleicAcidAlphabet}}"><code>Base.reverse</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">reverse(seq::BioSequence)</code></pre><p>Create reversed copy of a biological sequence.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L155-L159">source</a></section><section><div><pre><code class="language-julia hljs">reverse(seq::LongSequence)</code></pre><p>Create reversed copy of a biological sequence.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/transformations.jl#L34-L38">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.complement!" href="#BioSequences.complement!"><code>BioSequences.complement!</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">complement!(seq)</code></pre><p>Make a complement sequence of <code>seq</code> in place.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/transformations.jl#L115-L119">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSymbols.complement" href="#BioSymbols.complement"><code>BioSymbols.complement</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">complement(nt::NucleicAcid)</code></pre><p>Return the complementary nucleotide of <code>nt</code>.</p><p>This function returns the union of all possible complementary nucleotides.</p><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; complement(DNA_A)
+</code></pre><h3 id="Additional-transformations"><a class="docs-heading-anchor" href="#Additional-transformations">Additional transformations</a><a id="Additional-transformations-1"></a><a class="docs-heading-anchor-permalink" href="#Additional-transformations" title="Permalink"></a></h3><p>In addition to these basic modifying functions, other sequence transformations that are common in bioinformatics are also provided.</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.reverse!-Tuple{LongSequence}" href="#Base.reverse!-Tuple{LongSequence}"><code>Base.reverse!</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">reverse!(seq::LongSequence)</code></pre><p>Reverse a biological sequence <code>seq</code> in place.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/transformations.jl#L27-L31">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="Base.reverse-Tuple{LongSequence{&lt;:NucleicAcidAlphabet}}" href="#Base.reverse-Tuple{LongSequence{&lt;:NucleicAcidAlphabet}}"><code>Base.reverse</code></a> — <span class="docstring-category">Method</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">reverse(seq::BioSequence)</code></pre><p>Create reversed copy of a biological sequence.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L155-L159">source</a></section><section><div><pre><code class="language-julia hljs">reverse(seq::LongSequence)</code></pre><p>Create reversed copy of a biological sequence.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/transformations.jl#L34-L38">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.complement!" href="#BioSequences.complement!"><code>BioSequences.complement!</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">complement!(seq)</code></pre><p>Make a complement sequence of <code>seq</code> in place.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/transformations.jl#L115-L119">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSymbols.complement" href="#BioSymbols.complement"><code>BioSymbols.complement</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">complement(nt::NucleicAcid)</code></pre><p>Return the complementary nucleotide of <code>nt</code>.</p><p>This function returns the union of all possible complementary nucleotides.</p><p><strong>Examples</strong></p><pre><code class="language-julia-repl hljs">julia&gt; complement(DNA_A)
 DNA_T
 
 julia&gt; complement(DNA_N)
@@ -42,10 +42,10 @@
 
 julia&gt; complement(RNA_U)
 RNA_A
-</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSymbols.jl/blob/v5.1.3/src/nucleicacid.jl#L408-L429">source</a></section><section><div><pre><code class="language-julia hljs">complement(seq)</code></pre><p>Make a complement sequence of <code>seq</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L171-L175">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.reverse_complement!" href="#BioSequences.reverse_complement!"><code>BioSequences.reverse_complement!</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">reverse_complement!(seq)</code></pre><p>Make a reversed complement sequence of <code>seq</code> in place.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L182-L186">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.reverse_complement" href="#BioSequences.reverse_complement"><code>BioSequences.reverse_complement</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">reverse_complement(seq)</code></pre><p>Make a reversed complement sequence of <code>seq</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L191-L195">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.ungap!" href="#BioSequences.ungap!"><code>BioSequences.ungap!</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>Remove gap characters from an input sequence.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L243">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.ungap" href="#BioSequences.ungap"><code>BioSequences.ungap</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>Create a copy of a sequence with gap characters removed.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L240">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.canonical!" href="#BioSequences.canonical!"><code>BioSequences.canonical!</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">canonical!(seq::NucleotideSeq)</code></pre><p>Transforms the <code>seq</code> into its canonical form, if it is not already canonical. Modifies the input sequence inplace.</p><p>For any sequence, there is a reverse complement, which is the same sequence, but on the complimentary strand of DNA:</p><pre><code class="nohighlight hljs">-------&gt;
+</code></pre></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSymbols.jl/blob/v5.1.3/src/nucleicacid.jl#L408-L429">source</a></section><section><div><pre><code class="language-julia hljs">complement(seq)</code></pre><p>Make a complement sequence of <code>seq</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L171-L175">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.reverse_complement!" href="#BioSequences.reverse_complement!"><code>BioSequences.reverse_complement!</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">reverse_complement!(seq)</code></pre><p>Make a reversed complement sequence of <code>seq</code> in place.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L182-L186">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.reverse_complement" href="#BioSequences.reverse_complement"><code>BioSequences.reverse_complement</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">reverse_complement(seq)</code></pre><p>Make a reversed complement sequence of <code>seq</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L191-L195">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.ungap!" href="#BioSequences.ungap!"><code>BioSequences.ungap!</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>Remove gap characters from an input sequence.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L243">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.ungap" href="#BioSequences.ungap"><code>BioSequences.ungap</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>Create a copy of a sequence with gap characters removed.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L240">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.canonical!" href="#BioSequences.canonical!"><code>BioSequences.canonical!</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">canonical!(seq::NucleotideSeq)</code></pre><p>Transforms the <code>seq</code> into its canonical form, if it is not already canonical. Modifies the input sequence inplace.</p><p>For any sequence, there is a reverse complement, which is the same sequence, but on the complimentary strand of DNA:</p><pre><code class="nohighlight hljs">-------&gt;
 ATCGATCG
 CGATCGAT
-&lt;-------</code></pre><div class="admonition is-info"><header class="admonition-header">Note</header><div class="admonition-body"><p>Using the <a href="#BioSequences.reverse_complement"><code>reverse_complement</code></a> of a DNA sequence will give give this reverse complement.</p></div></div><p>Of the two sequences, the <em>canonical</em> of the two sequences is the lesser of the two i.e. <code>canonical_seq &lt; other_seq</code>.</p><p>Using this function on a <code>seq</code> will ensure it is the canonical version.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L200-L224">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.canonical" href="#BioSequences.canonical"><code>BioSequences.canonical</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">canonical(seq::NucleotideSeq)</code></pre><p>Create the canonical sequence of <code>seq</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/transformations.jl#L232-L237">source</a></section></article><p>Some examples:</p><pre><code class="language-julia-repl hljs">julia&gt; seq = dna&quot;ACGTAT&quot;
+&lt;-------</code></pre><div class="admonition is-info"><header class="admonition-header">Note</header><div class="admonition-body"><p>Using the <a href="#BioSequences.reverse_complement"><code>reverse_complement</code></a> of a DNA sequence will give give this reverse complement.</p></div></div><p>Of the two sequences, the <em>canonical</em> of the two sequences is the lesser of the two i.e. <code>canonical_seq &lt; other_seq</code>.</p><p>Using this function on a <code>seq</code> will ensure it is the canonical version.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L200-L224">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.canonical" href="#BioSequences.canonical"><code>BioSequences.canonical</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">canonical(seq::NucleotideSeq)</code></pre><p>Create the canonical sequence of <code>seq</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/transformations.jl#L232-L237">source</a></section></article><p>Some examples:</p><pre><code class="language-julia-repl hljs">julia&gt; seq = dna&quot;ACGTAT&quot;
 6nt DNA Sequence:
 ACGTAT
 
@@ -60,7 +60,7 @@
 julia&gt; reverse_complement!(seq)
 6nt DNA Sequence:
 ACGTAT
-</code></pre><p>Many of these methods also have a version which makes a copy of the input sequence, so you get a modified copy, and don&#39;t alter the original sequence. Such methods are named the same, but without the exclamation mark. E.g. <code>reverse</code> instead of <code>reverse!</code>, and <code>ungap</code> instead of <code>ungap!</code>.  </p><h4 id="Translation"><a class="docs-heading-anchor" href="#Translation">Translation</a><a id="Translation-1"></a><a class="docs-heading-anchor-permalink" href="#Translation" title="Permalink"></a></h4><p>Translation is a slightly more complex transformation for RNA Sequences and so we describe it here in more detail.</p><p>The <a href="#BioSequences.translate"><code>translate</code></a> function translates a sequence of codons in a RNA sequence to a amino acid sequence based on a genetic code. The <code>BioSequences</code> package provides all NCBI defined genetic codes and they are registered in <a href="#BioSequences.ncbi_trans_table"><code>ncbi_trans_table</code></a>.</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.translate" href="#BioSequences.translate"><code>BioSequences.translate</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">translate(seq, code=standard_genetic_code, allow_ambiguous_codons=true, alternative_start=false)</code></pre><p>Translate an <code>LongRNA</code> or a <code>LongDNA</code> to an <code>LongAA</code>.</p><p>Translation uses genetic code <code>code</code> to map codons to amino acids. See <code>ncbi_trans_table</code> for available genetic codes. If codons in the given sequence cannot determine a unique amino acid, they will be translated to <code>AA_X</code> if <code>allow_ambiguous_codons</code> is <code>true</code> and otherwise result in an error. For organisms that utilize alternative start codons, one can set <code>alternative_start=true</code>, in which case the first codon will always be converted to a methionine.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/geneticcode.jl#L326-L338">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.ncbi_trans_table" href="#BioSequences.ncbi_trans_table"><code>BioSequences.ncbi_trans_table</code></a> — <span class="docstring-category">Constant</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>Genetic code list of NCBI.</p><p>The standard genetic code is <code>ncbi_trans_table[1]</code> and others can be shown by <code>show(ncbi_trans_table)</code>. For more details, consult the next link: http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi?chapter=cgencodes.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/geneticcode.jl#L95-L102">source</a></section></article><pre><code class="language-julia-repl hljs">julia&gt; ncbi_trans_table
+</code></pre><p>Many of these methods also have a version which makes a copy of the input sequence, so you get a modified copy, and don&#39;t alter the original sequence. Such methods are named the same, but without the exclamation mark. E.g. <code>reverse</code> instead of <code>reverse!</code>, and <code>ungap</code> instead of <code>ungap!</code>.  </p><h4 id="Translation"><a class="docs-heading-anchor" href="#Translation">Translation</a><a id="Translation-1"></a><a class="docs-heading-anchor-permalink" href="#Translation" title="Permalink"></a></h4><p>Translation is a slightly more complex transformation for RNA Sequences and so we describe it here in more detail.</p><p>The <a href="#BioSequences.translate"><code>translate</code></a> function translates a sequence of codons in a RNA sequence to a amino acid sequence based on a genetic code. The <code>BioSequences</code> package provides all NCBI defined genetic codes and they are registered in <a href="#BioSequences.ncbi_trans_table"><code>ncbi_trans_table</code></a>.</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.translate" href="#BioSequences.translate"><code>BioSequences.translate</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">translate(seq, code=standard_genetic_code, allow_ambiguous_codons=true, alternative_start=false)</code></pre><p>Translate an <code>LongRNA</code> or a <code>LongDNA</code> to an <code>LongAA</code>.</p><p>Translation uses genetic code <code>code</code> to map codons to amino acids. See <code>ncbi_trans_table</code> for available genetic codes. If codons in the given sequence cannot determine a unique amino acid, they will be translated to <code>AA_X</code> if <code>allow_ambiguous_codons</code> is <code>true</code> and otherwise result in an error. For organisms that utilize alternative start codons, one can set <code>alternative_start=true</code>, in which case the first codon will always be converted to a methionine.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/geneticcode.jl#L326-L338">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.ncbi_trans_table" href="#BioSequences.ncbi_trans_table"><code>BioSequences.ncbi_trans_table</code></a> — <span class="docstring-category">Constant</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>Genetic code list of NCBI.</p><p>The standard genetic code is <code>ncbi_trans_table[1]</code> and others can be shown by <code>show(ncbi_trans_table)</code>. For more details, consult the next link: http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi?chapter=cgencodes.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/geneticcode.jl#L95-L102">source</a></section></article><pre><code class="language-julia-repl hljs">julia&gt; ncbi_trans_table
 Translation Tables:
   1. The Standard Code (standard_genetic_code)
   2. The Vertebrate Mitochondrial Code (vertebrate_mitochondrial_genetic_code)
@@ -81,4 +81,4 @@
  23. Thraustochytrium Mitochondrial Code (thraustochytrium_mitochondrial_genetic_code)
  24. Pterobranchia Mitochondrial Code (pterobrachia_mitochondrial_genetic_code)
  25. Candidate Division SR1 and Gracilibacteria Code (candidate_division_sr1_genetic_code)
-</code></pre><p><a href="https://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi?chapter=cgencodes">https://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi?chapter=cgencodes</a></p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../construction/">« Constructing sequences</a><a class="docs-footer-nextpage" href="../predicates/">Predicates »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Thursday 24 October 2024 17:54">Thursday 24 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
+</code></pre><p><a href="https://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi?chapter=cgencodes">https://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi?chapter=cgencodes</a></p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../construction/">« Constructing sequences</a><a class="docs-footer-nextpage" href="../predicates/">Predicates »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Friday 25 October 2024 10:21">Friday 25 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
diff --git a/dev/types/index.html b/dev/types/index.html
index f18782e3..17dc824f 100644
--- a/dev/types/index.html
+++ b/dev/types/index.html
@@ -1,2 +1,2 @@
 <!DOCTYPE html>
-<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>BioSequences Types · BioSequences.jl</title><meta name="title" content="BioSequences Types · BioSequences.jl"/><meta property="og:title" content="BioSequences Types · BioSequences.jl"/><meta property="twitter:title" content="BioSequences Types · BioSequences.jl"/><meta name="description" content="Documentation for BioSequences.jl."/><meta property="og:description" content="Documentation for BioSequences.jl."/><meta property="twitter:description" content="Documentation for BioSequences.jl."/><script data-outdated-warner src="../assets/warner.js"></script><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.050/juliamono.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.8/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../assets/documenter.js"></script><script src="../search_index.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-mocha.css" data-theme-name="catppuccin-mocha"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-macchiato.css" data-theme-name="catppuccin-macchiato"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-frappe.css" data-theme-name="catppuccin-frappe"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-latte.css" data-theme-name="catppuccin-latte"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><a class="docs-logo" href="../"><img src="../assets/logo.png" alt="BioSequences.jl logo"/></a><div class="docs-package-name"><span class="docs-autofit"><a href="../">BioSequences.jl</a></span></div><button class="docs-search-query input is-rounded is-small is-clickable my-2 mx-auto py-1 px-2" id="documenter-search-query">Search docs (Ctrl + /)</button><ul class="docs-menu"><li><a class="tocitem" href="../">Home</a></li><li><a class="tocitem" href="../symbols/">Biological Symbols</a></li><li class="is-active"><a class="tocitem" href>BioSequences Types</a><ul class="internal"><li><a class="tocitem" href="#The-abstract-BioSequence"><span>The abstract BioSequence</span></a></li><li><a class="tocitem" href="#The-abstract-Alphabet"><span>The abstract Alphabet</span></a></li><li class="toplevel"><a class="tocitem" href="#Concrete-types"><span>Concrete types</span></a></li><li><a class="tocitem" href="#Implemented-alphabets"><span>Implemented alphabets</span></a></li><li><a class="tocitem" href="#Long-Sequences"><span>Long Sequences</span></a></li><li><a class="tocitem" href="#Sequence-views"><span>Sequence views</span></a></li></ul></li><li><a class="tocitem" href="../construction/">Constructing sequences</a></li><li><a class="tocitem" href="../transforms/">Indexing &amp; modifying sequences</a></li><li><a class="tocitem" href="../predicates/">Predicates</a></li><li><a class="tocitem" href="../random/">Random sequences</a></li><li><a class="tocitem" href="../sequence_search/">Pattern matching and searching</a></li><li><a class="tocitem" href="../counting/">Counting</a></li><li><a class="tocitem" href="../io/">I/O</a></li><li><a class="tocitem" href="../interfaces/">Implementing custom types</a></li><li><a class="tocitem" href="../recipes/">Recipes</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><a class="docs-sidebar-button docs-navbar-link fa-solid fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>BioSequences Types</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>BioSequences Types</a></li></ul></nav><div class="docs-right"><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl" title="View the repository on GitHub"><span class="docs-icon fa-brands"></span><span class="docs-label is-hidden-touch">GitHub</span></a><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl/blob/master/docs/src/types.md" title="Edit source on GitHub"><span class="docs-icon fa-solid"></span></a><a class="docs-settings-button docs-navbar-link fa-solid fa-gear" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-article-toggle-button fa-solid fa-chevron-up" id="documenter-article-toggle-button" href="javascript:;" title="Collapse all docstrings"></a></div></header><article class="content" id="documenter-page"><h1 id="Abstract-Types"><a class="docs-heading-anchor" href="#Abstract-Types">Abstract Types</a><a id="Abstract-Types-1"></a><a class="docs-heading-anchor-permalink" href="#Abstract-Types" title="Permalink"></a></h1><p>BioSequences exports an abstract <code>BioSequence</code> type, and several concrete sequence types which inherit from it.</p><h2 id="The-abstract-BioSequence"><a class="docs-heading-anchor" href="#The-abstract-BioSequence">The abstract BioSequence</a><a id="The-abstract-BioSequence-1"></a><a class="docs-heading-anchor-permalink" href="#The-abstract-BioSequence" title="Permalink"></a></h2><p>BioSequences provides an abstract type called a <code>BioSequence{A&lt;:Alphabet}</code>. This abstract type, and the methods and traits is supports, allows for many algorithms in BioSequences to be written as generically as possible, thus reducing the amount of code to read and understand, whilst maintaining high performance when such code is compiled for a concrete BioSequence subtype. Additionally, it allows new types to be implemented that are fully compatible with the rest of BioSequences, providing that key methods or traits are defined).</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.BioSequence" href="#BioSequences.BioSequence"><code>BioSequences.BioSequence</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">BioSequence{A &lt;: Alphabet}</code></pre><p><code>BioSequence</code> is the main abstract type of <code>BioSequences</code>. It abstracts over the internal representation of different biological sequences, and is parameterized by an <code>Alphabet</code>, which controls the element type.</p><p><strong>Extended help</strong></p><p>Its subtypes are characterized by:</p><ul><li>Being a linear container type with random access and indices <code>Base.OneTo(length(x))</code>.</li><li>Containing zero or more internal data elements of type <code>encoded_data_eltype(typeof(x))</code>.</li><li>Being associated with an <code>Alphabet</code>, <code>A</code> by being a subtype of <code>BioSequence{A}</code>.</li></ul><p>A <code>BioSequence{A}</code> is indexed by an integer. The biosequence subtype, the index and the alphabet <code>A</code> determine how to extract the internal encoded data. The alphabet decides how to decode the data to the element type of the biosequence. Hence, the element type and container type of a <code>BioSequence</code> are separated.</p><p>Subtypes <code>T</code> of <code>BioSequence</code> must implement the following, with <code>E</code> begin an encoded data type:</p><ul><li><code>Base.length(::T)::Int</code></li><li><code>encoded_data_eltype(::Type{T})::Type{E}</code></li><li><code>extract_encoded_element(::T, ::Integer)::E</code></li><li><code>copy(::T)</code></li><li>T must be able to be constructed from any iterable with <code>length</code> defined and with a known, compatible element type.</li></ul><p>Furthermore, mutable sequences should implement</p><ul><li><code>encoded_setindex!(::T, ::E, ::Integer)</code></li><li><code>T(undef, ::Int)</code></li><li><code>resize!(::T, ::Int)</code></li></ul><p>For compatibility with existing <code>Alphabet</code>s, the encoded data eltype must be <code>UInt</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/biosequence.jl#L7-L41">source</a></section></article><p>Some aliases for <code>BioSequence</code> are also provided for your convenience:</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.NucSeq" href="#BioSequences.NucSeq"><code>BioSequences.NucSeq</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>An alias for <code>BioSequence{&lt;:NucleicAcidAlphabet}</code></p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/biosequence.jl#L231">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.AASeq" href="#BioSequences.AASeq"><code>BioSequences.AASeq</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>An alias for <code>BioSequence{AminoAcidAlphabet}</code></p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/biosequence.jl#L240-L242">source</a></section></article><p>Let&#39;s have a closer look at some of those methods that a subtype of <code>BioSequence</code> must implement. Check out julia base library docs for <code>length</code>, <code>copy</code> and <code>resize!</code>.</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.encoded_data_eltype" href="#BioSequences.encoded_data_eltype"><code>BioSequences.encoded_data_eltype</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">encoded_data_eltype(::Type{&lt;:BioSequence})</code></pre><p>Returns the element type of the encoded data of the <code>BioSequence</code>. This is the return type of <code>extract_encoded_element</code>, i.e. the data type that stores the biological symbols in the biosequence.</p><p>See also: <a href="#BioSequences.BioSequence"><code>BioSequence</code></a> </p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/biosequence.jl#L192-L200">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.extract_encoded_element" href="#BioSequences.extract_encoded_element"><code>BioSequences.extract_encoded_element</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">extract_encoded_element(::BioSequence{A}, i::Integer)</code></pre><p>Returns the encoded element at position <code>i</code>. This data can be decoded using <code>decode(A(), data)</code> to yield the element type of the biosequence.</p><p>See also: <a href="#BioSequences.BioSequence"><code>BioSequence</code></a> </p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/biosequence.jl#L203-L211">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.encoded_setindex!" href="#BioSequences.encoded_setindex!"><code>BioSequences.encoded_setindex!</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">encoded_setindex!(seq::BioSequence, x::E, i::Integer)</code></pre><p>Given encoded data <code>x</code> of type <code>encoded_data_eltype(typeof(seq))</code>, sets the internal sequence data at the given index.</p><p>See also: <a href="#BioSequences.BioSequence"><code>BioSequence</code></a> </p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/biosequence/biosequence.jl#L215-L222">source</a></section></article><p>A correctly defined subtype of <code>BioSequence</code> that satisfies the interface, will find the vast majority of methods described in the rest of this manual should work out of the box for that type. But they can always be overloaded if needed. Indeed the <code>LongSequence</code> type overloads Indeed some of the generic <code>BioSequence</code> methods, are overloaded for <code>LongSequence</code>, for example for transformation and counting operations where efficiency gains can be made due to the specific internal representation of a specific type.</p><h2 id="The-abstract-Alphabet"><a class="docs-heading-anchor" href="#The-abstract-Alphabet">The abstract Alphabet</a><a id="The-abstract-Alphabet-1"></a><a class="docs-heading-anchor-permalink" href="#The-abstract-Alphabet" title="Permalink"></a></h2><p>Alphabets control how biological symbols are encoded and decoded. They also confer many of the automatic traits and methods that any subtype of <code>T&lt;:BioSequence{A&lt;:Alphabet}</code> will get.</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.Alphabet" href="#BioSequences.Alphabet"><code>BioSequences.Alphabet</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">Alphabet</code></pre><p><code>Alphabet</code> is the most important type trait for <code>BioSequence</code>. An <code>Alphabet</code> represents a set of biological symbols encoded by a sequence, e.g. A, C, G and T for a DNA Alphabet that requires only 2 bits to represent each symbol.</p><p><strong>Extended help</strong></p><ul><li>Subtypes of Alphabet are singleton structs that may or may not be parameterized.</li><li>Alphabets span over a <em>finite</em> set of biological symbols.</li><li>The alphabet controls the encoding from some internal &quot;encoded data&quot; to a BioSymbol  of the alphabet&#39;s element type, as well as the decoding, the inverse process.</li><li>An <code>Alphabet</code>&#39;s <code>encode</code> method must not produce invalid data. </li></ul><p>Every subtype <code>A</code> of <code>Alphabet</code> must implement:</p><ul><li><code>Base.eltype(::Type{A})::Type{S}</code> for some eltype <code>S</code>, which must be a <code>BioSymbol</code>.</li><li><code>symbols(::A)::Tuple{Vararg{S}}</code>. This gives tuples of all symbols in the set of <code>A</code>.</li><li><code>encode(::A, ::S)::E</code> encodes a symbol to an internal data eltype <code>E</code>.</li><li><code>decode(::A, ::E)::S</code> decodes an internal data eltype <code>E</code> to a symbol <code>S</code>.</li><li>Except for <code>eltype</code> which must follow Base conventions, all functions operating on <code>Alphabet</code> should operate on instances of the alphabet, not the type.</li></ul><p>If you want interoperation with existing subtypes of <code>BioSequence</code>, the encoded representation <code>E</code> must be of type <code>UInt</code>, and you must also implement:</p><ul><li><code>BitsPerSymbol(::A)::BitsPerSymbol{N}</code>, where the <code>N</code> must be zero or a power of two in [1, 2, 4, 8, 16, 32, [64 for 64-bit systems]].</li></ul><p>For increased performance, see <a href="#BioSequences.AsciiAlphabet"><code>BioSequences.AsciiAlphabet</code></a></p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/alphabet.jl#L10-L38">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.AsciiAlphabet" href="#BioSequences.AsciiAlphabet"><code>BioSequences.AsciiAlphabet</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">AsciiAlphabet</code></pre><p>Trait for alphabet using ASCII characters as String representation. Define <code>codetype(A) = AsciiAlphabet()</code> for a user-defined <code>Alphabet</code> A to gain speed. Methods needed: <code>BioSymbols.stringbyte(::eltype(A))</code> and <code>ascii_encode(A, ::UInt8)</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/alphabet.jl#L250-L256">source</a></section></article><h1 id="Concrete-types"><a class="docs-heading-anchor" href="#Concrete-types">Concrete types</a><a id="Concrete-types-1"></a><a class="docs-heading-anchor-permalink" href="#Concrete-types" title="Permalink"></a></h1><h2 id="Implemented-alphabets"><a class="docs-heading-anchor" href="#Implemented-alphabets">Implemented alphabets</a><a id="Implemented-alphabets-1"></a><a class="docs-heading-anchor-permalink" href="#Implemented-alphabets" title="Permalink"></a></h2><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.DNAAlphabet" href="#BioSequences.DNAAlphabet"><code>BioSequences.DNAAlphabet</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>DNA nucleotide alphabet.</p><p><code>DNAAlphabet</code> has a parameter <code>N</code> which is a number that determines the <code>BitsPerSymbol</code> trait. Currently supported values of <code>N</code> are 2 and 4.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/alphabet.jl#L138-L143">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.RNAAlphabet" href="#BioSequences.RNAAlphabet"><code>BioSequences.RNAAlphabet</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>RNA nucleotide alphabet.</p><p><code>RNAAlphabet</code> has a parameter <code>N</code> which is a number that determines the <code>BitsPerSymbol</code> trait. Currently supported values of <code>N</code> are 2 and 4.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/alphabet.jl#L147-L152">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.AminoAcidAlphabet" href="#BioSequences.AminoAcidAlphabet"><code>BioSequences.AminoAcidAlphabet</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>Amino acid alphabet.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/alphabet.jl#L217-L219">source</a></section></article><h2 id="Long-Sequences"><a class="docs-heading-anchor" href="#Long-Sequences">Long Sequences</a><a id="Long-Sequences-1"></a><a class="docs-heading-anchor-permalink" href="#Long-Sequences" title="Permalink"></a></h2><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.LongSequence" href="#BioSequences.LongSequence"><code>BioSequences.LongSequence</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">LongSequence{A &lt;: Alphabet}</code></pre><p>General-purpose <code>BioSequence</code>. This type is mutable and variable-length, and should be preferred for most use cases.</p><p><strong>Extended help</strong></p><p><code>LongSequence{A&lt;:Alphabet} &lt;: BioSequence{A}</code> is parameterized by a concrete <code>Alphabet</code> type <code>A</code> that defines the domain (or set) of biological symbols permitted.</p><p>As the <a href="#BioSequences.BioSequence"><code>BioSequence</code></a> interface definition implies, <code>LongSequence</code>s store the biological symbol elements that they contain in a succinct encoded form that permits many operations to be done in an efficient bit-parallel manner. As per the interface of <a href="#BioSequences.BioSequence"><code>BioSequence</code></a>, the <a href="#BioSequences.Alphabet"><code>Alphabet</code></a> determines how an element is encoded or decoded when it is inserted or extracted from the sequence.</p><p>For example, <a href="#BioSequences.AminoAcidAlphabet"><code>AminoAcidAlphabet</code></a> is associated with <code>AminoAcid</code> and hence an object of the <code>LongSequence{AminoAcidAlphabet}</code> type represents a sequence of amino acids.</p><p>Symbols from multiple alphabets can&#39;t be intermixed in one sequence type.</p><p>The following table summarizes common LongSequence types that have been given aliases for convenience.</p><table><tr><th style="text-align: left">Type</th><th style="text-align: left">Symbol type</th><th style="text-align: left">Type alias</th></tr><tr><td style="text-align: left"><code>LongSequence{DNAAlphabet{N}}</code></td><td style="text-align: left"><code>DNA</code></td><td style="text-align: left"><code>LongDNA{N}</code></td></tr><tr><td style="text-align: left"><code>LongSequence{RNAAlphabet{N}}</code></td><td style="text-align: left"><code>RNA</code></td><td style="text-align: left"><code>LongRNA{N}</code></td></tr><tr><td style="text-align: left"><code>LongSequence{AminoAcidAlphabet}</code></td><td style="text-align: left"><code>AminoAcid</code></td><td style="text-align: left"><code>LongAA</code></td></tr></table><p>The <code>LongDNA</code> and <code>LongRNA</code> aliases use a DNAAlphabet{4}.</p><p><code>DNAAlphabet{4}</code> permits ambiguous nucleotides, and a sequence must use at least 4 bits to internally store each element (and indeed <code>LongSequence</code> does).</p><p>If you are sure that you are working with sequences with no ambiguous nucleotides, you can use <code>LongSequences</code> parameterised with <code>DNAAlphabet{2}</code> instead.</p><p><code>DNAAlphabet{2}</code> is an alphabet that uses two bits per base and limits to only unambiguous nucleotide symbols (A,C,G,T).</p><p>Changing this single parameter, is all you need to do in order to benefit from memory savings. Some computations that use bitwise operations will also be dramatically faster.</p><p>The same applies with <code>LongSequence{RNAAlphabet{4}}</code>, simply replace the alphabet parameter with <code>RNAAlphabet{2}</code> in order to benefit.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/8fa5616bb45b41c21892e60f87e3be5fdafc2e32/src/longsequences/longsequence.jl#L36-L85">source</a></section></article><h2 id="Sequence-views"><a class="docs-heading-anchor" href="#Sequence-views">Sequence views</a><a id="Sequence-views-1"></a><a class="docs-heading-anchor-permalink" href="#Sequence-views" title="Permalink"></a></h2><p>Similar to how Base Julia offers views of array objects, BioSequences offers view of <code>LongSequence</code>s - the <code>LongSubSeq{A&lt;:Alphabet}</code>.</p><p>Conceptually, a <code>LongSubSeq{A}</code> is similar to a <code>LongSequence{A}</code>, but instead of storing their own data, they refer to the data of a <code>LongSequence</code>. Modiying the <code>LongSequence</code> will be reflected in the view, and vice versa. If the underlying <code>LongSequence</code> is truncated, the behaviour of a view is undefined. For the same reason, some operations are not supported for views, such as resizing.</p><p>The purpose of <code>LongSubSeq</code> is that, since they only contain a pointer to the underlying array, an offset and a length, they are much lighter than <code>LongSequences</code>, and will be stack allocated on Julia 1.5 and newer. Thus, the user may construct millions of views without major performance implications.</p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../symbols/">« Biological Symbols</a><a class="docs-footer-nextpage" href="../construction/">Constructing sequences »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Thursday 24 October 2024 17:54">Thursday 24 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
+<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>BioSequences Types · BioSequences.jl</title><meta name="title" content="BioSequences Types · BioSequences.jl"/><meta property="og:title" content="BioSequences Types · BioSequences.jl"/><meta property="twitter:title" content="BioSequences Types · BioSequences.jl"/><meta name="description" content="Documentation for BioSequences.jl."/><meta property="og:description" content="Documentation for BioSequences.jl."/><meta property="twitter:description" content="Documentation for BioSequences.jl."/><script data-outdated-warner src="../assets/warner.js"></script><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.050/juliamono.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.8/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../assets/documenter.js"></script><script src="../search_index.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-mocha.css" data-theme-name="catppuccin-mocha"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-macchiato.css" data-theme-name="catppuccin-macchiato"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-frappe.css" data-theme-name="catppuccin-frappe"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-latte.css" data-theme-name="catppuccin-latte"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><a class="docs-logo" href="../"><img src="../assets/logo.png" alt="BioSequences.jl logo"/></a><div class="docs-package-name"><span class="docs-autofit"><a href="../">BioSequences.jl</a></span></div><button class="docs-search-query input is-rounded is-small is-clickable my-2 mx-auto py-1 px-2" id="documenter-search-query">Search docs (Ctrl + /)</button><ul class="docs-menu"><li><a class="tocitem" href="../">Home</a></li><li><a class="tocitem" href="../symbols/">Biological Symbols</a></li><li class="is-active"><a class="tocitem" href>BioSequences Types</a><ul class="internal"><li><a class="tocitem" href="#The-abstract-BioSequence"><span>The abstract BioSequence</span></a></li><li><a class="tocitem" href="#The-abstract-Alphabet"><span>The abstract Alphabet</span></a></li><li class="toplevel"><a class="tocitem" href="#Concrete-types"><span>Concrete types</span></a></li><li><a class="tocitem" href="#Implemented-alphabets"><span>Implemented alphabets</span></a></li><li><a class="tocitem" href="#Long-Sequences"><span>Long Sequences</span></a></li><li><a class="tocitem" href="#Sequence-views"><span>Sequence views</span></a></li></ul></li><li><a class="tocitem" href="../construction/">Constructing sequences</a></li><li><a class="tocitem" href="../transforms/">Indexing &amp; modifying sequences</a></li><li><a class="tocitem" href="../predicates/">Predicates</a></li><li><a class="tocitem" href="../random/">Random sequences</a></li><li><a class="tocitem" href="../sequence_search/">Pattern matching and searching</a></li><li><a class="tocitem" href="../counting/">Counting</a></li><li><a class="tocitem" href="../io/">I/O</a></li><li><a class="tocitem" href="../interfaces/">Implementing custom types</a></li><li><a class="tocitem" href="../recipes/">Recipes</a></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><a class="docs-sidebar-button docs-navbar-link fa-solid fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>BioSequences Types</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>BioSequences Types</a></li></ul></nav><div class="docs-right"><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl" title="View the repository on GitHub"><span class="docs-icon fa-brands"></span><span class="docs-label is-hidden-touch">GitHub</span></a><a class="docs-navbar-link" href="https://github.com/BioJulia/BioSequences.jl/blob/master/docs/src/types.md" title="Edit source on GitHub"><span class="docs-icon fa-solid"></span></a><a class="docs-settings-button docs-navbar-link fa-solid fa-gear" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-article-toggle-button fa-solid fa-chevron-up" id="documenter-article-toggle-button" href="javascript:;" title="Collapse all docstrings"></a></div></header><article class="content" id="documenter-page"><h1 id="Abstract-Types"><a class="docs-heading-anchor" href="#Abstract-Types">Abstract Types</a><a id="Abstract-Types-1"></a><a class="docs-heading-anchor-permalink" href="#Abstract-Types" title="Permalink"></a></h1><p>BioSequences exports an abstract <code>BioSequence</code> type, and several concrete sequence types which inherit from it.</p><h2 id="The-abstract-BioSequence"><a class="docs-heading-anchor" href="#The-abstract-BioSequence">The abstract BioSequence</a><a id="The-abstract-BioSequence-1"></a><a class="docs-heading-anchor-permalink" href="#The-abstract-BioSequence" title="Permalink"></a></h2><p>BioSequences provides an abstract type called a <code>BioSequence{A&lt;:Alphabet}</code>. This abstract type, and the methods and traits is supports, allows for many algorithms in BioSequences to be written as generically as possible, thus reducing the amount of code to read and understand, whilst maintaining high performance when such code is compiled for a concrete BioSequence subtype. Additionally, it allows new types to be implemented that are fully compatible with the rest of BioSequences, providing that key methods or traits are defined).</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.BioSequence" href="#BioSequences.BioSequence"><code>BioSequences.BioSequence</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">BioSequence{A &lt;: Alphabet}</code></pre><p><code>BioSequence</code> is the main abstract type of <code>BioSequences</code>. It abstracts over the internal representation of different biological sequences, and is parameterized by an <code>Alphabet</code>, which controls the element type.</p><p><strong>Extended help</strong></p><p>Its subtypes are characterized by:</p><ul><li>Being a linear container type with random access and indices <code>Base.OneTo(length(x))</code>.</li><li>Containing zero or more internal data elements of type <code>encoded_data_eltype(typeof(x))</code>.</li><li>Being associated with an <code>Alphabet</code>, <code>A</code> by being a subtype of <code>BioSequence{A}</code>.</li></ul><p>A <code>BioSequence{A}</code> is indexed by an integer. The biosequence subtype, the index and the alphabet <code>A</code> determine how to extract the internal encoded data. The alphabet decides how to decode the data to the element type of the biosequence. Hence, the element type and container type of a <code>BioSequence</code> are separated.</p><p>Subtypes <code>T</code> of <code>BioSequence</code> must implement the following, with <code>E</code> begin an encoded data type:</p><ul><li><code>Base.length(::T)::Int</code></li><li><code>encoded_data_eltype(::Type{T})::Type{E}</code></li><li><code>extract_encoded_element(::T, ::Integer)::E</code></li><li><code>copy(::T)</code></li><li>T must be able to be constructed from any iterable with <code>length</code> defined and with a known, compatible element type.</li></ul><p>Furthermore, mutable sequences should implement</p><ul><li><code>encoded_setindex!(::T, ::E, ::Integer)</code></li><li><code>T(undef, ::Int)</code></li><li><code>resize!(::T, ::Int)</code></li></ul><p>For compatibility with existing <code>Alphabet</code>s, the encoded data eltype must be <code>UInt</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/biosequence.jl#L7-L41">source</a></section></article><p>Some aliases for <code>BioSequence</code> are also provided for your convenience:</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.NucSeq" href="#BioSequences.NucSeq"><code>BioSequences.NucSeq</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>An alias for <code>BioSequence{&lt;:NucleicAcidAlphabet}</code></p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/biosequence.jl#L231">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.AASeq" href="#BioSequences.AASeq"><code>BioSequences.AASeq</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>An alias for <code>BioSequence{AminoAcidAlphabet}</code></p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/biosequence.jl#L240-L242">source</a></section></article><p>Let&#39;s have a closer look at some of those methods that a subtype of <code>BioSequence</code> must implement. Check out julia base library docs for <code>length</code>, <code>copy</code> and <code>resize!</code>.</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.encoded_data_eltype" href="#BioSequences.encoded_data_eltype"><code>BioSequences.encoded_data_eltype</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">encoded_data_eltype(::Type{&lt;:BioSequence})</code></pre><p>Returns the element type of the encoded data of the <code>BioSequence</code>. This is the return type of <code>extract_encoded_element</code>, i.e. the data type that stores the biological symbols in the biosequence.</p><p>See also: <a href="#BioSequences.BioSequence"><code>BioSequence</code></a> </p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/biosequence.jl#L192-L200">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.extract_encoded_element" href="#BioSequences.extract_encoded_element"><code>BioSequences.extract_encoded_element</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">extract_encoded_element(::BioSequence{A}, i::Integer)</code></pre><p>Returns the encoded element at position <code>i</code>. This data can be decoded using <code>decode(A(), data)</code> to yield the element type of the biosequence.</p><p>See also: <a href="#BioSequences.BioSequence"><code>BioSequence</code></a> </p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/biosequence.jl#L203-L211">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.encoded_setindex!" href="#BioSequences.encoded_setindex!"><code>BioSequences.encoded_setindex!</code></a> — <span class="docstring-category">Function</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">encoded_setindex!(seq::BioSequence, x::E, i::Integer)</code></pre><p>Given encoded data <code>x</code> of type <code>encoded_data_eltype(typeof(seq))</code>, sets the internal sequence data at the given index.</p><p>See also: <a href="#BioSequences.BioSequence"><code>BioSequence</code></a> </p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/biosequence/biosequence.jl#L215-L222">source</a></section></article><p>A correctly defined subtype of <code>BioSequence</code> that satisfies the interface, will find the vast majority of methods described in the rest of this manual should work out of the box for that type. But they can always be overloaded if needed. Indeed the <code>LongSequence</code> type overloads Indeed some of the generic <code>BioSequence</code> methods, are overloaded for <code>LongSequence</code>, for example for transformation and counting operations where efficiency gains can be made due to the specific internal representation of a specific type.</p><h2 id="The-abstract-Alphabet"><a class="docs-heading-anchor" href="#The-abstract-Alphabet">The abstract Alphabet</a><a id="The-abstract-Alphabet-1"></a><a class="docs-heading-anchor-permalink" href="#The-abstract-Alphabet" title="Permalink"></a></h2><p>Alphabets control how biological symbols are encoded and decoded. They also confer many of the automatic traits and methods that any subtype of <code>T&lt;:BioSequence{A&lt;:Alphabet}</code> will get.</p><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.Alphabet" href="#BioSequences.Alphabet"><code>BioSequences.Alphabet</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">Alphabet</code></pre><p><code>Alphabet</code> is the most important type trait for <code>BioSequence</code>. An <code>Alphabet</code> represents a set of biological symbols encoded by a sequence, e.g. A, C, G and T for a DNA Alphabet that requires only 2 bits to represent each symbol.</p><p><strong>Extended help</strong></p><ul><li>Subtypes of Alphabet are singleton structs that may or may not be parameterized.</li><li>Alphabets span over a <em>finite</em> set of biological symbols.</li><li>The alphabet controls the encoding from some internal &quot;encoded data&quot; to a BioSymbol  of the alphabet&#39;s element type, as well as the decoding, the inverse process.</li><li>An <code>Alphabet</code>&#39;s <code>encode</code> method must not produce invalid data. </li></ul><p><strong>Required methods</strong></p><p>Every subtype <code>A</code> of <code>Alphabet</code> must implement:</p><ul><li><code>Base.eltype(::Type{A})::Type{S}</code> for some eltype <code>S</code>, which must be a <code>BioSymbol</code>.</li><li><code>symbols(::A)::Tuple{Vararg{S}}</code>. This gives tuples of all symbols in the set of <code>A</code>.</li><li><code>encode(::A, ::S)::E</code> encodes a symbol to an internal data eltype <code>E</code>.</li><li><code>decode(::A, ::E)::S</code> decodes an internal data eltype <code>E</code> to a symbol <code>S</code>.</li><li>Except for <code>eltype</code> which must follow Base conventions, all functions operating on <code>Alphabet</code> should operate on instances of the alphabet, not the type.</li></ul><p>If you want interoperation with existing subtypes of <code>BioSequence</code>, the encoded representation <code>E</code> must be of type <code>UInt</code>, and you must also implement:</p><ul><li><code>BitsPerSymbol(::A)::BitsPerSymbol{N}</code>, where the <code>N</code> must be zero or a power of two in [1, 2, 4, 8, 16, 32, [64 for 64-bit systems]].</li></ul><p><strong>Optional methods</strong></p><ul><li><code>BitsPerSymbol</code> for compatibility with existing <code>BioSequence</code>s</li><li><code>AsciiAlphabet</code> for increased printing/writing efficiency</li><li><code>tryencode</code> for fallible encoding.</li></ul></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/alphabet.jl#L10-L42">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.AsciiAlphabet" href="#BioSequences.AsciiAlphabet"><code>BioSequences.AsciiAlphabet</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">AsciiAlphabet</code></pre><p>Trait for alphabet using ASCII characters as String representation. Define <code>codetype(A) = AsciiAlphabet()</code> for a user-defined <code>Alphabet</code> A to gain speed. Methods needed: <code>BioSymbols.stringbyte(::eltype(A))</code> and <code>ascii_encode(A, ::UInt8)</code>.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/alphabet.jl#L270-L276">source</a></section></article><h1 id="Concrete-types"><a class="docs-heading-anchor" href="#Concrete-types">Concrete types</a><a id="Concrete-types-1"></a><a class="docs-heading-anchor-permalink" href="#Concrete-types" title="Permalink"></a></h1><h2 id="Implemented-alphabets"><a class="docs-heading-anchor" href="#Implemented-alphabets">Implemented alphabets</a><a id="Implemented-alphabets-1"></a><a class="docs-heading-anchor-permalink" href="#Implemented-alphabets" title="Permalink"></a></h2><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.DNAAlphabet" href="#BioSequences.DNAAlphabet"><code>BioSequences.DNAAlphabet</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>DNA nucleotide alphabet.</p><p><code>DNAAlphabet</code> has a parameter <code>N</code> which is a number that determines the <code>BitsPerSymbol</code> trait. Currently supported values of <code>N</code> are 2 and 4.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/alphabet.jl#L162-L167">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.RNAAlphabet" href="#BioSequences.RNAAlphabet"><code>BioSequences.RNAAlphabet</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>RNA nucleotide alphabet.</p><p><code>RNAAlphabet</code> has a parameter <code>N</code> which is a number that determines the <code>BitsPerSymbol</code> trait. Currently supported values of <code>N</code> are 2 and 4.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/alphabet.jl#L171-L176">source</a></section></article><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.AminoAcidAlphabet" href="#BioSequences.AminoAcidAlphabet"><code>BioSequences.AminoAcidAlphabet</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><p>Amino acid alphabet.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/alphabet.jl#L239-L241">source</a></section></article><h2 id="Long-Sequences"><a class="docs-heading-anchor" href="#Long-Sequences">Long Sequences</a><a id="Long-Sequences-1"></a><a class="docs-heading-anchor-permalink" href="#Long-Sequences" title="Permalink"></a></h2><article class="docstring"><header><a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring"></a><a class="docstring-binding" id="BioSequences.LongSequence" href="#BioSequences.LongSequence"><code>BioSequences.LongSequence</code></a> — <span class="docstring-category">Type</span><span class="is-flex-grow-1 docstring-article-toggle-button" title="Collapse docstring"></span></header><section><div><pre><code class="language-julia hljs">LongSequence{A &lt;: Alphabet}</code></pre><p>General-purpose <code>BioSequence</code>. This type is mutable and variable-length, and should be preferred for most use cases.</p><p><strong>Extended help</strong></p><p><code>LongSequence{A&lt;:Alphabet} &lt;: BioSequence{A}</code> is parameterized by a concrete <code>Alphabet</code> type <code>A</code> that defines the domain (or set) of biological symbols permitted.</p><p>As the <a href="#BioSequences.BioSequence"><code>BioSequence</code></a> interface definition implies, <code>LongSequence</code>s store the biological symbol elements that they contain in a succinct encoded form that permits many operations to be done in an efficient bit-parallel manner. As per the interface of <a href="#BioSequences.BioSequence"><code>BioSequence</code></a>, the <a href="#BioSequences.Alphabet"><code>Alphabet</code></a> determines how an element is encoded or decoded when it is inserted or extracted from the sequence.</p><p>For example, <a href="#BioSequences.AminoAcidAlphabet"><code>AminoAcidAlphabet</code></a> is associated with <code>AminoAcid</code> and hence an object of the <code>LongSequence{AminoAcidAlphabet}</code> type represents a sequence of amino acids.</p><p>Symbols from multiple alphabets can&#39;t be intermixed in one sequence type.</p><p>The following table summarizes common LongSequence types that have been given aliases for convenience.</p><table><tr><th style="text-align: left">Type</th><th style="text-align: left">Symbol type</th><th style="text-align: left">Type alias</th></tr><tr><td style="text-align: left"><code>LongSequence{DNAAlphabet{N}}</code></td><td style="text-align: left"><code>DNA</code></td><td style="text-align: left"><code>LongDNA{N}</code></td></tr><tr><td style="text-align: left"><code>LongSequence{RNAAlphabet{N}}</code></td><td style="text-align: left"><code>RNA</code></td><td style="text-align: left"><code>LongRNA{N}</code></td></tr><tr><td style="text-align: left"><code>LongSequence{AminoAcidAlphabet}</code></td><td style="text-align: left"><code>AminoAcid</code></td><td style="text-align: left"><code>LongAA</code></td></tr></table><p>The <code>LongDNA</code> and <code>LongRNA</code> aliases use a DNAAlphabet{4}.</p><p><code>DNAAlphabet{4}</code> permits ambiguous nucleotides, and a sequence must use at least 4 bits to internally store each element (and indeed <code>LongSequence</code> does).</p><p>If you are sure that you are working with sequences with no ambiguous nucleotides, you can use <code>LongSequences</code> parameterised with <code>DNAAlphabet{2}</code> instead.</p><p><code>DNAAlphabet{2}</code> is an alphabet that uses two bits per base and limits to only unambiguous nucleotide symbols (A,C,G,T).</p><p>Changing this single parameter, is all you need to do in order to benefit from memory savings. Some computations that use bitwise operations will also be dramatically faster.</p><p>The same applies with <code>LongSequence{RNAAlphabet{4}}</code>, simply replace the alphabet parameter with <code>RNAAlphabet{2}</code> in order to benefit.</p></div><a class="docs-sourcelink" target="_blank" href="https://github.com/BioJulia/BioSequences.jl/blob/295ba89d4b8c57979ef38bc94ff44e5973b741d1/src/longsequences/longsequence.jl#L36-L85">source</a></section></article><h2 id="Sequence-views"><a class="docs-heading-anchor" href="#Sequence-views">Sequence views</a><a id="Sequence-views-1"></a><a class="docs-heading-anchor-permalink" href="#Sequence-views" title="Permalink"></a></h2><p>Similar to how Base Julia offers views of array objects, BioSequences offers view of <code>LongSequence</code>s - the <code>LongSubSeq{A&lt;:Alphabet}</code>.</p><p>Conceptually, a <code>LongSubSeq{A}</code> is similar to a <code>LongSequence{A}</code>, but instead of storing their own data, they refer to the data of a <code>LongSequence</code>. Modiying the <code>LongSequence</code> will be reflected in the view, and vice versa. If the underlying <code>LongSequence</code> is truncated, the behaviour of a view is undefined. For the same reason, some operations are not supported for views, such as resizing.</p><p>The purpose of <code>LongSubSeq</code> is that, since they only contain a pointer to the underlying array, an offset and a length, they are much lighter than <code>LongSequences</code>, and will be stack allocated on Julia 1.5 and newer. Thus, the user may construct millions of views without major performance implications.</p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../symbols/">« Biological Symbols</a><a class="docs-footer-nextpage" href="../construction/">Constructing sequences »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.7.0 on <span class="colophon-date" title="Friday 25 October 2024 10:21">Friday 25 October 2024</span>. Using Julia version 1.11.1.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>

Deprecated function	Instead use
`n_gaps`	`count(isgap, seq)`
`n_certain`	`count(iscertain, seq)`
`n_ambiguous`	`count(isambiguous, seq)`
Type	Symbol type	Type alias
`LongSequence{DNAAlphabet{N}}`	`DNA`	`LongDNA{N}`
`LongSequence{RNAAlphabet{N}}`	`RNA`	`LongRNA{N}`
`LongSequence{AminoAcidAlphabet}`	`AminoAcid`	`LongAA`