Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add CI, fix existing warnings, add other improvements #1095

Merged
merged 36 commits into from
Mar 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
dd5e66d
Use sphinx variables from the environment if present
victorlin Nov 8, 2022
b62ecb5
Update CI workflow to catch warnings in docs
victorlin Nov 8, 2022
9227345
docs/faq/metadata: Convert page to rST
victorlin Jan 18, 2023
ce920ad
Update references to zika tutorial
victorlin Nov 8, 2022
2c8fdaa
Remove "Tests" section heading in docstring
victorlin Nov 8, 2022
3aa9665
Use consistent table of contents
victorlin Nov 8, 2022
f331415
distance: Use JSON syntax highlighting
victorlin Nov 8, 2022
cbed8ef
Fix augur.io docs warning
victorlin Jan 18, 2023
844c10c
Replace section heading in docstring with bold text
victorlin Jan 18, 2023
a310587
docs: Build locally with nitpicky mode
victorlin Jan 18, 2023
14bf620
Set nitpick_ignore to suppress warnings of valid numpydoc
victorlin Jan 19, 2023
97e082d
fix: Add augur.types prefix for proper linking
victorlin Mar 9, 2023
eb65cbb
fix: Use Python type hints in doc generation
victorlin Jan 19, 2023
bf2d133
fix: Add external intersphinx mappings
victorlin Jan 19, 2023
862f749
fix: Add doc pages to resolve references
victorlin Jan 19, 2023
9b896c2
fix: Ignore JSONEncoder/JSONDecodeError
victorlin Jan 19, 2023
445d73f
fix: Correct type of pivots
victorlin Jan 19, 2023
ee58907
fix numpydoc: Address obvious style syntax issues
victorlin Jan 18, 2023
3071e37
fix numpydoc: Add an extra line between numpydoc and doctest
victorlin Jan 18, 2023
cc77b51
fix numpydoc: Add Examples section for doctests
victorlin Jan 18, 2023
cc52176
fix numpydoc: Use numpydoc type hints
victorlin Jan 18, 2023
9ac2386
fix numpydoc: Correct type of label
victorlin Jan 19, 2023
3a9c6ff
fix numpydoc: Remove trailing colon from exception classes
victorlin Jan 19, 2023
2f87db7
fix numpydoc: Replace "Path-like" with os.PathLike
victorlin Jan 19, 2023
3ad5369
fix numpydoc: Replace "IO buffer" with io.StringIO
victorlin Jan 19, 2023
5b37985
fix numpydoc: Remove docstring from traits.register_parser
victorlin Jan 19, 2023
8865478
fix numpydoc: Remove "TYPE" placeholders
victorlin Jan 19, 2023
a8b1c5f
fix numpydoc: Remove unused placeholders in titer_model
victorlin Jan 19, 2023
cf1d935
fix numpydoc: Properly reference Bio.Align.MultipleSeqAlignment
victorlin Jan 19, 2023
f6435c6
fix numpydoc: Properly reference Bio.Phylo.BaseTree.Tree
victorlin Jan 19, 2023
64d8bc5
fix numpydoc: Properly reference Bio.Phylo.BaseTree.Clade
victorlin Jan 19, 2023
2c7c6ed
fix numpydoc: Properly reference Bio.SeqRecord.SeqRecord
victorlin Jan 19, 2023
30781e1
fix numpydoc: Properly reference numpy.ndarray
victorlin Jan 19, 2023
ef70f10
fix numpydoc: Properly reference treetime classes
victorlin Jan 19, 2023
2ba5428
fix numpydoc: Update reference of pandas TextFileReader
victorlin Jan 19, 2023
8e3705d
Update changelog
victorlin Mar 8, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -93,3 +93,9 @@ jobs:
- uses: codecov/codecov-action@v3
with:
fail_ci_if_error: true

build-docs:
uses: nextstrain/.github/.github/workflows/docs-ci.yaml@master
with:
docs-directory: docs/
pip-install-target: .[dev]
2 changes: 2 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,10 @@
* translate: Fix error handling when features cannot be read from reference sequence file. [#1168][] (@victorlin)
* translate: Remove an unnecessary check which allowed for inaccurate error messages to be shown. [#1169][] (@victorlin)
* frequencies: Previously, monthly pivot points calculated from the end of a month may have been shifted by 1-3 days. This is now fixed. [#1150][] (@victorlin)
* docs: Fix minor formatting issues. [#1095][] (@victorlin)
* Update development status on PyPI from "3 - Alpha" to "5 - Production/Stable". This should have been done since the beginning of this changelog, but now it is official. [#1160][] (@corneliusroemer)

[#1095]: https://github.com/nextstrain/augur/pull/1095
[#1150]: https://github.com/nextstrain/augur/pull/1150
[#1160]: https://github.com/nextstrain/augur/pull/1160
[#1168]: https://github.com/nextstrain/augur/pull/1168
Expand Down
24 changes: 11 additions & 13 deletions augur/align.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ def prepare(sequences, existing_aln_fname, output, ref_name, ref_seq_fname):

Parameters
----------
sequences : list[str]
sequences : list of str
List of paths to FASTA-formatted sequences to align.
existing_aln_fname : str
Path of an existing alignment to use, or None
Expand All @@ -67,7 +67,8 @@ def prepare(sequences, existing_aln_fname, output, ref_name, ref_seq_fname):

Returns
-------
tuple: The existing alignment filename, the new sequences filename, and the name of the reference sequence.
tuple of str
The existing alignment filename, the new sequences filename, and the name of the reference sequence.
"""
seqs = read_sequences(*sequences)
seqs_to_align_fname = output + ".to_align.fasta"
Expand Down Expand Up @@ -104,7 +105,7 @@ def run(args):
'''
Parameters
----------
args : namespace
args : argparse.Namespace
arguments passed in via the command-line from augur

Returns
Expand Down Expand Up @@ -152,6 +153,8 @@ def run(args):
def postprocess(output_file, ref_name, keep_reference, fill_gaps):
"""Postprocessing of the combined alignment file.

The modified alignment is written directly to output_file.

Parameters
----------
output_file: str
Expand All @@ -162,10 +165,6 @@ def postprocess(output_file, ref_name, keep_reference, fill_gaps):
If the reference was provided, whether it should be kept in the alignment
fill_gaps: bool
Replace all gaps in the alignment with "N" to indicate ambiguous sites.

Returns
-------
None - the modified alignment is written directly to output_file
"""
# -- ref_name --
# reads the new alignment
Expand Down Expand Up @@ -270,7 +269,7 @@ def strip_non_reference(aln, reference, insertion_csv=None):

Parameters
----------
aln : MultipleSeqAlign
aln : Bio.Align.MultipleSeqAlignment
Biopython Alignment
reference : str
name of reference sequence, assumed to be part of the alignment
Expand All @@ -280,9 +279,8 @@ def strip_non_reference(aln, reference, insertion_csv=None):
list
list of trimmed sequences, effectively a multiple alignment


Tests
-----
Examples
--------
>>> [s.name for s in strip_non_reference(read_alignment("tests/data/align/test_aligned_sequences.fasta"), "with_gaps")]
Trimmed gaps in with_gaps from the alignment
['with_gaps', 'no_gaps', 'some_other_seq', '_R_crick_strand']
Expand Down Expand Up @@ -384,7 +382,7 @@ def prettify_alignment(aln):

Parameters
----------
aln : MultipleSeqAlign
aln : Bio.Align.MultipleSeqAlignment
Biopython Alignment
'''
for seq in aln:
Expand All @@ -407,7 +405,7 @@ def make_gaps_ambiguous(aln):

Parameters
----------
aln : MultipleSeqAlign
aln : Bio.Align.MultipleSeqAlignment
Biopython Alignment
'''
for seq in aln:
Expand Down
6 changes: 3 additions & 3 deletions augur/ancestral.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def ancestral_sequence_inference(tree=None, aln=None, ref=None, infer_gtr=True,

Parameters
----------
tree : Bio.Phylo tree or str
tree : Bio.Phylo.BaseTree.Tree or str
tree or filename of tree
aln : Bio.Align.MultipleSeqAlignment or str
alignment or filename of alignment
Expand All @@ -49,7 +49,7 @@ def ancestral_sequence_inference(tree=None, aln=None, ref=None, infer_gtr=True,

Returns
-------
TreeAnc
treetime.TreeAnc
treetime.TreeAnc instance
"""

Expand Down Expand Up @@ -78,7 +78,7 @@ def collect_mutations_and_sequences(tt, infer_tips=False, full_sequences=False,

Parameters
----------
tt : treetime
tt : treetime.TreeTime
instance of treetime with valid ancestral reconstruction
infer_tips : bool, optional
if true, request the reconstructed tip sequences from treetime, otherwise retain input ambiguities
Expand Down
8 changes: 4 additions & 4 deletions augur/clades.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,9 +124,9 @@ def is_node_in_clade(clade_alleles, node, ref):
----------
clade_alleles : list
list of clade defining alleles
node : Phylo.Node
node : Bio.Phylo.BaseTree.Clade
node to check, assuming sequences (as mutations) are attached to node
ref : str/list
ref : str or list
positions

Returns
Expand Down Expand Up @@ -162,9 +162,9 @@ def assign_clades(clade_designations, all_muts, tree, ref=None):
clade definitions as :code:`{clade_name:[(gene, site, allele),...]}`
all_muts : dict
mutations in each node
tree : Phylo.Tree
tree : Bio.Phylo.BaseTree.Tree
phylogenetic tree to process
ref : str/list, optional
ref : str or list, optional
reference sequence to look up state when not mutated

Returns
Expand Down
2 changes: 2 additions & 0 deletions augur/dates/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ def numeric_date(date):
2. A string in the YYYY-MM-DD (ISO 8601) syntax
3. A string representing a relative date (duration before datetime.date.today())

Examples
--------
>>> numeric_date("2020.42")
2020.42
>>> numeric_date("2020-06-04")
Expand Down
42 changes: 27 additions & 15 deletions augur/distance.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,7 @@
which sequences to compare) and a distance map (to determine the weight of a
mismatch between any two sequences).

Comparison methods
==================
**Comparison methods**

Comparison methods include:

Expand All @@ -32,14 +31,15 @@
parameters allow users to specify a fixed time interval for pairwise
calculations, limiting the computationally complexity of the comparisons.

Distance maps
=============
**Distance maps**

Distance maps are defined in JSON format with two required top-level keys.
The `default` key specifies the numeric (floating point) value to assign to all mismatches by default.
The `map` key specifies a dictionary of weights to use for distance calculations.
These weights are indexed hierarchically by gene name and one-based gene coordinate and are assigned in either a sequence-independent or sequence-dependent manner.
The simplest possible distance map calculates Hamming distance between sequences without any site-specific weights, as shown below::
The simplest possible distance map calculates Hamming distance between sequences without any site-specific weights, as shown below:

.. code-block:: json

{
"name": "Hamming distance",
Expand All @@ -48,7 +48,9 @@
}

By default, distances are floating point values whose precision can be controlled with the `precision` key that defines the number of decimal places to retain for each distance.
The following example shows how to specify a precision of two decimal places in the final output.::
The following example shows how to specify a precision of two decimal places in the final output:

.. code-block:: json

{
"name": "Hamming distance",
Expand All @@ -57,7 +59,9 @@
"precision": 2
}

Distances can be reported as integer values by specifying an `output_type` as `integer` or `int` as follows.::
Distances can be reported as integer values by specifying an `output_type` as `integer` or `int` as follows:

.. code-block:: json

{
"name": "Hamming distance",
Expand All @@ -70,7 +74,9 @@
value of the same type as the default value (integer or float). The following
example is a distance map for antigenic amino acid substitutions near influenza
A/H3N2 HA's receptor binding sites. This map calculates the Hamming distance
between amino acid sequences only at seven positions in the HA1 gene::
between amino acid sequences only at seven positions in the HA1 gene:

.. code-block:: json

{
"name": "Koel epitope sites",
Expand All @@ -92,7 +98,9 @@
where the `from` sequence in each pair is interpreted as the ancestral state and
the `to` sequence as the derived state. The following example is a distance map
that assigns asymmetric weights to specific amino acid substitutions at a
specific position in the influenza gene HA1::
specific position in the influenza gene HA1:

.. code-block:: json

{
"default": 0.0,
Expand All @@ -119,7 +127,9 @@
the JSON includes a `params` field that describes the mapping of attribute names
to requested comparisons and distance maps and any date parameters specified by
the user. The following example JSON shows a sample output when the distance
command is run with multiple comparisons and distance maps::
command is run with multiple comparisons and distance maps:

.. code-block:: json

{
"params": {
Expand Down Expand Up @@ -177,7 +187,8 @@ def read_distance_map(map_file):
dict :
Python representation of the distance map JSON


Examples
--------
>>> sorted(read_distance_map("tests/data/distance_map_weight_per_site.json").items())
[('default', 0), ('map', {'HA1': {144: 1}})]
>>> sorted(read_distance_map("tests/data/distance_map_weight_per_site_and_sequence.json").items())
Expand Down Expand Up @@ -237,7 +248,8 @@ def get_distance_between_nodes(node_a_sequences, node_b_sequences, distance_map,
float :
distance between node sequences based on the given map


Examples
--------
>>> node_a_sequences = {"gene": "ACTG"}
>>> node_b_sequences = {"gene": "ACGG"}
>>> distance_map = {"default": 0, "map": {}}
Expand Down Expand Up @@ -465,7 +477,7 @@ def get_distances_to_root(tree, sequences_by_node_and_gene, distance_map):

Parameters
----------
tree : Bio.Phylo
tree : Bio.Phylo.BaseTree.Tree
a rooted tree whose node names match the given dictionary of sequences
by node and gene

Expand Down Expand Up @@ -505,7 +517,7 @@ def get_distances_to_last_ancestor(tree, sequences_by_node_and_gene, distance_ma

Parameters
----------
tree : Bio.Phylo
tree : Bio.Phylo.BaseTree.Tree
a rooted tree whose node names match the given dictionary of sequences
by node and gene

Expand Down Expand Up @@ -565,7 +577,7 @@ def get_distances_to_all_pairs(tree, sequences_by_node_and_gene, distance_map, e

Parameters
----------
tree : Bio.Phylo
tree : Bio.Phylo.BaseTree.Tree
a rooted tree whose node names match the given dictionary of sequences
by node and gene

Expand Down
5 changes: 4 additions & 1 deletion augur/export_v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -584,7 +584,8 @@ def set_data_provenance(data_json, config):
config : dict
config JSON with an expected ``data_provenance`` key


Examples
--------
>>> config = {"data_provenance": [{"name": "GISAID"}, {"name": "INSDC"}]}
>>> data_json = {"meta": {}}
>>> set_data_provenance(data_json, config)
Expand All @@ -600,6 +601,8 @@ def counter_to_disambiguation_suffix(count):
"""Given a numeric count of author papers, return a distinct alphabetical
disambiguation suffix.

Examples
--------
>>> counter_to_disambiguation_suffix(0)
'A'
>>> counter_to_disambiguation_suffix(25)
Expand Down
Loading