Skip to content

Commit

Permalink
Simplify Python doc configuration (#13826)
Browse files Browse the repository at this point in the history
This PR is a follow-up to #13789 that adds specified lists of methods/attributes to some classes; removes redundancy in the autosummary templates we are using and adds documentation explaining how they work; and removes various pieces of outdated code in our conf.py to make it easier to maintain going forward.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Matthew Roeschke (https://github.com/mroeschke)

URL: #13826
  • Loading branch information
vyasr authored Aug 8, 2023
1 parent cd3ddca commit 9b80bfd
Show file tree
Hide file tree
Showing 13 changed files with 259 additions and 88 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,6 @@
.. currentmodule:: {{ module }}

.. autoclass:: {{ objname }}

..
Don't include the methods or attributes sections, numpydoc adds them for us instead.
33 changes: 0 additions & 33 deletions docs/cudf/source/_templates/autosummary/class_with_autosummary.rst

This file was deleted.

1 change: 0 additions & 1 deletion docs/cudf/source/api_docs/dataframe.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ Constructor
~~~~~~~~~~~
.. autosummary::
:toctree: api/
:template: autosummary/class_with_autosummary.rst

DataFrame

Expand Down
6 changes: 0 additions & 6 deletions docs/cudf/source/api_docs/extension_dtypes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ cudf.CategoricalDtype
=====================
.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

CategoricalDtype

Expand Down Expand Up @@ -41,7 +40,6 @@ cudf.Decimal32Dtype
===================
.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

Decimal32Dtype

Expand Down Expand Up @@ -70,7 +68,6 @@ cudf.Decimal64Dtype
===================
.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

Decimal64Dtype

Expand Down Expand Up @@ -99,7 +96,6 @@ cudf.Decimal128Dtype
====================
.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

Decimal128Dtype

Expand Down Expand Up @@ -128,7 +124,6 @@ cudf.ListDtype
==============
.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

ListDtype

Expand All @@ -154,7 +149,6 @@ cudf.StructDtype
================
.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

StructDtype

Expand Down
14 changes: 7 additions & 7 deletions docs/cudf/source/api_docs/index_objects.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ used before calling these methods directly.**

.. autosummary::
:toctree: api/
:template: autosummary/class_with_autosummary.rst

Index

Expand Down Expand Up @@ -162,9 +161,13 @@ Numeric Index
-------------
.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

RangeIndex
RangeIndex.start
RangeIndex.stop
RangeIndex.step
RangeIndex.to_numpy
RangeIndex.to_arrow
Int64Index
UInt64Index
Float64Index
Expand All @@ -175,7 +178,6 @@ CategoricalIndex
----------------
.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

CategoricalIndex

Expand All @@ -200,7 +202,6 @@ IntervalIndex
-------------
.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

IntervalIndex

Expand All @@ -219,7 +220,6 @@ MultiIndex
----------
.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

MultiIndex

Expand Down Expand Up @@ -250,6 +250,7 @@ MultiIndex components

MultiIndex.to_frame
MultiIndex.droplevel
MultiIndex.swaplevel

MultiIndex selecting
~~~~~~~~~~~~~~~~~~~~
Expand All @@ -265,7 +266,6 @@ DatetimeIndex
-------------
.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

DatetimeIndex

Expand Down Expand Up @@ -299,6 +299,7 @@ Time-specific operations
DatetimeIndex.round
DatetimeIndex.ceil
DatetimeIndex.floor
DatetimeIndex.tz_convert
DatetimeIndex.tz_localize

Conversion
Expand All @@ -313,7 +314,6 @@ TimedeltaIndex
--------------
.. autosummary::
:toctree: api/
:template: autosummary/class_without_autosummary.rst

TimedeltaIndex

Expand Down
2 changes: 0 additions & 2 deletions docs/cudf/source/api_docs/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,6 @@ Parquet
read_parquet
DataFrame.to_parquet
cudf.io.parquet.read_parquet_metadata
:template: autosummary/class_with_autosummary.rst

cudf.io.parquet.ParquetDatasetWriter
cudf.io.parquet.ParquetDatasetWriter.close
cudf.io.parquet.ParquetDatasetWriter.write_table
Expand Down
1 change: 0 additions & 1 deletion docs/cudf/source/api_docs/series.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ Constructor
-----------
.. autosummary::
:toctree: api/
:template: autosummary/class_with_autosummary.rst

Series

Expand Down
1 change: 0 additions & 1 deletion docs/cudf/source/api_docs/subword_tokenize.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ Constructor
~~~~~~~~~~~
.. autosummary::
:toctree: api/
:template: autosummary/class_with_autosummary.rst

SubwordTokenizer
SubwordTokenizer.__call__
41 changes: 6 additions & 35 deletions docs/cudf/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
from docutils.nodes import Text
from sphinx.addnodes import pending_xref

# -- Custom Extensions ----------------------------------------------------
sys.path.append(os.path.abspath("./_ext"))

# -- General configuration ------------------------------------------------
Expand Down Expand Up @@ -52,9 +53,6 @@

copybutton_prompt_text = ">>> "
autosummary_generate = True
ipython_mplbackend = "str"

html_use_modindex = True

# Enable automatic generation of systematic, namespaced labels for sections
myst_heading_anchors = 2
Expand Down Expand Up @@ -100,9 +98,6 @@
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = "sphinx"

# If true, `todo` and `todoList` produce output, else they produce nothing.
todo_include_todos = False

html_theme_options = {
"external_links": [],
# https://github.com/pydata/pydata-sphinx-theme/issues/1220
Expand Down Expand Up @@ -209,14 +204,12 @@

# Config numpydoc
numpydoc_show_inherited_class_members = {
"cudf.core.dtypes.CategoricalDtype": False,
"cudf.core.dtypes.Decimal32Dtype": False,
"cudf.core.dtypes.Decimal64Dtype": False,
"cudf.core.dtypes.Decimal128Dtype": False,
"cudf.core.dtypes.ListDtype": False,
"cudf.core.dtypes.StructDtype": False,
# option_context inherits undocumented members from the parent class
"cudf.option_context": False,
}

# Rely on toctrees generated from autosummary on each of the pages we define
# rather than the autosummaries on the numpydoc auto-generated class pages.
numpydoc_class_members_toctree = False
numpydoc_attributes_as_param_list = False

Expand All @@ -229,8 +222,6 @@
"cupy.core.core.ndarray": ("cupy.ndarray", "cupy.ndarray"),
}

_internal_names_to_ignore = {"cudf.core.column.string.StringColumn"}


def resolve_aliases(app, doctree):
pending_xrefs = doctree.traverse(condition=pending_xref)
Expand All @@ -254,26 +245,7 @@ def ignore_internal_references(app, env, node, contnode):
# use `cudf.Index`
node["reftarget"] = "cudf.Index"
return contnode
elif name is not None and name in _internal_names_to_ignore:
node["reftarget"] = ""
return contnode


def process_class_docstrings(app, what, name, obj, options, lines):
"""
For those classes for which we use ::
:template: autosummary/class_without_autosummary.rst
the documented attributes/methods have to be listed in the class
docstring. However, if one of those lists is empty, we use 'None',
which then generates warnings in sphinx / ugly html output.
This "autodoc-process-docstring" event connector removes that part
from the processed docstring.
"""
if what == "class":
if name in {"cudf.RangeIndex", "cudf.Int64Index", "cudf.UInt64Index", "cudf.Float64Index", "cudf.CategoricalIndex", "cudf.IntervalIndex", "cudf.MultiIndex", "cudf.DatetimeIndex", "cudf.TimedeltaIndex", "cudf.TimedeltaIndex"}:

cut_index = lines.index('.. rubric:: Attributes')
lines[:] = lines[:cut_index]
return None


nitpick_ignore = [
Expand All @@ -289,4 +261,3 @@ def setup(app):
app.add_js_file("https://docs.rapids.ai/assets/js/custom.js", loading_method="defer")
app.connect("doctree-read", resolve_aliases)
app.connect("missing-reference", ignore_internal_references)
app.connect("autodoc-process-docstring", process_class_docstrings)
29 changes: 29 additions & 0 deletions docs/cudf/source/developer_guide/documentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,35 @@ while still matching the pandas layout as closely as possible.
When adding a new API, developers simply have to add the API to the appropriate page.
Adding the name of the function to the appropriate autosummary list is sufficient for it to be documented.

### Documenting classes

Python classes and the Sphinx plugins used in RAPIDS interact in nontrivial ways.
`autosummary`'s default page generated for a class uses [`autodoc`](https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html) to automatically detect and document all methods of a class.
That means that in addition to the manually created `autosummary` pages where class methods are grouped into sections of related features, there is another page for each class where all the methods of that class are automatically summarized in a table for quick access.
However, we also use the [`numpydoc`](https://numpydoc.readthedocs.io/) extension, which offers the same feature.
We use both in order to match the contents and style of the pandas documentation as closely as possible.

pandas is also particular about what information is included in a class's documentation.
While the documentation pages for the major user-facing classes like `DataFrame`, `Series`, and `Index` contain all APIs, less visible classes or subclasses (such as subclasses of `Index`) only include the methods that are specific to those subclasses.
For example, {py:class}`cudf.CategoricalIndex` only includes `codes` and `categories` on its page, not the entire set of `Index` functionality.

To accommodate these requirements, we take the following approach:
1. The default `autosummary` template for classes is overridden with a [simpler template that does not generate method or attribute documentation](https://github.com/rapidsai/cudf/blob/main/docs/cudf/source/_templates/autosummary/class.rst). In other words, we disable `autosummary`'s generation of Methods and Attributes lists.
2. We rely on `numpydoc` entirely for the classes that need their entire APIs listed (`DataFrame`/`Series`/etc). `numpydoc` will automatically populate Methods and Attributes section if (and only if) they are not already defined in the class's docstring.
3. For classes that should only include a subset of APIs, we include those explicitly in the class's documentation. When those lists exist, `numpydoc` will not override them. If either the Methods or Attributes section should be empty, that section must still be included but should simply contain "None". For example, the class documentation for `CategoricalIndex` could include something like the following:

```
Attributes
----------
codes
categories
Methods
-------
None
```

## Comparing to pandas

cuDF aims to provide a pandas-like experience.
Expand Down
41 changes: 41 additions & 0 deletions python/cudf/cudf/core/dtypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,16 @@ class CategoricalDtype(_BaseDtype):
when used in operations that combine categoricals, e.g. astype, and
will resolve to False if there is no existing ordered to maintain.
Attributes
----------
categories
ordered
Methods
-------
from_pandas
to_pandas
Examples
--------
>>> import cudf
Expand Down Expand Up @@ -320,6 +330,16 @@ class ListDtype(_BaseDtype):
element_type : object
A dtype with which represents the element types in the list.
Attributes
----------
element_type
leaf_type
Methods
-------
from_arrow
to_arrow
Examples
--------
>>> import cudf
Expand Down Expand Up @@ -496,6 +516,16 @@ class StructDtype(_BaseDtype):
A mapping of field names to dtypes, the dtypes can themselves
be of ``StructDtype`` too.
Attributes
----------
fields
itemsize
Methods
-------
from_arrow
to_arrow
Examples
--------
>>> import cudf
Expand Down Expand Up @@ -649,6 +679,17 @@ def itemsize(self):
scale : int, optional
The scale of the dtype. See Notes below.
Attributes
----------
precision
scale
itemsize
Methods
-------
to_arrow
from_arrow
Notes
-----
When the scale is positive:
Expand Down
Loading

0 comments on commit 9b80bfd

Please sign in to comment.