Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pdocs for python documentation #60

Merged
merged 8 commits into from
Nov 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions .github/workflows/python-docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
name: python-docs

on:
push:
branches: ["main"]
paths:
- "py-denormalized/**"
pull_request:
branches: ["main"]
paths:
- "py-denormalized/**"

# security: restrict permissions for CI jobs.
permissions:
contents: read

jobs:
# Build the documentation and upload the static HTML files as an artifact.
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Install uv
uses: astral-sh/setup-uv@v3
with:
version: "0.5.1"
enable-cache: true
cache-dependency-glob: "py-denormalized/uv.lock"

- name: "Set up Python"
uses: actions/setup-python@v5
with:
python-version-file: "py-denormalized/pyproject.toml"

- name: Install the project
working-directory: ./py-denormalized
run: uv sync --no-dev --group docs --extra feast

- name: Build the docs
working-directory: ./py-denormalized
run: |
source .venv/bin/activate
pdoc -t pdocs/ python/denormalized/ -o pdocs/_build

- uses: actions/upload-pages-artifact@v3
with:
path: py-denormalized/pdocs/_build

# Deploy the artifact to GitHub pages.
# This is a separate job so that only actions/deploy-pages has the necessary permissions.
deploy:
if: github.event_name == 'push' && github.ref == 'refs/heads/main'

needs: build
runs-on: ubuntu-latest
permissions:
pages: write
id-token: write
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- id: deployment
uses: actions/deploy-pages@v4
1 change: 1 addition & 0 deletions py-denormalized/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ coverage.xml

# Sphinx documentation
docs/_build/
pdocs/_build/

# PyCharm
.idea/
Expand Down
9 changes: 4 additions & 5 deletions py-denormalized/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
denormalized-python
===
## Denormalized Python

Python bindings for [denormalized](https://github.com/probably-nothing-labs/denormalized)

Expand All @@ -13,15 +12,15 @@ Denormalized is a single node stream processing engine written in Rust. This dir

This script will connect to the kafka instance running in docker and aggregate the metrics in realtime.

There are several other examples in the [examples/ folder](python/examples/) that demonstrate other capabilities including stream joins and UDAFs.
There are several other examples in the [examples folder](python/examples/) that demonstrate other capabilities including stream joins and UDAFs.


## Development

Make sure you're in the `py-denormalized/` directory.

We currently use [rye](https://rye.astral.sh/) to manage python dependencies.
`rye sync` to create/update the virtual environment
We use [uv](https://docs.astral.sh/uv/) to manage python dependencies.
`uv sync` to create/update the virtual environment

We use [maturin](https://www.maturin.rs/) for developing and building:
- `maturin develop` - build and install the python bindings into the current venv
Expand Down
7 changes: 7 additions & 0 deletions py-denormalized/pdocs/module.html.jinja2
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{% extends "default/module.html.jinja2" %}

{% block nav_title %}
<a href="https://github.com/probably-nothing-labs/denormalized">
<img src="https://raw.githubusercontent.com/probably-nothing-labs/denormalized/refs/heads/main/docs/images/denormalized_logo.png" alt="Denormalized Logo" class="logo">
</a>
{% endblock %}
29 changes: 18 additions & 11 deletions py-denormalized/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,28 +8,34 @@ requires-python = ">=3.12"
classifiers = []
dynamic = ["version"] # Version specified in py-denormalized/Cargo.toml
description = "Embeddable stream processing engine"
dependencies = ["pyarrow>=17.0.0", "datafusion>=40.1.0"]
dependencies = [
"pyarrow>=17.0.0",
"datafusion>=40.1.0",
]

[project.optional-dependencies]
tests = ["pytest"]
feast = ["feast"]
dev = []

[tool.maturin]
python-source = "python"
features = ["pyo3/extension-module"]
module-name = "denormalized._d_internal"

[tool.rye]
dev-dependencies = [
"pip>=24.2",
[dependency-groups]
dev = [
"pdoc>=15.0.0",
"ipython>=8.26.0",
"pytest>=8.3.2",
"maturin>=1.7.4",
"pyarrow-stubs>=17.11",
"pandas>=2.2.3",
"jupyterlab>=4.3.0",
"pdoc>=15.0.0",
"pip>=24.3.1",
]
docs = [
"pdoc>=15.0.0",
]

[tool.maturin]
python-source = "python"
features = ["pyo3/extension-module"]
module-name = "denormalized._d_internal"

# Enable docstring linting using the google style guide
[tool.ruff.lint]
Expand All @@ -46,3 +52,4 @@ max-doc-length = 88
include = ["python"]
exclude = ["src"]
typeCheckingMode = "standard"
reportMissingImports = false
19 changes: 19 additions & 0 deletions py-denormalized/python/denormalized/__init__.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,30 @@
"""
.. include:: ../../README.md
:start-line: 1
:end-before: Development
"""

from .context import Context
from .data_stream import DataStream
from .datafusion import col, column, lit, literal, udf, udaf
from .datafusion.expr import Expr
from .datafusion import functions as Functions

__all__ = [
"Context",
"DataStream",
"col",
"column",
"Expr",
"Functions",
"lit",
"literal",
"udaf",
"udf",
]

__docformat__ = "google"

try:
from .feast_data_stream import FeastDataStream

Expand Down
40 changes: 35 additions & 5 deletions py-denormalized/python/denormalized/context.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,35 @@
from denormalized._d_internal import PyContext

from .data_stream import DataStream


class Context:
"""Context."""
"""A context manager for handling data stream operations.

This class provides an interface for creating and managing data streams,
particularly for working with Kafka topics and stream processing.

Attributes:
ctx: Internal PyContext instance managing Rust-side operations
"""

def __init__(self) -> None:
"""__init__."""
"""Initialize a new Context instance."""
self.ctx = PyContext()

def __repr__(self):
"""Return a string representation of the Context object.

Returns:
str: A detailed string representation of the context
"""
return self.ctx.__repr__()

def __str__(self):
"""Return a readable string description of the Context object.

Returns:
str: A human-readable string description
"""
return self.ctx.__str__()

def from_topic(
Expand All @@ -24,7 +40,22 @@ def from_topic(
timestamp_column: str,
group_id: str = "default_group",
) -> DataStream:
"""Create a new context from a topic."""
"""Create a new DataStream from a Kafka topic.

Args:
topic: Name of the Kafka topic to consume from
sample_json: Sample JSON string representing the expected message format
bootstrap_servers: Comma-separated list of Kafka broker addresses
timestamp_column: Column name containing event timestamps
group_id: Kafka consumer group ID (defaults to "default_group")

Returns:
DataStream: A new DataStream instance configured for the specified topic

Raises:
ValueError: If the topic name is empty or invalid
ConnectionError: If unable to connect to Kafka brokers
"""
py_ds = self.ctx.from_topic(
topic,
sample_json,
Expand All @@ -33,5 +64,4 @@ def from_topic(
group_id,
)
ds = DataStream(py_ds)

return ds
Loading
Loading