Skip to content

Commit

Permalink
release 0.3.1: documentation and docstrings
Browse files Browse the repository at this point in the history
  • Loading branch information
mahiki committed Feb 8, 2024
1 parent 132a226 commit 4904372
Show file tree
Hide file tree
Showing 7 changed files with 64 additions and 17 deletions.
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "PrefectInterfaces"
uuid = "25d49962-0f22-42a0-bb44-b427e1ded1d4"
authors = ["mahiki <[email protected]>"]
version = "0.3.0"
version = "0.3.1"

[deps]
AWS = "fbe9abb3-538b-5e4e-ba9e-bc94f4f92ebc"
Expand Down
3 changes: 2 additions & 1 deletion docs/src/developers.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Developers
* Develop and test with/without `just` taskrunner.
* (Optional) `just` taskrunner, see [Justfile](@ref), install as a dev tool as a convenience.
* From repo root type '`just info`' for hints.
* Documenter.jl `doctest()` included in `runtests.jl`

## Test, Build Docs with Justfile
Expand Down
6 changes: 3 additions & 3 deletions docs/src/usage-and-explanation.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Usage and Design Explanation
>A Data Scientist or Analyst User Story
>A data scientist or analyst with orchestrated jobs and productionized reports.
The problem is to manage routine data ETL or pipeline processing with Prefect and the Python API, while calling Julia fuctions for expressive dataframe transformations or niche high performance custom code. Prefect doesn't provide a Julia SDK (yet), so this package provides components for julia operations that are called from a Prefect orchestration environment.

Expand Down Expand Up @@ -53,5 +53,5 @@ The julia environment does not need to be aware of project environment, because

**Managing dev/prod environment with dev/main git branches:** When both main/dev are local, there will be two local prefect DB with different PREFECT_API_URL defined by the Prefect `profiles.toml` profile. The python side of the application will need to distinguish the dev/prod PREFECT_HOME environment variables to define different locations for the prefect DB (which is just a sqlite file). I prefer to do this in a task runner outside of the python application, something like Github Actions, Make, or `just`.

## Justfile
I've found when managing a Prefect orchestrator it is helpful to have a taskrunner program that documents development tasks and executes them for you as well. I use [`just`](https://just.systems/) to launch `dev/main` Prefect DB local servers and manage tasks like Prefect deployment builds ßand running tests before merging and deploying. If you, like most data scientists, like to develop and test on the main branch please ignore this part of the package.
## Why Just Taskrunner
I've found when managing a Prefect orchestrator it's best to have a taskrunner program to codify and smooth out repetitive tasks. I use [`just`](https://just.systems/) to launch `dev/main` Prefect DB local servers and manage tasks like Prefect deployment builds and running tests before merging and deploying. The justfile provides self-documentation as the workflow evolves.
36 changes: 25 additions & 11 deletions justfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,11 @@ default:

# info for developing/testing this package
info:
@echo "Optional on setup:"
@echo "Setting up Prefect Demo [Optional]:"
@echo " cd prefect/; just init"
@echo " * this intalls poetry package and get prefect local server running"
@echo " * this intalls poetry package and gets prefect local server running"
@echo " * see docs 'Prefect Installation' section"
@echo
@echo "Typical dev workflow:"
@echo " git checkout -b issue-3/s3-read-write"
@echo " just repl; ] instantiate; add PKGS # as neeeded"
@echo " * code, write/edit tests *"
@echo " just build - this runs the server, tests, doctest, builds docs"
@echo " * now debug until its clean *"
@echo " git commit 'closes #3: s3 read/write'"
@echo " ... git merge"
@echo " vim Project.toml -> bump version number, commit."

# pass thru command
run *args:
Expand Down Expand Up @@ -52,3 +44,25 @@ kill:

# full cycle of launch server, test, docs, kill server
build: launch test docs kill

# dev workflow steps, a reminder
workflow:
@echo "Dev workflow:"
@echo " git checkout -b issue-3/s3-read-write"
@echo " just repl; ] instantiate; add PKGS # as neeeded"
@echo " code, write/edit tests"
@echo " 'just build' - this runs the server, tests, doctest, builds docs"
@echo " debug"
@echo " vim Project.toml -> bump version number"
@echo " git commit 'closes #3: s3 read/write'"
@echo " => pull request"
@echo " => git merge; git push"
@echo
@echo " Registrator & Tagbot on merge commit"
@echo " add comment to commit to get release as follows:"
@echo " @JuliaRegistrator register"
@echo
@echo " Release Notes:"
@echo
@echo " # Markdown Notes Here"
@echo " - blah blah"
5 changes: 4 additions & 1 deletion src/Datasets/Datasets.jl
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,8 @@ end
read(ds::Dataset)
Returns a `DataFrame` by calling `CSV.read` on a filepath defined by the Dataset type.
*NOTE:* A prefect server must be available.
*NOTE:* A prefect server must be available to use Dataset read function.
# Examples
```julia
Expand Down Expand Up @@ -115,6 +116,8 @@ end
write(ds::Dataset, df::DataFrame)
Writes a `DataFrame` via `CSV.write` to a filepath defined by the `Dataset` type.
*NOTE:* A prefect server must be available to use Dataset read function.
"""
function write(
ds::Dataset
Expand Down
2 changes: 2 additions & 0 deletions src/config.jl
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
"""
PrefectAPI(url::String, key::SecretString) <:AbstractPrefectInterface
PrefectAPI(url::String)
PrefectAPI()
Mutable struct tha stores the Prefect server api endpoint. All `PrefectInterface` operations depend on connecting to a running Prefect server to pull block information. Constructor with no arguments assigns env variables `PREFECT_API_URL`, `PREFECT_API_KEY`
Expand Down
27 changes: 27 additions & 0 deletions src/prefectblock/prefectblocktypes.jl
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,33 @@ struct CredentialPairBlock <: AbstractPrefectBlock
)
end

"""
S3BucketBlock(
blockname, blocktype, bucket_name, bucket_folder
, region_name, aws_access_key_id, aws_secret_access_key)
Corresponds with the Prefect S3Bucket block in the prefect-aws integration. Attached functions:
read_path("path/to/object.csv")
write_path("path/to/object.csv", df::AbstractDataFrame)
Returns or writes a DataFrame csv object at a relative key from the
block-defined `s3:://bucket_name/bucket_folder/path/to/object.csv`.
# Examples:
```julia
# pull hypothetical existing block from Prefect DB server
julia> s3block = PrefectBlock("s3-bucket/willowdata")
S3BucketBlock("s3-bucket/willowdata", "s3-bucket", "willowdata", "data-folder/dev", "us-west-2"
, "AKIAEXAMPLEXXX", ####Secret####, ...)
julia> df = s3block.block.read_path("extracts/csv/dataset=test_table/rundate=2023-05-25/data.csv");
julia> s3block.block.write_path("testfolder/xanadu-test.csv", df)
p"s3://willowdata/data-folder/dev/testfolder/xanadu-test.csv"
```
"""
struct S3BucketBlock <: AbstractPrefectBlock
blockname::String
blocktype::String
Expand Down

2 comments on commit 4904372

@mahiki
Copy link
Owner Author

@mahiki mahiki commented on 4904372 Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator register

Release notes:

Documentation

  • justfile updates add workflow reminders
  • add docstrings and small documentation fixes

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/100497

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.3.1 -m "<description of version>" 49043722a006fc0ef8484b12a4ed4650427a3342
git push origin v0.3.1

Please sign in to comment.