Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare OpenMod PUDL Demo #2922

Closed
6 of 22 tasks
zaneselvans opened this issue Oct 8, 2023 · 1 comment
Closed
6 of 22 tasks

Prepare OpenMod PUDL Demo #2922

zaneselvans opened this issue Oct 8, 2023 · 1 comment

Comments

@zaneselvans
Copy link
Member

zaneselvans commented Oct 8, 2023

Description

Preparations for our 7-15min long recurring demonstration of PUDL at OpenMod US 2023. We want to be able to show folks how to easily access and work with the data we publish using Jupyter notebooks, Datasette, nightly build outputs, etc.

Motivation

  • Get people aware of and excited about working with the open data we publish.
  • Give people enough of an intro that they feel able to play with the data on their own after the conference.
  • Target audience is folks that already have some domain knowledge (OpenMod attendees) but may have a variety of different technical backgrounds / familiarity with different sets of tools.

Scope

  • The PUDL Dataset on Kaggle is well documented (Usability of 9+ out of 10?)
  • The PUDL Dataset on Kaggle is being automatically updated based on nightly builds.
  • The notebooks associated with the PUDL Dataset on Kaggle are being automatically tested as the data evolves.
  • Our Datasette deployment is working and can handle a bit of a spike in new usage.
  • We are able to capture and analyze PUDL usage that results from this outreach.
    • AWS downloads
    • Datasette traffic
    • RTD traffic
    • Kaggle dataset views & downloads & notebook clones
    • Zenodo downloads
  • We have a 7-15 minute demonstration that we can run through with a new user which covers:
    • Interactive access & computation via Jupyter notebooks on Kaggle
    • Browsing and querying of data on Datasette
    • Bulk data download from the AWS Open Data Registry for local usage
    • Bulk data download from a versioned Zenodo archive.
    • Data Dictionaries that annotate the data on Read the Docs.

Out of Scope

  • Introducing users to PUDL development environment setup.
  • Introducing users to running the back-end / Dagster.

Comanche Notebook Outline:

  • Given narrative context around the plant, how do we find it in the data?
  • Create a table with some basic summary information about CO coal-fired generators.
  • Make a map of CO coal plants in 2010 vs 2022
    • Group generators by plant and primary fuel type, sum capacity
  • Now we know EIA plant ID is 470, generators are 1, 2, 3. Dig in there.
  • Using monthly EIA-923 data show:
    • total net generation in MWh
    • total fuel consumption in MMBTU
    • heat rate (thermal efficiency) in MMBTU / MWh
    • fuel costs in $/MWh
    • capacity factor
  • Using annual FERC Form 1 data show:
    • annually averaged non-fuel operating costs in $/MWh
    • annually averaged CapEx in $/MW of capacity
    • Note that fuel consumption, fuel cost, and net generation is also available in FERC 1, but is not as granular or reliable as EIA-923.
    • Highlight existence of multiple ownership slices and complicated reporting if it shows up.
  • Using EPA CEMS:
    • Compare CEMS derived monthly net generation, fuel consumption, capacity factors, and implied heat rates with those we got from the EIA-923.
    • Using hourly data, look at the structure of outages / operational loads.
    • Highlight frequent outages for unit 3. Low capacity factor isn't because of ramping. It's either on or off.
    • Calculate emissions.

Minimum Requirements

Example Notebooks

Stretch Goals

@jdangerx
Copy link
Member

jdangerx commented Nov 9, 2023

Minimal to-do list to get something we can demo, if @zaneselvans is indisposed:

  • work on intro
    • what is PUDL?
      • we connect EIA/EPA/FERC data so that we can tell better stories about energy system
    • for this demo, we'll do an example: let's tell a story about comanche! it has problems and is closing
  • look at CEMS data
  • filter to comanche using plants_eia data
  • plot runtime % of comanche vs. other plants (what even are other similar plants? big coal plants?)
  • plot operations + maintenance expenses from FERC, for comanche - compare to the runtime % and also to other, stabler plants

@jdangerx jdangerx moved this from New to In progress in Catalyst Megaproject Nov 11, 2023
@zaneselvans zaneselvans moved this from In progress to Done in Catalyst Megaproject Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

2 participants