You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In OGE, we are currently attempting to eliminate our dependency on the pudl codebase, and instead entirely rely on the pudl output tables. However, over the past few years, our pipeline for generating subplant_ids has diverged from the one used in PUDL, and so to be able to rely directly on pudl's subplant_id outputs, I'm creating this issue to begin thinking through how we can incorporate the updates we've made into pudl. @catalyst-cooperative/com-dev
Our most recent major changes to our subplant_id pipeline are contained in singularity-energy/open-grid-emissions#353, which attempted to keep the subplant_ids as static as possible. This had previously been an issue for us because we were creating these IDs indpendently for a single year at a time, but you may have had less issue with that since you were likely already generating these IDs based on all years of data.
eGRID_crosswalk_of_EIA_ID_to_EPA_ID, which is copied from the appendix of the eGRID technical documentation. This contains some new crosswalks from the most recent eGRID years that are not in the current pudl version of the table
We also augment this table to add all plant, generator pairs before passing into pudl.analysis.epacamd_eia.filter_crosswalk(). Looking through your code, it looks like you may be no longer even using the filter_crosswalk() function anywhere?
Success Criteria
We can stop importing pudl.analysis.epacamd_eia and pudl.etl.glue_assets into OGE, and the two subplant tables are consistent
The content you are editing has changed. Please copy your edits and refresh the page.
Overview
In OGE, we are currently attempting to eliminate our dependency on the pudl codebase, and instead entirely rely on the pudl output tables. However, over the past few years, our pipeline for generating subplant_ids has diverged from the one used in PUDL, and so to be able to rely directly on pudl's subplant_id outputs, I'm creating this issue to begin thinking through how we can incorporate the updates we've made into pudl. @catalyst-cooperative/com-dev
In OGE, our subplant identification code occurs in this file: https://github.com/singularity-energy/open-grid-emissions/blob/development/src/oge/subplant_identification.py. This module has 4 main functions:
generate_subplant_ids
: this is our version ofcore_epa__assn_eia_epacamd_subplant_ids()
manually_update_subplant_id
: this is identical topudl.etl.glue_assets.manually_update_subplant_id()
update_subplant_ids
: similar to the version that exists in pudl, but there are various bugs that we fixed in this function in Enforce static subplant_id across years singularity-energy/open-grid-emissions#353 that should be integrated to pudlconnect_ids
: similar to the version that exists in pudl, but there are various bugs that we fixed in this function in Enforce static subplant_id across years singularity-energy/open-grid-emissions#353 that should be integrated to pudlOur most recent major changes to our subplant_id pipeline are contained in singularity-energy/open-grid-emissions#353, which attempted to keep the subplant_ids as static as possible. This had previously been an issue for us because we were creating these IDs indpendently for a single year at a time, but you may have had less issue with that since you were likely already generating these IDs based on all years of data.
Areas to investigate for harmonization
core_epa__assn_eia_epacamd
In oge.load_data.load_epa_eia_crosswalk(), we augment this table using two manual data sources:
We also augment this table to add all plant, generator pairs before passing into
pudl.analysis.epacamd_eia.filter_crosswalk()
. Looking through your code, it looks like you may be no longer even using thefilter_crosswalk()
function anywhere?Success Criteria
We can stop importing
pudl.analysis.epacamd_eia
andpudl.etl.glue_assets
into OGE, and the two subplant tables are consistentNext steps
The text was updated successfully, but these errors were encountered: