-
-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some FERC filings share a ReportDate or CertifyingOfficialDate, but were published at different times. #2822
Comments
5 tasks
zaneselvans
added
ferc1
Anything having to do with FERC Form 1
xbrl
Related to the FERC XBRL transition
labels
Sep 20, 2023
This was referenced Sep 25, 2023
This causes data to get dropped when we are reading data from the |
jdangerx
added a commit
that referenced
this issue
Oct 6, 2023
) * Update to use new version of ferc-xbrl-extractor * Fix issues arising from stricter typing used in pandas 2.1 * Use integer transmission circuits. * Remove obsolete references to ferc1_schema tests. * Make new extractor compatible with 2021 data The new extractor added some data to the 2021 XBRL archives. This caused some integration and validation test fails. I added some plants to the pudl_id mapping spreadsheet, all of which are considered totals. I.e., not real plants, but we're mapping them for the sake of giving them an ID (they are not connected to EIA records). Because this is how we treat other total records reported to FERC1. This also updates the way that values were assigned to a slice of the ferc1_eia_train output spreadsheets. NA values were causing an issue, so I had to change how the values were being converted. This also updates the test_minmax_rows test to reflect the new rows in the 2021 data. * Add a few plants to pudl_id_mapping Totally new: * 18012: pjm interconnection, llc / total * 18013: new york state electric & gas corporation / see footnote * 18014: southwest power pool, inc. / total * 18015: public service company of colorado / community solar gardens * 18016: the empire district electric company / n/a each & 73 units at 2.52 mw each) * 18017: wisconsin electric power company / see footnote * 18018: upper michigan energy resources company (pudl determined) / total * 18019: new york transco, llc / total * 18020: wilderness line holdings, llc / total * 18021: mt. carmel public utility co / total Mapped to existing PUDL ID: * 8671: pacific gas & electric company, small hydroelectric generating plants * 15000: idaho power company / hydro * 15001: idaho power company / internal combustion * 15068: public service company of colorado / conventional hydro * 12926: midamerican energy company / ida grove ii wind farm (8 units at 2.3 mw * 1287: alaska electric light and power company / salmon creek hyrdo Note the misspelling of the plant name in 1287. Changed: * 15031: mt. carmel public utility co / not applicable -> ameren illinois company / not applicable This one had a mismatch between utility_id_ferc 222, which corresponds to Ameren, not Mt. Carmel (397). * Update validation test expectations. There are some missing data due to messy deduplication: #2822 But we'll do the deduplication better in here: #2899 --------- Co-authored-by: zschira <[email protected]> Co-authored-by: Zane Selvans <[email protected]> Co-authored-by: Austen Sharpe <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
For FERC forms 1, 2, 6, and 60, we expect each filing to include a
ReportDate
fact, and for FERC 714 we expect each filing to include aCertifyingOfficialDate
fact.We use these in the
ferc-xbrl-extractor
to order the filings by recency, so we can merge all the filings and use the most recent data we have for any given fact.However, these only have day-level granularity, and often filings share the same date fact but are published at different times. We can tell because we can see the publish time with high granularity from the RSS feed metadata, which we already track.
To avoid ambiguity, we should use that RSS feed metadata instead of the report's self-reported date to determine which report should take precedence.
The text was updated successfully, but these errors were encountered: