-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Indicator cdc vaccines in progress #1312
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Katie Mazaitis <[email protected]>
Co-authored-by: Katie Mazaitis <[email protected]>
Co-authored-by: Katie Mazaitis <[email protected]>
Co-authored-by: Katie Mazaitis <[email protected]>
Co-authored-by: Katie Mazaitis <[email protected]>
Co-authored-by: Katie Mazaitis <[email protected]>
Co-authored-by: Katie Mazaitis <[email protected]>
Co-authored-by: Katie Mazaitis <[email protected]>
Co-authored-by: Katie Mazaitis <[email protected]>
Co-authored-by: Dmitry Shemetov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nits suggested; also when I run this I get the following exception:
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/krivard/projects/covid/dev/covidcast-indicators/cdc_vaccines/delphi_cdc_vaccines/__main__.py", line 12, in <module>
run_module(read_params()) # pragma: no cover
File "/home/krivard/projects/covid/dev/covidcast-indicators/cdc_vaccines/delphi_cdc_vaccines/run.py", line 43, in run_module
all_data = pull_cdcvacc_data(base_url, logger)
File "/home/krivard/projects/covid/dev/covidcast-indicators/cdc_vaccines/delphi_cdc_vaccines/pull.py", line 85, in pull_cdcvacc_data
df.columns = ["fips",
File "/home/krivard/projects/covid/dev/covidcast-indicators/cdc_vaccines/env/lib/python3.8/site-packages/pandas/core/generic.py", line 5500, in __setattr__
return object.__setattr__(self, name, value)
File "pandas/_libs/properties.pyx", line 70, in pandas._libs.properties.AxisProperty.__set__
File "/home/krivard/projects/covid/dev/covidcast-indicators/cdc_vaccines/env/lib/python3.8/site-packages/pandas/core/generic.py", line 766, in _set_axis
self._mgr.set_axis(axis, labels)
File "/home/krivard/projects/covid/dev/covidcast-indicators/cdc_vaccines/env/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 216, in set_axis
self._validate_set_axis(axis, new_labels)
File "/home/krivard/projects/covid/dev/covidcast-indicators/cdc_vaccines/env/lib/python3.8/site-packages/pandas/core/internals/base.py", line 57, in _validate_set_axis
raise ValueError(
ValueError: Length mismatch: Expected axis has 11 elements, new values have 10 elements
Co-authored-by: Katie Mazaitis <[email protected]>
Co-authored-by: Katie Mazaitis <[email protected]>
Co-authored-by: Katie Mazaitis <[email protected]>
Working on this now! Seems like the CDC Changed their base file recently. |
Co-authored-by: Dmitry Shemetov <[email protected]>
Co-authored-by: Katie Mazaitis <[email protected]>
Co-authored-by: Katie Mazaitis <[email protected]>
Co-authored-by: Katie Mazaitis <[email protected]>
As a note, we will probably need to update the source if we want to include data on Booster Shots: https://data.cdc.gov/Vaccinations/COVID-19-Vaccinations-in-the-United-States-Jurisdi/unsk-b7fc |
Sorry for taking so long to get to this review @Ananya-Joshi. I am getting a test error when I pull and test locally, see below test_run.py F... [100%]
=========================================================== FAILURES ===========================================================
_______________________________________________ TestRun.test_output_files_exist ________________________________________________
self = <test_run.TestRun object at 0x7f2b733ee0a0>
def test_output_files_exist(self):
"""Test that the expected output files exist."""
run_module(self.PARAMS)
csv_files = [f for f in listdir("receiving") if f.endswith(".csv")]
dates = [
"20210810",
"20210811",
"20210812",
"20210813",
"20210814",
"20210815",
"20210816",
"20210817",
]
geos = ["state", "hrr", "hhs", "nation", "msa"]
expected_files = []
for metric in ["cumulative_counts_tot_vaccine",
"incidence_counts_tot_vaccine",
"cumulative_counts_tot_vaccine_12P",
"incidence_counts_tot_vaccine_12P",
"cumulative_counts_tot_vaccine_18P",
"incidence_counts_tot_vaccine_18P",
"cumulative_counts_tot_vaccine_65P",
"incidence_counts_tot_vaccine_65P",
"cumulative_counts_part_vaccine",
"incidence_counts_part_vaccine",
"cumulative_counts_part_vaccine_12P",
"incidence_counts_part_vaccine_12P",
"cumulative_counts_part_vaccine_18P",
"incidence_counts_part_vaccine_18P",
"cumulative_counts_part_vaccine_65P",
"incidence_counts_part_vaccine_65P"]:
for date in dates:
for geo in geos:
expected_files += [date + "_" + geo + "_" + metric + ".csv"]
if not("cumulative" in metric) and not (date in dates[:6]):
expected_files += [date + "_" + geo + "_" + metric + "_7dav.csv"]
print(set(csv_files)-set(expected_files))
> assert set(csv_files) == set(expected_files)
E AssertionError: assert {'20210810_hh...12P.csv', ...} == {'20210810_hh...12P.csv', ...}
E Extra items in the left set:
E '20210818_state_incidence_counts_tot_vaccine_18P.csv'
E '20210819_state_incidence_counts_tot_vaccine_65P_7dav.csv'
E '20210819_nation_cumulative_counts_tot_vaccine_18P.csv'
E '20210819_hhs_incidence_counts_tot_vaccine_18P.csv'
E '20210818_hhs_incidence_counts_part_vaccine_12P.csv'
E '20210819_hhs_cumulative_counts_part_vaccine.csv'...
E
E ...Full output truncated (236 lines hidden), use '-vv' to show
test_run.py:71: AssertionError |
Can you also please merge main into this when you can, so the diffs are a bit easier to read? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to block this indicator any further. It seems fine to me, I ran through the pipeline pretty closely. Just a failing test, a merge, and a small suggestion, and we're good!
Co-authored-by: Dmitry Shemetov <[email protected]>
Pushing a small commit to make the diff code suggestion pass |
New PR for new branch of the CDC Indicator fixed. Only the committs relevant to this indicator should be in this PR.