Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Config reader with expected schema validation #7
Config reader with expected schema validation #7
Changes from 1 commit
a7602bb
51b5a25
981e158
a006e91
4611b3e
5a6e5b8
88713ec
33f0d6c
14f0eba
17dbb3b
db850c4
4693447
f047717
1e6fbb2
f27ed7a
a72c5ba
5e25426
3f7e043
76ac3fe
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"fundamental" is quite vague
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comment, I think these are all character strings but I find it helpful when reading documentation when the type is explicitly specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if
local_dest
doesn't exist?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only slight hesitation with UUIDs for the job IDs is if we run multiple jobs, it would just be a bit harder to know which job is which. That said, it would mean we never run into the annoying error "This job already exists" because we forgot to delete it.
What about, e.g.
Rt-estimation-2024-08-08T10:08:34
as job name?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes that you will pass in a UUID that is generated somewhere else right?
Probably not for this PR, I would add more metadata. For example, if this job id is under the "EpiNow2" umbrella, I would want something that names the job based on the name of the package being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like @natemcintosh's suggestion as long as we (1) store the date timestamp inside the metadata, not just in the path name, and (2) as long as there are no special character concerns using this as a path name 😬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was testing out this naming scheme idea on something else, and discovered that Azure was not happy with
:
, so I replaced it with-
.So this might be something more like
Rt-estimation-2024-08-08T10-08-34
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do passed in pmfs go in this work flow? Are they a part of parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are needed so that we can do things like:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the logical relationship between as_of_date here, and
report_date
below inside the data block? Do we need to enforce / validate this relationship somewhere?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure this is explained later, but what does "data" mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want to make sure I understand, currently this example is for "Epinow2" but you want to be able to swap this for another package name right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each report date runs on all the reference dates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmmm -- yeah this is a good flag. That's a bad assumption. Let me revisit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intended to be, for at least the EpiNow2 example, this vector of dates corresponding to the time series data passed in? E.g the date of admissions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And for EpiNow2, you would just have a single report date? So for example for this week if I ran EpiNow2 today, assuming old NHSN data reporting. I'd have:
as_of_date = "2024-08-08"
,report_date = "2024-08-07"
,reference_date
= a vector of dates going back some specified calibration period up until "2024-08-02" (last friday)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we want to point to more than one container to pull in parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you always needs to pass in these arguments or is this specific to Epinow2? I think there might be other packages where you would handle specifying priors differently (e.g. in the ww package we're developing, priors are lumped in with parameters...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My assumption is that you'd have to edit this in another version of the repo to enforce that schema, but I think the framework is adaptable!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that something like
fetch_config
is notEpiNow2
specific there is some argument it shouldn't be in this package. It seems like if we are to have another packagecfa-newpackage-pipeline
then it'll also need things function. So the schema isEpiNow2
specific but the surrounding functions are notThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to add specification of the ls mean via gp_opts() https://epiforecasts.io/EpiNow2/reference/gp_opts.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why integer instead of number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add the number of samples run here as a parameter -- seems like something we could want to change at runtime, and I think helpful to encode it explicitly instead of using the defaults
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As written, how would you point to multiple data sources when it seems that
data
just has onepath
optionThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use more descriptive test name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same: use more descriptive test name?