-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add trtAssignment as a "distribution" to defData to ease flow? #69
Comments
I see what you mean, makes sense. I will likely have time to look it over at the end of the week. |
Should be able to use trtAssign code, or some of it. I guess we would keep trtAssign as well. |
This just sparked an idea... we could possibly rework the data definitions / "dist" column to contain not only distributions but rather "modifications" that are applied to the data so: dists, trtAssign, addMissing, user defined functions(#71) .... This would defintely be quite some work but could be of value and as we are considering breaking changes in several different places, it might be a good time to implement such sweeping changes (maybe as simstudy 1.0.0 ?) |
I see the appeal of that, though I do have to say I like the current flow of keeping the missing data process different from the underlying (true) data generation process. They are two different processes, so I think I would like to keep them separate. As you know, though, I am very keen on being able to define the randomized treatment assignment in the data definition - that to me is a key part of the underlying data generation process. And the truncation obviously. Maybe by excluding the missing data from this will simplify things so that it is not as big a lift once the new |
Currently, the treatment assignment process using
trtAssign
breaks the flow of the creation of a data set. Usually there is an outcome variable that is a function of the treatment assignment - so that we need to add a column to the table after the treatment assignment is made.What if we added a "trtAssign" distribution to the data def table so that the treatment assignment can be part of a single data generation process? It would look like this:
The formula for trtAssign represents the treatment assignment ratio defaults to "1;1", but could be of any length - so, if it is "1;1;1;2" that would be four groups. The variance parameter represents the stratification. Multiple levels of stratification would be represented as "a;b;c", where a, b, and c are variable names (really need to be categorical or factors). The functionality is exactly as it is in function
trtAssign
.The text was updated successfully, but these errors were encountered: