-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: use bhepop2 package for income assignment #243
Conversation
describe attribute modalities when giving attribute selection
municipality stage now returns a DataFrame containing income deciles per attributes in addition to the usual global deciles. Two columns "attribute" and "modality" have been added to specify the related attribute and the value it takes (modality). Attribute and modality for global deciles are "all". Filter on "all" attribute and modality have been added where data.income.municipality were used.
merge municipality_attributes.py into municipality.py move compare_methods.py to analysis/methods/income/ created a utils.py module in synthesis/population/income/ to store common functions added test dataset and tests --------- Co-authored-by: leo-desbureaux-tellae <[email protected]>
Hi! Thanks a lot, I'm at hEART this week and a bit busy the week after, but I'll look at it asap! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I made a couple of minor comments, but this looks very good :)
Thanks a lot, LGTM |
Issues from your first review are fixed ! Let me know if there is something more that needs changes. |
Resolved conflict in changelog |
Thanks, looks all good, you can merge! |
Perfect, thank you very much @leo-desbureaux-tellae, thanks @sebhoerl for the review !! ⛵ 🚀 |
Introduction of the Bhepop2 package for income assignment
data.income.municipality
The distributions DataFrame returned by data.income.municipality is now tagged by attribute and modality
Attributes describes a property present on the the population agents. A modality is a value taken by this attribute.
In Eqasim, we use two attributes:
New columns of the returned DataFrame are ["commune_id", "q1", "q2", "q3", "q4", "q5", "q6", "q7", "q8", "q9", "attribute", "modality", "is_imputed", "is_missing", "reference_median"]
Global distributions (those that were returned in the previous version of municipality.py) are tagged with attribute and modality "all".
synthesis.population.income
New config option "income_assignation_method" (should it be "income_assignment_method" ?). This config allows choosing the method used to assign an income to population agents.
The former method is called via the config "uniform" (what should be the default config ?).
A new assignation method has been added, called via the config "bhepop2". This method uses the Bhepop2 package to match per attribute distributions instead of just matching the global one.
analysis.methods.income.compare_methods
A new analysis module has been added to compare the assignation methods. It can be run using the path
analysis.methods.income.compare_methods
.This module generates plots comparing income distributions of each assignation method and the source distribution (here, Filosofi data).
This comparison is done per attribute. For instance, we compare the income distribution of individuals with attribute "family_comp" equal to "Single_parent" for the two methods, and see what method matches best the source distribution.
Another output of this module is a table measuring the distance of each method to the source distribution, here again per attribute. This allows a more measurable comparison between assignation methods.