Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

365 default imputation links #47

Merged
merged 10 commits into from
Jul 17, 2024
Merged

365 default imputation links #47

merged 10 commits into from
Jul 17, 2024

Conversation

Jday7879
Copy link
Collaborator

@Jday7879 Jday7879 commented Jul 10, 2024

Summary

Implemented method for calculating default imputation link values based on specifications

Checklists

This pull request meets the following requirements:

  • installable with all dependencies recorded
  • runs without error
  • follows PEP8 and project specific conventions
  • appropriate use of comments, for example no descriptive comments
  • functions documented using Numpy style docstings
  • assumptions and decisions log considered and updated if appropriate
  • unit tests have been updated to cover essential functionality for a reasonable range of inputs and conditions
  • other forms of testing such as end-to-end and user-interface testing have been considered and updated as required
  • tests suite passes (locally as a minimum)
  • peer reviewed with review recorded

If you feel some of these conditions do not apply for this pull request, please
add a comment to explain why.

@Jday7879 Jday7879 marked this pull request as ready for review July 10, 2024 16:08
@rowanhemsi rowanhemsi self-requested a review July 11, 2024 09:32
Copy link
Collaborator

@rowanhemsi rowanhemsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of suggestions but this seems to cover the edge cases we are worried about as far as I understand them :)
The spec says that the matched pairs should be either missing or zero to indicate what type of default link case has happened - is this covered in the tests?

mbs_results/calculate_imputation_link.py Outdated Show resolved Hide resolved
mbs_results/calculate_imputation_link.py Show resolved Hide resolved
Comment on lines +101 to +108
actual_output = actual_output.rename(
columns={
"default_link_b_match": "default_backward",
"default_link_f_match": "default_forward",
"default_link_flag_construction_matches": "default_construction",
"flag_construction_matches_pair_count": "flag_match_pair_count",
}
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do these need to be renamed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit test for ratio of means are taken from the Pyspark versions which use different column names. It might be worth to bulk change these at some point, but there are about 70 files which would need to be updated. Happy to add this onto the tech backlog

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I agree that it's worth changing them but it's definitely not the top priority

@Jday7879
Copy link
Collaborator Author

Left a couple of suggestions but this seems to cover the edge cases we are worried about as far as I understand them :) The spec says that the matched pairs should be either missing or zero to indicate what type of default link case has happened - is this covered in the tests?

Unit test data is taken from the Pyspark method and have been built to account for zeros and nulls present in the data (From my understanding)

Updated the code in line with previous comments

@Jday7879 Jday7879 requested a review from rowanhemsi July 16, 2024 10:01
Copy link
Collaborator

@rowanhemsi rowanhemsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for responding to my comments, happy to approve :)

@Jday7879 Jday7879 merged commit e84ad97 into main Jul 17, 2024
3 checks passed
@Jday7879 Jday7879 deleted the 365-default-imputation-links branch July 17, 2024 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants