fix: remove null URNs from census data #1759

PsypherPunk · 2025-01-10T15:09:37Z

Context

Minor follow-up to #1748 to tidy some inconsistencies seen in the output data.

Change proposed in this pull request

explicitly drop any null entries from the URN data before setting the index
simplify the joining of pupil and workforce data (join will default to using the index)

Guidance to review

`.dropna(subset=["URN"])`

The workforce data contains a null row for some of the earlier years. Previously, this was stripped when the inner join didn't find a match with the pupil data. However, following #1688, the outer join keeps this in place resulting in an index that is a mix of integers (i.e. URNs) and floats (the null is expressed as a float).

removal of `on="URN",`

join() defaults to using the index: we want to explicitly join in the index and by this point, can guarantee that we want to join both datasets on their respective indexes.

Checklist (add/remove as appropriate)

Work items have been linked (use AB#)
Your code builds clean without any errors or warnings
You have run all unit/integration tests and they pass
Your branch has been rebased onto main
You have tested by running locally
~~You have reviewed with UX/Design~~

- explicitly drop any null entries from the URN data before setting the index - simplify the joining of pupil and workforce data (`join` will default to using the index)

Faizan-ahmad00

looks good to me.

fix: remove null URNs from census data

e9aad5c

- explicitly drop any null entries from the URN data before setting the index - simplify the joining of pupil and workforce data (`join` will default to using the index)

PsypherPunk requested review from jrabbott, J05h-L and Faizan-ahmad00 January 10, 2025 15:09

Faizan-ahmad00 approved these changes Jan 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: remove null URNs from census data #1759

fix: remove null URNs from census data #1759

PsypherPunk commented Jan 10, 2025

Faizan-ahmad00 left a comment

fix: remove null URNs from census data #1759

Are you sure you want to change the base?

fix: remove null URNs from census data #1759

Conversation

PsypherPunk commented Jan 10, 2025

Context

Change proposed in this pull request

Guidance to review

.dropna(subset=["URN"])

removal of on="URN",

Checklist (add/remove as appropriate)

Faizan-ahmad00 left a comment

Choose a reason for hiding this comment

`.dropna(subset=["URN"])`

removal of `on="URN",`