Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

boundary preprocessing #52

Merged
merged 15 commits into from
Oct 7, 2024
Merged

boundary preprocessing #52

merged 15 commits into from
Oct 7, 2024

Conversation

Hussein-Mahfouz
Copy link
Collaborator

@Hussein-Mahfouz Hussein-Mahfouz commented Oct 2, 2024

This PR should allow users to prepare the boundary layer for the study area of choice.

  • It replaces hardcoded filtering to Leeds.
  • It allows us to use OAs or MSOAs for the analysis. It assumes that we have an OA layer for the whole of England (which we can provide as part of the pipeline). We cannot use a custom zoning system with the current logic

TODO before merging:

  • Move boundaries_geography to config
  • Move filter_boundaries arguments to config
  • Clarify in the config that travel_times shuld only be true if we have a travel time matrix at the resolution specified by boundary_geography (i.e. if we choose boundary_geography == "MSOA" and travel_times = True but our travel times matrix is at the OA level, the code will not work)

@Hussein-Mahfouz
Copy link
Collaborator Author

this works but we will run into issues if we don't replace hardcoded OA21CD values downstream. See #39 (comment) for all parts of the code with hardcoded OA21CD

@Hussein-Mahfouz
Copy link
Collaborator Author

@sgreenbury how do I add a parameter from the config to a src file? There seems to be two ways to do it:

  1. appraoch 1: importing load_config and acbm_cli, and then adding a script inside a main function
  2. approach 2: importing the Config class

Is approach 1 exclusively for scripts and approach 2 exclusively for src files? For context, I want to replace the hardcoded OA21CD value here with zone_id

@Hussein-Mahfouz
Copy link
Collaborator Author

@sgreenbury I've updated the filtering so that we filter the boundary to match the spc region:

logger.info("3. Filtering boundaries to specified study area")
# Step 1: Get zones from SPC (these will be 2011 MSOAs)
spc = Reader(spc_path, region, backend="pandas")
zones_in_region = list(spc.info_per_msoa.keys())
# Step 2: Filter boundaries to identified zones
# a) get MSOA11CD to MSOA21CD lookup
msoa_lookup = pd.read_csv(
acbm.root_path
/ "data/external/MSOA_2011_MSOA_2021_Lookup_for_England_and_Wales.csv"
)
# Filter msoa_lookup to include only rows where MSOA11CD is in zones_in_region
msoa_lookup_filtered = msoa_lookup[msoa_lookup["MSOA11CD"].isin(zones_in_region)]
# Extract the corresponding MSOA21CD values
msoa21cd_values = msoa_lookup_filtered["MSOA21CD"].tolist()
# b) filter boundaries to include only rows where MSOA21CD is in msoa21cd_values
boundaries_filtered = boundaries[boundaries["MSOA21CD"].isin(msoa21cd_values)]

I think we can merge this now

[zone_id, "MSOA21CD", "geometry"]
] # we always need MSOA21CD to filter to study area
print("keeping original OA boundaries")

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should raise an exception here to handle any cases where geography provided is not "OA" or "MSOA"? E.g.:

    else:
        msg = f"Invalid geography: '{geography}'. Expected 'OA' or 'MSOA'."
        raise ValueError(msg)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, done now: 5e47d0c

@sgreenbury
Copy link
Collaborator

Are we able to now set commute_level from the new config.boundary_geography parameter? Or might it be helpful to able to run with a different commute_level to the boundary_geography config (in which case, it should be a different config parameter and we can implement that in a different PR)?

@sgreenbury
Copy link
Collaborator

This looks great @Hussein-Mahfouz with the boundaries configurable, derivable from the input population and handled first with additional preprocessing. Just a couple of comments/suggestions above and think it is good to merge.

@Hussein-Mahfouz
Copy link
Collaborator Author

Are we able to now set commute_level from the new config.boundary_geography parameter?

Yes we can do that. I'll make the edit

Or might it be helpful to able to run with a different commute_level to the boundary_geography config (in which case, it should be a different config parameter and we can implement that in a different PR)

I think it's definitely helpful, especially if you want to run the optimization problem on a lower resolution (e.g. MSOA) regardless of whether boundary_geography is OA. However, I prefer doing it in a seperate PR as it will take some work and is not currently essential

In terms of steps, it would require some edits to 3.2.2_assign_primary_zone_work.py:

  1. Edit possible_zones_work before it is used here. The zones all match the boundary_geography resolution. If boundary_geography is OA, we should replace each possible_zones to it's parent MSOA and use that as possible_zones_work

  2. After the assignment, each activity is mapped to an MSOA. If the boundary_geography is OA, we need to sample oan OA from the chosen MSOAs. This should be done before this step

@sgreenbury
Copy link
Collaborator

That sounds great, thanks for identifying the steps, I'll reference your comment in a new issue for this.

@Hussein-Mahfouz
Copy link
Collaborator Author

Ok great. The other 2 comments should be addressed in the last two commits

@sgreenbury
Copy link
Collaborator

That's great, looks good to me to merge!

@Hussein-Mahfouz Hussein-Mahfouz merged commit 0c6b63d into main Oct 7, 2024
4 checks passed
@Hussein-Mahfouz Hussein-Mahfouz deleted the preprocess_boundaries branch October 7, 2024 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants