Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add config across scripts and overall pipeline script #46

Merged
merged 25 commits into from
Sep 27, 2024

Conversation

sgreenbury
Copy link
Collaborator

Contributes towards #44.

@sgreenbury sgreenbury force-pushed the 44-config-and-script branch 2 times, most recently from 76f2728 to 12ccfcb Compare September 16, 2024 20:09
@Hussein-Mahfouz
Copy link
Collaborator

Hey @sgreenbury , I wanted to add some functionality in the config file but didn't want to conflict with your workflow / wanted to understand how the functions are working first.

When using travel_times matrix, I want to check if it exists, and if it doesn't create one. I've implemented the logic here:

# TODO: check config file to see if we have a travel time matrix. If not, create one
travel_times = pd.read_parquet(
acbm.root_path / "data/external/travel_times/oa/travel_time_matrix.parquet"
)
logger.info("Travel time matrix loaded")
# # Load the configuration file
# config_file_path = 'config.toml'
# travel_time_matrix_path = acbm.root_path / "data/external/travel_times/oa/travel_time_matrix.parquet"
# with open(config_file_path, 'r') as config_file:
# config = toml.load(config_file)
# # Check if travel_times is set to true and try to load the travel time matrix
# if config.get('travel_times') == True:
# try:
# travel_times = pd.read_parquet(travel_time_matrix_path)
# print("Travel time matrix loaded successfully.")
# except Exception as e:
# logger.info(f"Failed to load travel time matrix: {e}")
# logger.info("Creating a new travel time matrix.")
# # Create a new travel time matrix based on distances between zones
# travel_times = zones_to_time_matrix(zones = boundaries, id_col = "OA21CD")
# # TODO: save the travel time matrix
# # If travel_times is not true or loading failed, create a new travel time matrix
# if config.get('travel_times') != True:
# logger.info("No travel time matrix found. Creating a new travel time matrix.")
# # Create a new travel time matrix based on distances between zones
# travel_times = zones_to_time_matrix(zones = boundaries, id_col = "OA21CD")
# logger.info("Travel time estimates created")

Could we add that to the config? We can discuss Friday. Related to #48

@sgreenbury
Copy link
Collaborator Author

Could we add that to the config? We can discuss Friday. Related to #48

Looks good to me adding like that to the config and then config["parameters"]["travel_times"] to get the value if it is with the other config parameters we've added so far. I wonder if it needs to be configurable though since this block might always be worth running? Or would there be runs where we do not want to load the matrix if it exists?

@Hussein-Mahfouz
Copy link
Collaborator

I think the important thing is to put TRUE or FALSE in the config depending on whether we have a travel time matrix. That would then be used here. The block you refer to just handles the case where travel_times is accidentally set to TRUE but there is no actual travel_times parquet file. Maybe not necessary.

Somewhat related, I am thinking of using pandera to check the schemas of all dfs before applying functions.

@sgreenbury sgreenbury marked this pull request as ready for review September 23, 2024 09:25
@sgreenbury
Copy link
Collaborator Author

Following first full run for Leeds, as discussed with @Hussein-Mahfouz and @BZ-BowenZhang merging this PR.

@sgreenbury sgreenbury merged commit 4628798 into main Sep 27, 2024
4 checks passed
@sgreenbury sgreenbury deleted the 44-config-and-script branch September 27, 2024 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants