Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First draft of migration to OTP2 #176

Merged
merged 5 commits into from
Jan 24, 2024
Merged

First draft of migration to OTP2 #176

merged 5 commits into from
Jan 24, 2024

Conversation

leonardehrenfried
Copy link
Contributor

@leonardehrenfried leonardehrenfried commented Jan 23, 2024

This is my first attempt at moving this repo to OTP2.4. To be precise it is the snapshot version 2.5.0_2024-01-19T14-50.

It introduces a few changes to how things were done previously and I want to get feedback if this is the correct direction or if you want me to stay compatible with how things were done beforehand.

Graph build & deployment

Previously, the graphs were built in the OTP container itself. After having done this for a few years I recommend to new deployments the following:

  • use a CI tool like Github Actions to schedule regular graph builds (perhaps daily)
  • combine the graph and the configuration files into a container image
  • upload image to repository and tag it something like "latest"
  • configure the deployment server to regularly pull new images and restart itself

OSM extraction

Furthermore, rather than using a complicated query to get the correct OSM area, I've switched to the following strategy:

This is probably also faster (but I haven't tested it).

Feedback

I would like to hear if plan is ok or if you have thought about how you would like building and deployment to work.

Further work

If you tell me that my plan is ok, I'm going to improve and clean up the documentation which at the moment is not very up to date.

@rcavaliere
Copy link
Member

@leonardehrenfried fine for me, also for the work on the documentation. For all deployment aspects, I would however ask a feedback to @dulvui

@dulvui
Copy link
Contributor

dulvui commented Jan 24, 2024

@leonardehrenfried Great work, looks fine to me!

We had a lot of issues with the graph build in the past and currently the graph build works event driven like this:

  1. the otp container checks for changes of the graph provide by sta
  2. if the file changed, the container triggers a github action trough a webhook
  3. the github action starts a separate EC2 machine, that builds the graph
  4. the ec2 instance loads the new graph to a shared folder of the otp container
    Note: The ec2 instance is only used, because previously the graph build was really inefficient, so a big machine was needed. Now we solved the performance problem, so a separate machine is not needed anymore.

This approach is quite complex and often breaks, so we would highly prefer your approach. The only thing we noticed is, that this graph files change only every few months, so a daily builds would be too much. Keeping an event driven approach or at least moving the "if the gtfs file changed" check to the github action.
What do you think about this?

@rcavaliere I will deploy the new versions directly to https://journey.opendatahub.testingmachine.eu, since I don't think that we will make changes to the old version. Or do you prefer a third, separate endpoint?

@leonardehrenfried
Copy link
Contributor Author

I think it would not be hard to add a step to Github CI to compare the age of the container image with the age of the STA feed. If STA is newer a graph is built. If the container is newer nothing happens.

@rcavaliere
Copy link
Member

@rcavaliere I will deploy the new versions directly to https://journey.opendatahub.testingmachine.eu, since I don't think that we will make changes to the old version. Or do you prefer a third, separate endpoint?

@dulvui yes, no problem on this! Reuse the existing end-point

@leonardehrenfried
Copy link
Contributor Author

@dulvui BTW, how do you determine the creation date of the STA feed?

It doesn't have a last-modified header:

curl -I https://gtfs.api.opendatahub.com/v1/dataset/sta-time-tables/raw
HTTP/2 200 
access-control-allow-origin: *
content-disposition: attachment;filename=sta-time-tables.zip
content-type: application/zip
date: Wed, 24 Jan 2024 11:57:28 GMT
server: Caddy
x-powered-by: Express
content-length: 48425508

And https://gtfs.api.opendatahub.com/v1/dataset/sta-time-tables/ also doesn't contain any kind of date.

@dulvui
Copy link
Contributor

dulvui commented Jan 24, 2024

@leonardehrenfried Great then lets add this additional step to the CI and see if it works. I will try to move all the calculation logic and process to the Github Action before the next sprint.

Changes in the file get detected by downloading the file, creating the hash of it and then periodically comparing the file with the old hash.
You can find part of the logic here

elif [ "${SHA}" = "${OLD}" ]; then

But I saw that it is a quite big and complex bash script, so I might refactor that too.

@leonardehrenfried
Copy link
Contributor Author

I'm still a bit unsure which parts of the repo are my responsibility and which ones are @dulvui's so I'd be happy if you could guide me towards the parts you expect me to take over.

Basically, other than documentation, what else should I do during this sprint?

@dulvui
Copy link
Contributor

dulvui commented Jan 24, 2024

@leonardehrenfried I think everything that has to do with deployment and graph building would be my part, the rest is yours.
I'll merge your first draft now and adapt the github actions, if needed. I think by the end of the week your changes might be already be online on our testingmachine.
Would that work for you? So I'll merge now

@leonardehrenfried
Copy link
Contributor Author

Yes that sounds good.

@dulvui dulvui merged commit e81633f into main Jan 24, 2024
8 checks passed
@dulvui dulvui deleted the otp2 branch January 24, 2024 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants