Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a lighter format for wikipedia importance tables #3424

Merged
merged 3 commits into from
May 29, 2024

Conversation

lonvia
Copy link
Member

@lonvia lonvia commented May 15, 2024

Adds support for the new simpler CSV format for wikipedia importance values. This also comes with a much simplified table structure: redirects and articles are now in the same table and all unnecessary information has been dropped leaving only wikipedia article, wikidata ID and importance.

Support for the old-style wikipedia importance dumps remains in place for now. There will be official CSV dumps once we have removed the last obstacles in the generation process in https://github.com/osm-search/wikipedia-wikidata.

@mtmail
Copy link
Collaborator

mtmail commented May 15, 2024

I think nominatim/tools/check_database.py needs to check for the new database table, too.

@lonvia
Copy link
Member Author

lonvia commented May 16, 2024

The latest commit should be implementing that. Or do you have a different place in mind?

@mtmail
Copy link
Collaborator

mtmail commented May 16, 2024

ah I see it now

@lonvia lonvia force-pushed the importance-csc-import branch from 536e2b6 to 90eea6b Compare May 16, 2024 13:24
@lonvia lonvia merged commit ad95ff1 into osm-search:master May 29, 2024
11 checks passed
@lonvia lonvia deleted the importance-csc-import branch May 29, 2024 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants