Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileNameNormalizer/URLNormalizer does not support files with long file extensions #65

Open
pgrunewald opened this issue Jun 24, 2024 · 1 comment

Comments

@pgrunewald
Copy link

The current implementation does not properly normalize filenames with long file extensions (i.e. extensions with more than 4 characters). Example for this: base.geojson

When I upload this file, the generated ID for the filename will be base-geosjon instead of base.geojson. When downloading, the filename of the download however is the correct base.geojson. Shouldn't The FileNameNormalizer also cause the dash here?

The proposed solution would be to update the FILENAME_REGEX from:

FILENAME_REGEX = re.compile(r"^(.+)\.(\w{,4})$")

to:

FILENAME_REGEX = re.compile(r"^(.+)\.(\w+)$")

Additionally we should check if FileNameNormalizer is actually used, since it should've be applied in the filename for downloads.

@davisagli
Copy link
Member

Makes sense to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants