Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure makefsdata.py generates valid variable names #1841

Merged
merged 3 commits into from
Aug 20, 2024

Conversation

ndabas
Copy link
Contributor

@ndabas ndabas commented Aug 17, 2024

This PR:

  • Updates makefsdata.py to strip out all characters that would be invalid in a C variable name.
  • Adds a minor fix when getting the MIME type from the map to throw the exception we intend to, instead of KeyError.
  • Fixes a minor indentation inconsistency which would bug me forever if not fixed.

There are a couple more issues in the script which can potentially be fixed, please let me know if you think it's worthwhile to do so and I will add these changes too:

  • We could end up with duplicate variable names in some cases, if two or more file names differ only in special characters.
  • It might be a good idea to use Python's built-in mimetypes module to figure out the MIME type rather than our own map.

Fixes #1793.

Copy link
Contributor

@lurch lurch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lurch
Copy link
Contributor

lurch commented Aug 18, 2024

  • We could end up with duplicate variable names in some cases, if two or more file names differ only in special characters.
  • It might be a good idea to use Python's built-in mimetypes module to figure out the MIME type rather than our own map.

Sounds sensible, if you're happy to put in the additional effort. The documentation for the mimetypes module suggests it relies on an external system-installed mime.types file - are there any situations in which that file might not exist?

@ndabas
Copy link
Contributor Author

ndabas commented Aug 18, 2024

Sure, would be happy to add in those bits.

The mimetypes module has a default, hard-coded map as well, and the system-installed files are only used to augment the map:

mimetypes.init(files=None)
Initialize the internal data structures. If given, files must be a sequence of file names which should be used to augment the default type map.

So we should be fine even if no system-wide mime.types files are found.

I wrote a quick little script to check that all existing mappings exist in the defaults:

import mimetypes

# Copied in from makefsdata.py
file_types = {
    "html": "text/html",
    "htm":  "text/html",
    "shtml": "text/html",
    "shtm": "text/html",
    "ssi":  "text/html",
    "gif":  "image/gif",
    "png":  "image/png",
    "jpg":  "image/jpeg",
    "bmp":  "image/bmp",
    "ico":  "image/x-icon",
    "class": "application/octet-stream",
    "cls":  "application/octet-stream",
    "js":   "application/javascript",
    "ram":  "application/javascript",
    "css":  "text/css",
    "swf":  "application/x-shockwave-flash",
    "xml":  "text/xml",
    "xsl":  "application/pdf",
    "pdf":  "text/xml",
    "json": "application/json",
    "svg":  "image/svg+xml"
}

mimetypes.init(files=None)

for ext, mime in file_types.items():
    guessed, _ = mimetypes.guess_type("abc." + ext)
    if  guessed != mime:
        print(f"Diff: {ext}, expected: {mime}, got: {guessed}")

I got this:

Diff: shtml, expected: text/html, got: None
Diff: shtm, expected: text/html, got: None
Diff: ssi, expected: text/html, got: None
Diff: class, expected: application/octet-stream, got: None
Diff: cls, expected: application/octet-stream, got: None
Diff: js, expected: application/javascript, got: text/javascript
Diff: ram, expected: application/javascript, got: application/x-pn-realaudio
Diff: xsl, expected: application/pdf, got: text/xml
Diff: pdf, expected: text/xml, got: application/pdf

We can add in just these missing ones, but what's up with the current mappings for .ram, .xsl, and .pdf? I guess the xsl and pdf ones got swapped but what was ram supposed to be?

@ndabas
Copy link
Contributor Author

ndabas commented Aug 18, 2024

I have modified the code to use the mimetypes library and fixed another bug in the course of things. I don't know how important backwards compatibility is going to be for this, but here's how I've reconciled the differences between the hard-coded mimetypes and the library call:

  • shtml, shtm, ssi: added a static mapping to support these extensions.
  • class, cls: added a default of application/octet-stream when the mapping isn't found, which is how most web servers behave.
  • js: was application/javascript, but will now return text/javascript, which I think is fine.
  • ram was application/javascript, but will now be application/x-pn-realaudio -- hopefully this doesn't break anybody's workflow!
  • xsl and pdf were swapped, and will now be fixed.

peterharperuk
peterharperuk previously approved these changes Aug 19, 2024
Copy link
Contributor

@peterharperuk peterharperuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. httpd example still works. Thanks!

ndabas added a commit to ndabas/pico-setup-windows that referenced this pull request Aug 19, 2024
@ndabas
Copy link
Contributor Author

ndabas commented Aug 19, 2024

Thanks, I've verified that this fixes the issue on Windows too.

@peterharperuk
Copy link
Contributor

@kilograham Ok to merge this?

@lurch
Copy link
Contributor

lurch commented Aug 19, 2024

@ndabas Were you still planning to add the "We could end up with duplicate variable names in some cases, if two or more file names differ only in special characters." feature?

@ndabas
Copy link
Contributor Author

ndabas commented Aug 19, 2024

@lurch I wasn't planning to but I can if you think it's significant. Probably just a couple of extra lines of code?

@ndabas
Copy link
Contributor Author

ndabas commented Aug 19, 2024

@lurch I gave it a shot and it was easier than expected. So that takes care of all of the issues I mentioned.

@kilograham kilograham merged commit 789ea75 into raspberrypi:develop Aug 20, 2024
1 check passed
@kilograham kilograham added this to the 2.0.1 milestone Aug 20, 2024
@ndabas ndabas deleted the fix-makefsdata branch August 20, 2024 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants