Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove UTF-7 support for email bodies #1618

Open
pdg137 opened this issue Sep 24, 2024 · 0 comments
Open

Remove UTF-7 support for email bodies #1618

pdg137 opened this issue Sep 24, 2024 · 0 comments

Comments

@pdg137
Copy link

pdg137 commented Sep 24, 2024

In the last year or two I have started to receive a few legitimate emails encoded as UTF-7 (maybe 1 or 2 per 100,000 emails). I noticed that despite attempting to support UTF-7 email bodies, the Mail gem does not decode these emails properly.

Note that standard UTF-7 uses a + character to initiate a special sequence (see Wikipedia). The version using &, which the Mail gem supports, appears to be the "modified UTF-7" described in RFC 9051 and is only intended for email headers/addresses. It is certainly not a valid encoding for email bodies.

Note that

  • Ruby does not support conversion to/from UTF-7:
    irb(main):021> "Hi!".encode(Encoding::UTF_7)
    (irb):21:in `encode': code converter not found (UTF-8 to UTF-7) (Encoding::ConverterNotFoundError)
            from (irb):21:in `<main>'
    
    irb(main):022> "Hi+ACE-".force_encoding(Encoding::UTF_7).encode(Encoding::UTF_8)
    (irb):22:in `encode': code converter not found (UTF-7 to UTF-8) (Encoding::ConverterNotFoundError)
            from (irb):22:in `<main>'
  • The Net::IMAP package only supports the "modified UTF-7" version in its decode_utf7 method.
  • Microsoft recommends against it.
  • The HTML5 standard disallows UTF-7 support.
  • We have issues like this related to attempting the decoding: Invalid utf-7 data in body causes exceptions #1404.

Given that we do not support the correct version of UTF-7 now, that Ruby itself does not support it, and there are many standards recommending against it, I think we should completely remove support for modified UTF-7 in email bodies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant