Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix Encoding::CompatibilityError when nested message/rfc822 parts contain non-ASCII header bytes #569

Merged
merged 4 commits into from
Jul 12, 2020

Conversation

danc86
Copy link
Contributor

@danc86 danc86 commented Jul 12, 2020

Fixes the remaining crash in #205.

This patch changes DummySource to follow a similar pattern to the
Maildir source, which works by opening the files directly for parsing.
Previously DummySource was parsing strings fed through StringIO,
which was introducing extra (or at least, different) complications with
encoding. See for example the comments in issue sup-heliotrope#205.
In the code for handling message/rfc822 MIME parts, message.rb line 498,
we were calling the #normalize_whitespace method on the body string
before it was decoded.

I'm not too sure if messing with whitespace is the right thing to do
there, but that aside, that method was then also calling #fix_encoding!
which would forcibly transcode the raw body to UTF-8. Instead, we want to
keep the body as ASCII-8BIT at that point, and let it be decoded using
all the normal message decoding mechanisms.

The only other calls to #normalize_whitespace are in the UI, and in the
code path which handles body text of messages, message.rb line 592,
where the body text has already been decoded. So it seems like we can
safely make #normalize_whitespace just mess with whitespace and leave
the string encoding alone.

Fixes sup-heliotrope#205.
Copy link
Member

@IPv2 IPv2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @danc86. I was able to reproduce the crash, and the fix resolves. Code looks good to me. Nicely done!

While reproducing, I spotted two preexisting cosmetic-only issues related to enclosed messages, but these aren't blockers because they aren't introduced by this PR - indeed, they are even present in Sup 0.22.1. I'll raise new issues for these.

@danc86
Copy link
Contributor Author

danc86 commented Jul 12, 2020

Indeed, I noticed the mess with EnclosedMessage as well, that was going to be next on my list to fix, alongside the many other string encoding problems I keep finding :-)

@danc86 danc86 merged commit d3fbac1 into sup-heliotrope:develop Jul 12, 2020
IPv2 added a commit to IPv2/sup that referenced this pull request Jul 13, 2020
In class EnclosedMessage (within lib/sup/message_chunks.rb):

  - Ternary operator logic was backwards, e.g., setting @from to
    "unknown sender" if and only if "from" was truthy(!)

  - @lines was constructed using local variables instead of instance
    variables, which (happily?) masked the effects of the above
    backwards ternary logic. But this also created issue sup-heliotrope#571:
    Redwood::Person object representations were shown directly to the
    user, rather than the human-friendly parsed strings intended to be
    shown by Redwood::Person#full_address.

  - @lines was constructed as a single string (with literal line
    breaks), instead of following the normal convention of an array of
    lines. This hindered testability.

In this commit, we:

(1) Fix the above three issues in EnclosedMessage.

(2) Uncomment and expand upon the "TODO" tests for EnclosedMessage
    headers. These placeholders were added in PR sup-heliotrope#569, credit @danc86.

(3) Modify test/fixtures/non-ascii-header-in-nested-message.eml to
    follow best practices for a testing email address, i.e., use
    reserved ".invalid" domain instead of "b.c" - to ensure that the
    testing email address remains invalid forever.
IPv2 added a commit that referenced this pull request Jul 15, 2020
In class EnclosedMessage (within lib/sup/message_chunks.rb):

  - Ternary operator logic was backwards, e.g., setting @from to
    "unknown sender" if and only if "from" was truthy(!)

  - @lines was constructed using local variables instead of instance
    variables, which (happily?) masked the effects of the above
    backwards ternary logic. But this also created issue #571:
    Redwood::Person object representations were shown directly to the
    user, rather than the human-friendly parsed strings intended to be
    shown by Redwood::Person#full_address.

  - @lines was constructed as a single string (with literal line
    breaks), instead of following the normal convention of an array of
    lines. This hindered testability.

In this commit, we:

(1) Fix the above three issues in EnclosedMessage.

(2) Uncomment and expand upon the "TODO" tests for EnclosedMessage
    headers. These placeholders were added in PR #569, credit @danc86.

(3) Modify test/fixtures/non-ascii-header-in-nested-message.eml to
    follow best practices for a testing email address, i.e., use
    reserved ".invalid" domain instead of "b.c" - to ensure that the
    testing email address remains invalid forever.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants