Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Parsing of Quoted Strings #95

Open
ChaosData opened this issue Feb 7, 2018 · 0 comments
Open

Incorrect Parsing of Quoted Strings #95

ChaosData opened this issue Feb 7, 2018 · 0 comments

Comments

@ChaosData
Copy link

Hi,

There are a few issues I've observed related to the handling of quoted strings.
It appears that during local-part parsing, the local-part is first broken up by its . characters unconditionally, and then each subsection is parsed individually using something similar to what should be done for the local-part itself. In particular, while . is within the %d35-91 subrange of qtext as defined in RFC 5322 (Section 3.2.4, "Quoted Strings"), it is not allowed within a quoted string, because a split is performed first, leaving two "dangling" double quote characters. Furthermore, this leads the validator to incorrectly accept certain strings that are invalid. So, while valid "a.b"@example.tld would be rejected, the invalid "a"."b"@example.tld is accepted.

Additionally, neither comments nor folding whitespace appear to be properly handled. While I'm not too keen on the whole nested ((comment)) comment structure, an interesting issue this causes is that whitespace within quoted strings is not accepted, requiring space characters to be escaped using the backslash-prepended quoted-pair syntax (Section 3.2.2, "Folding White Space and Comments"). Disregarding the obs-qtext subrange of qtext, whitespace is still supported within quoted strings through the definition of quoted-string:

[CFWS]
DQUOTE *([FWS] qcontent) [FWS] DQUOTE
[CFWS]

This results in rejections of valid addresses such as "Fred Bloggs"@example.com (sourced from RFC 3696, Section 3, "Restrictions on email addresses"). The same RFC also provides examples of similarly rejected (but still valid) emails, such as Fred\ [email protected], but my understanding is that quoted-pair sequences outside of quotes are only allowed in the obs-* obsolete formats, so that may not be a big issue.

From my understanding of the spec, assuming non-support of comments, there should be a check at the beginning to determine if an address starts with a double quote character to determine if it should be parsed as a quoted string, or as a .-internally-delimited dot-atom.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant