Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update get_payload to decode the Content-Transfer-Encoding header #4

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

noahbass
Copy link
Contributor

Addresses some issues in #2.

This will decode the message payload if it's in base64 or quoted-printable as the Content-Transfer-Encoding header.

So with these changes, this message with a quoted-printable Content-Transfer-Encoding header:

Hello world, =0A=0AIf Lorem ipsum dolor sit amet, consectetur=
adipiscing elit. Donec vel ex egestas ante scelerisque vulputate eget et me=
us. =0A=0A In non sollicitudin nulla,=
accumsan vehicula velit=

Will now be output as this:

Hello world, Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec vel ex egestas ante scelerisque vulputate eget et metus. In non sollicitudin nulla, accumsan vehicula velit.

See get_payload documentation for details on using decode=True.

This will decode the message payload if it's in base64 or quoted-printable

See get_payload documentation for details on using `decode=True`:
https://docs.python.org/2/library/email.message.html#email.message.Message.get_payload
@isiah-lloyd
Copy link
Owner

I don't know how many emails it would affect but it seems like setting decode to true will cause get_payload() to return None if the email is a multipart message. Should we check the content type before that and then set decode to true only if the message is not multipart?

@noahbass
Copy link
Contributor Author

Maybe this block would work better? (Untested)

msg_body = ""
for part in msg.walk():
    if part.get_content_maintype() == 'multipart':
        continue
    if part.get_content_type() == 'text/plain':
        msg_body = part.get_payload(decode=True)

part can be a multipart, so if it is then just skip over it?

@noahbass
Copy link
Contributor Author

Well, actually I've thought about it and since the loop walks through each part in a message (which could be a multipart message):

msg_body = ""
for part in msg.walk():
    if part.get_content_type() == 'text/plain':
        msg_body = part.get_payload(decode=True)
        break

When it reaches get_payload(decode=True), part will never have content-type multipart b/c the content-type will always be text/plain, so the get_payload will always work.

@isiah-lloyd
Copy link
Owner

isiah-lloyd commented Oct 21, 2017

I think that looks good.

EDIT: Actually what about an else clause so that get_payload() is called with decode false so that a multipart message is still posted?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants