Skip to content
This repository has been archived by the owner on May 16, 2018. It is now read-only.

Reading mail fails if it has attachments with UTF-8 characters in filenames #676

Open
acid24 opened this issue Mar 10, 2016 · 4 comments
Open

Comments

@acid24
Copy link

acid24 commented Mar 10, 2016

This is the same issue as reported for zf2 #7533.

I tracked this problem to Zend_Mail_Part::_cacheContent() method which calls
Zend_Mime_Decode::splitMessageStruct() which in turn calls Zend_Mime_Decode::splitMessage() which decodes the headers here.

When iterating over an email's parts, each part is represented as a Zend_Mail_Part instance which receives an array of decoded headers. These headers are validated in the constructor to make sure they conform with RFC 2822 (only ASCII characters allowed).

So the issue as I understand it is that the headers are validated while decoded but should be validated while encoded and only after the validation be decoded.

@acid24
Copy link
Author

acid24 commented Mar 11, 2016

If it helps anyone, what I did to go around this bug was to extend 3 classes:

class ImapStorage extends Zend_Mail_Storage_Imap
{

    public function __construct($params)
    {
        parent::__construct($params);
        $this->_messageClass = 'MailMessage';
    }
}
class MailMessage extends Zend_Mail_Message
{

    public function __construct(array $params)
    {
        parent::__construct($params);
        $this->_partClass = 'MailPart';
    }

    protected function _cacheContent()
    {
        // caching content if we can't fetch parts
        if ($this->_content === null && $this->_mail) {
            $this->_content = $this->_mail->getRawContent($this->_messageNum);
        }

        if (!$this->isMultipart()) {
            return;
        }

        // split content in parts
        $boundary = $this->getHeaderField('content-type', 'boundary');
        if (!$boundary) {
            /**
             * @see Zend_Mail_Exception
             */
            require_once 'Zend/Mail/Exception.php';
            throw new Zend_Mail_Exception('no boundary found in content type to split message');
        }
        $parts = Zend_Mime_Decode::splitMessageStruct($this->_content, $boundary);
        if ($parts === null) {
            return;
        }
        $partClass = $this->getPartClass();
        $counter = 1;
        foreach ($parts as $part) {
            $encodedHeaders = $decodedHeaders = $part['header'];
            if (isset($encodedHeaders['content-type'])) {
                $name = Zend_Mime_Decode::splitContentType($encodedHeaders['content-type'], 'name');
                if (null !== $name) {
                    $name = Zend_Mime::encodeQuotedPrintableHeader($name, 'UTF-8');
                    $pattern = '!(name=).*?(;|$)!';
                    $replacement = sprintf('$1"%s"$3', $name);

                    $encodedHeaders['content-type'] = preg_replace($pattern, $replacement, $encodedHeaders['content-type']);
                }
            }
            if (isset($encodedHeaders['content-disposition'])) {
                $filename = Zend_Mime_Decode::splitHeaderField($encodedHeaders['content-disposition'], 'filename');
                if (null !== $filename) {
                    $filename = Zend_Mime::encodeQuotedPrintableHeader($filename, 'UTF-8');
                    $pattern = '!(filename=).*?(;|$)!';
                    $replacement = sprintf('$1"%s"$3', $filename);

                    $encodedHeaders['content-disposition'] = preg_replace($pattern, $replacement, $encodedHeaders['content-disposition']);
                }
            }

            // we want to validate encoded headers to avoid validation errors because of UTF-8 characters
            $mailPart = new $partClass(array('headers' => $encodedHeaders, 'content' => $part['body']));
            // then if the validation passed set the decoded headers
            $mailPart->setHeaders($decodedHeaders);

            $this->_parts[$counter++] = $mailPart;
        }
    }

    private function encodeHeader($header)
    {
        if (is_array($header)) {
            return array_map(array($this, 'encodeHeader'), $header);
        }
        return Zend_Mime::encodeQuotedPrintableHeader($header, 'UTF-8');
    }
}
class MailPart extends Zend_Mail_Part
{

    public function setHeaders(array $headers)
    {
        $this->_headers = $headers;
    }
}

If anyone finds a cleaner way to circumvent this bug please post it here. Thanks

@pillex
Copy link

pillex commented Apr 15, 2016

Can confirm this bug. Please fix it. Cannot parse many of our clients emails for automatic processing. Thanx.

Receiving emails with attachments containing utf8 characters in the filenames is currently impossible.

Using ZF 1.12.17 on php7

for me the exception is thrown in

foreach( new RecursiveIteratorIterator($message) as $part )

the error is:
Invalid header value detected

the exception happens in
/Mail/Header/HeaderValue.php

public static function isValid($value)
returns false for filenames with utf8 characters.

example $value that throws:
text/plain; charset=UTF-8; name="testä.txt"

here is a complete test email that throws the exception. the email was created in Thunderbird.
the filename of the attachment is testä.txt


To: [email protected]
From: "[email protected]" <[email protected]>
Subject: test
Date: Fri, 13 Apr 2016 19:29:37 +0200
MIME-Version: 1.0
Content-Type: multipart/mixed;
 boundary="------------050706090708030904000509"

This is a multi-part message in MIME format.
--------------050706090708030904000509
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit

Just some test text body

--------------050706090708030904000509
Content-Type: text/plain; charset=UTF-8;
 name="=?UTF-8?Q?test=c3=a4.txt?="
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename*=utf-8''%74%65%73%74%C3%A4%2E%74%78%74

YXR0YWNobWVudCB0ZXN0
--------------050706090708030904000509--

@mvanbaak
Copy link

Another 'I can confirm' this.

As I do not have much time to fix it correctly I simply removed the body of the isValid method and made it return 'true'.
A proper fix is needed.

@pillex
Copy link

pillex commented Apr 25, 2016

@mvanbaak

My "fix", which is probably almost as bad. I added this as the first line:

public static function isValid($value)
    {
     $value = iconv('ASCII', 'UTF-8//IGNORE', $value);

Why doesn't this even has the "bug" tag?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants