-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
a converted message have a broken encoding in a message body (have cp1251 in msg ) #49
Comments
Can someone look at it @mvz pls? I've been using you package since version 0.903, but when it was changed from utf8 to raw binary in version 0.919 (by Andreas Pflug) it is useless for our latin characters. It is the same for multiple code pages. I still have to use PERL version v5.26.1 from Ubuntu 18 and the old version 0.903 with |
I'm guessing this means the codepage is set in a property that I'm currently skipping. @czende if you have a message that you can share with me that has this problem, it would really help in debugging if you can send it to me. |
Hi, I'm a co-worker of @czende , unfortunately we can't share the problematic email because it contains sensitive data and we couldn't reproduce it any other way. I tested your PR locally on our problematic email but the result was the same, the output .eml had a broken message body, so your modification did not help in our case. But I found that in our case, "latin extended" characters like "ěščřž...." in the headers are probably responsible for the problem. Based on this, I changed the I'm not a Perl developer, so I'm not sure about this modification, but even according to the documentation, the |
input: .msg saved by outlook in cp1251
output:
-headers: good
-attachments: good
-message body: encoding is broken
a converted file itself seams to be in utf-8
have something like:
--16855398770.0877C.31779
Content-Type: text/plain; charset="UTF-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Óâàæàåìûé ...!
Îçíàêîìüòåñü ñ ... ... ...
need to be like (from the same message re-exported in unicode):
--16855410730.aC1d0Ed.5308
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Content-Disposition: inline
Уважаемый ...!
Ознакомьтесь с ... ... ...
based on this decoder - https://2cyr.com/decode/?lang=en
source encoding:
WINDOWS-1251
displayed as:
WINDOWS-1252
Email::Outlook::Message version: 0.921
The text was updated successfully, but these errors were encountered: