Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Text Content #20

Open
stevepop opened this issue Dec 9, 2015 · 4 comments
Open

Missing Text Content #20

stevepop opened this issue Dec 9, 2015 · 4 comments

Comments

@stevepop
Copy link

stevepop commented Dec 9, 2015

I am using this for pulling emails from an IMAP server. While it seems to be indexing all emails, a proportion of those emails have their contents missing i.e textContent and htmlContent are empty in Elasticsearch. Unfortunately this is happening randomly so I have no idea what could be the problem.

I also did not see any error in the logs that could give me an idea of why these contents are not being indexed.

See example extract from sense below;

 "mailboxType": "IMAP",
               "popId": null,
               "receivedDate": 1449630321000,
               "sentDate": 1449630310000,
               "size": 8455,
               "subject": "Re: Newsletter: 9th December 2015",
               "textContent": "",
               "htmlContent": null ```

@salyh
Copy link
Owner

salyh commented Dec 10, 2015

can happen if the content type of the mail is invalid. If you can send me such a failing e-mail (or post it here) i will have a look.

@stevepop
Copy link
Author

Hi @salyh, thanks for your response. I would prefer to send the failing emails to you directly· Can you send me where to send it to? Also, let me know what exactly you want me to send. ie, mail including headers, etc)

Further investigations show that most of these emails with missing message contents are sent from Microsoft Outlook and Outlook Web App. See extract of one example below;

Subject: Test Mail 1 14/12/2015 _ 0958

Thread-Topic: Test Mail 1 14/12/2015 _ 0958

Thread-Index: AdE2VkynlG/aqZyHTDKBjR4vUcA3ww==

Date: Mon, 14 Dec 2015 04:01:19 -0600

Message-ID: <[email protected]>

Accept-Language: en-GB, en-US

Content-Language: en-US

X-MS-Has-Attach:

X-MS-TNEF-Correlator: <[email protected]>

MIME-Version: 1.0

X-MS-Exchange-Transport-FromEntityHeader: Hosted

X-MS-Exchange-Organization-Network-Message-Id: f26cf0bd-af6e-4535-2399-08d3046d8451

X-MS-Exchange-Organization-AVStamp-Mailbox: SMEXw]nP;1220900;0;This mail has

 been scanned by Trend Micro ScanMail for Microsoft Exchange;

X-MS-Exchange-Organization-SCL: 0

X-MS-Exchange-Organization-AuthSource: MBX11D-ORD1.mex06.mlsrvr.com

X-MS-Exchange-Organization-AuthAs: Anonymous

Thanks

@salyh
Copy link
Owner

salyh commented Dec 14, 2015

For my emailadress see https://github.com/salyh (left side). If you want to encrypt your Mails with PGP ply find my key here: https://pgp.mit.edu/pks/lookup?op=get&search=0x7903F81190910A83

@stevepop
Copy link
Author

Thanks Hendrik, email sent!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants