-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating status of parcel tracking #1
Comments
Oops, just found out you are actually already using Trackerific for this. But it requires credentials (and I had not configured any ^^). So the above solution might actually be a good solution when the tracking URL is public. :) |
Not sure why I'm not getting notifications when these come up, but I just saw this. Trackerific is doing a good job so far, but it would be cool to have some way to do this without requiring a credentials. Also, there are some areas where I'm having issues, such as Amazon shipping emails. Weboob could help with that. Also, I've got on my list of things to add is support for flight tracking. There aren't great APIs for that, and definitely no free ones that I could find. Another possibly good use case. Thanks for sharing! |
I do support it through microdata email markup in my fork. Sadly, this seemed to be widely used a few years ago, but is no longer used (at least on European companies I travel with). :/ For now, I turned mostly to adding extra crawlers by fetching from my online accounts directly, either using Weboob (crawler is here) or CozyCloud connectors (crawler is WIP). This has the extra advantage of reducing false positives (I got some false positives with tracking parsers). Might be worth noting I'm indexing schema.org schemas, for standardization across parsers and crawlers. Feel free to take whatever is interesting for you from my fork. We could even consider a merge back if you find it interesting.
Indeed, but even more generally, I did not find many great APIs (even paid ones) to handle emails and extract meaningful info from them (using machine learning techniques). Inbox from Google is doing it nicely, but there seems to be no reusable such APIs :/ |
Yea, I was trying to figure out how Google extracts data for Gmail, but I gave up.
The idea of adding additional crawlers is cool! I hadn't even considered crawling account pages in addition to my imap inbox.
What I did find was APIs for returning flight tracking info, but they were something like .5c per request. I'd have to extract tracking numbers from emails using regex of sorts and then hit the API and see if it's valid. I'm definitely interested in contributions. If you've got parts you think may be useful to send back as patches, that'd be great. I can help with writing the If you would like to send PRs, it'd be great to split and try to contribute the pieces incrementally. It'll be easier for me to test the merges and verify. Since this is a small project and only a tiny community (just us two so far), I'd like to try and accommodate and prevent a hard fork that would split our contributions. |
It seems to me they started with parsing microdata from the emails, but they are no longer relying on this. Not sure if they have a bunch of ad-hoc scrapers for the emails or they are simply doing machine-learning stuff on the emails, similar to what https://developer.edison.tech/sift claims to be doing (but I gave a try to this API and it gave terrible results).
Getting the flight tracking info might not be a huge problem. For personal use case, I think scraping data from flight tracking websites might be more than enough. I think this part (flight tracking / checking) could be handled quite easily for free. The difficult part in my opinion might be extracting the flight number and details from the email.
Sure, I'll try to clean and PR my system. Feel free to discard anything that might not be interesting to you. Is it fine if I PR these on Github (or do you prefer to use your own hosted repo?)? I was thinking about organization of the repo and microservices. Maybe it would make sense to have a Github organization hosting all the microservices in dedicated repos and a main entry point with the instructions and global Docker compose? This might make it easier to support extra community-driven microservices, a bit as Yunohost is doing for their official apps and community apps. This might be a bit overkilll at the moment though :) |
Hi,
It seems from the web UI (viewer) that you planned at some point to automatically fetch the latest status for all shipping numbers found in the emails.
Not sure if it might help, but I know about Web Outside of Browser which is a Python collection of modules to fetch (scrap) data from websites. For instance, they already have a module for UPS (https://git.weboob.org/weboob/devel/tree/master/modules/ups) and DHL (https://git.weboob.org/weboob/devel/tree/master/modules/dhl).
I might actually try to bind the two of them on my setup at some point, let me know if you are interested :)
The text was updated successfully, but these errors were encountered: