-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handles broken RSS feeds that may not include an <rss> declaration. #179
base: master
Are you sure you want to change the base?
Conversation
Wow. I'm glad you were able figure out what was broken about that feed. But I'm not sure it's reasonable to expect a parser library to handle a feed that is so fundamentally broken. No complaints at all about how you implemented this. And thank you for the PR. I'm going to close this for now, but I'm open to discussion. |
What about an option to allow a user to define the feed type if none are found? Or an option to allow "cleaning up" a feed before it's read, which would detect this sort of error? |
That sounds doable.
This is probably already doable by hooking into the SAX events emitted, but I've been thinking about adding more hooks for precisely this kind of thing. Something like:
That's rather painfully explicit (and I probably wouldn't send plain text but rather a parsed node), but you get the picture. |
I'd prefer to see an option to provide "defaults" such as the default parser type, but anything that works would be fine. For now I'm just using my Fork of Feedparser, since my goal of using this library really was to support any feed, no matter how terribly broken it may be. I deal with almost 20k feeds, most of which are pretty fundamentally broken. |
0146a96
to
6444e61
Compare
Before this fix, if the summary tag appeared BEFORE the description tag, it was accepted as the description (rather than the full-text which may have appeared below the summary tag in the description tag)
Some international feeds use <articles> and <article> rather than <items> and <item> to separate their articles
* master: Support "article" as a tag for feeds
* 'master' of github.com:danmactough/node-feedparser: 2.2.10 Update mri Update readable-stream Update npm audit fixes Remove unused Makefile Update examples to use node-fetch in place of request Update mocha v7 Update eslint v6 Replace iconv with iconv-lite Update travis config; drop support for unmaintained node versions
* master: 2.2.10 Update mri Update readable-stream Update npm audit fixes Remove unused Makefile Update examples to use node-fetch in place of request Update mocha v7 Update eslint v6 Replace iconv with iconv-lite Update travis config; drop support for unmaintained node versions
An example feed is: http://stocknewsnow.com/feed/. The feed is a semi-valid
RSS feed, it's just missing the global declaration.