Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configure text element #28

Open
edsu opened this issue Mar 11, 2017 · 0 comments
Open

configure text element #28

edsu opened this issue Mar 11, 2017 · 0 comments

Comments

@edsu
Copy link
Member

edsu commented Mar 11, 2017

It might be useful to be configure a feed with a CSS selector to specify what element to extract text from with readability. For example the Washington Post currently use

<article itemprop="articleBody">...</article>

To enclose the text of the article using https://schema.org/NewsArticle microdata. Perhaps the config could look like:

- name: Washington Post - Politics
  url: http://feeds.washingtonpost.com/rss/politics
  css_selector: article[itemprop="articleBody"]
  twitter:
   access_token: foo
   access_token_secret: bar

I guess the downside to this is that sites change, so unless you are watching it you may not notice when their markup changes, and your diffengine instance would quietly stop working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant