Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract publication date from crawled pages #164

Open
2 tasks
aecio opened this issue May 31, 2018 · 0 comments
Open
2 tasks

Extract publication date from crawled pages #164

aecio opened this issue May 31, 2018 · 0 comments

Comments

@aecio
Copy link
Member

aecio commented May 31, 2018

Try to extract actual publication date from HTML meta-tags and others heuristics such as:

  • URL
  • HTML metatags
@aecio aecio changed the title Extract page publication date from crawled pages Extract publication date from crawled pages May 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant