We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
By default dates before 1995 are considered implausible, however changing the minimum date does not fix the issue.
CLI:
htmldate -u "https://web.archive.org/web/20201205182452/https://www.lesechos.fr/1991/01/saddam-hussein-menace-larabie-saoudite-939083" -vv -min "1990-01-01"
Python:
Here is the debugging without min_date:
min_date
DEBUG:htmldate.core:examining meta property: <meta data-rh="true" property="article:published_time" content="1991-01-02T01:01:00+01:00"> DEBUG:htmldate.core:examining meta property: <meta data-rh="true" property="article:modified_time" content="1991-01-02T01:01:00+01:00"> DEBUG:htmldate.core:analyzing (HTML): <footer class="sc-1lhe64-3 kPYMmr"><div class="sc-123ocby-3 fjTtGI"><div class="sc-aamjrj-0 sc-15kkm DEBUG:htmldate.extractors:found partial date in URL: /1991/01//01 DEBUG:htmldate.core:extensive search started DEBUG:htmldate.core:looking for copyright/footer information DEBUG:htmldate.core:3 components DEBUG:htmldate.validators:no potential year: 1991-01-02 DEBUG:htmldate.validators:no potential year: 1991-01-31 DEBUG:htmldate.core:firstselect: [('2022-07-26', 22), ('2022-07-25', 6), ('2020-01-29', 2), ('2022-06-28', 2)] DEBUG:htmldate.core:bestones: [('2022-07-26', 22), ('2022-07-25', 6)] DEBUG:htmldate.validators:date found for pattern "re.compile('\\D([0-9]{4}[/.-][0-9]{2}[/.-][0-9]{2})\\D')": 2022-07-26 '2022-07-26 00:00:00'
With min_date at "1990-01-01":
DEBUG:htmldate.core:examining meta property: <meta data-rh="true" property="article:published_time" content="1991-01-02T01:01:00+01:00"> DEBUG:htmldate.extractors:custom parse test: 1991-01-02T01:01:00+01:00 DEBUG:htmldate.validators:date not valid: 1991-01-02 01:01:00+01:00 DEBUG:htmldate.validators:date not valid: 1991-01-02 00:00:00 DEBUG:htmldate.validators:date not valid: 1991-01-01 00:00:00 DEBUG:htmldate.core:examining meta property: <meta data-rh="true" property="article:modified_time" content="1991-01-02T01:01:00+01:00"> DEBUG:htmldate.validators:date not valid: 1991-01-02 DEBUG:htmldate.core:analyzing (HTML): <footer class="sc-1lhe64-3 kPYMmr"><div class="sc-123ocby-3 fjTtGI"><div class="sc-aamjrj-0 sc-15kkm DEBUG:htmldate.extractors:custom parse test: Les Echos1991Janvier 1991 DEBUG:htmldate.extractors:send to external parser: Les Echos1991Janvier 1991 DEBUG:htmldate.extractors:found partial date in URL: /1991/01//01 DEBUG:htmldate.validators:date not valid: 1991-01-01 00:00:00 DEBUG:htmldate.core:extensive search started DEBUG:htmldate.extractors:custom parse test: Publié le 2 janv. 1991 à 1:01 DEBUG:htmldate.extractors:send to external parser: Publié le 2 janv. 1991 à 1:01 DEBUG:htmldate.core:looking for copyright/footer information DEBUG:htmldate.core:3 components DEBUG:htmldate.validators:no potential year: 1991-01-02 DEBUG:htmldate.validators:no potential year: 1991-01-31 DEBUG:htmldate.core:firstselect: [('2022-07-26', 22), ('2022-07-25', 6), ('2020-01-29', 2), ('2022-06-28', 2)] DEBUG:htmldate.core:bestones: [('2022-07-26', 22), ('2022-07-25', 6)] DEBUG:htmldate.validators:date found for pattern "re.compile('\\D([0-9]{4}[/.-][0-9]{2}[/.-][0-9]{2})\\D')": 2022-07-26 '2022-07-26 00:00:00'
Bug originally posted by @kinoute in #8 (comment)
The text was updated successfully, but these errors were encountered:
fix: properly set args for date validation (#62)
1aceb8a
adequate use of plausible_year_filter() (#62)
d0afa53
No branches or pull requests
By default dates before 1995 are considered implausible, however changing the minimum date does not fix the issue.
CLI:
htmldate -u "https://web.archive.org/web/20201205182452/https://www.lesechos.fr/1991/01/saddam-hussein-menace-larabie-saoudite-939083" -vv -min "1990-01-01"
Python:
Here is the debugging without
min_date
:With
min_date
at "1990-01-01":Bug originally posted by @kinoute in #8 (comment)
The text was updated successfully, but these errors were encountered: