Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rss url checker #170

Merged
merged 10 commits into from
Jul 13, 2023
Merged

Add rss url checker #170

merged 10 commits into from
Jul 13, 2023

Conversation

16arpi
Copy link
Contributor

@16arpi 16arpi commented Jul 10, 2023

Cette pull request vise à ajouter à Ural une fonction qui retourne la probabilité qu'une URL pointe vers un flux rss.

Le script est pertinent à 99.24% sur le dataset de flux rss utilisé pour ce billet de blog et 99.78% sur une liste aléatoire d'URLs issue d'une collecte twitter (assumant qu'aucune, sinon une infime partie, de ces URLs pointe vers des flux rss).

@16arpi 16arpi linked an issue Jul 10, 2023 that may be closed by this pull request
@16arpi 16arpi force-pushed the add_rss_url_checker branch 2 times, most recently from 2bfe09b to 825a995 Compare July 10, 2023 15:58
@Yomguithereal
Copy link
Member

Yomguithereal commented Jul 11, 2023

  • Test for false positives on a generic set of urls
  • add platform-specific heuristics
  • documentation

ural/should_be_rss_url.py Outdated Show resolved Hide resolved
ural/should_be_rss_url.py Outdated Show resolved Hide resolved
ural/should_be_rss_url.py Outdated Show resolved Hide resolved
ural/should_be_rss_url.py Outdated Show resolved Hide resolved
ural/should_be_rss_url.py Outdated Show resolved Hide resolved
ural/should_be_rss_url.py Outdated Show resolved Hide resolved
ural/should_be_rss_url.py Outdated Show resolved Hide resolved
ural/should_be_rss_url.py Outdated Show resolved Hide resolved
ural/should_be_rss_url.py Outdated Show resolved Hide resolved
ural/should_be_rss_url.pyi Outdated Show resolved Hide resolved
@Yomguithereal Yomguithereal merged commit 80aff3e into master Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RSS url checker
2 participants