You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently our spider uses the JSON-LD article tag to find the full text of a media article web page. Problem is that few media websites support this tag. Consequently our volunteers have to manually copy and page the text from the web page to our input field.
Any method (scraping, not yet used tags) that helps to automatically read the full text is welcome.
As roaddanger.org is multilingual, it would be nice if the full text extractor supports multiple languages.
The text was updated successfully, but these errors were encountered:
digitaldutch
changed the title
Automatically extract the full article text from a media page
Automatically extract the full article text from a media webpage
Sep 9, 2024
Currently our spider uses the JSON-LD article tag to find the full text of a media article web page. Problem is that few media websites support this tag. Consequently our volunteers have to manually copy and page the text from the web page to our input field.
Any method (scraping, not yet used tags) that helps to automatically read the full text is welcome.
As roaddanger.org is multilingual, it would be nice if the full text extractor supports multiple languages.
The text was updated successfully, but these errors were encountered: