You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is it possible in some way to define what language the news is in, so it could be fetched correctly?
I used the library for a news in Portuguese, but it converted "special letters" to regular ones.
It highly compromises NLP procedures that deals with syntax, context etc.
example: "àáéóíúâôêãõç" is converted to "aaeiuaoeaoc"
from newsfetch.news import newspaper news = newspaper('https://g1.globo.com/sc/santa-catarina/noticia/2021/01/20/greve-na-comcap-coleta-feita-por-empresa-privada-em-florianopolis-vai-abranger-35percent-do-roteiro-diz-prefeitura.ghtml')
I saw inside the class it is used Newspaper3K Scraper and if I enforce the right language it returns the correct text.
from newspaper import Article article = Article(url, language='pt')
thank you
The text was updated successfully, but these errors were encountered:
Hello
Is it possible in some way to define what language the news is in, so it could be fetched correctly?
I used the library for a news in Portuguese, but it converted "special letters" to regular ones.
It highly compromises NLP procedures that deals with syntax, context etc.
from newsfetch.news import newspaper
news = newspaper('https://g1.globo.com/sc/santa-catarina/noticia/2021/01/20/greve-na-comcap-coleta-feita-por-empresa-privada-em-florianopolis-vai-abranger-35percent-do-roteiro-diz-prefeitura.ghtml')
I saw inside the class it is used Newspaper3K Scraper and if I enforce the right language it returns the correct text.
from newspaper import Article
article = Article(url, language='pt')
thank you
The text was updated successfully, but these errors were encountered: