Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Websites arent scraped #4

Open
mitscheluk opened this issue Jul 30, 2023 · 2 comments
Open

Websites arent scraped #4

mitscheluk opened this issue Jul 30, 2023 · 2 comments

Comments

@mitscheluk
Copy link

so everything seems to work (via streamlit) however - browserless isnt called at all for me - meaning the scraper isnt scraping anything. I do get serps and based on those I get outputs.

@mathematicsofpaul
Copy link

scrape_website("Why did Elon rebrand twitter?", "https://www.axios.com/2023/08/03/twitters-x-rebrand-explained")

Gives:

Scraping website... CONTENTTTTTT: Just a moment...[www.axios.comChecking](https://file+.vscode-resource.vscode-cdn.net/Users/paul/Downloads/Chainlit-OpenAI-Functions-main/www.axios.comChecking) if the site connection is secure[www.axios.com](https://file+.vscode-resource.vscode-cdn.net/Users/paul/Downloads/Chainlit-OpenAI-Functions-main/www.axios.com) needs to review the security of your connection before proceeding.Connection is secureProceeding...Enable JavaScript and cookies to continueRay ID: 7f9b1f2eeb82cf6dPerformance & security by Cloudflare 'Just a moment...[www.axios.comChecking](https://file+.vscode-resource.vscode-cdn.net/Users/paul/Downloads/Chainlit-OpenAI-Functions-main/www.axios.comChecking) if the site connection is secure[www.axios.com](https://file+.vscode-resource.vscode-cdn.net/Users/paul/Downloads/Chainlit-OpenAI-Functions-main/www.axios.com) needs to review the security of your connection before proceeding.Connection is secureProceeding...Enable JavaScript and cookies to continueRay ID: 7f9b1f2eeb82cf6dPerformance & security by Cloudflare'

It's because browserless is not the best at bypassing Cloudflare anti browser protection.....

@chrisb1005
Copy link

any work around here? im getting the same outcome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants