This repository contains a folder named as pulse_news_scrapper which contains news scraper developed in Python that utilizes Beautiful Soup to scrape data from the Zerodha Pulse news website. The scraped news data is then automatically saved to a Google Sheets spreadsheet using the Google Drive API. The entire process is built on the Flask microframework and scheduled to run every 24 hours.
-> Scrapes news data from the Zerodha Pulse news website.
-> Utilizes Beautiful Soup for HTML parsing and data extraction.
-> Automatically saves the scraped news to a Google Sheets spreadsheet.
-> Uses the Google Drive API for interacting with Google Sheets.
-> Built on the Flask microframework for web application functionality.
-> Scheduled to run every 24 hours to ensure up-to-date news data.
Before running the news scraper, ensure you have the following:
-> Python installed on your machine.
-> The required libraries and dependencies installed. You can find them in the requirements.txt file.
-> Use this link to get our own Google Drive API and the JSON file - https://www.youtube.com/watch?v=OzscY5uDeK0&t=578s&ab_channel=AccidentalSoftwareTester![image](https://github.com/YASHGUPTA2611/Financial_news_scrapper/assets/74678250/d0e978f7-899f-469c-9b64-506550c6cb4b)
This repository includes the "live_mint_news_scrapper" folder, which contains a Google Colab notebook. The notebook utilizes Beautiful Soup to scrape data from the Mint news website (https://www.livemint.com/market/stock-market-news). The scraped news data is automatically saved to a dataframe, which can be converted to a CSV file for convenient downloading.