Skip to content

Latest commit

 

History

History
74 lines (45 loc) · 2.52 KB

ABOUT_PROJECT.md

File metadata and controls

74 lines (45 loc) · 2.52 KB
# 📰 TechCrunchStartupBot

## 📖 Project Description

**TechCrunchStartupBot** is an automated web scraping tool designed to fetch the latest startup news from TechCrunch. The bot is built using TypeScript, JavaScript, and Puppeteer. It navigates to the TechCrunch Startups section, scrapes the latest articles, and generates a markdown file with clickable links to the articles along with their publication time.

## 🛠️ Technologies Used

- **TypeScript**: Type-safe scripting language that builds on JavaScript.
- **JavaScript**: Programming language for building the scraping logic.
- **Puppeteer**: Node.js library providing a high-level API to control Chrome or Chromium over the DevTools Protocol.

## 🌟 Features

- **Automated Web Scraping**: Navigates to the TechCrunch Startups section and scrapes the latest articles.
- **Markdown File Generation**: Creates a markdown file (`LATEST_TECH_NEWS.md`) with clickable links to the articles and their publication time.
- **Error Handling**: Handles potential navigation and scraping errors gracefully.

## 📦 Installation

To get started with TechCrunchStartupBot, follow these steps:

1. **Clone the repository**:
   ```bash
   git clone https://github.com/yourusername/TechCrunchStartupBot.git
   cd TechCrunchStartupBot
  1. Install dependencies:

    npm install
  2. Run the bot:

    node scrapeTechCrunch.js

🚀 Usage

The bot will navigate to TechCrunch, scrape the latest startup articles, and generate a LATEST_TECH_NEWS.md file in the project directory. This file will contain the titles of the articles as clickable links, along with the publication time.

⏲️ Scheduling

To ensure the bot fetches the most recent data, it is recommended to run the script hourly.

Setting Up a Cron Job (Unix-based Systems)

To run the script at a specified interval (e.g., hourly), you can set up a cron job. Open the cron table using the following command:

crontab -e

Add the following line to run the script hourly:

0 * * * * /usr/bin/node /path/to/your/scrapeTechCrunch.js

Replace /path/to/your/scrapeTechCrunch.js with the actual path to your script file.

🤝 Contributing

Contributions are welcome! Please fork the repository and submit a pull request with your changes. Ensure your code follows the project's coding standards and includes appropriate tests.

🙏 Acknowledgments

  • The TechCrunch team for their continuous efforts in providing up-to-date tech news.