Python Web Scraping Pipeline Orchestrated With Airflow

This project creates a data pipeline that scraps podcast data into a Google Cloud SQL-managed Postgresql database. The Airflow-orchestrated pipeline also uploads the audio files of each podcast episode into a Google Cloud Storage bucket.

GCP resources are provisioned using Terraform.