Skip to content

Dav5T/ETL-Crowdfunding

Repository files navigation

ETL (Extract, transform, and load) Crowdfunding Mini Project

Background

There are 2 sets of data on crowdfunding that will be used. First we extract the data and import them into Pandas DataFrame. Next we process and organize the data into multiple tables using techniques such as regular expression in the last part of the process. There is also the option to use Python dictionary method instead of regex. Once the tables are completed, we export them into csv files, and upload it into an SQL database.

Data Modeling:
Here is the entity relationship diagram of all the tables:
image
For futher detail, you can click on this link: https://app.quickdatabasediagrams.com/#/d/MDz0ma

Extract Data

Start Running ETL_Mini_project_Starter_Code
You will be using the 2 xlsx files from DataSource folder to import into Pandas DataFrame:
 1. crowdfunding.xlsx
 2. contacts.xls

Transform Data

After processing and transforming the data, all the tables should be exported into the Resources folder.

Load Data

  1. You will first need to create a database.
      - Right click on PostgrsSQL
      - Click on create --> Database
      - In the Database field type crowdfunding_db
      - In Owner field, keep postgres
      - Leave comment field blank
      - Click on save

  2. Start creating the tables
     - Right click on the database, crowdfunding_db
     - Click on Query Tool
     - Open crowdfunding_db_schema.sql. Copy and paste each table creation.
     - After creating a table, import the appropriate csv file accordingly:
      * Contacts --> contacts.csv
      * category --> category.csv
      * subcategory --> subcategory.csv
      * campaign --> campaign.csv

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published