Ramil's Python Script Interacting with SQL Database

Overview

This project demonstrates how to interact with a SQL database using Python, involving a complex SQL query that includes joins, aggregation, and sorting. The repository covers the end-to-end process, from downloading CSV files, loading them into a database, and executing queries, all while following best practices with CI/CD integration.

Project Features

Complex SQL Query:
- The project contains a SQL query that joins multiple tables, performs aggregations, and sorts the results based on the specified criteria.
Python Integration:
- Python scripts are used to automate downloading CSV data from remote sources, load them into a SQL database, and execute SQL queries.
CI/CD Pipeline:
- The project includes continuous integration and deployment (CI/CD) through GitHub Actions, ensuring the project is automatically tested upon each commit.

SQL Query

The SQL query in this project involves:

Joins between tables containing football club and player data.
Aggregation to calculate statistics such as the average number of goals per club.
Sorting to rank clubs by total goals scored.

Example Query:

Select club.name  as club_name, 
SUM(player.market_value_in_eur) as total_market_value,
COUNT(*) as included_players

from ids706_data_engineering.default.rm564_football_clubs_players player
JOIN ids706_data_engineering.default.rm564_football_clubs club ON player.current_club_id = club.club_id
GROUP BY club_name
ORDER BY total_market_value DESC
limit 20 ;

Expected Results:

Club Name: The name of the football club.
Total Market Value: Total market value for each club.
Included Players: The number of players in each club.

Explanation:

This query joins the rm564_football_clubs and rm564_football_clubs_players tables on the club_id, aggregates player statistics for each club calculates the total market value and number of players included. The dataset sorted by total market value for the each club.

Getting Started

Prerequisites

Python 3.12
Databricks Connection (or alternative cloud SQL database)
GitHub Actions for CI/CD

Installation

Clone the repository:

git clone https://github.com/nogibjj/Ramil-Complex-SQL-Query-MySQL-Database.git
cd Ramil-Complex-SQL-Query-MySQL-Database

Install dependencies:

make install

Set up environment variables:

Create a .env file in the root directory with the following variables:

SERVER_HOSTNAME=your_databricks_server_hostname
HTTP_PATH=your_http_path
ACCESS_TOKEN=your_access_token

Running the Script

Extract CSV files and load data into the database:
```
python main.py
```
Execute queries:
- You can run specific SQL queries using the execute_query or fetchall_result functions.

Running Tests

To run the tests:

make test_file

To run the format:

make format

To run the lint:

make lint

CI/CD Pipeline

The repository uses GitHub Actions for continuous integration and deployment. Every commit triggers the CI pipeline, which:

Runs automated tests on the Python codebase.
Ensures the SQL query and database interactions work as expected.

Running the Pipeline Locally

You can use GitHub Actions for CI, and Docker to replicate the CI environment locally if desired.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
__pycache__		__pycache__
data		data
sql		sql
src		src
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
main.py		main.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt
test_lib.py		test_lib.py
test_main.py		test_main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ramil's Python Script Interacting with SQL Database

Overview

Project Features

SQL Query

Expected Results:

Explanation:

Getting Started

Prerequisites

Installation

Running the Script

Running Tests

CI/CD Pipeline

Running the Pipeline Locally

About

Releases

Packages

Languages

Ramil-cyber/Ramil-Complex-SQL-Query-MySQL-Database

Folders and files

Latest commit

History

Repository files navigation

Ramil's Python Script Interacting with SQL Database

Overview

Project Features

SQL Query

Expected Results:

Explanation:

Getting Started

Prerequisites

Installation

Running the Script

Running Tests

CI/CD Pipeline

Running the Pipeline Locally

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages