Green-Software-Foundation · adityamanglik · Jul 24, 2024 · Jul 24, 2024 · Jul 24, 2024 · Jul 24, 2024
diff --git a/.gitmodules b/.gitmodules
@@ -0,0 +1,3 @@
+[submodule "GSCL/kepler"]
+	path = GSCL/kepler
+	url = https://github.com/sustainable-computing-io/kepler
diff --git a/GSCL/Example.md b/GSCL/Example.md
@@ -0,0 +1,77 @@
+# Example Specification: Measuring Carbon Efficiency of PostgreSQL on AWS m5.large
+
+## Objective
+To benchmark the carbon efficiency of PostgreSQL running on an AWS m5.large instance by measuring its energy consumption during a typical database workload.
+
+## Test Environment
+
+### 1. Hardware Specifications
+- **Instance Type:** AWS m5.large
+- **vCPU:** 2 vCPUs (Intel Xeon Platinum 8175M)
+- **Memory:** 8 GiB RAM
+- **Storage:** 100 GiB EBS General Purpose SSD (gp3)
+- **Network:** Up to 10 Gbps network bandwidth
+- **Operating System:** Ubuntu 22.04 LTS
+
+### 2. Software Configuration
+- **PostgreSQL Version:** 14.4
+- **Database Configuration:**
+  - **Max Connections:** 100
+  - **Shared Buffers:** 2 GiB
+  - **Work Mem:** 16 MB
+  - **Maintenance Work Mem:** 64 MB
+- **Workload Tool:** `pgbench` (PostgreSQL benchmarking tool)
+- **pgbench Parameters:** 
+  - **Scale Factor:** 100 (creates a 1 GB database)
+  - **Number of Clients:** 10
+  - **Number of Transactions:** 100,000
+
+## Benchmark Workload Definition
+
+### 1. Unit of Software Function (USF)
+- **Definition:** Execution of 100,000 transactions by 10 concurrent clients on the PostgreSQL instance. TPC-C or TPC-H dataset?
+
+### 2. Execution Frequency
+- The USF will be executed 10 times to ensure consistency and account for any variability in performance.
+
+## Measurement Metrics
+
+### 1. Energy Consumption (kWh)
+- **Tool:** Use AWS CloudWatch and Kepler to measure the energy consumption of the m5.large instance during the benchmark.
+
+### 2. Carbon Emission (CO2e)
+- **Calculation:** Multiply the measured energy consumption (in kWh) by the region-specific carbon intensity factor (grams of CO2e per kWh) for the AWS data center region where the instance is located.
+
+## Test Methodology
+
+### 1. Setup Instructions
+1. Launch an AWS m5.large instance in the desired region with Ubuntu 22.04 LTS.
+2. Install PostgreSQL 14.4 using the official PostgreSQL repository.
+3. Configure PostgreSQL with the parameters specified above.
+4. Install `pgbench` on the instance.
+5. Install `kepler` on the instance.
+6. Create a test database and initialize `pgbench` with a scale factor of 100.
+
+### 2. Execution Procedure
+1. Start monitoring energy consumption using kepler and AWS CloudWatch.
+2. Run `pgbench` with 10 clients executing 100,000 transactions.
+3. Repeat the benchmark 10 times, recording the energy consumption and transaction performance for each run.
+
+### 3. Result Collection
+- Collect energy consumption data from the monitoring script and CloudWatch logs.
+- Calculate the average energy consumed per USF across the 10 runs.
+- Calculate the total CO2e emissions using the energy consumption and carbon intensity factor.
+
+## Certification and Reporting
+
+### 1. SCER (Software Carbon Efficiency Rating)
+- ??
+
+### 3. Transparency and Reproducibility
+- Publish the benchmark methodology, configuration details, and results openly to allow third-party verification and reproducibility.
+
+## Discussion Points
+- Different stages: Disclosure, Categorization, Benchmarking (this), Rating
+- Feasibility of applying this benchmarking methodology across different cloud environments and regions.
+- How to incentivize other database management systems to adopt a similar standardized benchmarking process.
+- Potential challenges in maintaining the consistency of energy measurements across various AWS regions.
diff --git a/GSCL/Initial Draft.md b/GSCL/Initial Draft.md
@@ -0,0 +1,34 @@
+## Objective
+To establish a standardized methodology for measuring and comparing the carbon efficiency of software applications, ensuring consistent, reproducible results across different platforms and environments.
+
+## Components:
+
+### Hardware Specifications
++ Define a baseline hardware setup that includes processor type, memory capacity, storage type, and network configuration. This baseline should represent mid-tier, widely accessible hardware to ensure broad applicability. 
+For cloud-based testing, specify a standard cloud instance type (e.g., AWS m5.large, Azure D4s_v3) with fixed resource allocation.
+
+### Software Configuration
++ Operating System: Use a standardized, widely supported operating system (e.g., Ubuntu 22.04 LTS) with minimal additional services running to reduce background noise in energy consumption measurements.
++ Middleware and Dependencies: Specify versions and configurations for any middleware or software dependencies, ensuring consistency across all test environments.
+
+### Benchmark Workload Definition
+
++ Unit of Software Function (USF): Define the core functionality to be measured (e.g., number of API requests processed, number of data sets analyzed).
++ Execution Frequency: Specify the number of times the USF should be executed during the benchmark to account for variability in short-term performance.
+
+### Measurement Metrics
+
++ SCI: Tool?
++ Energy Consumption (kWh): Measure the total energy consumed during the benchmark using precise energy monitoring tools (e.g., Kepler, or direct power measurement at the hardware level).
++ Carbon Emission (CO2e): Calculate the carbon emissions based on the energy consumption data, using region-specific carbon intensity factors for electricity (e.g., grams of CO2e per kWh).
+Test Methodology:
+
+### Setup Instructions
++ Provide detailed, step-by-step instructions for setting up the software and test environment, including installation scripts and configuration files.
++ Execution Procedure: Outline a sequential list of actions to perform during the benchmark, ensuring that each step is repeatable and documented.
++ Result Collection: Use automated logging tools to collect performance data (e.g., energy consumption, processing time). Ensure data integrity by validating logs against expected outputs.
+Certification and Reporting:
+
++ SCER (Software Carbon Efficiency Rating): Assign a carbon efficiency rating based on the measured CO2e per USF, normalized to account for any variations in hardware performance.
++ Certification Levels: Define certification levels (e.g., Platinum, Gold, Silver) based on SCER thresholds, providing a clear and intuitive indication of software carbon efficiency.
++ Transparency and Reproducibility: All benchmark results, methodologies, and environmental configurations should be published openly to allow independent verification and reproducibility of results by third parties.
diff --git a/GSCL/Questions.md b/GSCL/Questions.md
@@ -0,0 +1,7 @@
+1. Should we define a universal hardware baseline, or allow for flexibility based on regional or organizational resources?
+
+3. How can we incentivize major LLM providers to contribute their internal workflows to an open-source standard?
+
+5. What conformance programs or validation processes should be put in place to maintain the integrity of the certification?
+
+7. What role should community-driven initiatives, like crowd-sourced testing, play in the GSCL ecosystem?
diff --git a/GSCL/kepler b/GSCL/kepler