Shale Gas Well Productions Prediction

The Korea National Oil Corporation was interested in purchasing shale gas wells from the United States and wanted to predict their productions to select wells that maximize profit.

A combination of LightGBM regression and Exponential smoothing is used to predict productions. 0-1 integer programming using Gurobi is used for optimization to maximize profit. Performance evaluation is based on sMAPE (symmetric Mean Absolute Percentage Error). Our team has one of the best performances, having a percentage error of 25.54%, compared to the best one of 19.49%.

Problem Description

Data

Unfortunately, the train and exam datasets are confidential. Therefore, they are not included in this repository.

trainSet.csv - Data of 280 shale gas wells for training models
examSet.csv - Data of 44 shale gas wells for prediction

Predicting Gas Production

The task is to predict the monthly average gas productions of 44 shale gas wells in examSet.csv for the next 6 months.

Performance evaluation is based on sMAPE (symmetric Mean Absolute Percentage Error):

F_i - predicted monthly average gas production of i^th gas well over the next 6 months
A_i - actual monthly average gas production of i^th gas well over the next 6 months
n - number of gas wells (44 in this problem)

Investment Decision

A budget of $15,000,000 is allocated. The task is to select gas wells among the 44 wells to maximize profit after predicting their monthly average gas productions:

A_i - actual monthly average gas production of i^th gas well over the next 6 months
P_i - price of i^th gas well
P_s - shale gas price ($5 per 1 Mcf)
C_i - monthly operation cost of i^th gas well
X_i - decision variable to purchase i^th gas well (if purchasing i^th gas well: X_i = 1, else: X_i = 0)

Solution Approach

The wells are divided into new wells and old wells. New wells do not have data on gas production, non-gas production and hours operated per month. This data is available for old wells.

Therefore, regression is used to predict the monthly average productions of new wells for the first 6 months, and exponential smoothing is used to predict the monthly average productions of old wells for the last 6 months.

New Wells

After EDA (Exploratory Data Analysis) and feature engineering, the following advanced decision tree-based models for regression are tested:

BaggingRegressor
- n_estimators=50
RandomForestRegressor
- n_estimators=50
XGBRegressor
- max_depth=5
- objective='reg:squarederror'
LGBMRegressor
VotingRegressor
- estimators=[bagging, random_forest, xgb, lgbm]
- n_jobs=-1

Hyperparameter: train_test_split(test_size=0.2, random_state=42)

LGBMRegressor turns out as the best performing, with the minimum sMAPE.

LGBMRegressor hyperparameters after tuning with Ray Tune using Grid Search Algorithm:

boosting_type='gbdt'
learning_rate=0.1
max_bin=250
max_depth=-1
min_data_in_leaf=20
num_iterations=100
num_leaves=20

GPU is leveraged.

Old Wells

The following exponential smoothing models are tested:

SimpleExpSmoothing
- smoothing_level=0.2
- smoothing_level=0.6
- optimized smoothing level
Holt
- Additive model
- Multiplicative model
- Damped additive model
- Damped multiplicative model
ExponentialSmoothing
- use_boxcox=True
  - Additive model
  - Damped additive model

Depending on the model with the minimum SSE (Sum of Squared Error) for each well, different models are used to forecast different wells.

Investment Decision

The following 0-1 integer programming model is used:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Shale Gas Well Productions Prediction

Problem Description

Data

Predicting Gas Production

Investment Decision

Solution Approach

New Wells

Old Wells

Investment Decision

Files

README.md

Latest commit

History

README.md

File metadata and controls

Shale Gas Well Productions Prediction

Problem Description

Data

Predicting Gas Production

Investment Decision

Solution Approach

New Wells

Old Wells

Investment Decision