diff --git a/pricing_competing_products/price_optimization_with_competing_products.ipynb b/pricing_competing_products/price_optimization_with_competing_products.ipynb index d09d7ab..3b593aa 100644 --- a/pricing_competing_products/price_optimization_with_competing_products.ipynb +++ b/pricing_competing_products/price_optimization_with_competing_products.ipynb @@ -1,1725 +1,1293 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "7f7d69f9", - "metadata": { - "id": "7f7d69f9" - }, - "source": [ - "Copyright © 2024 Gurobi Optimization, LLC\n", - "\n", - "# Price Optimization with Competing Products\n", - "Finding the delicate balance between price and demand is a difficult problem to solve and one that is prevalent in a number of huge industries like retail, e-commerce, ticketing, and hospitality.\n", - "\n", - "It just so happens that this is a problem that data scientists have recently been getting better and better at addressing. Still, a key piece is missing: How can I make the right pricing decision given all the other constraints and business rules that exist for this problem? That's where optimization comes in. \n", - "\n", - "In this scenario, we have a product that has several categories, but we have a limited amount of \"space\" for our products. This could be shelf or warehouse space in retail, seats for events, or for airline ticketing, or rooms at a hotel. \n", - "\n", - "This problem considers two similar products that are offered where it's our job to use the data available with some optimization know-how to determine the optimal mix of products to offer that maximize revenue while also adhering to a few other business rules. First, we'll create a predictive model to forecast sales based on the prices of each product. Then we'll build an optimization model to find this optimal mix. Finally, we’ll also use the Gurobi-sponsored open-source package Gurobi Machine Learning to seamlessly combine the features of a machine learning model with a decision of an optimization model. \n", - "\n", - "Let’s get started!" - ] - }, - { - "cell_type": "markdown", - "id": "a5846a14", - "metadata": { - "id": "vGL8Rv-uRZ7j" - }, - "source": [ - "## Load required packages\n", - "If you have a Gurobi license you can skip the installation of `gurobipy`, but always make sure you have the [latest version](https://www.gurobi.com/downloads/gurobi-software) available. " - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "id": "1c3276f0", - "metadata": {}, - "outputs": [ + "cells": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "Requirement already satisfied: gurobipy in /Users/yurchisin/opt/anaconda3/envs/gurobi_ml/lib/python3.11/site-packages (11.0.0)\n", - "Note: you may need to restart the kernel to use updated packages.\n" - ] - } - ], - "source": [ - "%pip install gurobipy" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "3024295a", - "metadata": { - "id": "3024295a" - }, - "outputs": [], - "source": [ - "import gurobipy as gp\n", - "from gurobipy import GRB\n", - "\n", - "import numpy as np\n", - "import pandas as pd\n", - "import seaborn as sns\n", - "import matplotlib.pyplot as plt\n", - "import warnings\n", - "from sklearn.model_selection import train_test_split\n", - "from sklearn import tree\n", - "\n", - "warnings.filterwarnings(\"ignore\")" - ] - }, - { - "cell_type": "markdown", - "id": "d8c28cfd", - "metadata": { - "id": "d8c28cfd" - }, - "source": [ - "## Start with some data analysis\n", - "\n", - "This data contains prices and sales for two of our competing products and was generated using another script, which can be found [here](). Let's load the data and take a quick look. " - ] - }, - { - "cell_type": "code", - "execution_count": 169, - "id": "eea57a4c", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 707 + "cell_type": "markdown", + "id": "7f7d69f9", + "metadata": { + "id": "7f7d69f9" + }, + "source": [ + "Copyright © 2024 Gurobi Optimization, LLC\n", + "\n", + "# Price Optimization with Competing Products\n", + "Finding the delicate balance between price and demand is a difficult problem to solve and one that is prevalent in a number of huge industries like retail, e-commerce, ticketing, and hospitality.\n", + "\n", + "It just so happens that this is a problem that data scientists have recently been getting better and better at addressing. Still, a key piece is missing: How can I make the right pricing decision given all the other constraints and business rules that exist for this problem? That's where optimization comes in.\n", + "\n", + "In this scenario, we have a product that has several categories, but we have a limited amount of \"space\" for our products. This could be shelf or warehouse space in retail, seats for events, or for airline ticketing, or rooms at a hotel.\n", + "\n", + "This problem considers two similar products that are offered where it's our job to use the data available with some optimization know-how to determine the optimal mix of products to offer that maximize revenue while also adhering to a few other business rules. First, we'll create a predictive model to forecast sales based on the prices of each product. Then we'll build an optimization model to find this optimal mix. Finally, we’ll also use the Gurobi-sponsored open-source package Gurobi Machine Learning to seamlessly combine the features of a machine learning model with a decision of an optimization model.\n", + "\n", + "Let’s get started!" + ] + }, + { + "cell_type": "markdown", + "id": "a5846a14", + "metadata": { + "id": "a5846a14" + }, + "source": [ + "## Load required packages\n", + "If you have a Gurobi license you can skip the installation of `gurobipy`, but always make sure you have the [latest version](https://www.gurobi.com/downloads/gurobi-software) available." + ] }, - "id": "eea57a4c", - "outputId": "7412df74-2d7e-478a-f7fc-5008c21077a3" - }, - "outputs": [ { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
p[1]p[2]n[1]
0356.12197.67108.0
1358.05189.6866.0
2340.79260.35130.0
3353.76133.5355.0
4341.37229.8091.0
............
995357.63241.5468.0
996352.58212.9587.0
997355.28189.5094.0
998369.75166.3351.0
999349.31222.07114.0
\n", - "

1000 rows × 3 columns

\n", - "
" + "cell_type": "code", + "execution_count": 1, + "id": "1c3276f0", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "1c3276f0", + "outputId": "8b9efd0a-df20-42e8-99b7-28eb2ae2b3e5" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Requirement already satisfied: gurobipy in /usr/local/lib/python3.10/dist-packages (11.0.0)\n" + ] + } ], - "text/plain": [ - " p[1] p[2] n[1]\n", - "0 356.12 197.67 108.0\n", - "1 358.05 189.68 66.0\n", - "2 340.79 260.35 130.0\n", - "3 353.76 133.53 55.0\n", - "4 341.37 229.80 91.0\n", - ".. ... ... ...\n", - "995 357.63 241.54 68.0\n", - "996 352.58 212.95 87.0\n", - "997 355.28 189.50 94.0\n", - "998 369.75 166.33 51.0\n", - "999 349.31 222.07 114.0\n", - "\n", - "[1000 rows x 3 columns]" + "source": [ + "%pip install gurobipy" ] - }, - "execution_count": 169, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "df = pd.read_csv('https://raw.githubusercontent.com/Gurobi/modeling-examples/master/pricing_competing_products/price_value_data.csv')\n", - "df" - ] - }, - { - "cell_type": "markdown", - "id": "b65707ac", - "metadata": {}, - "source": [ - "### What's in the data?\n", - "The data contains three columns:\n", - "1. `p[1]` is the price (in dollars) of the first category (let's call it Category 1).\n", - "2. `p[2]` is the price (in dollars) of the second category (Category 2).\n", - "3. `n[1]` is the number of the items sold that are of Category 1. \n", - "\n", - "We don't see a column for `n[2]`, which would be the number of items sold that are Category 2. Here is where we make a **pretty big assumption** that we will sell all of the items. This makes our decision to be how to divvy up the limited space we have in order to maximize our revenue. \n", - "The data was created to have a couple of key characteristics.\n", - "1. As the price of Category 1 goes up, the number sold should decrease, so `p[1]` and `n[1]` have a negative correlation.\n", - "2. As the price of Category 2 goes up, the number sold of Category 1 should increase, so `p[2]` and `n[1]` have a positive correlation.\n", - "\n", - "The correlation plot of the columns of the data is below. " - ] - }, - { - "cell_type": "code", - "execution_count": 170, - "id": "4cb22fe7", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 451 }, - "id": "4cb22fe7", - "outputId": "b6e2ddb2-81ca-485c-e5cc-04b65b292ad9" - }, - "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "Warning: environment still referenced so free is deferred (Continue to use WLS)\n", - "Warning: environment still referenced so free is deferred (Continue to use WLS)\n", - "Warning: environment still referenced so free is deferred (Continue to use WLS)\n" - ] + "cell_type": "code", + "execution_count": 2, + "id": "3024295a", + "metadata": { + "id": "3024295a" + }, + "outputs": [], + "source": [ + "import gurobipy as gp\n", + "from gurobipy import GRB\n", + "\n", + "import numpy as np\n", + "import pandas as pd\n", + "import seaborn as sns\n", + "import matplotlib.pyplot as plt\n", + "import warnings\n", + "from sklearn.model_selection import train_test_split\n", + "from sklearn import tree\n", + "\n", + "warnings.filterwarnings(\"ignore\")" + ] }, { - "data": { - "image/png": "", - "text/plain": [ - "
" + "cell_type": "markdown", + "id": "d8c28cfd", + "metadata": { + "id": "d8c28cfd" + }, + "source": [ + "## Start with some data analysis\n", + "\n", + "This data contains prices and sales for two of our competing products and was generated using another script, which can be found [here](). Let's load the data and take a quick look." ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "fig, axes = plt.subplots(nrows=1, ncols=1, figsize=(15, 5))\n", - "sns.heatmap(df[['p[1]','p[2]','n[1]']].corr(),annot=True, center=0,ax=axes)\n", - "\n", - "axes.set_title('Correlations')\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "id": "a4aa96d4", - "metadata": {}, - "source": [ - "In this problem, we've assumed that the amount of space we have available for the products is 200 units. In retail, this could be the amount of warehouse space, or for ticketing this could represent the number of seats available." - ] - }, - { - "cell_type": "markdown", - "id": "6e9c6157", - "metadata": { - "id": "6e9c6157" - }, - "source": [ - "### Building regressors to predict sales\n", - "\n", - "The prices for each category item will be used to predict the number of Category 1 items sold. Here we build a regression model to form this relationship which will later be used as part of the optimization model. " - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "id": "a70d5f2c", - "metadata": { - "id": "a70d5f2c" - }, - "outputs": [], - "source": [ - "from sklearn.compose import make_column_transformer\n", - "from sklearn.linear_model import LinearRegression\n", - "from sklearn.pipeline import make_pipeline\n", - "from sklearn.metrics import r2_score\n", - "from sklearn.model_selection import train_test_split #importing scikit-learn's function for data splitting\n", - "from sklearn.ensemble import GradientBoostingRegressor #importing scikit-learn's gradient booster regressor function\n", - "from sklearn.model_selection import cross_validate #improting scikit-learn's cross validation function" - ] - }, - { - "cell_type": "code", - "execution_count": 171, - "id": "3095da93", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" }, - "id": "3095da93", - "outputId": "e6415f1b-34b8-4c25-ebf5-f4d1fa2223e1" - }, - "outputs": [], - "source": [ - "X = df[[\"p[1]\",\"p[2]\"]]\n", - "y = df[\"n[1]\"]\n", - "# Split the data for training and testing\n", - "X_train, X_test, y_train, y_test = train_test_split(\n", - " X, y, train_size=0.75, random_state=1\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "f8d713c2", - "metadata": {}, - "source": [ - "First we'll start with a linear regression model. " - ] - }, - { - "cell_type": "code", - "execution_count": 172, - "id": "33f4eaec", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 92 + { + "cell_type": "code", + "execution_count": 3, + "id": "eea57a4c", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 424 + }, + "id": "eea57a4c", + "outputId": "28a70699-d016-4f52-96e3-bc9badf6e0b0" + }, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + " p[1] p[2] n[1]\n", + "0 356.12 197.67 108.0\n", + "1 358.05 189.68 66.0\n", + "2 340.79 260.35 130.0\n", + "3 353.76 133.53 55.0\n", + "4 341.37 229.80 91.0\n", + ".. ... ... ...\n", + "995 357.63 241.54 68.0\n", + "996 352.58 212.95 87.0\n", + "997 355.28 189.50 94.0\n", + "998 369.75 166.33 51.0\n", + "999 349.31 222.07 114.0\n", + "\n", + "[1000 rows x 3 columns]" + ], + "text/html": [ + "\n", + "
\n", + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
p[1]p[2]n[1]
0356.12197.67108.0
1358.05189.6866.0
2340.79260.35130.0
3353.76133.5355.0
4341.37229.8091.0
............
995357.63241.5468.0
996352.58212.9587.0
997355.28189.5094.0
998369.75166.3351.0
999349.31222.07114.0
\n", + "

1000 rows × 3 columns

\n", + "
\n", + "
\n", + "\n", + "
\n", + " \n", + "\n", + " \n", + "\n", + " \n", + "
\n", + "\n", + "\n", + "
\n", + " \n", + "\n", + "\n", + "\n", + " \n", + "
\n", + "\n", + "
\n", + " \n", + " \n", + " \n", + "
\n", + "\n", + "
\n", + "
\n" + ], + "application/vnd.google.colaboratory.intrinsic+json": { + "type": "dataframe", + "variable_name": "df", + "summary": "{\n \"name\": \"df\",\n \"rows\": 1000,\n \"fields\": [\n {\n \"column\": \"p[1]\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 14.409836071934338,\n \"min\": 300.0,\n \"max\": 400.0,\n \"num_unique_values\": 911,\n \"samples\": [\n 350.69,\n 359.04,\n 351.26\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"p[2]\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 29.093624863457553,\n \"min\": 100.0,\n \"max\": 300.0,\n \"num_unique_values\": 947,\n \"samples\": [\n 204.43,\n 211.29,\n 175.08\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"n[1]\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 29.65004858986544,\n \"min\": 0.0,\n \"max\": 200.0,\n \"num_unique_values\": 148,\n \"samples\": [\n 70.0,\n 105.0,\n 48.0\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}" + } + }, + "metadata": {}, + "execution_count": 3 + } + ], + "source": [ + "df = pd.read_csv('https://raw.githubusercontent.com/Gurobi/modeling-examples/master/pricing_competing_products/price_value_data.csv')\n", + "df" + ] }, - "id": "33f4eaec", - "outputId": "d696a20d-da2a-4f48-8224-9501c4328615" - }, - "outputs": [ { - "data": { - "text/plain": [ - "(array([0.80665368, 0.81004987, 0.7941077 , 0.79780172, 0.79495142]),\n", - " array([0.77650521, 0.74227784, 0.82105606, 0.81102818, 0.82166193]))" + "cell_type": "markdown", + "id": "b65707ac", + "metadata": { + "id": "b65707ac" + }, + "source": [ + "### What's in the data?\n", + "The data contains three columns:\n", + "1. `p[1]` is the price (in dollars) of the first category (let's call it Category 1).\n", + "2. `p[2]` is the price (in dollars) of the second category (Category 2).\n", + "3. `n[1]` is the number of the items sold that are of Category 1.\n", + "\n", + "We don't see a column for `n[2]`, which would be the number of items sold that are Category 2. Here is where we make a **pretty big assumption** that we will sell all of the items. This makes our decision to be how to divvy up the limited space we have in order to maximize our revenue.\n", + "The data was created to have a couple of key characteristics.\n", + "1. As the price of Category 1 goes up, the number sold should decrease, so `p[1]` and `n[1]` have a negative correlation.\n", + "2. As the price of Category 2 goes up, the number sold of Category 1 should increase, so `p[2]` and `n[1]` have a positive correlation.\n", + "\n", + "The correlation plot of the columns of the data is below." ] - }, - "execution_count": 172, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "linear_regressor = make_pipeline(LinearRegression())\n", - "linear_regressor.fit(X_train, y_train)\n", - "linear_regression_validation = cross_validate(linear_regressor, X_train, y_train, cv=5, return_train_score=True, return_estimator=True)\n", - "\n", - "linear_regression_validation['train_score'],linear_regression_validation['test_score']" - ] - }, - { - "cell_type": "markdown", - "id": "cc9dd59b", - "metadata": {}, - "source": [ - "Let's try a gradient boosting model as well. " - ] - }, - { - "cell_type": "code", - "execution_count": 173, - "id": "9b8eb814", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" }, - "id": "9b8eb814", - "outputId": "de999b5f-807f-4ff6-8356-bbdb9f88db51" - }, - "outputs": [ { - "data": { - "text/plain": [ - "(array([0.83316977, 0.83568563, 0.82395805, 0.82872996, 0.82289456]),\n", - " array([0.75951446, 0.75090466, 0.79206779, 0.79360144, 0.80517282]))" + "cell_type": "code", + "execution_count": 4, + "id": "4cb22fe7", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 468 + }, + "id": "4cb22fe7", + "outputId": "37183ae6-b780-4225-cc3d-bbaa2f4c7924" + }, + "outputs": [ + { + "output_type": "display_data", + "data": { + "text/plain": [ + "
" + ], + "image/png": "\n" + }, + "metadata": {} + } + ], + "source": [ + "fig, axes = plt.subplots(nrows=1, ncols=1, figsize=(15, 5))\n", + "sns.heatmap(df[['p[1]','p[2]','n[1]']].corr(),annot=True, center=0,ax=axes)\n", + "\n", + "axes.set_title('Correlations')\n", + "plt.show()" ] - }, - "execution_count": 173, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "from sklearn.ensemble import GradientBoostingRegressor\n", - "xgb_regressor = make_pipeline(GradientBoostingRegressor(n_estimators=25)) \n", - "xgb_regressor.fit(X_train, y_train) \n", - "xgb_regressor_validation = cross_validate(xgb_regressor, X_train, y_train, cv=5, return_train_score=True, return_estimator=True)\n", - "\n", - "xgb_regressor_validation['train_score'], xgb_regressor_validation['test_score']\n" - ] - }, - { - "cell_type": "markdown", - "id": "mTquaGiJF2pO", - "metadata": { - "id": "mTquaGiJF2pO" - }, - "source": [ - "## Price optimization model with competing products\n", - "\n", - "Our problem is to:\n", - "1.\tDetermine the number of each category of product to make available given the overall restriction of what we can offer. \n", - "2.\tWe are also instructed to make sure there are a minimum number of each category made available as well as a minimum and maximum price for each category.\n", - "3.\tLastly, the product categories should be decreasing in price, meaning Category 1 should be the most expensive, and so on. Specifically, we must make sure there is at least a $50 gap between categories, but no more than $100. \n", - " \n", - "With the predictive part in place, it's time to build the optimization model. The model is formulated (i.e. the mathematical representation) for an unspecified number of categories, but the code will reflect that we have two categories of products in this problem. We start by setting some parameter values (not to be confused with ML hyperparameters) and initialize the optimization model. \n", - "\n" - ] - }, - { - "cell_type": "markdown", - "id": "95b966b2", - "metadata": {}, - "source": [ - "### Initialize model and set input parameters\n", - "- $C$: Number of product categories\n", - "- $N$: Total amount of space available\n", - "- $\\lambda$: Price control parameter\n", - "\n", - "Here is the first mention of a price control parameter. It is fairly common in optimization modeling to add penalty terms to try and prevent undesirable outcomes. This is akin to using penalty terms in machine learning and applied statistics to prevent overfitting, with [Lasso](https://en.wikipedia.org/wiki/Lasso_(statistics)) and [Ridge](https://en.wikipedia.org/wiki/Ridge_regression) regression as a couple of common examples. " - ] - }, - { - "cell_type": "code", - "execution_count": 174, - "id": "67db9804", - "metadata": {}, - "outputs": [ + }, { - "name": "stdout", - "output_type": "stream", - "text": [ - "Set parameter Username\n", - "Set parameter WLSAccessID\n", - "Set parameter WLSSecret\n", - "Set parameter LicenseID to value 874302\n", - "WLS license 874302 - registered to Gurobi Optimization LLC\n" - ] - } - ], - "source": [ - "#### Initialize the model\n", - "m = gp.Model(\"price optimization\")\n", - "\n", - "products = [1,2] #### Category 1 and Category 2\n", - "N = 200 #### limit on available space\n", - "l = 0 #### price control, we'll start this at 0" - ] - }, - { - "cell_type": "markdown", - "id": "ac5a875b", - "metadata": {}, - "source": [ - "### Decision variables\n", - "- $p_c$: price per item in category $c = 1,2,\\dots, C$\n", - "- $n_c$: number of items allocated to category, predicted using features $p_c$, $c = 1,2,\\dots, C$" - ] - }, - { - "cell_type": "code", - "execution_count": 175, - "id": "bbe94a3c", - "metadata": {}, - "outputs": [], - "source": [ - "p = m.addVars(products, name=\"p\") #### price decision variables\n", - "n = m.addVars(products, name=\"n\") #### decision variable for number of items in each category" - ] - }, - { - "cell_type": "markdown", - "id": "721614e0", - "metadata": {}, - "source": [ - "### Constraints" - ] - }, - { - "cell_type": "markdown", - "id": "8403f941", - "metadata": {}, - "source": [ - "We need to have a minimum number of each category available. \n", - "\\begin{align*}\n", - "n_c \\ge l_{n_c}\n", - "\\end{align*}\n", - "\n", - "We also set lower and upper bounds on the prices.\n", - "\\begin{align*}\n", - "l_{n_c} \\le p_c \\le u_{p_c}\n", - "\\end{align*}" - ] - }, - { - "cell_type": "code", - "execution_count": 176, - "id": "edec4432", - "metadata": {}, - "outputs": [], - "source": [ - "min_items = {1:50,2:50}\n", - "price_bounds = {1:[300,400], 2:[100,300]}\n", - "m.addConstrs(n[c] >= min_items[c] for c in products) #### we could hardcode 50 instead of min_items, but this is more flexible\n", - "m.addConstr(p[1] == [300,400]) #### this is a shorthand way to code 300 <= p[1] <= 400 \n", - "m.addConstr(p[2] == [100,300]);" - ] - }, - { - "cell_type": "markdown", - "id": "3bef9e0d", - "metadata": {}, - "source": [ - "Another note: each of the above constraints can be addressed when defining the decision variables. Here is an example for the decision variable $n$." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e2065d83", - "metadata": {}, - "outputs": [], - "source": [ - "#price_lb = {1:300, 2:100}\n", - "#price_ub = {1:400, 2:300}\n", - "#p = m.addVars(products, lb = price_lb, ub = price_ub, name=\"p\") #### each price is now bounded\n", - "#n = m.addVars(products, lb = min_items, name=\"n\") " - ] - }, - { - "cell_type": "markdown", - "id": "0578e271", - "metadata": {}, - "source": [ - "In general, the number of items allocated must equal the total available space. \n", - "\\begin{equation*}\n", - "n_1 + n_2 + \\dots + n_C = \\sum_{c}n_c = N \\\\\n", - "\\end{equation*}\n", - "Note that this, along with the constraint on the minimum number available means we don't have to specify an upper bound for each $n_c$." - ] - }, - { - "cell_type": "code", - "execution_count": 177, - "id": "140e739f", - "metadata": {}, - "outputs": [], - "source": [ - "m.addConstr(n.sum() == N); #### remember we set N = 200 earlier" - ] - }, - { - "cell_type": "markdown", - "id": "b17a1e78", - "metadata": {}, - "source": [ - "The last set of constraints are for price ordering. This requires the subsequent category to be cost between $50 and $100 less than the previous. \n", - "\\begin{equation*}\n", - "50 \\le p_c - p_{c+1} \\le 100\n", - "\\end{equation*}" - ] - }, - { - "cell_type": "code", - "execution_count": 178, - "id": "1428be22", - "metadata": {}, - "outputs": [], - "source": [ - "m.addConstr(p[1]-p[2] == [50,100]);" - ] - }, - { - "cell_type": "markdown", - "id": "51ee2204", - "metadata": {}, - "source": [ - "### Objective function" - ] - }, - { - "cell_type": "markdown", - "id": "6eb69e54", - "metadata": {}, - "source": [ - "We want to maximize total revenue with the portion of total revenue coming from category $c$ being $p_cn_c$. This makes the total revenue $\\sum_{c} p_c n_c$. That is the first part of the objective. Earlier a price control parameter was introduced which is the second part of the objective. The lambda parameter captures the trade-off between the revenue and price-control pieces. This term penalizes the model from setting too high of prices since doing so could lose sales. Our model assumed we'll sell all of the items so having this penalty term can make this assumption more realistic. For reference, [here is a good source](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4565407).\n", - "\n", - "This term will be defined as $λ (\\sum_{c} p_c^2)$ for this problem. So, the complete objective is:\n", - "\\begin{equation*}\n", - "\\textrm{maximize} \\sum_{c} p_c n_c - λ (\\sum_{c} p_c^2)\n", - "\\end{equation*}" - ] - }, - { - "cell_type": "code", - "execution_count": 179, - "id": "a2cfcabc", - "metadata": {}, - "outputs": [], - "source": [ - "revenue = gp.quicksum(p[c]*n[c] for c in products) #### you could also use the more simple p.prod(n)\n", - "penalty = l*(p[1]**2+p[1]**2) #### we used l as the lambda parameter earlier\n", - "m.setObjective(revenue - penalty, sense = GRB.MAXIMIZE)" - ] - }, - { - "cell_type": "markdown", - "id": "c707af60", - "metadata": {}, - "source": [ - "### Integrate the ML model\n" - ] - }, - { - "cell_type": "markdown", - "id": "38d41039", - "metadata": {}, - "source": [ - "Right now, if we were to run the optimization, the solution would be to set the price for Category 1 to $400, Category 2 to $300, and sell 150 and 50 of each item, respectively. That's because we have yet to add in the relationship between price and demand that was derived from the ML model. To integrate the machine learning model into the optimization model, we'll use the Gurobi Machine Learning package. The magic happens using `add_predictor_constr` function. " - ] - }, - { - "cell_type": "code", - "execution_count": 180, - "id": "b113eda2", - "metadata": {}, - "outputs": [ + "cell_type": "markdown", + "id": "a4aa96d4", + "metadata": { + "id": "a4aa96d4" + }, + "source": [ + "In this problem, we've assumed that the amount of space we have available for the products is 200 units. In retail, this could be the amount of warehouse space, or for ticketing this could represent the number of seats available." + ] + }, { - "name": "stdout", - "output_type": "stream", - "text": [ - "Requirement already satisfied: gurobi-machinelearning in /Users/yurchisin/opt/anaconda3/envs/gurobi_ml/lib/python3.11/site-packages (1.3.3)\n", - "Requirement already satisfied: numpy>=1.22.0 in /Users/yurchisin/opt/anaconda3/envs/gurobi_ml/lib/python3.11/site-packages (from gurobi-machinelearning) (1.26.2)\n", - "Requirement already satisfied: gurobipy>=10.0.0 in /Users/yurchisin/opt/anaconda3/envs/gurobi_ml/lib/python3.11/site-packages (from gurobi-machinelearning) (11.0.0)\n", - "Requirement already satisfied: scipy>=1.9.3 in /Users/yurchisin/opt/anaconda3/envs/gurobi_ml/lib/python3.11/site-packages (from gurobi-machinelearning) (1.11.4)\n", - "Note: you may need to restart the kernel to use updated packages.\n" - ] - } - ], - "source": [ - "#### install the package and load the required function\n", - "%pip install gurobi-machinelearning\n", - "from gurobi_ml import add_predictor_constr" - ] - }, - { - "cell_type": "markdown", - "id": "119a8f7e", - "metadata": {}, - "source": [ - "This additional package is useful when we have **decision variables** that are also **features** of a machine learning model. First, we need a data frame that contains these decision variables. It is important to make sure the indices of the data frame have the **same name** as the training data for the machine learning model. " - ] - }, - { - "cell_type": "code", - "execution_count": 182, - "id": "0b1ac8cc", - "metadata": {}, - "outputs": [], - "source": [ - "m_feats = pd.DataFrame({\"p[1]\":[p[1]],\"p[2]\":[p[2]]})" - ] - }, - { - "cell_type": "markdown", - "id": "3263a396", - "metadata": {}, - "source": [ - "Adding the predictive model to the optimization model requires specifying the model we want to use `(m)`, regression object `(xgb_regressor)`, feature data frame `(m_feats)`, and the output decision variable `(n[1])`. Remember `n[2]` is **NOT** the output of the regression. We can then print the number of variables and constraints added to the model using `print_stats`.\n" - ] - }, - { - "cell_type": "code", - "execution_count": 186, - "id": "1fe6c59f", - "metadata": {}, - "outputs": [ + "cell_type": "markdown", + "id": "6e9c6157", + "metadata": { + "id": "6e9c6157" + }, + "source": [ + "### Building regressors to predict sales\n", + "\n", + "The prices for each category item will be used to predict the number of Category 1 items sold. Here we build a regression model to form this relationship which will later be used as part of the optimization model." + ] + }, { - "name": "stdout", - "output_type": "stream", - "text": [ - "Model for pipe0:\n", - "0 variables\n", - "1 constraints\n", - "Input has shape (1, 2)\n", - "Output has shape (1, 1)\n", - "\n", - "Pipeline has 1 steps:\n", - "\n", - "--------------------------------------------------------------------------------\n", - "Step Output Shape Variables Constraints \n", - " Linear Quadratic General\n", - "================================================================================\n", - "lin_reg (1, 1) 0 1 0 0\n", - "\n", - "--------------------------------------------------------------------------------\n" - ] - } - ], - "source": [ - "pred_constr = add_predictor_constr(m, linear_regressor, m_feats, n[1])\n", - "pred_constr.print_stats()" - ] - }, - { - "cell_type": "markdown", - "id": "53ce50f6", - "metadata": {}, - "source": [ - "### Solve the optimization and get the solution" - ] - }, - { - "cell_type": "markdown", - "id": "ec80341d", - "metadata": {}, - "source": [ - "Since this is a quadratic, non-convex problem we set the `NonConvex` parameter to 2. See the [documentation](https://www.gurobi.com/documentation/current/refman/nonconvex.html) for more information. We'll also print out the optimal solution. " - ] - }, - { - "cell_type": "code", - "execution_count": 187, - "id": "d04d5e5b", - "metadata": {}, - "outputs": [ + "cell_type": "code", + "execution_count": 5, + "id": "a70d5f2c", + "metadata": { + "id": "a70d5f2c" + }, + "outputs": [], + "source": [ + "from sklearn.compose import make_column_transformer\n", + "from sklearn.linear_model import LinearRegression\n", + "from sklearn.pipeline import make_pipeline\n", + "from sklearn.metrics import r2_score\n", + "from sklearn.model_selection import train_test_split #importing scikit-learn's function for data splitting\n", + "from sklearn.ensemble import GradientBoostingRegressor #importing scikit-learn's gradient booster regressor function\n", + "from sklearn.model_selection import cross_validate #improting scikit-learn's cross validation function" + ] + }, { - "name": "stdout", - "output_type": "stream", - "text": [ - "Gurobi Optimizer version 11.0.0 build v11.0.0rc2 (mac64[rosetta2] - Darwin 22.6.0 22G91)\n", - "\n", - "CPU model: Apple M1\n", - "Thread count: 8 physical cores, 8 logical processors, using up to 8 threads\n", - "\n", - "WLS license 874302 - registered to Gurobi Optimization LLC\n", - "Optimize a model with 33 rows, 232 columns and 240 nonzeros\n", - "Model fingerprint: 0x27842501\n", - "Model has 2 quadratic objective terms\n", - "Model has 648 general constraints\n", - "Variable types: 32 continuous, 200 integer (200 binary)\n", - "Coefficient statistics:\n", - " Matrix range [1e-01, 2e+00]\n", - " Objective range [0e+00, 0e+00]\n", - " QObjective range [2e+00, 2e+00]\n", - " Bounds range [1e+00, 2e+02]\n", - " RHS range [1e+00, 7e+02]\n", - " GenCon rhs range [3e-03, 4e+02]\n", - " GenCon coe range [1e+00, 1e+00]\n", - "\n", - "MIP start from previous solve produced solution with objective 62753.9 (0.04s)\n", - "Loaded MIP start from previous solve with objective 62753.9\n", - "\n", - "Presolve added 111 rows and 0 columns\n", - "Presolve removed 0 rows and 68 columns\n", - "Presolve time: 0.03s\n", - "Presolved: 149 rows, 167 columns, 749 nonzeros\n", - "Presolved model has 2 bilinear constraint(s)\n", - "\n", - "Solving non-convex MIQCP\n", - "\n", - "Variable types: 33 continuous, 134 integer (134 binary)\n", - "\n", - "Root relaxation: objective 6.762746e+04, 191 iterations, 0.00 seconds (0.00 work units)\n", - "\n", - " Nodes | Current Node | Objective Bounds | Work\n", - " Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time\n", - "\n", - " 0 0 67627.4598 0 31 62753.8592 67627.4598 7.77% - 0s\n", - " 0 0 67627.4598 0 33 62753.8592 67627.4598 7.77% - 0s\n", - " 0 0 67627.4598 0 35 62753.8592 67627.4598 7.77% - 0s\n", - "H 0 0 63247.729112 67627.4598 6.92% - 0s\n", - " 0 0 66606.1147 0 17 63247.7291 66606.1147 5.31% - 0s\n", - " 0 0 66561.7393 0 21 63247.7291 66561.7393 5.24% - 0s\n", - " 0 0 66219.9473 0 45 63247.7291 66219.9473 4.70% - 0s\n", - " 0 0 65902.8627 0 38 63247.7291 65902.8627 4.20% - 0s\n", - " 0 0 65755.2174 0 23 63247.7291 65755.2174 3.96% - 0s\n", - " 0 0 65563.5628 0 11 63247.7291 65563.5628 3.66% - 0s\n", - " 0 0 65456.9640 0 7 63247.7291 65456.9640 3.49% - 0s\n", - "H 0 0 65292.830679 65456.9640 0.25% - 0s\n", - " 0 0 65456.9640 0 7 65292.8307 65456.9640 0.25% - 0s\n", - "\n", - "Cutting planes:\n", - " Learned: 1\n", - " Cover: 4\n", - " Implied bound: 1\n", - " Clique: 35\n", - " MIR: 3\n", - " Flow cover: 4\n", - " Relax-and-lift: 14\n", - "\n", - "Explored 1 nodes (728 simplex iterations) in 0.21 seconds (0.06 work units)\n", - "Thread count was 8 (of 8 available processors)\n", - "\n", - "Solution count 3: 65292.8 63247.7 62753.9 \n", - "\n", - "Optimal solution found (tolerance 1.00e-04)\n", - "Best objective 6.529283067918e+04, best bound 6.529283067918e+04, gap 0.0000%\n" - ] - } - ], - "source": [ - "m.Params.NonConvex = 2\n", - "m.optimize()" - ] - }, - { - "cell_type": "code", - "execution_count": 188, - "id": "ec56f1da", - "metadata": {}, - "outputs": [ + "cell_type": "code", + "execution_count": 6, + "id": "3095da93", + "metadata": { + "id": "3095da93" + }, + "outputs": [], + "source": [ + "X = df[[\"p[1]\",\"p[2]\"]]\n", + "y = df[\"n[1]\"]\n", + "# Split the data for training and testing\n", + "X_train, X_test, y_train, y_test = train_test_split(\n", + " X, y, train_size=0.75, random_state=1\n", + ")" + ] + }, { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "Optimal price for the two categories:\n", - " 378.77 300.0\n", - "\n", - "Optimal number of space assigned to the two categories:\n", - " 67 133\n", - "\n", - "Total revenue:\n", - " 65292.83\n" - ] - } - ], - "source": [ - "print(\"\\nOptimal price for the two categories:\\n\",round(p[1].X,2),round(p[2].X,2))\n", - "print(\"\\nOptimal number of space assigned to the two categories:\\n\",round(n[1].X), round(n[2].X))\n", - "print(\"\\nTotal revenue:\\n\",round(revenue.getValue(),2))" - ] - }, - { - "cell_type": "markdown", - "id": "5e9e2bbd", - "metadata": {}, - "source": [ - "## Build an interactive model" - ] - }, - { - "cell_type": "markdown", - "id": "e9e00776", - "metadata": {}, - "source": [ - "The penalty portion of the objective can have a big impact on the solution. We can combine the parts of the model into a function to give us the ability to see what different values of $\\lambda$ do to the solution. \n" - ] - }, - { - "cell_type": "code", - "execution_count": 111, - "id": "c0c84944", - "metadata": {}, - "outputs": [], - "source": [ - "from ipywidgets import interact, interactive, fixed, interact_manual\n", - "import ipywidgets as widgets\n", - "\n", - "def solve(x = 0):\n", - "\n", - " model = gp.Model(\"Price optimization\")\n", - " model.Params.OutputFlag = 0\n", - "\n", - " l = x\n", - " N = 200\n", - " price_lb = {1:300, 2:100}\n", - " price_ub = {1:400, 2:300}\n", - " min_items = {1:50,2:50}\n", - " price_bounds = {1:[300,400], 2:[100,300]}\n", - " p = model.addVars(products, lb = price_lb, ub = price_ub, name=\"p\") \n", - " n = model.addVars(products, lb = min_items, name=\"n\") \n", - " \n", - " model.addConstrs(n[c] >= min_items[c] for c in products) \n", - " model.addConstr(p[1] == [300,400]) \n", - " model.addConstr(p[2] == [100,300])\n", - "\n", - " revenue = gp.quicksum(p[c]*n[c] for c in products) \n", - " penalty = l*(p[1]**2+p[1]**2) \n", - " model.setObjective(revenue - penalty, sense = GRB.MAXIMIZE)\n", - "\n", - " model.addConstr(n.sum() == N)\n", - "\n", - " model.addConstr(p[1]-p[2] == [50,100])\n", - "\n", - " m_feats = pd.DataFrame({\"p[1]\":[p[1]],\"p[2]\":[p[2]]})\n", - "\n", - " pred_constr = add_predictor_constr(m, xgb_regressor, m_feats, n[1])\n", - "\n", - " model.Params.NonConvex = 2\n", - " model.optimize()\n", - " print(\"\\nOptimal price for the two categories:\\n\",round(p[1].X,2),round(p[2].X,2))\n", - " print(\"\\nOptimal number of space assigned to the two categories:\\n\",round(n[1].X), round(n[2].X))\n", - " print(\"\\nTotal revenue:\\n\",round(revenue.getValue(),2))\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e3b012c2", - "metadata": {}, - "outputs": [], - "source": [ - "upper_bound_for_lamda = 0.25\n", - "print(upper_bound_for_lamda)\n", - "print(\"Select a value for regularization parameter lambda:\\n\")\n", - "interact(solve, x=(0,upper_bound_for_lamda,0.01))" - ] - }, - { - "cell_type": "markdown", - "id": "e0888fb6", - "metadata": { - "id": "e0888fb6" - }, - "source": [ - "Copyright © 2024 Gurobi Optimization, LLC" - ] - } - ], - "metadata": { - "colab": { - "provenance": [] - }, - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.7" - }, - "widgets": { - "application/vnd.jupyter.widget-state+json": { - "04c449bbca144b7db6821cf019ebb48b": { - "model_module": "@jupyter-widgets/output", - "model_module_version": "1.0.0", - "model_name": "OutputModel", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/output", - "_model_module_version": "1.0.0", - "_model_name": "OutputModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/output", - "_view_module_version": "1.0.0", - "_view_name": "OutputView", - "layout": "IPY_MODEL_c18fa29a67c643e79a1f0dffb1b591b6", - "msg_id": "", + "cell_type": "markdown", + "id": "f8d713c2", + "metadata": { + "id": "f8d713c2" + }, + "source": [ + "First we'll start with a linear regression model." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "33f4eaec", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "33f4eaec", + "outputId": "d02c9203-4bef-4a4e-cb16-bb948a8fca00" + }, "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Model for pipe:\n", - "180 variables\n", - "21 constraints\n", - "559 general constraints\n", - "Input has shape (1, 2)\n", - "Output has shape (1, 1)\n", - "\n", - "Pipeline has 1 steps:\n", - "\n", - "--------------------------------------------------------------------------------\n", - "Step Output Shape Variables Constraints \n", - " Linear Quadratic General\n", - "================================================================================\n", - "gbtree_reg (1, 1) 180 21 0 559\n", - "\n", - "--------------------------------------------------------------------------------\n", - "Set parameter NonConvex to value 2\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Gurobi Optimizer version 10.0.2 build v10.0.2rc0 (linux64)\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "CPU model: Intel(R) Xeon(R) CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Thread count: 1 physical cores, 2 logical processors, using up to 2 threads\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Optimize a model with 30 rows, 187 columns and 193 nonzeros\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Model fingerprint: 0xb29f40d6\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Model has 3 quadratic objective terms\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Model has 2 quadratic constraints\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Model has 559 general constraints\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Variable types: 27 continuous, 160 integer (160 binary)\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Coefficient statistics:\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " Matrix range [1e-01, 1e+00]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " QMatrix range [1e+00, 1e+00]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " QLMatrix range [1e+00, 1e+00]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " Objective range [0e+00, 0e+00]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " QObjective range [3e-01, 2e+00]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " Bounds range [6e-02, 1e+00]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " RHS range [5e-01, 4e+02]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " GenCon rhs range [5e-04, 4e+02]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " GenCon coe range [1e+00, 1e+00]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Presolve added 113 rows and 0 columns\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Presolve time: 0.06s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Presolved: 156 rows, 191 columns, 1024 nonzeros\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Presolved model has 1 quadratic constraint(s)\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Presolved model has 4 bilinear constraint(s)\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Variable types: 31 continuous, 160 integer (160 binary)\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Root relaxation: objective 4.668508e+04, 188 iterations, 0.00 seconds (0.00 work units)\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " Nodes | Current Node | Objective Bounds | Work\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " 0 0 46685.0841 0 43 - 46685.0841 - - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "H 0 0 25589.338929 46685.0841 82.4% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " 0 0 40215.3725 0 21 25589.3389 40215.3725 57.2% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "H 0 0 37190.120990 40215.3725 8.13% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " 0 0 40207.4274 0 19 37190.1210 40207.4274 8.11% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " 0 0 39746.2851 0 29 37190.1210 39746.2851 6.87% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "H 0 0 37399.288369 39746.2851 6.28% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " 0 0 39724.9188 0 27 37399.2884 39724.9188 6.22% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " 0 0 39531.4442 0 37 37399.2884 39531.4442 5.70% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " 0 0 39516.6123 0 50 37399.2884 39516.6123 5.66% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " 0 0 39436.2486 0 43 37399.2884 39436.2486 5.45% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " 0 0 39433.6216 0 50 37399.2884 39433.6216 5.44% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " 0 0 39423.4882 0 39 37399.2884 39423.4882 5.41% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " 0 0 39405.9128 0 53 37399.2884 39405.9128 5.37% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " 0 0 39401.2897 0 50 37399.2884 39401.2897 5.35% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " 0 0 39033.2506 0 21 37399.2884 39033.2506 4.37% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " 0 1 39033.0955 0 21 37399.2884 39033.0955 4.37% - 0s\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Cutting planes:\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " Learned: 13\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " Cover: 12\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " Clique: 33\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " MIR: 6\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " RLT: 3\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - " Relax-and-lift: 15\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Explored 22 nodes (590 simplex iterations) in 0.43 seconds (0.10 work units)\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Thread count was 2 (of 2 available processors)\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Solution count 3: 37399.3 37190.1 25589.3 \n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Optimal solution found (tolerance 1.00e-04)\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Best objective 3.739928836871e+04, best bound 3.739928836871e+04, gap 0.0000%\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "Optimal price for the two categories:\n", - " 325.04908752441406 275.04908752441406\n", - "\n", - "Optimal number of seats assigned to the two categories:\n", - " 150.00000000000006 49.99999999999994\n" - ] - } + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(array([0.80665368, 0.81004987, 0.7941077 , 0.79780172, 0.79495142]),\n", + " array([0.77650521, 0.74227784, 0.82105606, 0.81102818, 0.82166193]))" + ] + }, + "metadata": {}, + "execution_count": 7 + } + ], + "source": [ + "linear_regressor = make_pipeline(LinearRegression())\n", + "linear_regressor.fit(X_train, y_train)\n", + "linear_regression_validation = cross_validate(linear_regressor, X_train, y_train, cv=5, return_train_score=True, return_estimator=True)\n", + "\n", + "linear_regression_validation['train_score'],linear_regression_validation['test_score']" + ] + }, + { + "cell_type": "markdown", + "id": "cc9dd59b", + "metadata": { + "id": "cc9dd59b" + }, + "source": [ + "Let's try a gradient boosting model as well." ] - } }, - "0653e902f546425e9506d8340450dbd2": { - "model_module": "@jupyter-widgets/controls", - "model_module_version": "1.5.0", - "model_name": "VBoxModel", - "state": { - "_dom_classes": [ - "widget-interact" + { + "cell_type": "code", + "execution_count": 8, + "id": "9b8eb814", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "9b8eb814", + "outputId": "ca16f36f-2399-4e29-c3b3-9646cdb22a38" + }, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "(array([0.71068886, 0.71262322, 0.70264476, 0.70564959, 0.7008412 ]),\n", + " array([0.65706085, 0.69348053, 0.66690978, 0.67664668, 0.67992755]))" + ] + }, + "metadata": {}, + "execution_count": 8 + } ], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "VBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "VBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_e212c9758a0e4460950cb22f911671e1", - "IPY_MODEL_04c449bbca144b7db6821cf019ebb48b" + "source": [ + "from sklearn.ensemble import GradientBoostingRegressor\n", + "xgb_regressor = make_pipeline(GradientBoostingRegressor(n_estimators=10))\n", + "xgb_regressor.fit(X_train, y_train)\n", + "xgb_regressor_validation = cross_validate(xgb_regressor, X_train, y_train, cv=5, return_train_score=True, return_estimator=True)\n", + "\n", + "xgb_regressor_validation['train_score'], xgb_regressor_validation['test_score']\n" + ] + }, + { + "cell_type": "markdown", + "id": "mTquaGiJF2pO", + "metadata": { + "id": "mTquaGiJF2pO" + }, + "source": [ + "## Price optimization model with competing products\n", + "\n", + "Our problem is to:\n", + "1.\tDetermine the number of each category of product to make available given the overall restriction of what we can offer.\n", + "2.\tWe are also instructed to make sure there are a minimum number of each category made available as well as a minimum and maximum price for each category.\n", + "3.\tLastly, the product categories should be decreasing in price, meaning Category 1 should be the most expensive, and so on. Specifically, we must make sure there is at least a $50 gap between categories, but no more than $100.\n", + "\n", + "With the predictive part in place, it's time to build the optimization model. The model is formulated (i.e. the mathematical representation) for an unspecified number of categories, but the code will reflect that we have two categories of products in this problem. We start by setting some parameter values (not to be confused with ML hyperparameters) and initialize the optimization model.\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "95b966b2", + "metadata": { + "id": "95b966b2" + }, + "source": [ + "### Initialize model and set input parameters\n", + "- $C$: Number of product categories\n", + "- $N$: Total amount of space available\n", + "- $\\lambda$: Price control parameter\n", + "\n", + "Here is the first mention of a price control parameter. It is fairly common in optimization modeling to add penalty terms to try and prevent undesirable outcomes. This is akin to using penalty terms in machine learning and applied statistics to prevent overfitting, with [Lasso](https://en.wikipedia.org/wiki/Lasso_(statistics)) and [Ridge](https://en.wikipedia.org/wiki/Ridge_regression) regression as a couple of common examples." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "67db9804", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "67db9804", + "outputId": "2a1edc1c-0b23-4649-88da-53d34a5f2b03" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Restricted license - for non-production use only - expires 2025-11-24\n" + ] + } ], - "layout": "IPY_MODEL_c2a976389eb046728c1cf683b4920a85" - } + "source": [ + "#### Initialize the model\n", + "m = gp.Model(\"price optimization\")\n", + "\n", + "products = [1,2] #### Category 1 and Category 2\n", + "N = 200 #### limit on available space\n", + "l = 0 #### price control, we'll start this at 0" + ] + }, + { + "cell_type": "markdown", + "id": "ac5a875b", + "metadata": { + "id": "ac5a875b" + }, + "source": [ + "### Decision variables\n", + "- $p_c$: price per item in category $c = 1,2,\\dots, C$\n", + "- $n_c$: number of items allocated to category, predicted using features $p_c$, $c = 1,2,\\dots, C$" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "bbe94a3c", + "metadata": { + "id": "bbe94a3c" + }, + "outputs": [], + "source": [ + "p = m.addVars(products, name=\"p\") #### price decision variables\n", + "n = m.addVars(products, name=\"n\") #### decision variable for number of items in each category" + ] + }, + { + "cell_type": "markdown", + "id": "721614e0", + "metadata": { + "id": "721614e0" + }, + "source": [ + "### Constraints" + ] + }, + { + "cell_type": "markdown", + "id": "8403f941", + "metadata": { + "id": "8403f941" + }, + "source": [ + "We need to have a minimum number of each category available.\n", + "\\begin{align*}\n", + "n_c \\ge l_{n_c}\n", + "\\end{align*}\n", + "\n", + "We also set lower and upper bounds on the prices.\n", + "\\begin{align*}\n", + "l_{n_c} \\le p_c \\le u_{p_c}\n", + "\\end{align*}" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "edec4432", + "metadata": { + "id": "edec4432" + }, + "outputs": [], + "source": [ + "min_items = {1:50,2:50}\n", + "price_bounds = {1:[300,400], 2:[100,300]}\n", + "m.addConstrs(n[c] >= min_items[c] for c in products) #### we could hardcode 50 instead of min_items, but this is more flexible\n", + "m.addConstr(p[1] == [300,400]) #### this is a shorthand way to code 300 <= p[1] <= 400\n", + "m.addConstr(p[2] == [100,300]);" + ] + }, + { + "cell_type": "markdown", + "id": "3bef9e0d", + "metadata": { + "id": "3bef9e0d" + }, + "source": [ + "Another note: each of the above constraints can be addressed when defining the decision variables. Here is an example for the decision variable $n$." + ] }, - "39b9c940f508451b9292f29639f5d128": { - "model_module": "@jupyter-widgets/controls", - "model_module_version": "1.5.0", - "model_name": "SliderStyleModel", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "SliderStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "", - "handle_color": null - } + { + "cell_type": "code", + "execution_count": 12, + "id": "e2065d83", + "metadata": { + "id": "e2065d83" + }, + "outputs": [], + "source": [ + "#price_lb = {1:300, 2:100}\n", + "#price_ub = {1:400, 2:300}\n", + "#p = m.addVars(products, lb = price_lb, ub = price_ub, name=\"p\") #### each price is now bounded\n", + "#n = m.addVars(products, lb = min_items, name=\"n\")" + ] }, - "c18fa29a67c643e79a1f0dffb1b591b6": { - "model_module": "@jupyter-widgets/base", - "model_module_version": "1.2.0", - "model_name": "LayoutModel", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } + { + "cell_type": "markdown", + "id": "0578e271", + "metadata": { + "id": "0578e271" + }, + "source": [ + "In general, the number of items allocated must equal the total available space.\n", + "\\begin{equation*}\n", + "n_1 + n_2 + \\dots + n_C = \\sum_{c}n_c = N \\\\\n", + "\\end{equation*}\n", + "Note that this, along with the constraint on the minimum number available means we don't have to specify an upper bound for each $n_c$." + ] }, - "c2a976389eb046728c1cf683b4920a85": { - "model_module": "@jupyter-widgets/base", - "model_module_version": "1.2.0", - "model_name": "LayoutModel", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } + { + "cell_type": "code", + "execution_count": 13, + "id": "140e739f", + "metadata": { + "id": "140e739f" + }, + "outputs": [], + "source": [ + "m.addConstr(n.sum() == N); #### remember we set N = 200 earlier" + ] }, - "e212c9758a0e4460950cb22f911671e1": { - "model_module": "@jupyter-widgets/controls", - "model_module_version": "1.5.0", - "model_name": "FloatSliderModel", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatSliderModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "FloatSliderView", - "continuous_update": true, - "description": "x", - "description_tooltip": null, - "disabled": false, - "layout": "IPY_MODEL_e58bd5527b16464da224d7e8125e2065", - "max": 0.08, - "min": 0, - "orientation": "horizontal", - "readout": true, - "readout_format": ".2f", - "step": 0.01, - "style": "IPY_MODEL_39b9c940f508451b9292f29639f5d128", - "value": 0.08 - } + { + "cell_type": "markdown", + "id": "b17a1e78", + "metadata": { + "id": "b17a1e78" + }, + "source": [ + "The last set of constraints are for price ordering. This requires the subsequent category to be cost between $50 and $100 less than the previous.\n", + "\\begin{equation*}\n", + "50 \\le p_c - p_{c+1} \\le 100\n", + "\\end{equation*}" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "1428be22", + "metadata": { + "id": "1428be22" + }, + "outputs": [], + "source": [ + "m.addConstr(p[1]-p[2] == [50,100]);" + ] }, - "e58bd5527b16464da224d7e8125e2065": { - "model_module": "@jupyter-widgets/base", - "model_module_version": "1.2.0", - "model_name": "LayoutModel", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } + { + "cell_type": "markdown", + "id": "51ee2204", + "metadata": { + "id": "51ee2204" + }, + "source": [ + "### Objective function" + ] + }, + { + "cell_type": "markdown", + "id": "6eb69e54", + "metadata": { + "id": "6eb69e54" + }, + "source": [ + "We want to maximize total revenue with the portion of total revenue coming from category $c$ being $p_cn_c$. This makes the total revenue $\\sum_{c} p_c n_c$. That is the first part of the objective. Earlier a price control parameter was introduced which is the second part of the objective. The lambda parameter captures the trade-off between the revenue and price-control pieces. This term penalizes the model from setting too high of prices since doing so could lose sales. Our model assumed we'll sell all of the items so having this penalty term can make this assumption more realistic. For reference, [here is a good source](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4565407).\n", + "\n", + "This term will be defined as $λ (\\sum_{c} p_c^2)$ for this problem. So, the complete objective is:\n", + "\\begin{equation*}\n", + "\\textrm{maximize} \\sum_{c} p_c n_c - λ (\\sum_{c} p_c^2)\n", + "\\end{equation*}" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "a2cfcabc", + "metadata": { + "id": "a2cfcabc" + }, + "outputs": [], + "source": [ + "revenue = gp.quicksum(p[c]*n[c] for c in products) #### you could also use the more simple p.prod(n)\n", + "penalty = l*(p[1]**2+p[1]**2) #### we used l as the lambda parameter earlier\n", + "m.setObjective(revenue - penalty, sense = GRB.MAXIMIZE)" + ] + }, + { + "cell_type": "markdown", + "id": "c707af60", + "metadata": { + "id": "c707af60" + }, + "source": [ + "### Integrate the ML model\n" + ] + }, + { + "cell_type": "markdown", + "id": "38d41039", + "metadata": { + "id": "38d41039" + }, + "source": [ + "Right now, if we were to run the optimization, the solution would be to set the price for Category 1 to $400, Category 2 to $300, and sell 150 and 50 of each item, respectively. That's because we have yet to add in the relationship between price and demand that was derived from the ML model. To integrate the machine learning model into the optimization model, we'll use the Gurobi Machine Learning package. The magic happens using `add_predictor_constr` function." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "id": "b113eda2", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "b113eda2", + "outputId": "584d9f0e-ea69-41c9-e481-a45f0bc36dd6" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Requirement already satisfied: gurobi-machinelearning in /usr/local/lib/python3.10/dist-packages (1.4.0)\n", + "Requirement already satisfied: numpy>=1.22.0 in /usr/local/lib/python3.10/dist-packages (from gurobi-machinelearning) (1.25.2)\n", + "Requirement already satisfied: gurobipy>=10.0.0 in /usr/local/lib/python3.10/dist-packages (from gurobi-machinelearning) (11.0.0)\n", + "Requirement already satisfied: scipy>=1.9.3 in /usr/local/lib/python3.10/dist-packages (from gurobi-machinelearning) (1.11.4)\n" + ] + } + ], + "source": [ + "#### install the package and load the required function\n", + "%pip install gurobi-machinelearning\n", + "from gurobi_ml import add_predictor_constr" + ] + }, + { + "cell_type": "markdown", + "id": "119a8f7e", + "metadata": { + "id": "119a8f7e" + }, + "source": [ + "This additional package is useful when we have **decision variables** that are also **features** of a machine learning model. First, we need a data frame that contains these decision variables. It is important to make sure the indices of the data frame have the **same name** as the training data for the machine learning model." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "id": "0b1ac8cc", + "metadata": { + "id": "0b1ac8cc" + }, + "outputs": [], + "source": [ + "m_feats = pd.DataFrame({\"p[1]\":[p[1]],\"p[2]\":[p[2]]})" + ] + }, + { + "cell_type": "markdown", + "id": "3263a396", + "metadata": { + "id": "3263a396" + }, + "source": [ + "Adding the predictive model to the optimization model requires specifying the model we want to use `(m)`, regression object `(xgb_regressor)`, feature data frame `(m_feats)`, and the output decision variable `(n[1])`. Remember `n[2]` is **NOT** the output of the regression. We can then print the number of variables and constraints added to the model using `print_stats`.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "id": "1fe6c59f", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "1fe6c59f", + "outputId": "5aa94f80-e974-4e5b-de19-7c82525e5f75" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Model for pipe:\n", + "90 variables\n", + "11 constraints\n", + "246 general constraints\n", + "Input has shape (1, 2)\n", + "Output has shape (1, 1)\n", + "\n", + "Pipeline has 1 steps:\n", + "\n", + "--------------------------------------------------------------------------------\n", + "Step Output Shape Variables Constraints \n", + " Linear Quadratic General\n", + "================================================================================\n", + "gbtree_reg (1, 1) 90 11 0 246\n", + "\n", + "--------------------------------------------------------------------------------\n" + ] + } + ], + "source": [ + "pred_constr = add_predictor_constr(m, xgb_regressor, m_feats, n[1])\n", + "pred_constr.print_stats()" + ] + }, + { + "cell_type": "markdown", + "id": "53ce50f6", + "metadata": { + "id": "53ce50f6" + }, + "source": [ + "### Solve the optimization and get the solution" + ] + }, + { + "cell_type": "markdown", + "id": "ec80341d", + "metadata": { + "id": "ec80341d" + }, + "source": [ + "Since this is a quadratic, non-convex problem we set the `NonConvex` parameter to 2. See the [documentation](https://www.gurobi.com/documentation/current/refman/nonconvex.html) for more information. We'll also print out the optimal solution." + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "id": "d04d5e5b", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "d04d5e5b", + "outputId": "7cbae0da-917f-45ec-9fe1-3af04c942d08" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Set parameter NonConvex to value 2\n", + "Gurobi Optimizer version 11.0.0 build v11.0.0rc2 (linux64 - \"Ubuntu 22.04.3 LTS\")\n", + "\n", + "CPU model: Intel(R) Xeon(R) CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n", + "Thread count: 1 physical cores, 2 logical processors, using up to 2 threads\n", + "\n", + "Optimize a model with 17 rows, 97 columns and 102 nonzeros\n", + "Model fingerprint: 0x2255329b\n", + "Model has 2 quadratic objective terms\n", + "Model has 246 general constraints\n", + "Variable types: 17 continuous, 80 integer (80 binary)\n", + "Coefficient statistics:\n", + " Matrix range [1e-01, 1e+00]\n", + " Objective range [0e+00, 0e+00]\n", + " QObjective range [2e+00, 2e+00]\n", + " Bounds range [1e+00, 2e+02]\n", + " RHS range [1e+00, 4e+02]\n", + " GenCon rhs range [3e-01, 4e+02]\n", + " GenCon coe range [1e+00, 1e+00]\n", + "Presolve added 44 rows and 0 columns\n", + "Presolve removed 0 rows and 15 columns\n", + "Presolve time: 0.10s\n", + "Presolved: 66 rows, 85 columns, 383 nonzeros\n", + "Presolved model has 2 bilinear constraint(s)\n", + "\n", + "Solving non-convex MIQCP\n", + "\n", + "Variable types: 18 continuous, 67 integer (67 binary)\n", + "Found heuristic solution: objective 57353.074762\n", + "\n", + "Root relaxation: objective 6.842235e+04, 81 iterations, 0.00 seconds (0.00 work units)\n", + "\n", + " Nodes | Current Node | Objective Bounds | Work\n", + " Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time\n", + "\n", + " 0 0 68422.3454 0 21 57353.0748 68422.3454 19.3% - 0s\n", + " 0 0 68380.4264 0 21 57353.0748 68380.4264 19.2% - 0s\n", + "H 0 0 65051.531333 68380.4264 5.12% - 0s\n", + "H 0 0 65717.039440 68380.4264 4.05% - 0s\n", + " 0 0 67521.7136 0 30 65717.0394 67521.7136 2.75% - 0s\n", + " 0 0 67440.6730 0 34 65717.0394 67440.6730 2.62% - 0s\n", + " 0 0 67415.9059 0 33 65717.0394 67415.9059 2.59% - 0s\n", + " 0 0 67407.6626 0 32 65717.0394 67407.6626 2.57% - 0s\n", + " 0 0 67397.1289 0 32 65717.0394 67397.1289 2.56% - 0s\n", + " 0 0 67367.1673 0 26 65717.0394 67367.1673 2.51% - 0s\n", + "H 0 0 65919.660132 67367.1673 2.20% - 0s\n", + " 0 0 67325.6782 0 24 65919.6601 67325.6782 2.13% - 0s\n", + " 0 0 67027.1365 0 27 65919.6601 67027.1365 1.68% - 0s\n", + " 0 0 66962.1686 0 21 65919.6601 66962.1686 1.58% - 0s\n", + " 0 0 66754.1535 0 21 65919.6601 66754.1535 1.27% - 0s\n", + " 0 0 66724.4366 0 21 65919.6601 66724.4366 1.22% - 0s\n", + " 0 0 66357.6817 0 22 65919.6601 66357.6817 0.66% - 0s\n", + " 0 0 66314.3792 0 17 65919.6601 66314.3792 0.60% - 0s\n", + " 0 0 66212.7588 0 15 65919.6601 66212.7588 0.44% - 0s\n", + " 0 0 66096.5119 0 21 65919.6601 66096.5119 0.27% - 0s\n", + " 0 0 66096.5119 0 22 65919.6601 66096.5119 0.27% - 0s\n", + " 0 0 66096.5119 0 21 65919.6601 66096.5119 0.27% - 0s\n", + " 0 0 66096.5119 0 23 65919.6601 66096.5119 0.27% - 0s\n", + " 0 0 66092.0244 0 21 65919.6601 66092.0244 0.26% - 0s\n", + " 0 0 66092.0244 0 23 65919.6601 66092.0244 0.26% - 0s\n", + " 0 0 66092.0244 0 22 65919.6601 66092.0244 0.26% - 0s\n", + " 0 0 66092.0244 0 21 65919.6601 66092.0244 0.26% - 0s\n", + " 0 0 66092.0244 0 21 65919.6601 66092.0244 0.26% - 0s\n", + " 0 0 66092.0244 0 23 65919.6601 66092.0244 0.26% - 0s\n", + " 0 0 66092.0244 0 19 65919.6601 66092.0244 0.26% - 0s\n", + " 0 0 66087.8039 0 21 65919.6601 66087.8039 0.26% - 0s\n", + " 0 0 66078.8922 0 17 65919.6601 66078.8922 0.24% - 0s\n", + " 0 0 66074.6109 0 20 65919.6601 66074.6109 0.24% - 0s\n", + " 0 0 65978.7306 0 15 65919.6601 65978.7306 0.09% - 0s\n", + " 0 0 65955.6109 0 23 65919.6601 65955.6109 0.05% - 0s\n", + " 0 0 65941.1212 0 23 65919.6601 65941.1212 0.03% - 0s\n", + " 0 0 65933.9387 0 23 65919.6601 65933.9387 0.02% - 0s\n", + " 0 0 65933.9387 0 22 65919.6601 65933.9387 0.02% - 0s\n", + " 0 0 65931.8301 0 1 65919.6601 65931.8301 0.02% - 0s\n", + "H 0 0 65931.830113 65931.8301 0.00% - 0s\n", + " 0 0 65931.8301 0 1 65931.8301 65931.8301 0.00% - 0s\n", + "\n", + "Cutting planes:\n", + " Cover: 3\n", + " Implied bound: 2\n", + " Clique: 6\n", + " MIR: 2\n", + " GUB cover: 1\n", + " Relax-and-lift: 4\n", + "\n", + "Explored 1 nodes (377 simplex iterations) in 0.62 seconds (0.05 work units)\n", + "Thread count was 2 (of 2 available processors)\n", + "\n", + "Solution count 5: 65931.8 65919.7 65717 ... 57353.1\n", + "\n", + "Optimal solution found (tolerance 1.00e-04)\n", + "Best objective 6.593183011274e+04, best bound 6.593183011274e+04, gap 0.0000%\n" + ] + } + ], + "source": [ + "m.Params.NonConvex = 2\n", + "m.optimize()" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "id": "ec56f1da", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "ec56f1da", + "outputId": "231a9f5b-2d76-48b2-899b-8f9db2232b94" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "\n", + "Optimal price for the two categories:\n", + " 364.97 300.0\n", + "\n", + "Optimal number of space assigned to the two categories:\n", + " 91 109\n", + "\n", + "Total revenue:\n", + " 65931.83\n" + ] + } + ], + "source": [ + "print(\"\\nOptimal price for the two categories:\\n\",round(p[1].X,2),round(p[2].X,2))\n", + "print(\"\\nOptimal number of space assigned to the two categories:\\n\",round(n[1].X), round(n[2].X))\n", + "print(\"\\nTotal revenue:\\n\",round(revenue.getValue(),2))" + ] + }, + { + "cell_type": "markdown", + "id": "5e9e2bbd", + "metadata": { + "id": "5e9e2bbd" + }, + "source": [ + "## Follow-up questions\n", + "At this point the λ value is set to zero, meaning we are not penalizing the price. One way to see the sensitivity of the prices and number of items set to each category is solving the model for multiple values of λ. This can be done fairly easy in `gurobipy`. Check out the documentation for how Gurobi can handle [multiple scenarios]('https://www.gurobi.com/documentation/current/refman/multiple_scenarios.html').\n", + "\n", + "We also only used the trained `xgb_regressor` for the optimization to show the number of variables and constraints added to the model. Test the `linear_regressor` that was trained as well. You can do that by re-running the cells or by adding to the code to [remove the previous predictor constraints](https://gurobi-machinelearning.readthedocs.io/en/stable/auto_generated/gurobi_ml.sklearn.linear_regression.LinearRegressionConstr.html#gurobi_ml.sklearn.linear_regression.LinearRegressionConstr.remove) and adding new ones." + ] + }, + { + "cell_type": "markdown", + "id": "e9e00776", + "metadata": { + "id": "e9e00776" + }, + "source": [ + "The penalty portion of the objective can have a big impact on the solution. We can combine the parts of the model into a function to give us the ability to see what different values of $\\lambda$ do to the solution.\n" + ] + }, + { + "cell_type": "markdown", + "id": "e0888fb6", + "metadata": { + "id": "e0888fb6" + }, + "source": [ + "Copyright © 2024 Gurobi Optimization, LLC" + ] + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.6" } - } - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} + }, + "nbformat": 4, + "nbformat_minor": 5 +} \ No newline at end of file