From 4e737e0c248ca66a57bc463bfc1f6329b547e443 Mon Sep 17 00:00:00 2001 From: Alexandr Katrutsa Date: Sat, 28 Jul 2018 12:04:20 +0300 Subject: [PATCH] Add en version of seminar 20, #15 --- 20-InteriorPoint/Seminar20en.ipynb | 1043 ++++++++++++++++++++++++++++ 1 file changed, 1043 insertions(+) create mode 100644 20-InteriorPoint/Seminar20en.ipynb diff --git a/20-InteriorPoint/Seminar20en.ipynb b/20-InteriorPoint/Seminar20en.ipynb new file mode 100644 index 0000000..279bc91 --- /dev/null +++ b/20-InteriorPoint/Seminar20en.ipynb @@ -0,0 +1,1043 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Seminar 20\n", + "# Interior point method" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Reminder\n", + "\n", + "- Projected gradient descent\n", + "- Frank-Wolfe method" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Convex optimization problem with equality constraints\n", + "\n", + "\\begin{equation*}\n", + "\\begin{split}\n", + "&\\min f(x) \\\\ \n", + "\\text{s.t. } & Ax = b,\n", + "\\end{split}\n", + "\\end{equation*}\n", + "where $f$ is convex and twice differentiable, $A \\in \\mathbb{R}^{p \\times n}$ and $\\mathrm{rank} \\; A = p < n$" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Dual problem\n", + "Relation between dual and conjugate functions \n", + "\\begin{equation*}\n", + "\\begin{split}\n", + "g(\\mu) & = -b^{\\top}\\mu + \\inf_x(f(x) + \\mu^{\\top}Ax) \\\\\n", + "& = -b^{\\top}\\mu - \\sup_x((-A^{\\top}\\mu)^{\\top}x -f(x)) \\\\\n", + "& = -b^{\\top}\\mu - f^*(-A^{\\top}\\mu)\n", + "\\end{split}\n", + "\\end{equation*}\n", + "\n", + "Dual problem\n", + "\n", + "$$\n", + "\\max_\\mu -b^{\\top}\\mu - f^*(-A^{\\top}\\mu)\n", + "$$\n", + "\n", + "**Approach 1**: find conjugate function and \n", + "\n", + "solve unconstrained optimization problem" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "**Issues**\n", + "- it may be not so easy to find solution of the primal problem from the dual one\n", + "- conjugate function $f^*$ has to be twice differentiable for fast solving of dual problem, but this is not always hold" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Optimality conditions\n", + "\n", + "- $Ax^* = b$\n", + "- $f'(x^*) + A^{\\top}\\mu^* = 0$\n", + "\n", + "or\n", + "\n", + "$$ \\begin{bmatrix} f' & A^{\\top} \\\\ A & 0 \\end{bmatrix} \\begin{bmatrix} x^{\\\\*} \\\\ \\mu^{\\\\*} \\end{bmatrix} = \\begin{bmatrix} 0 \\\\ b \\end{bmatrix} $$\n", + "\n", + "**Approach 2**: solve generally non-linear system with Newton method.\n", + "\n", + "**Q**: in what case the system becomes linear?" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Newton method for convex optimization problem with equality constraints\n", + "\n", + "\\begin{equation*}\n", + "\\begin{split}\n", + "& \\min_v f(x) + f'(x)^{\\top}v + \\frac{1}{2}v^{\\top}f''(x)v\\\\\n", + "\\text{s.t. } & A(x + v) = b\n", + "\\end{split}\n", + "\\end{equation*}\n", + "\n", + "From the optimality condition follows \n", + "\n", + "$$ \\begin{bmatrix} f''(x) & A^{\\top} \\\\ A & 0 \\end{bmatrix} \\begin{bmatrix} v \\\\ w \\end{bmatrix} = \\begin{bmatrix} -f'(x) \\\\ 0 \\end{bmatrix} $$\n", + "\n", + "**Newton direction $v$ is defined only for non-singular matrix!**\n", + "\n", + "**Q:** how direction $w$ can be interpreted?" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "**Exercise**. \n", + "\n", + "Estimate number of iterations required for convergence of \n", + "\n", + "Newton method for quadratic objective and linear equality constraints." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Linearization of optimality conditions\n", + "\n", + "- $A(x + v) = b \\rightarrow Av = 0$\n", + "- $f'(x + v) + A^{\\top}w \\approx f'(x) + f''(x)v + A^{\\top}w = 0$\n", + "\n", + "or\n", + "\n", + "- $f''(x)v + A^{\\top}w = -f'(x)$" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Pseudocode\n", + "**Important note:** initial point has to lie inside the feasible set!\n", + "\n", + "```python\n", + "def NewtonEqualityFeasible(f, gradf, hessf, A, b, \n", + " \n", + " stop_crit, line_search, x0, \n", + " \n", + " tol):\n", + " \n", + " x = x0\n", + " \n", + " n = x.shape[0]\n", + " \n", + " while True:\n", + " \n", + " newton_matrix = [[hessf(x), A.T], [A, 0]]\n", + " \n", + " rhs = [-gradf(x), 0]\n", + " \n", + " w = solve_lin_sys(newton_matrix, rhs)\n", + " \n", + " h = w[:n]\n", + " \n", + " if stop_crit(x, h, gradf(x), **kwargs) < tol:\n", + " \n", + " break\n", + " \n", + " alpha = line_search(x, h, f, gradf(x), **kwargs)\n", + " \n", + " x = x + alpha * h\n", + " \n", + " return x\n", + "\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Stopping criterion\n", + "\n", + "Estimate the following difference\n", + "\n", + "$$\n", + "f(x) - \\inf_v(\\hat{f}(x + v) \\; | \\; A(x+v) = b),\n", + "$$\n", + "\n", + "where $\\hat{f}$ is quadratic approximation of function $f$.\n", + "\n", + "To do this multiply both side by $h^{\\top}$ from the left \n", + "\n", + "$$\n", + "\\langle h^{\\top} \\rvert \\cdot \\quad f''(x)h + A^{\\top}w = -f'(x)\n", + "$$\n", + "\n", + "and use constraint $Ah = 0$\n", + "\n", + "$$\n", + "h^{\\top}f''(x)h = -f'(x)^{\\top}h\n", + "$$\n", + "\n", + "Then \n", + "\n", + "$$\n", + "\\inf_v(\\hat{f}(x + v) \\; | \\; A(x+v) = b) = f(x) - \\frac{1}{2}h^{\\top}f''(x)h\n", + "$$\n", + "\n", + "**Summary:** value of $h^{\\top}f''(x)h$ is the most natural stopping criterion of Newton method." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Convergence theorem\n", + "\n", + "Convergence of the equality constrained Newton method is equivalent \n", + "\n", + "to convergence classical Newton method for unconstrained optimization problem.\n", + "\n", + "**Theorem**\n", + "Assume the following conditions hold\n", + "- level set $S = \\{ x \\; | \\; x \\in D(f), \\; f(x) \\leq f(x_0), \\; Ax = b \\}$ is closed and $x_0 \\in D(f), \\; Ax_0 = b$\n", + "- for any $x \\in S$ and $\\tilde{x} \\in S$ hessian $f''(x)$ is Lipschitz\n", + "- in the set $S$ $\\|f''(x)\\|_2 \\leq M $ and norm of the inverse matrix of the KKT system is bounded above\n", + "\n", + "Then, Newton method converges to the pair $(x^*, \\mu^*)$ \n", + "\n", + "- linearly when iterands far from the solution\n", + "- quadratically in sufficiently small neighbourhood of the solution" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Infeasible starting point\n", + "\n", + "- Newton method requires that starting point is feasible\n", + "- But what to do if this requirement is violated? Example of hard case is the problem in which domain of the objective function is not $\\mathbb{R}^n$.\n", + "- Assume starting point is infeasible, then KKT conditions can be written as\n", + "\n", + "$$\n", + "\\begin{bmatrix}\n", + "f''(x) & A^{\\top}\\\\\n", + "A & 0\n", + "\\end{bmatrix}\n", + "\\begin{bmatrix}\n", + "v\\\\\n", + "w\n", + "\\end{bmatrix}\n", + " = -\n", + "\\begin{bmatrix}\n", + "f'(x)\\\\\n", + "{\\color{red}{Ax - b}}\n", + "\\end{bmatrix}\n", + "$$\n", + "\n", + "- If $x$ is feasible, then the system is equivalent to the system for Newton method with feasible starting point" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Primal-dual interpretation\n", + "\n", + "- Method is called *primal-dual*, if every iteration updates both primal and dual variables\n", + "- In particular, re-write optimality condition in the following form\n", + "\n", + "$$\n", + "r(x^*, \\mu^*) = (r_p(x^*, \\mu^*), r_d(x^*, \\mu^*)) = 0,\n", + "$$\n", + "\n", + "where $r_p(x, \\mu) = Ax - b$ and $r_d(x, \\mu) = f'(x) + A^{\\top}\\mu$\n", + "- Solve system with Newton method:\n", + "\n", + "$$\n", + "r(y + z) \\approx r(y) + Dr(y)z = 0\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "- Primal-dual direction in Newton method is defined as the solution of the following system\n", + "\n", + "$$\n", + "Dr(y)z = -r(y)\n", + "$$\n", + "\n", + "or more detailed\n", + "\n", + "$$\n", + "\\begin{bmatrix}\n", + "f''(x) & A^{\\top}\\\\\n", + "A & 0\n", + "\\end{bmatrix}\n", + "\\begin{bmatrix}\n", + "z_p\\\\\n", + "z_d\n", + "\\end{bmatrix}\n", + " = -\n", + "\\begin{bmatrix}\n", + "r_p(x, \\mu)\\\\\n", + "r_d(x, \\mu)\n", + "\\end{bmatrix}\n", + "= - \n", + "\\begin{bmatrix}\n", + "f'(x) + A^{\\top}\\mu\\\\\n", + "Ax - b\n", + "\\end{bmatrix}\n", + "$$\n", + "\n", + "- Replace $z_d^+ = \\mu + z_d$ and obtain\n", + "\n", + "$$\n", + "\\begin{bmatrix}\n", + "f''(x) & A^{\\top}\\\\\n", + "A & 0\n", + "\\end{bmatrix}\n", + "\\begin{bmatrix}\n", + "z_p\\\\\n", + "z_d^+\n", + "\\end{bmatrix}\n", + "= - \n", + "\\begin{bmatrix}\n", + "f'(x)\\\\\n", + "Ax - b\n", + "\\end{bmatrix}\n", + "$$\n", + "\n", + "- This system is equivalent to the previous one in the following notation\n", + "\n", + "$$\n", + "v = z_p \\qquad w = z_d^+ = \\mu + z_d \n", + "$$\n", + "\n", + "- Newton method gives direction for update primal variable and updated value for dual variable" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Pseudocode\n", + "```python\n", + "def NewtonEqualityInfeasible(f, grad, hessf, A, b, \n", + " \n", + " stop_crit, line_search, x0, \n", + " \n", + " mu0, tol):\n", + " \n", + " x = x0\n", + " \n", + " mu = mu0\n", + " \n", + " n = x.shape[0]\n", + " \n", + " while True:\n", + " \n", + " z_p, z_d = ComputeNewtonStep(hessf(x), A, b)\n", + " \n", + " if stop_crit(x, z_p, z_d, grad(x), **kwargs) < tol:\n", + " \n", + " break\n", + " \n", + " alpha = line_search(x, z_p, z_d, \n", + " \n", + " f, grad(x), **kwargs)\n", + " \n", + " x = x + alpha * z_p\n", + " \n", + " mu = mu + alpha * z_d\n", + " \n", + " return x\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Stopping criterion and backtracking\n", + "\n", + "- Update $r_p$ after step $z_p$\n", + "\n", + "$$\n", + "A(x + \\alpha z_p) - b = [A(x + z_p) = b] = Ax + \\alpha(b - Ax) - b = (1 - \\alpha)(Ax - b)\n", + "$$\n", + "\n", + "- Total update after $k$ steps\n", + "\n", + "$$\n", + "r^{(k)} = \\prod_{i=0}^{k-1}(1 - \\alpha^{(i)})r^{(0)}\n", + "$$\n", + "\n", + "- Stopping criterion: $Ax = b$ and $\\|r(x, \\mu)\\|_2 \\leq \\varepsilon$" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "- Backtracking: $c \\in (0, 1/2)$, $\\beta = (0, 1)$\n", + "```python\n", + "def linesearch(r, x, mu, z_p, z_d, c, beta):\n", + " alpha = 1\n", + " while norm(r(x + alpha * z_p, mu + alpha * z_d)) >= \n", + " (1 - c * alpha) * norm(r(x, mu)): \n", + " alpha *= beta \n", + " return alpha\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Convergence theorem\n", + "\n", + "The theorem result is similar to the case of feasible starting point\n", + "\n", + "**Theorem.** Assume that\n", + "- sublevel set $S = \\{(x, \\mu) \\; | \\; x \\in D(f), \\; \\| r(x, \\mu) \\|_2 \\leq \\| r(x_0, \\mu_0) \\|_2 \\}$ is closed\n", + "- in the set $S$ norm of the inverse KKT matrix is bounded\n", + "- hessian is Lipschitz in $S$.\n", + "\n", + "Then convergence is \n", + "- linear far from the solution and\n", + "- quadratic in the sufficiently small neighbourhood." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## General convex optimization problem\n", + "\n", + "\\begin{equation*}\n", + "\\begin{split}\n", + "& \\min_{x \\in \\mathbb{R}^n} f_0(x)\\\\\n", + "\\text{s.t. } & f_i (x) \\leq 0 \\qquad i=1,\\ldots,m\\\\\n", + "& Ax = b,\n", + "\\end{split}\n", + "\\end{equation*}\n", + "where $f_i$ are convex and twice smoothly differentiable, $A \\in \\mathbb{R}^{p \\times n}$ and $\\mathrm{rank} \\; A = p < n$. \n", + "\n", + "Assume that the problem is strictly feasible, i.e. Slater conditions are satisfied." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Optimality conditions\n", + "\n", + "- Primal feasibility\n", + "\n", + "$$\n", + "Ax^* = b, \\; f_i(x^*) \\leq 0, \\; i = 1,\\ldots,m\n", + "$$\n", + "\n", + "- Dual feasibility\n", + "\n", + "$$\n", + "\\lambda^* \\geq 0\n", + "$$\n", + "\n", + "- Lagrangian stationarity\n", + "\n", + "$$\n", + "f'_0(x^*) + \\sum_{i=1}^m \\lambda^*_if'_i(x^*) + A^{\\top}\\mu^* = 0\n", + "$$\n", + "\n", + "- Complementary slackness condition\n", + "\n", + "$$\n", + "\\lambda^*_i f_i(x^*) = 0, \\qquad i = 1,\\ldots, m\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Idea\n", + "\n", + "- Reduce the problem with **inequality** constraints to sequence of **equality** constrained problems\n", + "- Use methods for solving equality constrained problems" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "\\begin{equation*}\n", + "\\begin{split}\n", + "& \\min f_0(x) + \\sum_{i=1}^m I_-(f_i(x))\\\\\n", + "\\text{s.t. } & Ax = b,\n", + "\\end{split}\n", + "\\end{equation*}\n", + "where $I_-$ is an indicator function\n", + "\n", + "$$\n", + "I_-(u) = \n", + "\\begin{cases}\n", + "0, & u \\leq 0\\\\\n", + "\\infty, & u > 0\n", + "\\end{cases}\n", + "$$\n", + "\n", + "**Issue.** Now objective function **is not differentiable**." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Logarithmic barier\n", + "\n", + "**Idea.** Approximate function $I_-(u)$ with function\n", + "\n", + "$$\n", + "\\hat{I}_-(u) = -t\\log(-u),\n", + "$$\n", + "\n", + "where $t > 0$ is fixed parameter.\n", + "\n", + "- Functions $I_-(u)$ and $\\hat{I}_-(u)$ are convex and non-decreasing\n", + "- But $\\hat{I}_-(u)$ is **differentiable** and approximates $I_-(u)$ while $t \\to 0$" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%matplotlib inline\n", + "import matplotlib.pyplot as plt\n", + "plt.rc(\"text\", usetex=True)\n", + "import numpy as np\n", + "\n", + "x = np.linspace(-2, 0, 100000, endpoint=False)\n", + "plt.figure(figsize=(10, 6))\n", + "for t in [0.1, 0.5, 1, 1.5, 2]:\n", + " plt.plot(x, -t * np.log(-x), label=r\"$t = \" + str(t) + \"$\")\n", + "plt.legend(fontsize=20)\n", + "plt.xticks(fontsize=20)\n", + "plt.yticks(fontsize=20)\n", + "_ = plt.xlabel(\"$u$\", fontsize=26)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### \"Constrained\" problem\n", + "\n", + "\\begin{equation*}\n", + "\\begin{split}\n", + "& \\min f_0(x) + \\sum_{i=1}^m -t \\log(-f_i(x))\\\\\n", + "\\text{s.t. } & Ax = b,\n", + "\\end{split}\n", + "\\end{equation*}\n", + "\n", + "- The problem is still **convex**\n", + "- Function \n", + "\n", + "$$\n", + "\\phi(x) = -\\sum\\limits_{i=1}^m \\log(-f_i(x))\n", + "$$ \n", + "\n", + "is called *logarithmic barier*. \n", + "\n", + "Its domain is a set of points such that the inequality constraints are strictly feasible.\n", + "\n", + "**Exercise.** Find gradiebnt and hessian of $\\phi(x)$" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Central path\n", + "\n", + "For every $t > 0$ \"constrained\" problem has unique solution $x^*(t)$.\n", + "\n", + "**Definition.** Sequence $x^*(t)$ for $t > 0$ is formed *central path*." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Optimality conditions for \"constrained\" problem\n", + "\n", + "- Primal feasibility\n", + "\n", + "$$\n", + "Ax^*(t) = b, \\; f_i(x^*) < 0, \\; i = 1,\\ldots,m\n", + "$$\n", + "\n", + "- Lagrangian stationarity\n", + "\n", + "\\begin{equation*}\n", + "\\begin{split}\n", + "& f'_0(x^*(t)) + \\phi'(x^*(t)) + A^{\\top}\\hat{\\mu} = \\\\\n", + "& = f'_0(x^*(t)) - t\\sum_{i=1}^m \\frac{f_i'(x^*(t))}{f_i(x^*(t))} + A^{\\top}\\hat{\\mu} = 0\n", + "\\end{split}\n", + "\\end{equation*}" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "- Introduce the following notation\n", + "\n", + "$$\n", + "\\lambda^*_i(t) = -\\frac{t}{f_i(x^*(t))} \\; i=1,\\ldots,m \\text{ и } \\mu^* = \\hat{\\mu}\n", + "$$\n", + "\n", + "- Then optimality condition can be re-written as\n", + "\n", + "$$\n", + "f'_0(x^*(t)) + \\sum_{i=1}^m \\lambda^*_i(t)f_i'(x^*(t)) + A^{\\top}\\mu^* = 0\n", + "$$\n", + "\n", + "- Then $x^*(t)$ is minimizer of the following Lagrangian\n", + "\n", + "$$\n", + "L = f_0(x) + \\sum_{i=1}^m \\lambda_if_i(x) + \\mu^{\\top}(Ax - b)\n", + "$$\n", + "\n", + "where $\\lambda = \\lambda^*(t)$ and $\\mu = \\mu^*$." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Duality gap\n", + "\n", + "- Dual function $g(\\lambda^*(t), \\mu^*)$ is finite and is represented in the following way\n", + "\\begin{equation*}\n", + "\\begin{split}\n", + "g(\\lambda^*(t), \\mu^*) & = f_0(x^*(t)) + \\sum_{i=1}^m \\lambda^*_i(t)f_i(x^*(t)) + (\\mu^*)^{\\top}(Ax^*(t) - b)\\\\\n", + "& = f_0(x^*(t)) - mt\n", + "\\end{split}\n", + "\\end{equation*}\n", + "\n", + "- Duality gap\n", + "\n", + "$$\n", + "f_0(x^*(t)) - p^* \\leq mt\n", + "$$\n", + "\n", + "- While $t \\to 0$ duality gap is 0 and central path converges to the solution of the original problem" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## KKT interpretation\n", + "\n", + "Optimality conditions for \"constrained\" problem is equivalent to optimality conditions for original problem if\n", + "\n", + "$$\n", + "-\\lambda_i f_i(x) = 0 \\Rightarrow - \\lambda_i f_i(x) = t \\quad i = 1,\\ldots, m\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Physical interpretation\n", + "\n", + "- Assume that we do not have equality constraints \n", + "- Consider classical particle in the field of forces\n", + "- Every constraint $f_i(x) \\leq 0$ corresponds to the following force\n", + "\n", + "$$\n", + "F_i(x) = -\\nabla(-\\log(-f_i(x))) = \\frac{f'_i(x)}{f_i(x)}\n", + "$$\n", + "\n", + "- Objective function corresponds to some force, too\n", + "\n", + "$$\n", + "F_0(x) = -\\frac{f'_0(x)}{t}\n", + "$$\n", + "\n", + "- Every point from the central path $x^*(t)$ is a equilibrium state of particle where sum of forces is zero\n", + "- While decreasing $t$ forces $F_0(x)$ dominates forces $F_i(x)$ and particle aims to get state which corresponding to optimal value of objective\n", + "- As far as forces $F_i(x)$ go to infinity when particle is close to the bounds, particle will never leave feasible region" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Barier method\n", + "\n", + "- $x_0$ has to be feasible\n", + "- $t_0 > 0$ is initial value of parameter\n", + "- $\\alpha \\in (0, 1)$ is multiplier for decreasing $t_0$\n", + "\n", + "```python\n", + "def BarrierMethod(f, x0, t0, tol, alpha, **kwargs):\n", + " \n", + " x = x0\n", + " \n", + " t = t0\n", + " \n", + " while True:\n", + " \n", + " x = SolveBarrierProblem(f, t, x, **kwargs)\n", + " \n", + " if m * t < tol:\n", + " \n", + " break\n", + " \n", + " t *= alpha\n", + " \n", + " return x\n", + "\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Parameters selection\n", + "\n", + "- Multiplier $\\alpha$\n", + " - In the case of $\\alpha \\sim 1$, \"constrained\" problem is solved after **small** number of iterations, but central path consists of **large** number of points\n", + " - In the case $\\alpha \\sim 10^{-5}$ the situation is completely different: **large** number of iterations for solving \"constrained\" problems, but **small** number of points for central path\n", + "- Initialization $t_0$\n", + " - Alternatives are similar to parameter $\\alpha$\n", + " - Parameter $t_0$ affects the initial point in central path" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Almost convergence theorem\n", + "\n", + "- As it was shown above, while $t \\to 0$ barrier method converges to the solution of the original problem\n", + "- Convergence speed is directly affected by parameters $\\alpha$ and $t_0$, as was shown above\n", + "- The main difficulty is fast solving auxilliary problems with Newton methods" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Problem of finding feasible starting point\n", + "\n", + "- Barier method requires feasible starting point\n", + "- Method is splitted in two phases\n", + " - The first phase gives feasible starting point\n", + " - The second phase uses this starting point to run barier method" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### The first phase method\n", + "\n", + "Simple method to find strictly feasible point\n", + "\n", + "\\begin{equation*}\n", + "\\begin{split}\n", + "& \\min s\\\\\n", + "\\text{s.t. } & f_i(x) \\leq s\\\\\n", + "& Ax = b\n", + "\\end{split}\n", + "\\end{equation*}\n", + "\n", + "- this problem always has strictly feasible starting point\n", + "- if $s^* < 0$, then $x^*$ is strictly feasible and can be used in barier method\n", + "- if $s^* > 0$, then original problem is infeasible and feasible set is empty" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Sum of inconsistencies\n", + "\n", + "\\begin{equation*}\n", + "\\begin{split}\n", + "& \\min s_1 + \\ldots + s_m\\\\\n", + "\\text{s.t. } & f_i(x) \\leq s_i\\\\\n", + "& Ax = b\\\\\n", + "& s \\geq 0\n", + "\\end{split}\n", + "\\end{equation*}\n", + "\n", + "- optimal objective value is 0 and it attains in the case of consistency of inequality constraints\n", + "- if the problem is infeasible, it is possible to identify what constraints are violated, i.e. $s_i > 0$ " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Second phase\n", + "\n", + "- After starting point $x_0$ is found, one can run standard Newton method for equality constrained problem" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Primal-dual method\n", + "\n", + "It is similar to barier method, but\n", + "- every iteration updates both primal and dual variables\n", + "- Newton direction is taken from the modified KKT system\n", + "- iterands in primal-dual method may be not feasible\n", + "- it works even if the problem is not strictly feasible" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Recap\n", + "\n", + "- Newton method for convex optimization problem with equality constraint\n", + "- The case of infeasible initial point\n", + "- Primal barier method\n", + "- Primal-dual method" + ] + } + ], + "metadata": { + "celltoolbar": "Slideshow", + "kernelspec": { + "display_name": "Python 3 (cvxpy)", + "language": "python", + "name": "cvxpy" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.4" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}