Predictor-Corrector Primal-Dual Interior-Point Method for QP
From textbook to solver. Paper CVXGEN reproduce
To recap, we recall problem §11.8 the predictor–corrector viewpoint in Boyd convex optimization.
Consider a family of barrier problems parameterized by $t>0$, \(x^{\star}(t) = \arg\min_x \; t f_0(x) + \phi(x),\) whose central path is characterized by the first-order condition \(F(x,t) := t\nabla f_0(x) + \nabla\phi(x) = 0,\quad F(x^{\star}(t),t)=0.\) Differentiating this identity with respect to $t$ and using the implicit function theorem gives
A first-order predictor of the point $x^{\star}(\mu t)$ on the central path is then
\[\hat x = x^{\star}(t) + \frac{dx^{\star}(t)}{dt}\,(\mu t - t),\]and a subsequent Newton step from $\hat x$ plays the role of the corrector.
The primal–dual method implemented in CVXGEN paper applies the same predictor–corrector idea to the KKT residual, on full primal–dual central path.
Problem formulation
This project solve QP in the form:
\begin{align} \underset{x}{\textrm{min}} \,\, & \frac{1}{2}x^TQx + q^Tx
\textrm{s.t.} \,\,& Ax=b\nonumber \ \,\,& Gx \leq h \nonumber
\end{align}
where $x \in \mathbb{R}^n$, $Q \in \mathbf{S}^n_+$, $q \in \mathbb{R}^n$, $A \in \mathbb{R}^{n_\text{eq} \times n}$, $b \in \mathbb{R}^{n_\text{eq}}$, $G \in \mathbb{R}^{n_\text{ineq} \times n}$ and $h \in \mathbb{R}^{n_\text{ineq}}$. Suppose $p = n_{\text{ineq}}$, therefore $s,\lambda \in \mathbb{R}^p$.
We consider the barrier subproblem for the QP with slack $s$:
\[\begin{aligned} \min_{x,s}\quad & \frac12 x^\top Q x + q^\top x - \frac{1}{t}\sum_{i=1}^p \log s_i\\ \text{s.t.}\quad & Gx + s = h,\\ & Ax = b, \end{aligned}\]The modified KKT conditions write as the residual vector
\[r_t(x,s,\lambda,\nu) = \begin{bmatrix} r_{\mathrm{dual}}\\ r_{\mathrm{cent}}\\ r_{\mathrm{pri,ineq}}\\ r_{\mathrm{pri,eq}} \end{bmatrix} := \begin{bmatrix} Qx + q + G^\top\lambda + A^\top\nu\\ \operatorname{diag}(s)\lambda - \tfrac{1}{t}\mathbf 1\\ Gx + s - h\\ Ax - b \end{bmatrix}.\]Linearization
Let the stacked variable be $ w := [\,x,\; s,\; \lambda,\; \nu\,]^{\top},\quad \Delta w := [\,\Delta x,\; \Delta s,\; \Delta\lambda,\; \Delta\nu\,]^{\top} $.
A Newton step at $(x,s,\lambda,\nu)$ solves $ D r_t(w)\,\Delta w = - r_t(w).$
Computing the Jacobian of each block gives the linear system
\[\begin{bmatrix} Q & 0 & G^\top & A^\top\\ 0 & \operatorname{diag}(\lambda) & \operatorname{diag}(s) & 0\\ G & I & 0 & 0\\ A & 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} \Delta x\\ \Delta s\\ \Delta\lambda\\ \Delta\nu \end{bmatrix} = - \begin{bmatrix} r_{\mathrm{dual}}\\ r_{\mathrm{cent}}\\ r_{\mathrm{pri,ineq}}\\ r_{\mathrm{pri,eq}} \end{bmatrix}.\]This is the basic Newton system used by the primal–dual method; the different “steps” come only from choosing different right–hand sides for the centering block.
Affine step
The affine direction ignores the $\tfrac{1}{t}$ term, pretends the complementarity target is $s_i\lambda_i \approx 0$.
Equivalently, residual is replaced the centering by $r_{\mathrm{cent}}^{\mathrm{aff}} := \operatorname{diag}(s)\lambda.$
The affine step $\Delta w_{\mathrm{aff}}$ solves
\[\begin{bmatrix} Q & 0 & G^\top & A^\top\\ 0 & \operatorname{diag}(\lambda) & \operatorname{diag}(s) & 0\\ G & I & 0 & 0\\ A & 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} \Delta x_{\mathrm{aff}}\\ \Delta s_{\mathrm{aff}}\\ \Delta\lambda_{\mathrm{aff}}\\ \Delta\nu_{\mathrm{aff}} \end{bmatrix} = - \begin{bmatrix} r_{\mathrm{dual}}\\ r_{\mathrm{cent}}^{\mathrm{aff}}\\ r_{\mathrm{pri,ineq}}\\ r_{\mathrm{pri,eq}} \end{bmatrix} = - \begin{bmatrix} r_{\mathrm{dual}}\\ \operatorname{diag}(s)\lambda\\ r_{\mathrm{pri,ineq}}\\ r_{\mathrm{pri,eq}} \end{bmatrix}.\]From this predictor one computes the affine complementarity
\[\mu_{\mathrm{aff}} := \frac{1}{p}\,(s + \alpha_{\mathrm{aff}}\Delta s_{\mathrm{aff}})^\top (\lambda + \alpha_{\mathrm{aff}}\Delta\lambda_{\mathrm{aff}}),\]chooses a centering parameter $\sigma\in(0,1)$ (e.g. $\sigma = (\mu_{\mathrm{aff}}/\mu)^3$, with $\mu = \tfrac{1}{p}s^\top\lambda$), and then forms the centering–plus–corrector step.
Centering–plus–corrector step
We would like the updated iterates to satisfy the approximate complementarity
\[(s_i + \Delta s_i)(\lambda_i + \Delta\lambda_i) \approx \sigma\mu,\qquad i=1,\dots,p.\]Expanding and using the affine step to approximate the quadratic term, $\Delta s_i\Delta\lambda_i \approx \Delta s_{i,\mathrm{aff}}\Delta\lambda_{i,\mathrm{aff}}$, gives
\[\operatorname{diag}(\lambda)\Delta s + \operatorname{diag}(s)\Delta\lambda = \sigma\mu\,\mathbf 1 - \operatorname{diag}(s)\lambda - \operatorname{diag}(\Delta s_{\mathrm{aff}})\,\Delta\lambda_{\mathrm{aff}}.\]To encode this in the same Newton system, we keep the Jacobian unchanged and simply replace the centering residual in the right–hand side by
\[r_{\mathrm{cent}}^{\mathrm{cc}} := \operatorname{diag}(s)\lambda - \sigma\mu\,\mathbf 1 + \operatorname{diag}(\Delta s_{\mathrm{aff}})\,\Delta\lambda_{\mathrm{aff}}.\]Thus the centering–plus–corrector step $\Delta w$ is obtained by solving
\[\begin{bmatrix} Q & 0 & G^\top & A^\top\\ 0 & \operatorname{diag}(\lambda) & \operatorname{diag}(s) & 0\\ G & I & 0 & 0\\ A & 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} \Delta x\\ \Delta s\\ \Delta\lambda\\ \Delta\nu \end{bmatrix} = - \begin{bmatrix} r_{\mathrm{dual}}\\ r_{\mathrm{cent}}^{\mathrm{cc}}\\ r_{\mathrm{pri,ineq}}\\ r_{\mathrm{pri,eq}} \end{bmatrix}.\]Both the affine step and the cc step use the same Jacobian matrix of the modified KKT conditions only the RHS of the centering block is changed to reflect, respectively, a pure predictor ($s_i\lambda_i\approx 0$) in aff step or a centered predictor–corrector target ($s_i\lambda_i\approx\sigma\mu$) with the quadratic correction term.
Line search
The line search should manually ensure \(s^+ = s + \alpha \Delta s \succ 0, \qquad \lambda^+ = \lambda + \alpha \Delta\lambda \succ 0.\) and decrease the residual norm $|r_t(x,s,\lambda,\nu)|$ sufficiently. Three criterion on textbook reduces to
\[\alpha_{\max} = \sup\{\alpha \ge 0 \mid s + \alpha\Delta s \ge 0,\; \lambda + \alpha\Delta\lambda \ge 0\}, \alpha = \min\{1,\; 0.99\,\alpha_{\max}\}\]Therefore code run the following loop after initialization, 1. Evaluate stopping criteria; 2. affine step; 3. cc step and combine the update direction with affine step; 4. line search; 5.Update. In code $\sigma = \left( \frac{\lVert (\bar{s} + \alpha \Delta s^{\mathrm{aff}} )^\top (\bar{\lambda} + \alpha \Delta \lambda^{\mathrm{aff}} ) \rVert_{2}^{2}}{\bar{s}^\top\bar{\lambda}} \right)^{3}$. Detailed code implementation explanation can be found in CVXGEN paper.
Result for simple QP
References
-
Boyd, S. & Vandenberghe, L. Convex Optimization. Cambridge University Press, 2004.
-
Mattingley, J. & Boyd, S. CVXGEN: a code generator for embedded convex optimization, Optimization and Engineering, vol. 13, no. 1, pp. 1-27, 2012.