Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a fresher reference for how HSD implementation currently works? #148

Closed
pratyai opened this issue Aug 13, 2024 · 2 comments
Closed

Comments

@pratyai
Copy link
Contributor

pratyai commented Aug 13, 2024

Currently, the reference paper for Tulip is this AFAIK. However, I see some differences in the actual implementation, as follows:

  • The number of times the function KKT.solve!() is called is significantly different. There is a "preliminary" solve even before the predictor step to compute the quantity h0 (which is not explained anywhere, and I'm trying to work out its contribution).
  • Also, KKT.solve!() is called only once in each predictor or corrector step, whereas the paper says the augmented system needs to be solved twice in each solution of the newton step (predictor or corrector).

I looked a bit through the commit history and their messages are not particularly insightful on this matter either. I imagine that these differences were introduce organically during the development of Tulip.

Has there been any paper / documentation / publicly accessible communication for the differences or how the current implementation is formulated?

Thanks!

@mtanneau
Copy link
Member

Thanks for using Tulip and for reaching out about this :)

In what follows, all the equations number refer to the arxiv preprint.

  • At the mathematical level, the implementation follows the formulation in the paper (Eq. (1)), except that the standard form used in the code considers explicit lower and upper bounds on variables (see this part of the docs)
  • The overall HSD algorithm is presented in Section 3.3, and the computation of the search direction is in Section 3.3.2.
    It consists of solving several Newton systems, which all have the same left-hand side matrix:
    • affine scaling direction: (28)-(32)
    • centrality correction: (35)-(39)
    • additional centrality corrections: (45)-(49)
  • The discrepancy in the number of calls to KKT.solve! you're seeing comes from how the Newton systems are solved under the hood. This part is detailed in Section 4. Namely, we reduce the Newton system to a so-called augmented system (63).
    • To solve system (63), we solve two smaller systems (64) and (65).
    • As is mentioned in the paper just after Eq. (68), the system (64) is the same for all Newton systems, hence it only needs to be solved once per IPM iteration. This is what happens here.
      The quantities hx, hy in the code correspond to p, q in the paper's Eq. (64), and h0 corresponds to the denominator of Eq (66)
    • Afterwards, we compute the affine-scaling direction here, the centrality correction here, and additional centrality corrections here.
      Each of these calls solve_newton_system! once, and you'll note that this function is passed hx, hy, h0 as arguments, which were obtained from the solution of (64).

The rationale behind most of this was that 1) the code was structured around a KKT module that solves augmented systems, and 2) I tried to minimize the number of redundant linear system solves.

@pratyai
Copy link
Contributor Author

pratyai commented Aug 13, 2024

Thank you so much!
Indeed the matter of the number of calls to KKT.solve! is quite clear after your explanation. I missed the fact that (64) is fixed for all the steps, and the preliminary step is computing just that. The hx, hy, h0 variables are also clear now.

@pratyai pratyai closed this as completed Aug 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants