Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR adds the implementation of Global and Local Optimization Policies (GLOP), together with the implementation of Shortest Hamiltonian Path Problem (SHPP) environment.
Motivation and Context
GLOP is an important non-autoregressive (NAR) model for routing problems. For more details, please refer to the original paper.
Types of changes
Checklist
The current implementation of GLOP is runnable but it can not learn.
I added one test notebook at the
examples/other/3-glop.ipynb
. This notebook including the test for SHPP environment, greedy rollout for untrained GLOP policy (including visualization for a better understanding), and launching the training for the GLOP. Please play with it and have a look.There are following components not implemented yet compare with the original GLOP:
I will add these missing parts soon. And here are some possible ideas to help to reproduce the results:
If @henry-yeh @Furffico have time, could you help to have a look about the implementation? We need to reproduce the GLOP's result close recently.