ESPL

Pytorch implementation of Efficient Symbolic Policy Learning with Differentiable Symbolic Expression [paper]

CSP/

The source code of contextual symbolic policy.

Train the contextual symbolic policy with:

python  main.py  --config configs/xxx.config   --spls 0.25 --target_ratio 0.002  --arch_index 0  --hard_epoch 25  --seed 0

You can tune the hyper-parameters by editing main.py or using the argparse.

You can also get the average count of all selected paths and paths selected by at least ninety percent with CSP/mask_matrix.py

Then you can extract the discovered symbolic policy with CSP/sym.py (Please make sure the symbolic network is sparse enough before this).

The source code for single-task symbolic policy learning.

Train the symbolic policy with:

python -u sac_symbolic_v1.py --env lunar_lander

You can tune the hyper-parameters by editing sac_symbolic_v1.py or using the argparse.

Then you can extract the discovered symbolic policy with ESPL/sym.py (Please make sure the symbolic network is sparse enough before this).

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
CSP		CSP
ESPL		ESPL
README.md		README.md