Skip to content

cfdeinza/reinforcement-learning-rendezvous

Repository files navigation

Information:

This directory contains the work of Carlos F. De Inza Niemeijer for the completion of the MSc degree in Aerospace Engineering at TU Delft. The purpose of this project is to use reinforcement learning to train a neural network policy that can perform an autonomous rendezvous with a rotating target.

A custom environment was created to simulate the rendezvous scenario. This environment is located in rendezvous_env.py. It follows the guidelines specified in OpenAI Gym. The learning algorithm chosen to train the controller is the Proximal Policy Optimization (PPO) algorithm implemented by Stable-Baselines3.

Training:

The scripts main.py or main_rnn.py can be executed to train a feedforward policy or a recurrent policy, respectively. The trained policy will be saved as a .zip file. If Weights & Biases is enabled, the callback function defined in custom/custom_callbacks.py will log the progress of the training process.

Evaluation:

The monte_carlo.py script can be executed to evaluate the trajectories generated by a trained policy based on a set of initial conditions. The initial conditions for the Monte Carlo trajectories can be generated as a .csv file by running the verification/get_initial_conditions.py script. Trajectories can be plotted in various formats using the plot_*.py scripts.

Other:

  • The scripts sensitivity_analysis.py and tune_*.py are used for performing a sensitivity analysis and for tuning the different components of the model.
  • The trained policies can be found in the models directory.
  • All of the results from the tuning, training, and evaluation experiments can be found in the results directory.

About

Protoype model for thesis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages