RL environment that uses SAC/PPO for tuning the gains on a feedback controller for Comma AI's controls challenge