This research project aims to develop an autonomous racing vehicle capable of navigating complex racetracks using advanced deep reinforcement learning techniques. The primary objectives are to maintain optimal track positioning while adhering to track boundaries and maximizing performance metrics.
The project employs Proximal Policy Optimization (PPO), a state-of-the-art deep reinforcement learning algorithm, to train the autonomous agent. PPO optimizes a surrogate objective function while constraining the policy update to prevent excessive changes, as defined by:Â
REWARD FUNCTION DESIGN
The reward function, a critical component in reinforcement learning, has been carefully designed to encourage desired behaviors in the autonomous vehicle. Three variants of the reward function have been implemented and evaluated:
Proximity-Based Reward: This function incentivizes the vehicle to maintain proximity to the track centerline, promoting precise navigation.
Progress-Based Reward: This function rewards the vehicle for forward progression along the track, encouraging efficient completion of laps.
Combined Reward: This function integrates both proximity and progress metrics to optimize the balance between precise positioning and forward momentum.