Ghulam Murtaza
This project implements a reinforced multiagent learning system in a transport world, where three agents navigate a grid environment to reach a target location while avoiding obstacles. The system uses Q-learning and epsilon-greedy exploration to train the agents to make optimal decisions. The project includes four experiments to evaluate the performance of the agents under different learning rates, exploration strategies, and reward structures.
Q-learning and epsilon-greedy exploration Multiagent learning in a grid environment Obstacle avoidance and target location reaching Four experiments to evaluate agent performance Visualization of agent paths and Q-tables
Python Reinforcement learning Q-learning Multiagent systems
MIT License