Some thoughts on the wolf_sheep_petting_zoo example #1

EwoutH · 2024-03-18T21:36:40Z

This extents from the discussion in projectmesa/mesa#2070.

As requested, some thoughts on the wolf_sheep_petting_zoo example.

So big disclaimer, I'm no expert on reinforcement learning (or machine learning) at all. So I'm going to leave some notes from my experience as a maintainer, developer and modeller.

mesa_rl/wolf_sheep_petting_zoo/environment.py

Lines 67 to 68 in 189d400

    
           self.observation_spaces =  {a: self.observation_space(a) for a in self.possible_agents} 
        
           self.action_spaces = {a: self.action_space(a) for a in self.possible_agents}

Conceptually, I like this. It's clearly separates what information each agent has and what each agent could do.

Implementation wise, I'm curious if we could implement it in the agent. In Mesa we try to stuff as much of the agent states and behavior in the agents themselves, so they can be called like agent.variable. So maybe there should be an subclass of the Mesa Agent called the RLAgent, which has some additional attributes, like a observation_space and a action_space.

mesa_rl/wolf_sheep_petting_zoo/environment.py

Line 79 in 189d400

    
           rewards = {a.unique_id: 0 for a in self.schedule.agents if isinstance(a, (Sheep, Wolf))}

Idem dito, might be interesting to move this into the agent.

mesa_rl/wolf_sheep_petting_zoo/environment.py

Line 84 in 189d400

agent.step(action_dict[agent.unique_id])

In that case you can say here agent.step(agent.actions). Or maybe you can even move agent.actions inside the agent, so you can use self.actions.

So technically, my main recommendation is trying to see what happens if you move some stuff to the agent, and maybe create an RLAgent subclass.

Another note: Normally we would use either a scheduler like RandomActivation here, or use the new AgentSet functionality (see also .

mesa_rl/wolf_sheep_petting_zoo/environment.py

Line 82 in 189d400

for agent in self.schedule.agents:

Seeing an example really helps! I don't know exactly where the learned behavior of the agent currently is updated and sort. Could you point me to that?

Anyway, it looks to be able to quite well integrate into Mesa! In your proposal, think about what constructs you want to offer to users to make using petting zoo easier with Mesa. These can be methods, functions or classes, but it could also be documentation or tutorials.

The text was updated successfully, but these errors were encountered:

harshmahesheka · 2024-03-19T18:04:11Z

Thanks for the reviews. We can surely move some attributes and functionalities inside the agent; creating a new agent class seems nice way of doing it. To learn, you can run run.py inside the wolf_sheep folder. The reward and observation space are the same, so the policy we previously trained for the RLlib example should also work with this. As for integration with the petting zoo, I think a middle path makes sense to me, where we create a few functions and classes that could speed up the task and then provide ample documentation and tutorials for everyone to get started.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some thoughts on the wolf_sheep_petting_zoo example #1

Some thoughts on the wolf_sheep_petting_zoo example #1

EwoutH commented Mar 18, 2024

harshmahesheka commented Mar 19, 2024

Some thoughts on the wolf_sheep_petting_zoo example #1

Some thoughts on the wolf_sheep_petting_zoo example #1

Comments

EwoutH commented Mar 18, 2024

harshmahesheka commented Mar 19, 2024