Presenter Information

Charlotte M. Morrison

Faculty Sponsor

Dr. Ayan Dutta

Faculty Sponsor College

College of Computing, Engineering & Construction

Faculty Sponsor Department

Computing

Location

SOARS Virtual Conference

Presentation Website

https://unfsoars.domains.unf.edu/game-theory-based-distributed-coordination-with-multi-agent-reinforcement-learning/

Keywords

SOARS (Conference) (2020 : University of North Florida) -- Posters; University of North Florida. Office of Undergraduate Research; University of North Florida. Graduate School; College students – Research -- Florida – Jacksonville -- Posters; University of North Florida – Graduate students – Research -- Posters; University of North Florida. School of Computing -- Research -- Posters; Engineering, Math, and Computer Sciences -- Research – Posters

Abstract

We study the problem of automated object manipulation using two arms of a Baxter robot. The robot uses a novel multi-agent reinforcement learning strategy to learn how to complete the task without any prior experience. It learns what actions to take by storing its interactions with the environment and uses these experiences to create a policy that guides future actions. Each of Baxter’s arms is modeled as an independent agent that can move and learn separately from the other. Each arm learns independent policies (i.e., environment state to robot action mapping) about how to best move in order to complete a collaborative (i.e., using two arms) task (e.g., push an item, pick-and-place etc.). The individual agents are trained using the standard TD3 algorithm, which uses the experiences that include how well the agent’s past actions guided it towards completing the task. TD3 has two neural networks: the actor network takes the states (joint angles) as an input and outputs the actions (joint movements) and the twin critic networks evaluate the quality of those actions. The actions between the agents are coordinated through a game theory-based distributed coordination strategy for successful coordination. This coordination learning framework produces a policy that produces a good set of actions for each arm to execute. Finally, Baxter uses this policy to complete tasks using both its arms in collaboration. To the best of our knowledge, this work is the first to use a game theory-based strategy for dual arm manipulation learning.

Share

COinS
 
Apr 8th, 12:00 AM Apr 8th, 12:00 AM

Game Theory Based Distributed Coordination with Multi-Agent Reinforcement Learning

SOARS Virtual Conference

We study the problem of automated object manipulation using two arms of a Baxter robot. The robot uses a novel multi-agent reinforcement learning strategy to learn how to complete the task without any prior experience. It learns what actions to take by storing its interactions with the environment and uses these experiences to create a policy that guides future actions. Each of Baxter’s arms is modeled as an independent agent that can move and learn separately from the other. Each arm learns independent policies (i.e., environment state to robot action mapping) about how to best move in order to complete a collaborative (i.e., using two arms) task (e.g., push an item, pick-and-place etc.). The individual agents are trained using the standard TD3 algorithm, which uses the experiences that include how well the agent’s past actions guided it towards completing the task. TD3 has two neural networks: the actor network takes the states (joint angles) as an input and outputs the actions (joint movements) and the twin critic networks evaluate the quality of those actions. The actions between the agents are coordinated through a game theory-based distributed coordination strategy for successful coordination. This coordination learning framework produces a policy that produces a good set of actions for each arm to execute. Finally, Baxter uses this policy to complete tasks using both its arms in collaboration. To the best of our knowledge, this work is the first to use a game theory-based strategy for dual arm manipulation learning.

https://digitalcommons.unf.edu/soars/2020/spring_2020/117