Faculty Sponsor
Dr. Ayan Dutta
Faculty Sponsor College
College of Computing, Engineering & Construction
Faculty Sponsor Department
Computing
Location
SOARS Virtual Conference
Presentation Website
https://unfsoars.domains.unf.edu/game-theory-based-distributed-coordination-with-multi-agent-reinforcement-learning/
Keywords
SOARS (Conference) (2020 : University of North Florida) -- Posters; University of North Florida. Office of Undergraduate Research; University of North Florida. Graduate School; College students – Research -- Florida – Jacksonville -- Posters; University of North Florida – Graduate students – Research -- Posters; University of North Florida. School of Computing -- Research -- Posters; Engineering; Math; and Computer Sciences -- Research – Posters
Abstract
We study the problem of automated object manipulation using two arms of a Baxter robot. The robot uses a novel multi-agent reinforcement learning strategy to learn how to complete the task without any prior experience. It learns what actions to take by storing its interactions with the environment and uses these experiences to create a policy that guides future actions. Each of Baxter’s arms is modeled as an independent agent that can move and learn separately from the other. Each arm learns independent policies (i.e., environment state to robot action mapping) about how to best move in order to complete a collaborative (i.e., using two arms) task (e.g., push an item, pick-and-place etc.). The individual agents are trained using the standard TD3 algorithm, which uses the experiences that include how well the agent’s past actions guided it towards completing the task. TD3 has two neural networks: the actor network takes the states (joint angles) as an input and outputs the actions (joint movements) and the twin critic networks evaluate the quality of those actions. The actions between the agents are coordinated through a game theory-based distributed coordination strategy for successful coordination. This coordination learning framework produces a policy that produces a good set of actions for each arm to execute. Finally, Baxter uses this policy to complete tasks using both its arms in collaboration. To the best of our knowledge, this work is the first to use a game theory-based strategy for dual arm manipulation learning.
Included in
Game Theory Based Distributed Coordination with Multi-Agent Reinforcement Learning
SOARS Virtual Conference
We study the problem of automated object manipulation using two arms of a Baxter robot. The robot uses a novel multi-agent reinforcement learning strategy to learn how to complete the task without any prior experience. It learns what actions to take by storing its interactions with the environment and uses these experiences to create a policy that guides future actions. Each of Baxter’s arms is modeled as an independent agent that can move and learn separately from the other. Each arm learns independent policies (i.e., environment state to robot action mapping) about how to best move in order to complete a collaborative (i.e., using two arms) task (e.g., push an item, pick-and-place etc.). The individual agents are trained using the standard TD3 algorithm, which uses the experiences that include how well the agent’s past actions guided it towards completing the task. TD3 has two neural networks: the actor network takes the states (joint angles) as an input and outputs the actions (joint movements) and the twin critic networks evaluate the quality of those actions. The actions between the agents are coordinated through a game theory-based distributed coordination strategy for successful coordination. This coordination learning framework produces a policy that produces a good set of actions for each arm to execute. Finally, Baxter uses this policy to complete tasks using both its arms in collaboration. To the best of our knowledge, this work is the first to use a game theory-based strategy for dual arm manipulation learning.
https://digitalcommons.unf.edu/soars/2020/spring_2020/117