Lab 3: Dummy Q-learning (table) - GitHub Pages

Lab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim ...

12 downloads 429 Views 1MB Size
Lab 3: Dummy Q-learning (table) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

Learning Q(s, a): Table initial Q values are 0

0 0 0 0 0 0 0 0

0 0

0

0 0

0 0 0 0

0

0

0 0 0 0

0 0

0 0 0

0 0 0

0

0

0

0

0

0

0

0

0 0

0

0 0

0

0 0

0 0

0

0

0

0

0 0 0

0

0

0

0

0 0

0

0

Learning Q(s, a) Table (with many trials) initial Q values are 0

Learning Q(s, a) Table: one success! initial Q values are 0

1

1 1 1 1 1 1

1

Learning Q(s, a) Table: one success!

1

1 1 1 1 1 1

1

Dummy Q-learning algorithm

Dummy Q-learning algorithm

Code: setup

# https://gist.github.com/stober/1943451

https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0#.pjz9g59ap

Code: (dummy) Q-learning

https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0#.pjz9g59ap

Code: result reporting

Success rate: 0.95

Q = np.zeros([env.observation_space.n, env.action_space.n]) print(Q) LEFT DOWN RIGHT UP [[ 0. 0. 1. 0.] [ 0. 0. 1. 0.] [ 0. 1. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 1. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 1. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 0. 0.] [ 0. 0. 1. 0.] [ 0. 0. 0. 0.]]

t x d n e a N n o i t a r d o l r p a x w e e r & t i e r o l u p t u f Ex d e t n u o c s di