[問題] A3C Actor Gradient
看板Prob_Solve (計算數學 Problem Solving)作者longlyeagle (長鷹寶寶實驗室)時間7年前 (2017/10/08 10:31)推噓0(0推 0噓 5→)留言5則, 1人參與討論串1/1
Working on A3C deep reinforcement learning.
Since I am too lazy to modify the last layer of my NN to softmax,
I use a softmax filter to let the linear layer directly target
the softmax output.
The algorithm works in my test cases for now.
But it might go wrong when the reward is on a different scale.
Can anyone help me to check if my implementation is correct?
https://goo.gl/FV8sFu
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 114.35.245.133
※ 文章網址: https://www.ptt.cc/bbs/Prob_Solve/M.1507429910.A.70E.html
→
11/05 22:07,
7年前
, 1F
11/05 22:07, 1F
→
11/05 22:08,
7年前
, 2F
11/05 22:08, 2F
→
11/05 22:08,
7年前
, 3F
11/05 22:08, 3F
→
11/05 22:09,
7年前
, 4F
11/05 22:09, 4F
→
11/05 22:09,
7年前
, 5F
11/05 22:09, 5F
Prob_Solve 近期熱門文章
PTT數位生活區 即時熱門文章