Koha9
895cd5c118
while game over add remaintime/15 to every step's rewards. to improve this round's training weight. fix get target from states still using onehot decoder bug. |
||
---|---|---|
.. | ||
GAIL-Model | ||
Pytorch | ||
Tensorflow |
Koha9
895cd5c118
while game over add remaintime/15 to every step's rewards. to improve this round's training weight. fix get target from states still using onehot decoder bug. |
||
---|---|---|
.. | ||
GAIL-Model | ||
Pytorch | ||
Tensorflow |