Commit Graph

8 Commits

Author SHA1 Message Date
895cd5c118 Add EndReward Broadcast function
while game over add remaintime/15 to every step's rewards. to improve this round's training weight.
fix get target from states still using onehot decoder bug.
2022-12-03 03:58:19 +09:00
3930bcd953 Add Multi-NN agent
Add Multi neural network in output layer
use different nn while facing to different target.
2022-12-01 19:55:51 +09:00
5631569b31 Side Channel added
add side Channel to save target win ratio. 
Fix some Bug
2022-11-30 06:45:07 +09:00
32d398dbef Change Learning timing
change learning timing to each episode end.
2022-11-16 19:40:57 +09:00
a0895c7449 Add load & save function.
Add load & save function.
Add train flag to test model.
Add new action select function while in test mode.
Add decision period to skip step.
2022-11-08 23:14:34 +09:00
474032d1e8 hybrid dis-con action, save-load, converge wad observed
add discrete and continuous action in same NN model.
model save and load.
reward is increasing, converge was observed.

this two models are seems good:
Aimbot_9331_1667423213_hybrid_train2
Aimbot_9331_1667389873_hybrid
2022-11-03 07:16:18 +09:00
0dbe2013ae weight and bias sync added
weight and bias sync added
2022-11-01 19:11:45 +09:00
7497ffcb0f Parallel Environment Discrete PPO finish
Parallel Environment Discrete PPO finish. Runnable.
2022-10-30 04:13:14 +09:00