Commit Graph

12 Commits

Author SHA1 Message Date
0e0d98d8b1 Change Param based on a Paper
Change Param based on a Paper, and it work!
2022-12-17 09:59:44 +09:00
3116831ae6 change network and fix trainset bug
change network and fix trainset bug
2022-12-17 09:59:44 +09:00
bf77060456 Change Critic NN as Multi-NN
Change Critic NN as Multi-NN
wrong remain Time Fix

wrong remain Time Fix, what a stupid mistake...
and fix doubled WANDB writer
Deeper TargetNN

deeper target NN and will get target state while receive hidden layer's output.
Change Middle input

let every thing expect raycast input to target network.
Change Activation function to Tanh

Change Activation function to Tanh, and it's works a little bit better than before.
2022-12-17 09:59:44 +09:00
cbc385ca10 Change training dataset storage method
save training dataset by it target type.
while training NN use single target training set to backward NN.
this improve at least 20 times faster than last update!
2022-12-03 07:54:38 +09:00
895cd5c118 Add EndReward Broadcast function
while game over add remaintime/15 to every step's rewards. to improve this round's training weight.
fix get target from states still using onehot decoder bug.
2022-12-03 03:58:19 +09:00
3930bcd953 Add Multi-NN agent
Add Multi neural network in output layer
use different nn while facing to different target.
2022-12-01 19:55:51 +09:00
5631569b31 Side Channel added
add side Channel to save target win ratio. 
Fix some Bug
2022-11-30 06:45:07 +09:00
32d398dbef Change Learning timing
change learning timing to each episode end.
2022-11-16 19:40:57 +09:00
a0895c7449 Add load & save function.
Add load & save function.
Add train flag to test model.
Add new action select function while in test mode.
Add decision period to skip step.
2022-11-08 23:14:34 +09:00
474032d1e8 hybrid dis-con action, save-load, converge wad observed
add discrete and continuous action in same NN model.
model save and load.
reward is increasing, converge was observed.

this two models are seems good:
Aimbot_9331_1667423213_hybrid_train2
Aimbot_9331_1667389873_hybrid
2022-11-03 07:16:18 +09:00
0dbe2013ae weight and bias sync added
weight and bias sync added
2022-11-01 19:11:45 +09:00
7497ffcb0f Parallel Environment Discrete PPO finish
Parallel Environment Discrete PPO finish. Runnable.
2022-10-30 04:13:14 +09:00