Featured image of post Tensorflow2.8中创建带其他参数的损失函数

Tensorflow2.8中创建带其他参数的损失函数

Tensorflow2.8中创建带其他参数的损失函数

需求

Tensorflow中的loss function默认只能使用y_truey_pred来计算损失值,其中y_true为真实值,y_pred为预测值。 在PPO算法计算actor网络的loss时,需要的不仅仅是y_truey_pred,还需要用到来自另一网络的预测值,Advantage等来计算loss。

尝试

首先我尝试了这位dalao的方法,在网络输入中事先创建loss function所需要的Input,在引入我的loss function时将其引入。 简略代码如下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
class PPO(object):
    '''
    '''

    # Build Net
    def buildActorNet(self,inputSize,continuousActionRange):
        # buildActor NN
       
        stateInput = layers.Input(shape = (inputSize,),name ='stateInput')
        advantage = layers.Input(shape = (1,),name = 'advantage') # for loss function argument
        oldPiMuSig = layers.Input(shape = (2,),name = 'oldPiMuSig') # for loss function argument
        
        dense1 = layers.Dense(100,activation='relu',name = 'dense1')(stateInput)# 输入网络时只使用stateInput
        dense2 = layers.Dense(50,activation='relu',name = 'dense2')(dense1)
        mu = continuousActionRange * layers.Dense(1,activation='tanh',name = 'muOut')(dense2)
        sigma = layers.Dense(1,activation='softplus',name = 'sigmaOut')(dense2)
        muSig = layers.concatenate([mu,sigma],name = 'muSigOut')
        
        actorOPT = optimizers.Adam(learning_rate = self.actorLR)
        model = keras.Model(inputs = [stateInput,advantage,oldPiMuSig],outputs = muSig)
        model.compile(optimizer = actorOPT,loss = self.aLoss(advantage,oldPiMuSig))
        return model

    def aLoss(self,advantage,oldPiMuSig):
        
        def distProb(mu,sig,x):
            dist = 1/(tf.sqrt(2*np.pi)*sig)
            prob = dist*tf.exp(-tf.square(x-mu)/(2*tf.square(sig)))
            return prob
            
        def loss(y_true,y_pred):
            # y_true: actions
            # y_pred: muSigma

            piProb = distProb(y_pred[:,0],y_pred[:,1],y_true)
            oldpiProb = distProb(oldPiMuSig[:,0],oldPiMuSig[:,1],y_true)
            ratio = piProb/(oldpiProb+1e-5)
            surr = ratio * advantage
            clipValue = tf.clip_by_value(ratio,1. - self.EPSILON,1. + self.EPSILON * advantage)

            loss = -tf.reduce_mean(tf.minimum(surr,clipValue))
            return loss
        return loss
    
    # 返回一个dummy的lossfunction用的输入
    def getDummyADV(self,size):
        return np.zeros((size,1))
    def getDummyOldMuSig(self,size):
        return np.zeros((size,2))

    def trainActor(self,states,actions,discountedR,epochs):
        #Trian Actor
        # states: Buffer States
        # actions: Buffer Actions
        # discountedR: Discounted Rewards
        # criticV: criticNN predict result
        # Epochs: just Epochs
        
        states = np.asarray(states)
        actions = np.asarray(actions)
        
        dummySize = len(states)
        criticV = self.critic.predict(states)
        advantage = copy.deepcopy(discountedR - criticV)
        #在predict时lossfunction的参数的那些输入用全是0的dummy代替
        oldPiMuSig = copy.deepcopy(self.actor.predict([states,self.getDummyADV(dummySize),self.getDummyOldMuSig(dummySize)]))
        #在fit时则将lossfunction所需参数从x带进去
        self.actor.fit(x = [states,advantage,oldPiMuSig],y = actions,epochs = epochs)

结果

结果就如本篇标题图片所示,提示不能传递KerasTensor,具体错误以下:

TypeError: You are passing KerasTensor(type_spec=TensorSpec(shape=(), dtype=tf.float32, name=None), name='Placeholder:0', description="created by layer 'tf.cast_4'"), an intermediate Keras symbolic input/output, to a TF API that does not allow registering custom dispatchers, such as `tf.cond`, `tf.function`, gradient tapes, or `tf.map_fn`. Keras Functional model construction only supports TF API calls that *do* support dispatching, such as `tf.math.add` or `tf.reshape`. Other APIs cannot be called directly on symbolic Kerasinputs/outputs. You can work around this limitation by putting the operation in a custom Keras layer `call` and calling that layer on this symbolic input/output.

随后用print大法发现从Input层传递获取的数据都是KerasTensor型,而不是lossfunction中需要的tensorflow的Tensor型。相同问题出现在Tensorflow的该讨论串,根据Tensorflow各版本不同,从Input来的数据型貌似不同,代码如下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# TensorFlow 2.3.0:
def ppo_loss(oldpolicy_probs, advantage, reward, value):
    def loss(y_true, y_pred):
        print(oldpolicy_probs) # Tensor("old_prediction_input:0", shape=(None, 2), dtype=float32)
        print(advantage)       # Tensor("advantage_input:0", shape=(None, 1), dtype=float32)
        print(reward)          # Tensor("reward_input:0", shape=(None, 1), dtype=float32)
        print(value)           # Tensor("value_input:0", shape=(None, 1), dtype=float32)

        print(y_true)     # Tensor("IteratorGetNext:5", shape=(128, 2), dtype=float32)
        print(y_pred)     # Tensor("functional_1/policy/Tanh:0", shape=(128, 2), dtype=float32)
        
        # ... Compute Loss ...
    return loss

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# TensorFlow 2.5.0:
def ppo_loss(oldpolicy_probs, advantage, reward, value):
    def loss(y_true, y_pred):
        print(oldpolicy_probs) # KerasTensor(type_spec=TensorSpec(shape=(None, 2), dtype=tf.float32, name='old_prediction_input'), name='old_prediction_input', description="created by layer 'old_prediction_input'")
        print(advantage)       # KerasTensor(type_spec=TensorSpec(shape=(None, 1), dtype=tf.float32, name='advantage_input'), name='advantage_input', description="created by layer 'advantage_input'")
        print(reward)          # KerasTensor(type_spec=TensorSpec(shape=(None, 1), dtype=tf.float32, name='reward_input'), name='reward_input', description="created by layer 'reward_input'")
        print(value)           # KerasTensor(type_spec=TensorSpec(shape=(None, 1), dtype=tf.float32, name='value_input'), name='value_input', description="created by layer 'value_input'")

        print(y_true)     # Tensor("IteratorGetNext:5", shape=(128, 2), dtype=float32)
        print(y_pred)     # Tensor("model/policy/Tanh:0", shape=(128, 2), dtype=float32)
        
        # ... Compute Loss ...
    return loss

解决方案

最终我使用了这个问题的答案的第二个方法,将lossfunction所需的argument打包进y_true里暗度陈仓进lossfunction。

原文代码如下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
    def custom_loss(data, y_pred):

        y_true = data[:, 0]
        i = data[:, 1]
        return K.mean(K.square(y_pred - y_true), axis=-1) + something with i...


    def baseline_model():
        # create model
        i = Input(shape=(5,))
        x = Dense(5, kernel_initializer='glorot_uniform', activation='linear')(i)
        o = Dense(1, kernel_initializer='normal', activation='linear')(x)
        model = Model(i, o)
        model.compile(loss=custom_loss, optimizer=Adam(lr=0.0005))
        return model


    model.fit(X, np.append(Y_true, X[:, 0], axis =1), batch_size = batch_size, epochs=90, shuffle=True, verbose=1)

我的简略代码如下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
class PPO(object):
    '''
    '''

    # Build Net
    def buildActorNet(self,inputSize,continuousActionRange):
        # buildActor NN
        stateInput = layers.Input(shape = (inputSize,),name ='stateInput')
        
        dense1 = layers.Dense(100,activation='relu',name = 'dense1')(stateInput)
        dense2 = layers.Dense(50,activation='relu',name = 'dense2')(dense1)
        mu = continuousActionRange * layers.Dense(1,activation='tanh',name = 'muOut')(dense2) 
        sigma = layers.Dense(1,activation='softplus',name = 'sigmaOut')(dense2)
        muSig = layers.concatenate([mu,sigma],name = 'muSigOut')
        model = keras.Model(inputs = stateInput,outputs = muSig)
        actorOPT = optimizers.Adam(learning_rate = self.actorLR)
        model.compile(optimizer = actorOPT,loss = self.aLoss())
        return model

    def aLoss(self):
        
        def distProb(mu,sig,x):
            dist = 1/(tf.sqrt(2*np.pi)*sig)
            prob = dist*tf.exp(-tf.square(x-mu)/(2*tf.square(sig)))
            return prob
            
        def loss(y_true,y_pred):
            # y_true: actions,adv,oldPiSig,oldPiMu
            # y_pred: muSigma = self.actor(state)
            # 将y_true拆解
            actions = y_true[:,0]
            advantage = y_true[:,-1]
            oldPiSig = y_true[:,-2]
            oldPiMu = y_true[:,-3]

            piProb = distProb(y_pred[:,0],y_pred[:,1],actions)
            oldpiProb = distProb(oldPiMu,oldPiSig,actions)
            ratio = piProb/(oldpiProb+1e-5)
            surr = ratio * advantage
            clipValue = tf.clip_by_value(ratio,1. - self.EPSILON,1. + self.EPSILON * advantage)
            
            loss = -tf.reduce_mean(tf.minimum(surr,clipValue))
            return loss
        return loss

    def trainActor(self,states,actions,discountedR,epochs):
        #Trian Actor
        # states: Buffer States
        # actions: Buffer Actions
        # discountedR: Discounted Rewards
        # Epochs: just Epochs
        
        states = np.asarray(states)
        actions = np.asarray(actions)

        criticV = self.critic.predict(states)
        advantage = copy.deepcopy(discountedR - criticV)
        oldPiMuSig = copy.deepcopy(self.actor.predict(states))
        y_true = np.append(actions,oldPiMuSig,axis=1) 
        y_true = np.append(y_true,advantage,axis = 1) # y_true在此处打包成(32,4)
        
        self.actor.fit(x = states,y = y_true,epochs = epochs,verbose = 0)

文中所使用的链接如下

ProximalPolicyOptimizationContinuousKeras
Custom loss function is not working #43650
Custom loss function in Keras based on the input data

本次并未用上但是有可能有参考的链接

TypeError: Cannot convert a symbolic Keras input/output to a numpy array. #47311
Use layer output in keras custom loss
How to Convert Keras Tensor to TensorFlow Tensor?
Keras custom loss function: Accessing current input pattern
Passing additional arguments to objective function #2121
How to write a custom loss function with additional arguments in Keras
Recieve list of all outputs as input to a custom loss function. #14140

Licensed under CC BY-NC-SA 4.0
comments powered by Disqus