TOC
Categorical Crossentropy源码分析
Source:
import tensorflow as tf
import numpy as np
sess = tf.InteractiveSession()
print("--------output-----------")
target = tf.constant([1., 0., 0., 0., 1., 0., 0., 0., 1.], shape=[3,3])
print("target: \n",target.eval())
output = tf.constant([.9, .05, .05, .05, .89, .06, .05, .01, .94], shape=[3,3])
print("output:\n ",output.eval())
loss = tf.keras.backend.categorical_crossentropy(target, output)
print("loss: \n",loss.eval()) # Output: [0.10536 0.11653 0.06188]
Output:
--------output-----------
target:
[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]
output:
[[0.9 0.05 0.05]
[0.05 0.89 0.06]
[0.05 0.01 0.94]]
loss:
[0.10536055 0.11653383 0.06187541]
问:输出中最后一行,loss第一个值0.10536055
是什么得到的?
简单回答是: 1.将output映射到0-1之间,得到新的output; 2.对新output逐项计算log_e(x),得到新的output,暂叫log_output; 3.将target与log_output逐项相乘,最后计算数组和,最后取负号值;
没懂?我们逐行来看。
源码逐行分析
categorical_crossentropy
的源码如下:
def categorical_crossentropy(target, output, from_logits=False, axis=-1):
# ...
# 省略from_logits为True时的计算代码
# ...
# scale preds so that the class probas of each sample sum to 1
output = output / math_ops.reduce_sum(output, axis, True)
# Compute cross entropy from probabilities.
epsilon_ = _constant_to_tensor(epsilon(), output.dtype.base_dtype)
output = clip_ops.clip_by_value(output, epsilon_, 1. - epsilon_)
return -math_ops.reduce_sum(target * math_ops.log(output), axis)
接下来逐行分析:
1.首先,第一行output = output / math_ops.reduce_sum(output, axis, True)
主要作用是重新把output缩放到0-1之间,方法是将每个项除以该项所在的数组的和;
axis = -1;
from tensorflow.python.ops import math_ops,clip_ops
from tensorflow.python.framework import constant_op
reduce_rs = math_ops.reduce_sum(output, axis, True)
print("math_ops.reduce_sum:\n",reduce_rs.eval()) # 最里层数组和为1
output = output / reduce_rs
print("rescale output:\n",output.eval()) # 每项除以1,还是原来的值
Output:
math_ops.reduce_sum:
[[1.]
[1.]
[1.]]
rescale output:
[[0.9 0.05 0.05]
[0.05 0.89 0.06]
[0.05 0.01 0.94]]
2.epsilon_ = _constant_to_tensor(epsilon(), output.dtype.base_dtype)
这一句代码返回一个常数,很小的常数,源码里是0.0000001,即1e-7
;
下一行output = clip_ops.clip_by_value(output, epsilon_, 1. - epsilon_)
,此行的作用是把output里比epsilon_还小的值设为epsilon_,比1大的值,设置为1;
epsilon_ = 1e-7
output = clip_ops.clip_by_value(output, epsilon_, 1. - epsilon_)
print("clip_ops.clip_by_value: \n",output.eval())
因为数据里没有比1e-7小,也没比1大,因此输出不变,输出如下所示:
clip_ops.clip_by_value:
[[0.9 0.05 0.05]
[0.05 0.89 0.06]
[0.05 0.01 0.94]]
3.最后一行-math_ops.reduce_sum(target * math_ops.log(output), axis)
的作用是:计算output的log值,而后与target相乘,最后计算数组和,然后返回。
其可拆为以下几步:
# 1.计算log_e(x)值
math_log = tf.math.log(output) # 公式:log_e(x); math_ops.log 等同于 tf.math.log, 参考 https://www.tensorflow.org/api_docs/python/tf/math/log
print("tf.math.log:\n",math_log.eval())
# 2.与target相乘
target_log = target * math_log
print("tar_log_out:\n",target_log.eval())
# 3.返回数组的和
return_rs = - math_ops.reduce_sum(target_log, axis)
print("return_rs:\n",return_rs.eval())
Output:
tf.math.log:
[[-0.10536055 -2.9957323 -2.9957323 ] # 观察第一个值log_e(0.9)就是-0.10536051565783
[-2.9957323 -0.11653383 -2.8134108 ]
[-2.9957323 -4.6051702 -0.06187541]]
tar_log_out:
[[-0.10536055 -0. -0. ]
[-0. -0.11653383 -0. ]
[-0. -0. -0.06187541]]
return_rs:
[0.10536055 0.11653383 0.06187541]
附完整代码:
运行环境版本:
python version: 3.7.4
numpy version:1.19.4
tensorflow version:1.14.0
keras version:2.1.5
完整代码:
import tensorflow as tf
import numpy as np
sess = tf.InteractiveSession()
print("--------output-----------")
target = tf.constant([1., 0., 0., 0., 1., 0., 0., 0., 1.], shape=[3,3])
print("target: \n",target.eval())
output = tf.constant([.9, .05, .05, .05, .89, .06, .05, .01, .94], shape=[3,3])
print("output:\n ",output.eval())
loss = tf.keras.backend.categorical_crossentropy(target, output)
print("loss: \n",loss.eval()) # Output: [0.10536 0.11653 0.06188]
# --------categorical_crossentropy逐行分析-------------
axis = -1;
from tensorflow.python.ops import math_ops,clip_ops
from tensorflow.python.framework import constant_op
reduce_rs = math_ops.reduce_sum(output, axis, True)
print("math_ops.reduce_sum:\n",reduce_rs.eval()) # 最里层数组和为1
output = output / reduce_rs
print("rescale output:\n",output.eval()) # 每项除以1,还是原来的值
epsilon_ = 1e-7
output = clip_ops.clip_by_value(output, epsilon_, 1. - epsilon_)
print("clip_ops.clip_by_value: \n",output.eval())
# -math_ops.reduce_sum(target * math_ops.log(output), axis)
# 1.计算log_e(x)值
math_log = tf.math.log(output) # 公式:log_e(x); math_ops.log 等同于 tf.math.log, 参考 https://www.tensorflow.org/api_docs/python/tf/math/log
print("tf.math.log:\n",math_log.eval())
# 2.与target相乘
target_log = target * math_log
print("tar_log_out:\n",target_log.eval())
# 3.返回数组的和
return_rs = - math_ops.reduce_sum(target_log, axis)
print("return_rs:\n",return_rs.eval())
1.官方文档categorical_crossentropy
2.Github Tensorflow categorical_crossentropy
「点个赞」
点个赞
使用微信扫描二维码完成支付
