深海游弋的鱼 – 默默的点滴

Same Convolution Padding

我之前学习吴恩达老师的课程时，了解到的same padding是指在输入周围填充0，以使卷积操作后输入输出大小相同。而在tensorflow中的same padding却不是这样的。

要理解tensorflow中的same padding是如何操作的，先考虑一维卷积的情况。

ni和no分别表示输入和输出的大小，k为kernel大小，s为stride步长。那么在same padding中，no由ni和s二者确定：no = ceil(ni / s)

比如，假设ni为11，s为2，那么就得到no为6。而s若为1，则输入输出大小相等。

现在已经确定好了输出no的大小，接下来就要确定如何对输入ni进行pad来得到目标输出大小。也就是要找到满足下面公式的pi：

现在的目标就是要找到最小的pi(因为pi有多个值可满足上诉公式)。

一般，在上取整ceil(x / a) = b(a > 0)中，意味着b - 1 < x / a <= b。满足该条件的最小整数x = a(b - 1) + 1。
因为 a(b - 1) < x <= ab，a、b都为整数，所以x最小就为a(b - 1) + 1。

那么对于pi也如此，即ni + pi - k + 1 = s(no - 1) + 1。

那么pi = s(no - 1) + k - ni。

对于上面的例子：ni = 11，s = 2，no = 6，则pi = k - 1。可以手动算一下k为各个值时卷积运算的情况。

但是对于上面的公式，在有的情况下，pi可能得到负值，比如ni = 10，s = 2，k = 1，此时不用padding也能得到no = 5，但根据公式可得到pi = -1。虽然在ni上减去1也可以得到同样的结果，但再减去1显然是多此一举的。所以修改pi公式如下：

对于该公式还要再分为两种情况讨论......：ni能整除s和ni不能整除s。

mod(ni , s) = 0时

这时，ni / s = no。pi = max(k - s, 0)。

mod(ni , s) != 0时

此时，ni可写为下面的公式(相当于将ni拆为两部分：可以整除s的部分和不能整除s的部分)：

因为mod(ni , s) != 0，所以中间括号中的值为1。也就简化为下面的公式：

移项，则，no = ceil(ni / s) = (ni + s - (ni mod s)) / s。将其带入pi，得到：

将两种情况总结如下，得到最终的same padding的pad情况：

一维情况下的same padding清楚了，对于二维情况下面就直接贴代码了：

# 先确定输出维度，记住是上取整
out_height = ceil(float(in_height) / float(strides[1]))
out_width  = ceil(float(in_width) / float(strides[2]))

# 上面的公式
if (in_height % strides[1] == 0):
  pad_along_height = max(filter_height - strides[1], 0)
else:
  pad_along_height = max(filter_height - (in_height % strides[1]), 0)
if (in_width % strides[2] == 0):
  pad_along_width = max(filter_width - strides[2], 0)
else:
  pad_along_width = max(filter_width - (in_width % strides[2]), 0)

# 因为pad是在上下、左右四侧pad。所以当pi不为偶数时要分配下
# 这里是当pi为奇数时，下侧比上侧多一，右侧比左侧多一。
#  Note that this is different from existing libraries such as cuDNN and Caffe, which explicitly specify the number of padded pixels and always pad the same number of pixels on both sides.
pad_top = pad_along_height // 2
pad_bottom = pad_along_height - pad_top
pad_left = pad_along_width // 2
pad_right = pad_along_width - pad_left

# 先确定输出维度，记住是上取整

out_height = ceil(float(in_height) / float(strides[1]))

out_width = ceil(float(in_width) / float(strides[2]))

# 上面的公式

if (in_height % strides[1] == 0):

pad_along_height = max(filter_height - strides[1], 0)

else:

pad_along_height = max(filter_height - (in_height % strides[1]), 0)

if (in_width % strides[2] == 0):

pad_along_width = max(filter_width - strides[2], 0)

else:

pad_along_width = max(filter_width - (in_width % strides[2]), 0)

# 因为pad是在上下、左右四侧pad。所以当pi不为偶数时要分配下

# 这里是当pi为奇数时，下侧比上侧多一，右侧比左侧多一。

# Note that this is different from existing libraries such as cuDNN and Caffe, which explicitly specify the number of padded pixels and always pad the same number of pixels on both sides.

pad_top = pad_along_height // 2

pad_bottom = pad_along_height - pad_top

pad_left = pad_along_width // 2

pad_right = pad_along_width - pad_left

其实在搭建网络是不需要关心他是怎么padding的，关键是要知道在same padding中no = ceil(ni / s)，好确定输出维度。

Valid Convolution Padding

很简单，对于valid padding来说就是在卷积运算前，对输入没有pad操作。输出由下公式确定：

no = ceil( (ni - k + 1) / s )

对于二维情况：

out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))

1 2	out_height = ceil(float(in_height - filter_height + 1) / float(strides[1])) out_width = ceil(float(in_width - filter_width + 1) / float(strides[2]))

一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Tensorflow中same padding和valid padding

参考链接

Tensorflow中same padding和valid padding

发布者

默默

发表回复取消回复

参考链接

Tensorflow中same padding和valid padding

发布者

默默

发表回复 取消回复

发表回复取消回复