Some notes on Convolution course

What is padding?

Padding is to add some pixels to the border of the original image, such as a 6*6 image will become a 8*8 image if we add a pixel to its border.

valid convolution vs same convolution.

Valid convolution is on padding that means the actual pixels of the output image after we convole original image with filter.

Same convolution means adding padding so that the output image has the same size as its input image.

What is strided convolution?

When we convolve an image with a filter matrix, we select a rectangular area with the same width as the filter matrix width.

And we will multiply this rectangular area by the filter matrix to get a output pixel.

The stride is the number of pixels we move the rectangular area to right or down. The default stride is 1, that means the rectangular area move forward in one pixel each time.

output matrix = [(n+2p-f) / s]+1

How to convolution over volumes?

A RGB image have three layers: red、 green and blue.

Volumes is a collection of filters, it looks like a cube. For a RGB image, a filter in first layer may have three walls which correspond to RGB.

Let’s say the first layer have only one filter and suppose size of this filter is 3 x 3, and this fliter have threes walls which correspond to RGB.

To convolve a RGB image with this filter is to multiply threes fliter wall with RGB channels respectively; For example multiply first filter wall with red channel of image, then do second fliter wall with green channel of image, thrid filter wall with blue channge of image. After do that, the first number of output matrix is to sum up three products.

And so on, calculate the next number of output until getting the last number.

Why filters are always odd?

For an odd-sized filter, all the previous layer pixels would be symmetrically around the output pixel. Without this symmetry, we will have to account for distortions across the layers which happens when using an even sized kernel. Therefore, even sized kernel filters are mostly skipped to promote implementation simplicity.

What is max pooling and how do it do?

A pool is a square area, it has its size such as 2 * 2, 4 * 4.

max pooling can get a sub-matrix from the source matrix by splitting the origin matrix into small pool-size area, selecting the largest number from each small area, and using the largest number to form another matrix in the original order.

Max pooling is done to in part to help over-fitting by providing an abstracted form of the representation. As well, it reduces the computational cost by reducing the number of parameters to learn.

Max pooling