Convolutional 2D Layer
Implementing a simple 2D convolution layer with padding and stride, including output shape formulas.
Problem
Implement a 2D convolutional layer that takes an input matrix and applies a kernel (filter) to produce an output feature map. The operation should support configurable padding and stride parameters to control the spatial dimensions of the output.
Code
import numpy as np
def simple_conv2d(input_matrix: np.ndarray, kernel: np.ndarray, padding: int, stride: int):
input_height, input_width = input_matrix.shape
kernel_height, kernel_width = kernel.shape
# dimension is now h + 2p x w + 2p
padded_input = np.pad(input_matrix, ((padding, padding), (padding, padding)), mode='constant')
input_height_padded, input_width_padded = padded_input.shape
output_height = (input_height_padded - kernel_height) // stride + 1
output_width = (input_width_padded - kernel_width) // stride + 1
output_matrix = np.zeros((output_height, output_width))
for i in range(output_height):
for j in range(output_width):
region = padded_input[i*stride:i*stride+kernel_height, j*stride:j*stride+kernel_width]
output_matrix[i, j] = np.sum(region * kernel)
return output_matrixKey Parameters
Image Credits: https://www.geeksforgeeks.org/machine-learning/cnn-introduction-to-padding/
- Padding (p)
- Adds zeros around the border of the input
- Controls the size of the output feature map
- Stride (s)
- Step size for sliding the kernel
- Larger stride means smaller output dimensions
Formulas
Given:
-
Input dimensions: (height × width)
-
Kernel dimensions: (kernel height × width)
-
Padding:
-
Stride:
-
Padded input dimensions:
-
Output dimensions:
Where denotes the floor function (integer division).