Convolutional 2D Layer

27-01-2026convolution · deep-learning

Implementing a simple 2D convolution layer with padding and stride, including output shape formulas.

Problem

Implement a 2D convolutional layer that takes an input matrix and applies a kernel (filter) to produce an output feature map. The operation should support configurable padding and stride parameters to control the spatial dimensions of the output.

Code

import numpy as np
 
def simple_conv2d(input_matrix: np.ndarray, kernel: np.ndarray, padding: int, stride: int):
    input_height, input_width = input_matrix.shape
    kernel_height, kernel_width = kernel.shape
 
    # dimension is now h + 2p x w + 2p
    padded_input = np.pad(input_matrix, ((padding, padding), (padding, padding)), mode='constant')
 
    input_height_padded, input_width_padded = padded_input.shape
 
    output_height = (input_height_padded - kernel_height) // stride + 1
    output_width = (input_width_padded - kernel_width) // stride + 1
    output_matrix = np.zeros((output_height, output_width))
 
    for i in range(output_height):
        for j in range(output_width):
            region = padded_input[i*stride:i*stride+kernel_height, j*stride:j*stride+kernel_width]
            output_matrix[i, j] = np.sum(region * kernel)
    
	return output_matrix

Key Parameters

CNN Image Credits: https://www.geeksforgeeks.org/machine-learning/cnn-introduction-to-padding/

  1. Padding (p)
  • Adds zeros around the border of the input
  • Controls the size of the output feature map
  1. Stride (s)
  • Step size for sliding the kernel
  • Larger stride means smaller output dimensions

Formulas

Given:

  • Input dimensions: H×WH \times W (height × width)

  • Kernel dimensions: Kh×KwK_h \times K_w (kernel height × width)

  • Padding: pp

  • Stride: ss

  • Padded input dimensions:

    Hpadded=H+2pH_{\text{padded}} = H + 2p

    Wpadded=W+2pW_{\text{padded}} = W + 2p

  • Output dimensions:

    Hout=H+2pKhs+1H_{\text{out}} = \left\lfloor \frac{H + 2p - K_h}{s} \right\rfloor + 1

    Wout=W+2pKws+1W_{\text{out}} = \left\lfloor \frac{W + 2p - K_w}{s} \right\rfloor + 1

Where \lfloor \cdot \rfloor denotes the floor function (integer division).