Image Compression with SVD

"Minimizing the number of singular values used to reconstruct an image"
05 December 2023


We are going to learn on how to minimize the size of an image by using Singular Value Decomposition (SVD). With SVD, we are going to find the least number of singular values that can be used to reconstruct an image.

In this post, we are not going to write SVD from scratch. Instead, we are going to use the SVD function from numpy library.

Let's import the libraries that we are going to use.

from scipy import datasets
from numpy import linalg
import numpy as np
import matplotlib.pyplot as plt

Then we import the image of a raccoon from scipy library.

raccoon = datasets.face()

Note that all images consists of 3 layers, where each layer represents the red, green, and blue channel. In order to prove that, let's separate the image into 3 layers.

raccoon_red_channel = raccoon[:, :, 0]
raccoon_blue_channel = raccoon[:, :, 1]
raccoon_green_channel = raccoon[:, :, 2]

Then, we create three empty matrices with the same size as the image, and then we fill those empty matrices with the values from the red, green, and blue channel.

raccoon_red_image = np.zeros(raccoon.shape)
raccoon_red_image[:, :, 0] = raccoon_red_channel
raccoon_blue_image = np.zeros(raccoon.shape)
raccoon_blue_image[:, :, 1] = raccoon_blue_channel
raccoon_green_image = np.zeros(raccoon.shape)
raccoon_green_image[:, :, 2] = raccoon_green_channel
# plot the images
fig, ax = plt.subplots(1, 3, figsize=(15, 5))
ax[0].set_title('Red channel')
ax[1].set_title('Green channel')
ax[2].set_title('Blue channel')

Raccoon image in three differrent color channels
Raccoon image in three differrent color channels

Gray Images

Applying SVD to a gray image is pretty straightforward since we have only to deal with one layer of 768×1024768 \times 1024 matrix, where each value represents the dark intensity of the pixel. 00 represents black and 255255 represents white.

First, we want to normalize the image so that the values are between 00 and 11 by dividing the image by 255255.

normalized_raccoon = raccoon / 255
print(normalized_raccoon.min(), normalized_raccoon.max()) # 0.0 1.0

Once we have the image is normalized, we then multiple the image with [0.2989, 0.5870, 0.1140] to convert the image to grayscale. We are going to use @ operator to do the multiplication.

gray_raccoon = normalized_raccoon @ [0.2989, 0.5870, 0.1140]
plt.imshow(gray_raccoon, cmap='gray')

Why [0.2989, 0.5870, 0.1140]?

These values are the weights used to combine the red, green, and blue channels, respectively. This specific set of weights is based on the luminance model, which takes into account the human eye's sensitivity to different colors. The human eye is most sensitive to green, followed by red, and then blue, which is why the green channel has the highest weight.

Then, we apply SVD to the gray image. We should be getting 3 matrics, where the first matrix will be a 768×768768 \times 768 matrix, the second matrix will be a 768×1768 \times 1 matrix, and the third matrix will be a 1024×10241024 \times 1024 matrix.

U, s, Vt = linalg.svd(img_gray)
print(U.shape, s.shape, Vt.shape) # (768, 768) (768,) (1024, 1024)

Once we have the sigma values s, we then create another empty 768×1024768 \times 1024 matrix, and then we fill the diagonal values with the sigma values.

sigma = np.zeros((img_gray.shape[0], img_gray.shape[1]))
np.fill_diagonal(sigma, s)

In order to reconstruct the image, we need to pick the number of singular values that we want to use. Say that we want to use 100 singular values, then:

approximation = U @ sigma[:, :100] @ Vt[:100, :]
plt.imshow(approximation, cmap='gray')

Let's compare the original image with the reconstructed image with 100 singular values.

fig, ax = plt.subplots(1, 2, figsize=(15, 5))
for i in range(2):
ax[0].imshow(img_gray, cmap='gray')
ax[0].set_title('Original image')
ax[1].imshow(approximation, cmap='gray')
ax[1].set_title('Approximation with 100 singular values')

Original image vs reconstructed image with 100 singular values
Original image vs reconstructed image with 100 singular values

I am sure you would not be able to tell the difference between the original image and the reconstructed image. Unless you zoom in multiple times, then you will notice that the reconstructed image is a bit blurry. The only significant difference is the size of those two images where the original image is 802kb and the reconstructed image is 505kb.

Colored Images

To be continued ...