The Perspective Transformation is that operation that we use when we want to change the perspective of an object.

In simpler words, let’s say for example that we have a sheet of paper on the table and we’re capturing it with a camera.

Sheet paper

As you can clearly see on the picture above, the part of the paper closer to the camera is bigger than the one that is further.
That’s how our eyes see, objects closer look bigger than the one that are further.

How to do perspective transformation?

Let’s now quickly analyze the python code to do a perspective transformation.

First we need to load the image we want to transform. So let’s import the libraries and then we load the image.

import cv2
import numpy as np

img = cv2.imread("sheet_paper.JPEG")

We then need to select 4 points, in order: top-left, top-right, bottom-left, bottom-right.

From Line 6 to Line 9 I’m just drawing a circle to show the exact points we are taking.

On line 11 we create a list with this 4 points, and we’ll use this list later to apply the transformation.

cv2.circle(img, (470, 206), 5, (0, 0, 255), -1)
cv2.circle(img, (1479, 198), 5, (0, 0, 255), -1)
cv2.circle(img, (32, 1122), 5, (0, 0, 255), -1)
cv2.circle(img, (1980, 1125), 5, (0, 0, 255), -1)

pts1 = np.float32([[470, 206], [1479, 198], [32, 1122], [1980, 1125]])

On line 12 we create a new set of 4 points.
This 4 points are the size of the new window where we want to display the image transformed.

pts2 = np.float32([[0, 0], [500, 0], [0, 600], [500, 600]])

Then we apply the perspective transform to create the matrix and finally we can warp the image into using the original frame and the matrix just created.

matrix = cv2.getPerspectiveTransform(pts1, pts2)
result = cv2.warpPerspective(frame, matrix, (500, 600))

Then we can show it on the screen:

cv2.imshow("Image", img)
cv2.imshow("Perspective transformation", result)
cv2.waitKey(0)
cv2.destroyAllWindows()

This is how the final result will look like: