We are going to learn in this tutorial how to detect the gaze.

Why do we need to detect the gaze?

To answer to this question I need to explain to you how I plan to run the app.
The idea is to see a keyboard on the screen, light up each key every second or so and when the key we’re insterested in is on, simply we close our eyes.

If we consider that the keyboard has 26 letters of the alphabet, plus we need the space and some other key, just to run through all the keys lighting up each for one second, it will take half a mintue each time.

The idea is to devide the keyboard in two parts. If we look on the left side only the left part of the keyboard will be activated, while if we look on the right side only the letter on the right part of the keyboard will light up.

Detect gaze of left eye

We need to detect the gaze of both eyes, but for the moment we will focus only on one eye and later we will apply the same method for the second eye.

We can select the second eye simply taking the coordinates from the landmarks points.

We know that the left eye region corresponds to the landmarks with indexes: 36, 37, 38, 39, 40 and 41, so we take them.

# Gaze detection
left_eye_region = np.array([(landmarks.part(36).x, landmarks.part(36).y),
							(landmarks.part(37).x, landmarks.part(37).y),
							(landmarks.part(38).x, landmarks.part(38).y),
							(landmarks.part(39).x, landmarks.part(39).y),
							(landmarks.part(40).x, landmarks.part(40).y),
							(landmarks.part(41).x, landmarks.part(41).y)], np.int32)

Once we have the coordinates of the left eye, we can create the mask to extract exactly the inside of the left eye and exclude all the sorroundings.

height, width, _ = frame.shape
mask = np.zeros((height, width), np.uint8)
cv2.polylines(mask, [left_eye_region], True, 255, 2)
cv2.fillPoly(mask, [left_eye_region], 255)
left_eye = cv2.bitwise_and(gray, gray, mask=mask)

We now extract the eye from the face and we put it on his own window.
Only we need to keep in mind that we can only cut out rectangular shapes from the image, so we take all the extremes points of the eye (top left and right bottom) to get the rectangle.

We also get the threshold that will need to detect the gaze.
If you’re interested to learn in details how we detect the gaze, you can check this other tutorial Eye motion tracking.

min_x = np.min(left_eye_region[:, 0])
max_x = np.max(left_eye_region[:, 0])
min_y = np.min(left_eye_region[:, 1])
max_y = np.max(left_eye_region[:, 1])

gray_eye = left_eye[min_y: max_y, min_x: max_x]
_, threshold_eye = cv2.threshold(gray_eye, 70, 255, cv2.THRESH_BINARY)

And finally we display it on the screen.
I’m going to increase it’s size so we can see it better.

threshold_eye = cv2.resize(threshold_eye, None, fx=5, fy=5)
eye = cv2.resize(gray_eye, None, fx=5, fy=5)
cv2.imshow("Eye", eye)
cv2.imshow("Threshold", threshold_eye)
cv2.imshow("Left eye", left_eye)