Capture Screen from a window – Python Plays Tetris p. 1

This is a first tutorial of a new series where we are going to create a bot that plays the game of tetris automatically.

The idea is that the bot takes completely the place of a human and plays the game to win it.

How does it work?

First we need to have the game of Tetris.
I found it on this website: https://tetris.com/play-tetris

Tetris

Then to create the bot we need to focus on three main aspects:

  1. Capture the screen where the game is playing, using python and Opencv.
    Also we want to be able to interact with the game directly from python. For example in this game to play we need to use three key (left, right and down) to move the object.
    We should be able to press these keys on the screen with a python code.
  2. Video processing of the game. We are going to detect the board, the tetrominoes in the game using Opencv.
  3. Build the algorythm, which is going to be the mind that understand an plays the game.

We are going to focus in this first article on how to capture the screen.

1. Capture the screen of a Window

We open the game of tetris on our browser. I’m using “Mozilla Firefox”, I suggest you to do the same if you want to try the code of this tutorial.

Open Mozilla Firefox and go to this website: https://tetris.com/play-tetris

1.1 Install and Import python libraries

Then let’s move to the python editor and let’s start importing the libraries. We need to use 4 Python libraries: pillow, pywin32, opencv and numpy.

In case you don’t have them installed, you can download the binaries here:
https://www.lfd.uci.edu/~gohlke/pythonlibs/ , follow my video tutorial to see how to install them.

import cv2
import numpy as np
from PIL import ImageGrab
import win32gui

1.2 Find tetris game window

First we scan all the windows available using the library win32gui.
On line 7 we create an emply list where we store all the name of the windows and their handle.
Hdwn is the windows handle, an ID number associated with the window
On line 11 we loop trough all the windows and we store the hwnd and window’s name on the window_list array.

# Detect the window with Tetris game
windows_list = []
toplist = []
def enum_win(hwnd, result):
    win_text = win32gui.GetWindowText(hwnd)
    windows_list.append((hwnd, win_text))
win32gui.EnumWindows(enum_win, toplist)

Once we have the list of all the windows handles and their name, we can select only the window we are interested in, the one with the game.
If you see the picture below, tha name of the window is “Play Tetris | Free Online Games …”.

Choosing the window is simple, on line 17 we put a condition and we say if “Play tetris” is on window’s name than that’s the window we want to focus on.

# Game handle
game_hwnd = 0
for (hwnd, win_text) in windows_list:
    if "Play Tetris" in win_text:
        game_hwnd = hwnd

1.3 Grab screenshots in real time

Kwowing the position of the window, we can now take a screenshot.
A screenshot is only a picture, and as we want to grab the images in realtime, we put this funciton in a loop so that we can take the entire video streaming in realtime.

while True:
    position = win32gui.GetWindowRect(game_hwnd)

    # Take screenshot
    screenshot = ImageGrab.grab(position)
    screenshot = np.array(screenshot)
    screenshot = cv2.cvtColor(screenshot, cv2.COLOR_RGB2BGR)
    cv2.imshow("Screen", screenshot)

In case you would like to know more about the code written, watch the video tutorial where I code it all from scratch and I explain everything.

Blueprint

Learn to build Computer Vision Software easily and efficiently.

This is a FREE Workshop where I'm going to break down the 4 steps that are necessary to build software to detect and track any object.

Sign UP for FREE