Capture Screen from a window – Python Plays Tetris p. 1
This is a first tutorial of a new series where we are going to create a bot that plays the game of tetris automatically.
The idea is that the bot takes completely the place of a human and plays the game to win it.
How does it work?
First we need to have the game of Tetris.
I found it on this website: https://tetris.com/play-tetris
Then to create the bot we need to focus on three main aspects:
- Capture the screen where the game is playing, using python and Opencv.
Also we want to be able to interact with the game directly from python. For example in this game to play we need to use three key (left, right and down) to move the object.
We should be able to press these keys on the screen with a python code.
- Video processing of the game. We are going to detect the board, the tetrominoes in the game using Opencv.
- Build the algorythm, which is going to be the mind that understand an plays the game.
We are going to focus in this first article on how to capture the screen.
1. Capture the screen of a Window
We open the game of tetris on our browser. I’m using “Mozilla Firefox”, I suggest you to do the same if you want to try the code of this tutorial.
Open Mozilla Firefox and go to this website: https://tetris.com/play-tetris
1.1 Install and Import python libraries
Then let’s move to the python editor and let’s start importing the libraries. We need to use 4 Python libraries: pillow, pywin32, opencv and numpy.
In case you don’t have them installed, you can download the binaries here:
https://www.lfd.uci.edu/~gohlke/pythonlibs/ , follow my video tutorial to see how to install them.
import cv2 import numpy as np from PIL import ImageGrab import win32gui
1.2 Find tetris game window
First we scan all the windows available using the library win32gui.
On line 7 we create an emply list where we store all the name of the windows and their handle.
Hdwn is the windows handle, an ID number associated with the window
On line 11 we loop trough all the windows and we store the hwnd and window’s name on the window_list array.
# Detect the window with Tetris game windows_list =  toplist =  def enum_win(hwnd, result): win_text = win32gui.GetWindowText(hwnd) windows_list.append((hwnd, win_text)) win32gui.EnumWindows(enum_win, toplist)
Once we have the list of all the windows handles and their name, we can select only the window we are interested in, the one with the game.
If you see the picture below, tha name of the window is “Play Tetris | Free Online Games …”.
Choosing the window is simple, on line 17 we put a condition and we say if “Play tetris” is on window’s name than that’s the window we want to focus on.
# Game handle game_hwnd = 0 for (hwnd, win_text) in windows_list: if "Play Tetris" in win_text: game_hwnd = hwnd
1.3 Grab screenshots in real time
Kwowing the position of the window, we can now take a screenshot.
A screenshot is only a picture, and as we want to grab the images in realtime, we put this funciton in a loop so that we can take the entire video streaming in realtime.
while True: position = win32gui.GetWindowRect(game_hwnd) # Take screenshot screenshot = ImageGrab.grab(position) screenshot = np.array(screenshot) screenshot = cv2.cvtColor(screenshot, cv2.COLOR_RGB2BGR) cv2.imshow("Screen", screenshot)
In case you would like to know more about the code written, watch the video tutorial where I code it all from scratch and I explain everything.