Hello all, Ferro here!
This will be part 1 of a 2 part series, where I'll teach you how to create a bot that plays Tetris. I got the idea to create a bot whilst playing Tetris. The difficulty got so hard that the limiting factor was my reaction time as the blocks dropped almost instantly. Instead of getting better at the game, my solution was to create a bot and then claim the score as my own \ (^_^) /
These series of articles will take you along the journey of the development cycle of this bot. To be completely honest, I don't know what I'm doing thus this will be quite an ad-hoc approach. There might be an incredibly simple way to tackle some of the challenges I faced and if anyone can offer any suggestions please! leave them, I need it.
To create the bot I decided to use the python programming language as A) I was familiar with it and B) the libraries are really useful to interact with the game. The version of Tetris that we'll be using can be found here. This version allows you to hold a block and also see the next three blocks. The game uses Flash Player to work so I found it difficult to interact with the code of the game itself. My solution to this problem was to take screenshots of the game and siphon the data from that. I won't be teaching you how to install python or get the library's that we'll be using, however a quick Google search should do the trick. The version of Python we'll be using is python 3.6.3
List of libraries we'll be using
- PIL (pillow)
The objective of the bot is to get the most points possible. The easiest way to maximise points is to clear 4 lines at once. This is called a Tetris. One approach to this is to make a well on the right hand side and use the line piece. Similar to the image below
PART 1 - Creating a hook into the game.
Before we get into creating the AI we have to construct an interface that the bot can use to interact with the environment. We can view this as reading and writing to the game. This article will be split into to segments, the first will be reading data from the game and the second will be writing to it.
Reading the game data
A method we'll be using to read the data from the game is to consistently take screenshots and read the colour values of specific pixels to recreate the data that the bot can read. To do this will be using two libraries one of which is called pillow or PIL for short. This library has a fantastic at obtaining data from an image. The second library is called pyscreenshot, as the name suggests it takes screenshots.
This code will be used to take an initial screenshot of the Tetris game you should get an image similar to this.
To save us headaches in the future we're going to reduce the screen capture size so that its set around the game screen. To do this we need to define specific pixel coordinates. Originally the method I used to get the screen coordinates was to open the screenshot in paint and manually find the corner coordinates. This proved to be inefficient as different resolutions have different coordinates for the corners. Fortunately for us, the Tetris game screen is locked to a specific size. Thus if we find the coordinates of an object inside of the screen space we can calculate the constants to find the corners. This is explained by the diagram below.
To get the coordinates of the hold/next tabs we'll use a library called piautogui. This library has a bunch of functionality that we'll be using to control the game. A handy function that it possesses is to find PNG's on the screen. Download the two images naming them 'next.png' and 'hold.png'. Also download the code into the same directory.
Run the code whilst the game is open. You should have an output similar to this.
Now that we have the coordinates values we can add them to the code shown below under the section please change. Make this a new file called screenshot.py. We'll be using this chunk of code consistently in the future. Now we should be able to take a screenshot just of the game screen.
Reading Game Grid
Now that we have a refined image we can start to siphon the data. Our first task will be to create the virtualisation of the game grid. We will record the data by inputting either a 1 for a block present, or a 0 for a blank space in a 2 dimensional array. Our bot will see the grid like the image below.
By loading the im variable we are able to find the rgb value of specific pixels from the image. This code will get the rgb values of the centre pixel of every cell. It'll check if it's a colour and it'll add 1 accordingly to the 2 dimensional array. In the code make sure to change the coordinates value pending on your resolution.
Reading Current, Holding and Next Blocks
For all three of these variables we'll use the same method to extract data from them . The method that I found most effective is to detect the colour of the block to figure out what shape it is. For example a line block is light blue. However, the blocks aren't a solid colour and have many different hues. To calculate the type we need to check if a colour is within a certain domain.
This is where things start to get tricky. It's very difficult to create a domain for different shades of a certain colour using RGB values. This is because RGB values don't increase linearly up the colour spectrum. The graph explains it best. The top colour spectrum is partitioned using a HSV colour scheme and the bottom uses the RGB colour scheme. As you can see even a minor difference in hue of the HSV colour scheme can produce wildly altered values for the RGB scale making it very difficult to create a domain.
As the python library we're using only gives us RGB values we need to convert these to HSV then check them with the domains of each colour. Luckily there's a mathematical equation to convert these values. Even luckier for us someone else is already written it in python, so time to copy and paste.
Now to create the domains! I did this by grabbing random points inside each of the blocks finding the lightest and darkest hue and setting the bounces to the domain using those points. I've constructed a little diagram to elaborate this. Look at us playing with colour wheels to make a tetris bot, the rabbit hole has no bounds ／(^ x ^)＼
The code for the section is much longer than the previous . Thus I won't show screenshot as most likely everyone's going to copy and paste it. In this code I have implemented the auto-calibration so the code should be able to run out of the box. Make sure that 'hold.png' and 'next.png' are in the directory with the code, as well as this image named 'playButton.png' Below is a diagram of all the points we'll be capturing, however if the calibration worked they should already be scaled to your screen so you don't have to worry about it. If you run into a problem you'll get a none type error. If numerous people get this I'll update the article to include the procedure as it's quite lengthy and tedious.
Writing data to tetris
The code implemented to write data to the game is quite small as it's heavily intertwined with the AI's logic. This will be expanded in the second part of this series. We'll be using pyautogui to virtually press the appropriate button. To test that the code is correctly implemented we'll write a quick procedure that moves the block to the left hand side and instantly drops it, followed by holding the next block. Here is the code and it should look similar to this.
Congratulations!! You've taken the first steps to create a bot. A good analogy to summarise this article is that we gave our bot senses. Although these functions seem isolated once we've developed the AI architecture we can glue them all together. In the next tutorial we'll give our bot a brain (arguably much more exciting). If I skimmed over something or plainly left it out, please tell me. I'll be more than happy to explain things in greater depths if people are interested. Hope you guys enjoyed my first tutorial.