GKMinmaxStrategist: What does it take to build a TicTacToe AI?

Excellent question, glad you asked. :)

In this tutorial you’ll build a TicTacToe game with Swift in which two AI players battle it out. The code is available on github and contains additional comments and logging that you can uncomment to learn more about how the code works. But you’ll need Xcode 7 and OS X 10.11 to run the code, or make a few simple changes so it runs on iOS 9.

While working on this project I created a sandbox app to help me experiment and debug GKMinmaxStrategist. I’ve recorded a video to show what you can do with it, watch closely because it’s super-helpful:

I’m thinking about polishing this GKMinmaxStrategist sandbox app and sell it (with source code) for a couple bucks because it proved really useful in debugging the AI. Please holler (leave a comment or better: post in the forum) if you would like this app, and perhaps even let me know how much it would be worth to you, thanks!

GKMinmaxStrategist Orientation Course

When I started looking into GKMinmaxStrategist, I found it difficult at first to get a feeling for how things are connected, how the pieces fit together, what the code execution flow would be.

Therefore I think the best way of learning would be an orientation course that follows the same steps as the code is executed. Hopefully this helps to gain an understanding of how one thing leads to another.

I recommend to create a new SpriteKit project using the “Game” template in Xcode with Swift as the main language. That will give you a GameScene class that you should open. Then remove all the template code from the functions already in the GameScene class so you can start from scratch. You may also remove the update function as you don’t need that.

First, throw in a bunch of variables at the top of the class:

 

I’ll go over the classes hinted at here in detail in due time. Note the additional imports here: SpriteKit and GameplayKit. They are required by this example project.

The moveCount exists to end the game if the maximum number of moves have been performed. In that case it is assumed the game is a draw, since in TicTacToe you have at most 9 possible moves (5 for X, 4 for O – meaning: X is always at an advantage).

Next up, add the code below, replacing the existing didMoveToView function:

 

You’ll need two players, X and O. So for each you create a player instance and assign them a unique identifier. Here I’ve chosen to give X the playerId 1 while O gets playerId 2.

The choice of playerId is arbitrary, you could also use 0 and 1 or 80085 and 1337. I don’t care. The GKMinmaxStrategist won’t care. The only important bit is that each playerId must be unique, and that the player class must implement the GKGameModelPlayer protocol. The player’s name is only used for logging and could be omitted, but serves as an example of how you can add extra properties to the player classes.

Next thing you have to create a class implementing the GKGameModel protocol. This one uses a custom initializer so you can give the virtual playfield a size (number of columns and rows) as well as a list (array) of participating players that you’ve created earlier.

Lastly, you create the GKMinmaxStrategist instance and assign it the game model instance. I’ve purposefully chosen a maxLookAheadDepth of 1 (the default) because this is what you should start with in a new AI. Make it work well enough without looking ahead, then increase the maxLookAhead and fix the issues the AI has when looking ahead 2, 3 or more moves.

Also note that increasing maxLookAhead doesn’t necessarily or generally make the AI stronger. In case of TicTacToe the AI will begin trading less favorable positions for the current move to obtain an advantage in a future move – however the game is so simple and straightforward that this kind of thinking almost always leads to a loss. There is no recovering from a disadvantageous position in TicTacToe, assuming the other player plays perfectly. At best you can play for a draw.

I also set the randomSource to nil (the default). Again, when creating a new AI you may want to exercise it first with preset situations from which the AI will always conclude the same outcome even if there are multiple possible moves ranked at the same score (the value property in the GKGameModelUpdate protocol). That makes it easier to debug any problems as they arise, rather than having to re-start the strategizer over and over again until it randomly makes the move that you want to debug.

Commencing the game of course requires implementing the switchActivePlayer function. Add the following code below didMoveToView:

 

Here you check what the gameModel.activePlayer is, and if it’s nil or the one with playerId 2 you use the playerX (playerId: 1) as the new activePlayer. Otherwise it’ll be playerO (playerId: 2).

Next you perform a moveCount check. If there were already 9 moves made in total, you can call declareWinner with nil as the winning player which indicates a draw. I will not go into the declareWinner method as it merely draws the final board on screen and prints to the debug console. You can copy the declareWinner function from the github repo along with its companion function drawBoard. Or just implement it yourself in whatever way you see fit.

If there are still possible moves, the moveCount is increased and performAIPlayerMove calculates the next move; which is implement as follows:

 

Here, the activePlayer is cast to TheGameModelPlayer class so that you can use its implementation-specific details, such as the name property. With the activePlayer the strategist is asked to return the bestMoveForPlayer. The bestMoveForPlayer method does all the grunt work in the GKGameModel instance, which in turn makes use of the GKGameModelUpdate and GKGameModelPlayer protocol classes. I’ll get to those shortly; hold on to that thought.

For now it’s only important to know that this returns a move instance. The move may be nil if there aren’t any moves left, but we already got that covered via the moveCount variable.

Note that the alternative function randomMoveForPlayer does the exact same thing as bestMoveForPlayer, except that it returns an array of moves containing several of the highest ranked moves. You could then implement your own algorithm to select a good move, or arbitrarily selecting the worst move of the best 3 or 5. This is mainly useful to create an AI that is artificially dumbed down, one that makes mistakes. In essence, randomizing moves can make the AI both easier and less predictable (aka more like a human player). However, this would be overkill in a simplistic game like TicTacToe.

Assuming that you got the best possible move returned, the setBoardPieceForPlayer takes the activePlayer and the move and updates the board. Following that is a check to see whether the activePlayer has now won the game given the new board state, and if so, he is declared the winner. Otherwise the game commences with the other player’s turn.

Let’s have a closer look at the setBoardPieceForPlayer method which you should implement as follows:

 

That’s interesting. Setting the board piece is forwarded to the gameModel instance. The function also runs the gameModel.scoreForPlayer method again to get the score for the current board situation (aka game state).

The whole logic, including the board itself, is stored in the TheGameModel class and you can peruse this information for your own use, it needn’t be exclusively made for the GKMinmaxStrategist.

Modelling the Strategist AI

Which brings me to the strategist itself, more accurately the GKGameModel* protocol instances. You should create a new Swift file and name it GameModel.swift just so you can separate the model stuff from the scene.

Then import the GameplayKit module at the top. However, you must not import SpriteKit. The model classes should only ever contain data and logic. Once you start polluting the model with SpriteKit code you’ll violate the separation of concerns just like you would in a Model-View-Controller app if you were to place view code in the model. SpriteKit is that view code that should never be used in the model code.

That said, you can start implementing the TheGameModel class as follows:

 

This game model class has a board property which is just an array of Int. The boardWidth and boardHeight properties are handed over in the initializer, which doesn’t perform any safety checks – for tutorial reasons only.

Notice that I explicitly included the NSCopying protocol. The GKMinmaxStrategizer expects it but it’s not included in the GKGameModel protocol. While it suffices to imlement the copyWithZone method I find it better to also include the NSCopying protocol to indicate that the class is actually conforming to said protocol.

Speaking of which, here’s the copyWithZone method you should add to TheGameModel class now:

This creates a regular copy of a Swift class, using self.dynamicType – which is shorthand for “the type of this class” (similar to: [self class]). Next, the copy calls the setGameModel function which is actually part of the GKGameModel protocol and discussed next.

Add the setGameModel function below to your TheGameModel class:

The purpose of setGameModel is to take an existing game model object and to make a copy of the GKGameModel data that’s relevant for the game. Specifically any data that changes (or might change) after making a move. Here: the board and the activePlayer.

Notice how we can just assign the inputModel.board to self.board without needing to make a copy. Swift Array types are copied implicitly. If you were to use an NSArray instead you’d have to call its copy function. Whatever data you have in your game model, you should make a (deep) copy of it in setGameModel – merely assigning references would cause future moves to modify the original GKGameModel class which would lead to unpredictable results (ie possible moves skipped and what not).

So in summary: the only job of setGameModel is to copy the data from the game model that’s passed into the function to the game model represented by self.

So, what next? Well, there are two properties defined by the GKGameModel protocol that need to be implemented. Add these to your TheGameModel class:

 

I decided to simply assign these properties to “backing ivars” so to speak. The players property simply returns the same players array that was passed in the TheGameModel init function. And activePlayer is the player you set in GameScene as the activePlayer. But you’ll also change the activePlayer while strategising to simulate the other player’s moves, too. You’ll see this shortly.

Moving on to the next GKGameModel function: gameModelUpdatesForPlayer. Implement it as follows:

 

The gameModelUpdatesForPlayer function only needs to collect all possible moves for the current game state (the board) and for the active player. In the case of the TicTacToe game, any field that doesn’t yet have a piece in it is considered a valid move. Any board coordinate whose value is 0 is considered empty. Board values of 1 and 2 would be pieces already placed by the respective players.

If the for loops find a x,y coordinate whose board value is 0, a new TheGameModelUpdate instance is created with the given coordinate. The GKGameModelUpdate protocol is used by the class that encodes all data required to make a move – in most board games a simple x,y coordinate suffices because you only need to know where to set the piece. The update or move class needn’t consider whose move it is – this is given by the activePlayer property in the GKGameModel class.

Note that it is not the point of the gameModelUpdatesForPlayer function to rate the moves. It should blindly return all possible moves, no matter how foolish or unnecessary. Therefore, do not add code to gameModelUpdatesForPlayer that attempts to decide whether a given move is meaningful for the player or whether it is dangerous and should not be considered.

The decision how a move should be rated is determined by the remaining two GKGameModel functions: applyGameModelUpdate and scoreForPlayer. Let’s look at the first one, applyGameModelUpdate. Implement it as follows in your TheGameModel class:

 

This method simply takes a GKGameMovelUpdate instance (aka “a move”) and sets a board piece for the currently active player at the board position. This changes the model’s board, and if the AI needs to look ahead further in the future, the strategist will make a copy of the current game model and then re-run this function. This happens recursively for maxLookAhead times or until no valid moves are available.

This explains why increasing maxLookAheadDepth causes the number of evaluated moves to grow exponentially. This can not be stressed enough: even for a TicTacToe game on a 3×3 board with maxLookAheadDepth of 7 the strategist runs through 12,494 move iterations for the very first move! This number then falls dramatically with every move: 4,834 for the second move, 2547 for the third move, 570 for the fourth move. If you would change the board size from 3×3 to 4×4 you can easily get in the hundreds of thousands (if not millions) of move iterations for the first move!

In summary: a maxLookAheadDepth of 5-7 could easily be prohibitively slow to compute for more complex board games!

Of course this looking ahead only makes sense if new copies of the game model assume the other player as the active player. After all, one player has just made a move and now it’s the other player’s turn. Hence the code that takes the players list and, if it’s not the same player, assigns it as the new activePlayer. This works as long as you only have two players – if you need more players you’ll obviously have to change this code.

Hint: you could refactor the player switching into a separate function in TheGameModel class, and then also call it from the switchActivePlayer method in the GameScene. That way you would encapsulate the players and how they take turns inside the game model.

It’s all about the Score!

Lastly, finally, you get to the heart of the strategist: the scoring function known as scoreForPlayer in the GKGameModel protocol. This is also called the heuristic, since it returns a value that rates how desirable a given board state is from the perspective of a given player.

There’s one thing of note: the evaluation results in a score based on the current board state, not the move that was made! Scoring (typically) does not consider the move that lead to the current state. This influences the algorithm that determines the score. Allow me to illustrate this by an example:

GKMinmaxStrategist TicTacToe: X made a "good" move but it only scores a 3
X made a “good” move but it only scores a 3

Here, it is (was) X’s turn and the most recent move made by X is the one highlighted in yellow. However, the scoring function doesn’t know that this was the most recent move, it just looks at the board as it is. It therefore will give the board a relatively neutral rating (score = 3) because clearly no one is going to win or lose in the next move.

On first sight that might sound counterintuitive because clearly the AI should make this move! In fact, it’s the only valid option X has if it doesn’t want to lose!

However, let’s have a look at an alternative move. A total of 5 moves will be evaluated for this state. What if the X is placed down in the lower right corner? The scoring algorithm determines that X does have a chance to win on its next turn, however it also notices that O has a chance to win its very next turn:

GKMinmaxStrategist TicTacToe: While putting X in a good position, this move is almost certainly a loss, hence the negative score.
While putting X in a good position, this move is almost certainly a loss, hence the negative score.

The result is a score of -10, a combination of +40 for the possible win on X’s next turn and a -50 deduction due to the fact that player O has the chance of winning its next turn. Therefore making that move is even less valuable for X than the previous move. This also means there is no need to determine whether the move prevented a win for O, it suffices to ensure that any move that results in a possible win for O returns a negative score and where it does not, any positive score (even 0) would lead to the desired behavior of blocking O’s sequence.

So within the scoring function you need to find the right balance between board states that strengthen the active player and those that leave the active player at a disadvantage.

What’s more, you have to look out for the scores being added up when maxLookAhead is set to 2 or higher. Say the maxLookAhead would be 3 in the above example, and the AI for X expects a definite win is possible 2 turns from now and rates that as +100. This would then outweigh the near-term loss of -10. So the AI might decide to make the lower-left corner move despite its -10 score because it seems advantageous in the long term – the score is offset by +100, thus outranking the score of 3 that blocks O’s win.

In this case the AI would have failed to consider a guaranteed near-term loss. Hence you have two options: reduce the maxLookAhead to avoid such a situation, or rate possible losses not as -50 but rather as -500 to guarantee that they are most definitely going to be avoided.

Scoring is everything in making a good AI! That means balancing the score values not only for a given move, but also the sum of scores for future moves (which probably don’t occur as planned. You might want to consistently downrate the score the more ahead in the future a given board state is. You could do so by tracking how far into the future a move is by initially counting the number of free slots on the board, and record that number in a static var in the game model class.

That said, I think you’re eager to implement scoreForPlayer now:

 

This function first calls isWinForPlayer, passing in the active player’s playerId. Note that even though you change players in the applyGameModelUpdate function, the scoreForPlayer function will always be called with the player that was initially set as the activePlayer when you called bestMoveForPlayer.

The isWinForPlayer method goes over each row and column and also the diagonals to find 2 or 3 connected board pieces of the same type. It then returns an analysis result which is an enum that summarizes the board state: WontWinThisMove, WinThisMove, PossibleWinNextMove and MultiplePossibleWinsNextMove. Note that the latter is almost superfluous because in TicTacToe you can hardly play for getting yourself into a situation where the other player can make either move and still lose. But still, it’s there for reference.

I will not reprint the isWinForPlayer method here as it’s rather lengthy and tedious. You can look it up on github though.

Given the result from the active player’s perspective, a winScore can be assigned. These are arbitrary values that you need to balance against the dontLoseScore as well as the sum of scores that GKMinmaxStrategist calculates internally by combining (adding) each move’s score when looking multiple moves ahead in the future.

Hence my recommendation to start with a maxLookAhead of 1 – that makes testing the initial scoring function (heuristic) much easier, and then you can concentrate on getting the balance right for a maxLookAhead of 2 until you reach the highest maxLookAhead that you want your AI to support. This also depends on the type of game. A chess strategist will need a decently high look ahead somewhere in the range that a normal human being is capable of processing (ie 3-5 moves ahead) and where near-term losses can indeed be good trade-offs in the long term. This clearly isn’t the case for TicTacToe, Connect Four or similarly simple games.

Next, the scoring function looks at the score from the other player’s perspective. This is important because you don’t want the other player to get to an advantage just because you only considered your own position. There’s always two players playing.  But since the opponent didn’t make a move, there’s no need to consider a win for the opponent – that’s just not going to happen.

But you do want to prevent a possible loss in the next move, so these are rather lower than the active player’s possible wins in his next move (after the opponent has moved).

Summary

This scoring function creates a decent TicTacToe AI but not a perfect one. There are more cases to consider, for instance what move to choose if both players will neither win nor lose in the next turn. In theory, it’s possible to create a TicTacToe AI that, if it plays against itself, will have all games end in a draw. Even with a maxLookAhead of 1 although a maxLookAhead of 2 or 3 should make the scoring function easier and still lead to perfect play.

If you want to become familiar with GKMinmaxStrategist I encourage you to try and improve on the scoring function so that you do get perfect play. Finding, understanding and fixing the flaws of this TicTacToe AI best teaches you important lessons for any AI.

Despite TicTacToe’s apparent simplicity, there are many intricate details to consider. For instance, X players should prefer the center position in their first move as this prevents the O player from ever winning diagonally, thus severely reducing O’s chance of winning just by making the ideal opening move. That’s also why this game becomes rather dull once you’ve played it a couple times. It’s easy to process for a human being, implementing these thoughts as code requires a lot more work.

The complexity of AI only goes up the more complex the game is. Making a good chess AI certainly requires having a large database of commonly played opening moves as well as a database of mid and endgame situations and moves, typically obtained from professional chess matches played around the world. Without a database of opening moves the chess AI would often make really dumb moves right at the beginning.

On the other hand, who needs a perfect AI in a game that’s supposed to be fun? Having an AI that, on average, makes more sound decisions than dumb ones is often good enough. Especially if even the dumb moves can prove to be surprising to the human player, perhaps giving the notion of an AI that is merely trying to mask its strategy but really is up to something. Well, yes, you can actually have the human’s way of thinking play against themselves. 😉

Further Reading

Lastly, here’s a summary of what we have so far in terms of GKMinmaxStrategist documentation and examples.


 

Have a thought or comment about this article? Please leave a comment or leave a post in the forum, thank you!

Comments

Loading Facebook Comments ...

Leave a Reply

Your email address will not be published. Required fields are marked *