TicTacToe - MiniMax Without Recursion

Christ Kennedy

4.63/5 (4 votes)

Dec 12, 2018

CPOL

8 min read

16744

407

An exhaustive search of all game-states used to trace Mini-Max decisions from End game up through to Start game

Download TicTacToe_2.0 - 102.9 KB (recursive code)
Download TicTacToe - 2.1 MB (exhaustive search code)

Introduction

I came up with the idea of making an exhaustive search of all viable tic-tac-toe game states to build an AI-like game, thinking "that's not really A.I. but it might be a fun project".

Background

After about ten days of messing around debugging all the issues with resolving which game states are viable ancestors to the current game state and sending up information like TurnsToExWin and TurnsToOhWin and Win_Lose ratio and a multiple of other dumpster diving dead-end data attempts, I got the thing to do what I wanted - generate all states and send up the information but the results were, to put it kindly, lacklustre. Then I read this article, How to make your Tic Tac Toe game unbeatable by using the minimax algorithm. And the Mini-Max thing came alive. I slept on it relieved to have been given the answer on a platter, then implemented the Recursive Mini-Max tic-tac-toe game in a couple of hours and decided I knew what I was doing.

Using Mini-Max with the core of what I had written already wasn't so bad since all the ancestor/descendant problems were already resolved. A few mistakes later actually rethinking the results of recursion from a different perspective and here it is. An algorithm that generates all possible game states and steps back through them all systematically in order to reproduce the results of a recursive Mini-Max algorithm so that now it simply goes straight to the right record on file, looks up the Moves and picks one appropriate for its IQ setting.

Using the Code

The project is fairly friendly enough for your sisters' kids to play it. Developers might be interested in some of the components available in the CK_objects such as the HSlider which depicts a graphical slider along a path that lets the user set an integer value, here the Computer IQ.

Where the recursion Mini-Max looks like this:

int evaluateMove(classGameState cGameState, enuSquareStates eStatePlayer)
{
  intRecursionCounter++;
  int intMyID = intRecursionCounter ;
  enuBoard_State eBoardState = cGameState.getState(ePlayerTurn);
  enuSquareStates eOpponent = getOpponent(eStatePlayer);
  switch (eBoardState)
  {
    case enuBoard_State.Draw:
      return 0;
    case enuBoard_State.Lose:
      return -10;
    case enuBoard_State.Win:
      return 10;
    default:
      int[,] intMoves = new int[3, 3];
      List<classPointListElement> lstMoves = new List<classPointListElement>();
      for (int intY = 0; intY < 3; intY++)
        for (int intX = 0; intX < 3; intX++)
        {
          if (cGameState.board[intX, intY] == enuSquareStates.blank)
          {
            Point pt = new Point(intX, intY);
            classGameState cNextGameState = new classGameState(cGameState.board, pt, eOpponent);
            enuBoard_State eNextBoardState = cNextGameState.getState(ePlayerTurn);

            intMoves[intX, intY] = evaluateMove(cNextGameState, eOpponent);
            classPointListElement cPLE = new classPointListElement(pt, intMoves[intX, intY]);
            lstMoves.Add(cPLE);
          }
        }
      IEnumerable<classPointListElement> query;
      if (ePlayerTurn == eStatePlayer)
        query= lstMoves.OrderBy(pointListElement => pointListElement.value);
      else
        query = lstMoves.OrderByDescending(pointListElement => pointListElement.value);
      lstMoves = query.ToList<classPointListElement>();
      return lstMoves[0].value;
  }
}

and calls itself (recurses) for every turn the computer makes. The Exhaustive search PlayAi() function needs only decide what choice to make based on its IQ setting.

void PlayAI()
{
  if (cGame == null) cGame = classBinTree.instance.BinRec_Load(0).Record;

  List<classPointListElement> lstMoves = new List<classPointListElement>();
  classEnumeratedTypes.enuPlayer eOpponent = classTicTacToe_Record.NextPlayer(cGame.ePlayer);
  for (int intY = 0; intY < 3; intY++)
    for (int intX = 0; intX < 3; intX++)
      if (cGame.Descendants[intX, intY] > 0)
        lstMoves.Add(new classPointListElement
        (new Point(intX, intY), cGame.Moves[(int)cGame.ePlayer, intX, intY]));
  IEnumerable<classPointListElement> query = 
              lstMoves.OrderByDescending(move => move.value);
  lstMoves = query.ToList<classPointListElement>(); // lstMoves[0] 
                                   // has the best choice to play next

 // choose a path appropriate for this IQ
      int intRnd = 0;
  if (ghs_IQ[(int)cGame.ePlayer].Value > 90)
  {
    List<classPointListElement> lstTopMoves = new List<classPointListElement>();
    for (int iCounter = 0; iCounter < lstMoves.Count; iCounter++)
      if (lstMoves[iCounter].value == lstMoves[0].value)
        lstTopMoves.Add(lstMoves[iCounter]); intRnd = cRnd.getInt(0, lstTopMoves.Count - 1);
    cGame = classBinTree.instance.BinRec_Load
    (cGame.Descendants[lstTopMoves[intRnd].pt.X, lstTopMoves[intRnd].pt.Y]).Record;
  }
  else
  {
    int intLogBase = 6; // the higher the logbase,
                        // the less biased the result will be towards Minimum
    intRnd = cRnd.getBiasedInt(0, lstMoves.Count - 1, 
             ghs_IQ[(int)cGame.ePlayer].Value, intLogBase);
    cGame = classBinTree.instance.BinRec_Load
                 (cGame.Descendants[lstMoves[intRnd].pt.X, lstMoves[intRnd].pt.Y]).Record;
  }
}

It reads the [3,3] array of moves for the current player, copies them all into a list retaining their cartesian indices along with their play-value, then reorder the list in Descending order so that the best choice for a next move is top of the list. After this is done, it randomly chooses one of the permissible moves using a biased random number generator which favours low range values for higher Grade input parameter (here IQ) so that the end result is the higher the IQ of the computer player, the more likely it will choose the best path. Here, if the computer has an IQ higher than 90, it will select from the equally valid best paths only.

The getBiasedInt() was an after-thought and gets decent results. It chooses a random number between 0 and MaxRange, then resets the MaxRange to the previous random value and loops n times where n is a direct function of the ComputerPlayer's IQ. The greater the IQ, the more times it loops and the more likely a low number will result and the higher-IQ computer will choose the better paths to play. It's not perfect but it's pretty good and that's not what this article is about, so I kept it anyway.

Points of Interest

The Mini-Max article listed above is very clear and well written. I wouldn't have wasted a week trying to get the right data to my PlayAI() function if I had read it earlier and I refer you to it for any questions about Mini-Max. Things are a lot easier to do when the algorithm is handed to you on a plate like that one is!

What this non-recursive algorithm does is it removes the need for the AI to think once the data is built.

FindEndGames

To build the data-file, what the algorithm does is it first makes a breadth-first-search of all possible game states. Essentially, it lists all possible moves from a start position and places them in a Queue. Then, it picks the first one off the Queue and processes it by merely repeating what it just did -> it too places all possible moves from that game state into the same queue, then picks the next game-state off the queue and proceeds until there are no more game states to be found.

In order to retrieve the game states once they're in the file, each game state's file record is a binary tree leaf where the leaf's search key is its unique game-state ID. The unique Game-State ID is an integer value that describes the game board using a base-3 9 'decimal-place' numbering system where the top-left (0,0) square on the board is the least significant (3⁰), the next one over (1,0) is the next least significant (3¹) and so on until you get to the bottom-right (2,2) which is the most significant (3⁸). Letting a blank space = 0, Ex =1 and Oh =2 we multiply the contents of the squares by 3ⁿ and get a unique ID that describes the game state and can be used as a search key for the binary tree in which the game-states are stored.

Alternately, bypassing the binary-tree altogether would accelerate the game-state record retrieval if the uniqueID is used as an index in the file. Doing so, however, would significantly increase the size of the file but since Harddrive storage devices are so abundant and cheap, that may be a more suitable solution if speed is your game.

All game states are sorted into 9 separate files that each contain a list of indices to the records in the binary-tree compiled during the FindEndGames phase of the data rebuild. These indices point directly to the record in the binary-tree file and spares us the need to perform a search for the UniqueID of the record we're looking for. These indices are used throughout the rebuild stage as well as the playAI() stage such as this line here seen above which uses the classBinTree's load function to retrieve a record using a random index of the lstTopMoves[].

 cGame = classBinTree.instance.BinRec_Load
 (cGame.Descendants[lstTopMoves[intRnd].pt.X, lstTopMoves[intRnd].pt.Y]).Record;   }

TraceEndGames

At this point, all the game states have been defined and are stored in the binary-tree file on the harddive. Each one points to its own descendants (game states that can be achieved from current game state). The Descendants[3,3] array of each record holds the indices of each of its viable descendants but none of these records know how to point to their own Ancestors yet.

The algorithm scans each list of game states that were sorted during the FindEndGames phase of the data rebuild, starting with the most crowded game-states that either result in Draw or ExWin end games (Ex has the last and first turn placing 5 Exes on the board while Oh only has 4). For each record, it determines what kind of game state it is. For end-games, a value is assigned of either -10, 0 or +10 depending on the game's end and the player in question. Both Ex and Oh hold Mini-Max information in each record that allows the AI to then choose a path in exactly the same way as it would if it had gotten the results from a recursive search of each available square on the board. If the game-state is not an end-game, then it needs to determine what value to 'return' up to its ancestors in a way that mimics the recursive search results. It checks the contents of all its own Moves[,] array elements which were set during the previous scan-file level (the scan of the file holding all the game-states that were one game-square apart from the current scan file's game states) then sends that information up to its own ancestors and stores it in their Moves[] array.

The tricky part about passing the information up to the ancestors is that in Tic-Tac-Toe each player takes a turn so for non-end-game game-states you can remove any one of the opposing player's pieces and generate a viable ancestor. However, end-games that result in a win for the opposing player (current player inherits the end game after their opponent has played) the only viable ancestors are the three squares that hold the three winning pieces. The exception to this rule is when the game ends with two sets of three of the opponents pieces making a row or column.

public enum enuTypeWin { H1=0, H2, H3, V1, V2, V3, RS, BS, Multiple, _numTypeWin };

In those cases, here referred to as 'multiple', there is only one viable ancestor and that is the one square that is common to both winning lines of three of the opponent's pieces.

Once the ancestors have been determined, each one is provided with the Mini-Max information for the Ex and Oh respectively. This information is stored in the Ancestor record's Moves[,] array corresponding to the square that leads to the game-state currently being processed. Since the files are scanned from the most crowded game-states to the least crowded in sequence, the Mini-Max information is passed along upwards until it reaches the starting (empty) game state at record index 0 (and root node) of the binary-tree file.

History

History? I dunno. I looked up wikipedia for some history on Mini-Max and thought I'd find something clever to punctuate this article, with but it ain't happening.

Future? Well, now. Tic-tac-toe adds pieces per turn and is fairly predictable so it was a relatively easy project to complete in that sense so I'm thinking maybe checkers? Not sure. Ultimately, Chess would be nice.