By Matt Corrente, Paul Miller, Luigi Sanchez, and Austin Suarez
The Mancala AI is a project that I got to work on with some very talented programmers (Matt Corrente, Paul Miller, and Luigi Sanchez). Our project was to use artificial intelligence in a unique way. Although AI for games has been done we thought it would be interesting to do a study on how each AI algorithm can compete against one another. We built out it out three different AI algorithms ( Min-max with Alpha beta, Monte Carlo, and Evolutionary) to run on a Flask app.
“It’s an ancient family of board games, and there are numerous variants. Many different forms of Mancala are played across Africa, though the game may originally have come from ancient Sumeria.”
Mancala is a two player turn-based board game.The goal is to capture the most beans and stones into your side.
Since Mancala is a game with a clear set of rules it allows for each AI to calculate a path to victory fairly. By fairly, we mean that some variations of this game are not perfect and allow for an extremely calculate the first move that is capable of ending the game.
- The Mancala board is made up of two rows of six holes, or pits, each.
- Four pieces—marbles, chips, or stones—are placed in each of the 12 holes.
- Each player has a store (called a Mancala) to the right side of the Mancala board.
- The game begins with one player picking up all of the pieces in any one of the holes on his side.
- Moving counter-clockwise, the player deposits one of the stones in each hole until the stones run out.
- If you run into your own store, deposit one piece in it. If you run into your opponent’s store, skip it.
- If the last piece you drop is in your own store, you get a free turn.
- If the last piece you drop is in an empty hole on your side, you capture that piece and any pieces in the hole directly opposite.
- Always place all captured pieces in your store.
- The game ends when all six spaces on one side of the Mancala board are empty.
- The player who still has pieces on his side of the board when the game ends capture all of those pieces.
- Count all the pieces in each store. The winner is the player with the most pieces. 
Mancala represents a decision problem since each player must decide on a single move to make. Each of our AI agents makes use of varying algorithms for decision making.
- Minimax with Alpha-Beta Pruning
- Monte-Carlo Tree Search
- Evolutionary Algorithm with binary encoded moves
The algorithms we used vary in the method of deciding. For example, our Minimax, Monte-Carlo and Evolutionary algorithms make use of traversing the current game tree. However, both searching algorithms use different heuristics to limit the state space being search. The state space for an entire game is very large, and running an exhaustive search on the entire game tree will run in about O(6^n) time. Where n is a number of total turns, where each player has at most 6 choices. A naive exhaustive searching algorithm for this is completely inefficient, so adding a maximum lookahead depth allows us to bound the search space for our algorithms making them run in a reasonable enough time to play against.
All our algorithms make use of a variable heuristic function. This function analyzes the current state of the board to help the AI make the best possible decision. This heuristic can be used to:
- Maximize AI’s score
- Maximize the difference between the AI and players score
- Maximize the number of pebbles on the AI side
- Maximize the difference between the number of pebbles on AI side from player’s side
Or a weighted polynomial of the above (x = score, p = pebble diff = x+p*0.3)
Modifying the evaluation heuristic also changes the difficulty of the AI. For instance, in the Minimax algorithm, evaluating the score performs much better than evaluating a number of pebbles on each side. Also, measuring the relative score between the player and the AI performs better than just maximizing the AI. The reasoning behind this would be that the AI not only maximizes his own score but also makes sure that the next best move for the player will be far lower than the AI’s score. So, it would preference 9 vs 3 over 10 vs 8.
Each of our algorithms performs differently. The Minimax algorithm with Alpha-Beta pruning is a more exhaustive searching making it much slower but is also much more difficult to beat. With a turn lookahead depth of around 6-8, the AI is quick enough to play and very difficult to beat. However, if the lookahead is not set and the AI searches the entire game tree, the AI will be unbeatable, however, due to the time complexity of the game tree, the AI takes far too long to decide on a move even with alpha-beta pruning. So, both Monte-Carlo and Evolutionary Algorithm decision-making approaches are much more user-friendly. The Monte-Carlo algorithm makes uses of a probabilistic adversarial decision making, which chooses the other players most likely move. This allows the algorithm to be far less exhaustive making it capable of making more human-like decisions, since the enemy may not always choose the predicted move. The Evolutionary algorithm makes use of binary programs that encode possible moves for the AI. The populations are initialized randomly and evolved using probabilistic mutation and crossover. Each population is evaluated by testing the heuristic for each next potential move, and the moves with the highest scores are more likely to be chosen for reproduction.
This was a very interesting project. Since the game tree is simple enough to visualize, yet results in an exponential explosion of possible game states. Each AI plays the game differently thus it is very interesting to see the result from the AI vs AI.