Settlers of Catan is a fun but strategic board game that I have played extensively. For those who haven't played, just know that the game board is made of hexagonal tiles of various colors representing different resources. Generally speaking the game has a lot of replayability due to the fact that the game box comes with a selection of suggested tile layouts. However, given how much the game has played the game over the years, even these many layouts can start to get played out. While there are a variety of apps and online tools for generating new boards, I have long wanted to try my own approach to the problem using entropy as a basis for board generation.
As a bit of a legend for what's to follow, here are the various types of tiles in Catan as well as the colors used to represent them for this project:
Hills produce brick and are represented here with RGB values of (180, 60, 30) (called "brick" in the code)
Forests produce wood and are represented here with RGB values of (10, 80, 10) (called "wood" in the code)
Pastures produce wool and are represented here with RGB values of (20, 200, 50) (called "sheep" in the code)
Fields produce wheat and are represented here with RGB values of (250, 230, 20) (called "wheat" in the code)
Mountains produce ore and are represented here with RGB values of (127, 127, 127) (called "stone" in the code)
Gold fields produce anything and are represented here with RGB values of (160, 140, 50) (called "gold" in the code)
Deserts produce nothing and are represented here with RGB values of (230, 210, 150) (called "desert" in the code)
Seas are areas for ships and are represented here with RGB values of (80, 210, 240) (called "water" in the code)
Suppose X is a discrete probability distribution with n possible outcomes x1 through xn. Mathematically we can write this in the following way:
The notion of Shannon entropy defines the amount of "information" gained by an event happening is the logarithm of the reciprocal of the probability of a such an event occurring. For example, if an event that is guaranteed to happen occurs then we don't gain any information; this is represented by the fact that the logarithm of 1 is equal to 0. It is well-known that there are a variety of characterizations for Shannon's "information function", but fundamentally all we care about here is that we have our function I defined in the following way:
From here, one simply defines the Shannon entropy H(X) of the distribution X to be the expected value of this information function:
For the sake of this project, my interest largely involves the "efficiency" of X rather than entropy itself. It is well-known that the discrete distribution of maximum entropy is the uniform distribution, and additionally it is possible to show that said maximum value of entropy on n possible outcomes is exactly log(n). Normalizing our entropy value with respect to maximum entropy yields a value between 0 and 1 called the efficiency 𝜂(X) of X:
The process used here for taking a hexagonal tiling and converting it to set of probability distributions is described below. Depending on the Catan expansion used, the first step is to define Rtypes as the set of possible resource tile types and to denote the size of this set as ntypes:
Next, the total number of tiles needed for the board depends on both the board size and the expansion used for the game; as such we will denote that number as ntiles. From here we define an adjacency matrix A where the entry in row i and column j is 1 if tiles i and j are adjacent and 0 if they are not:
Finally, we will use the vector vtypes of dimension ntiles to represent the choice of resource type associated with each tile on the board:
For convenience, it will be helpful to define two different families of tile subsets. The first family, denoted using the letter I, extracts the indices of tiles of a given resource type. The second family, denoted using the letter N, provides the neighboring tile indices for each space on the board.
At long last it is now possible to concisely define the probability distributions used for the algorithm. Generally speaking, each resource type has a probability distribution which is derived from the empirical frequency at which the various pairs of resource types are adjacent to each other:
Now that we have our probability distributions, the focus of this project now turns to how we can use this to create interesting Catan boards. A preliminary step here is to consider what the efficiency values of these distributions tend to look like for the officially released tilings provided in the game booklet. Broadly speaking the trends for these values can be summarized as follows:
Brick, wood, sheep, wheat and stone distributions tend to have an efficiency of approximately 0.85 regardless of the board size or expansion used
The desert distribution takes efficiency values of 0.8 and 0.5 for Original and Seafarers expansions respectively
Gold and water tiles only appear in the Seafarers expansion but their distributions tend to have efficiencies of 0.35 and 0.8 respectively
The overall goal of the board generation algorithm is to take a randomly generated tiling and to iteratively make adjustments that make the observed efficiency values move closer to the "target" efficiency values shown above. The one main caveat here is that it turns out to be better to use a lower target efficiency for water because doing so tends to lead to better board layouts. As such, we will end up using a target water efficiency of 0.6 for all Seafarers boards except for the 6-Wide boards where 0.4 is chosen instead. Regardless, the following notation is used to denote the target efficiencies:
The general approach to this algorithm is to randomly select two tiles on the board and swap the resources between them. While uniformly randomly selecting tiles is one possible approach here, a strategy that is a bit more sophisticated is to favor swapping tiles whose efficiencies are too high with those that are too low. To that end, we can define the "raw error" of a tile type to be the difference between the observed and target efficiency values:
One can show that these raw error values will always fall in the interval [-1,1] because the efficiency values themselves are between 0 and 1. The ultimate goal here is to use error values as pseudo-probabilities for a distribution, but before doing so we must address the problem of pseudo-probabilities being non-negative. There are several possible solutions to this problem but the one selected here is to linearly map the raw error internal onto the range [0,1].
This algorithm considers two possible versions of this linear mapping; as will become clear later, each method has its pros and cons. The first of the methods is to simply move the entire [-1,1] range onto [0,1] by dividing by 2 and adding 0.5. The "static" nature of always using the entire range leads to the main advantage of this method, namely its simplicity. However, assuming that raw errors tend to go to 0, this means that all normalized errors will end up converging to 0.5 over time; this isn't inherently bad but as we will see this implies that the distributions from which tile types are sampled will always converge toward uniformity. To combat this, we could instead "dynamically" determine the smallest 0-centered interval containing the current raw error values and map that onto [0,1]. The equation for doing so is very similar to the static case and using this dynamic normalization leads to interesting behavior in some cases:
With this in place, we can now test the hypothesis that swapping positive and negative raw error tiles is a good strategy. The idea of Figures 1 and 2 is as follows: (1) create a randomly generated Seafarers 8-Wide board, (2) uniformly randomly select two distinct tile types, (3) uniformly randomly select one tile on the board from each of those types, (4) record the change in mean squared raw error (MSE) upon swapping the positions of the two selected tiles, (5) repeat the swap process for 1000 swaps per board, (6) and repeat the process for a total of 20 boards. We can do this in a variety of ways but the clearest narrative appears when using the static normalization method:
For this figure all swaps are accepted, regardless of their affect on MSE
The clusters of observed normalized error pairs form a shape that is roughly speaking a "square" for both plots
Notice how the swaps which lower the MSE by the largest amount tend to come from the top and right sides of this square
To describe this phenomenon quantitatively, the correlation coefficient between distance from the diagonal and decrease in MSE is equal to 0.272; not the strongest trend but a trend nonetheless
For this figure only swaps which lower MSE all swaps are accepted; swaps which raise the MSE are reverted and not analyzed in these plots (meaning that there are fewer points in this set of figures)
Notice how the top 25% of swaps form a much larger cluster than the bottom 25%; we can't really say if normalized error pairs near (0.5,0.5) will be good or bad but we can say that pairs far from (0.5,0.5) will tend to be good for lowering MSE
As such, we have that the correlation coefficient between distance from the diagonal and decrease in MSE is 0.567; this is significantly stronger trend than when bad swaps are not rejected
The above figures demonstrate that swapping high and low normalized error tiles can tend to lead to larger decreases in MSE. In terms of implementing this algorithmically, the general idea is to use the normalized errors as pseudo-probabilities in selecting the types of tiles that will be swapped. Once the tiles types are selected the actual tiles being swapped are uniformly randomly selected from within the chosen subsets of tiles.
To lay out the math needed for the above, the following equations are used in order to get the actual distributions from the normalized errors:
There are two main points to make regarding these equations. The first point is the "skew power" parameter denoted by s. The idea here is that the skew power lets us control how close to uniform we want our distributions to be: if s is 0 then we have uniformity but if s is infinite (which is well-defined in the limit) then we are simply forcing the highest and lowest normalized error types to be selected. The second point is how the two above distributions are related. One distribution uses powers of the normalized errors as pseudo-probabilities while the other uses the powers of the complement normalized errors. The result of this construction is that tile types that are likely in distribution tend to be unlikely in the other.
For clarity on the equations described above, please refer to this interactive Desmos notebook to see how the two tile type distributions are affected by things like the normalization method and the skew power.
The first important step in testing this algorithm is to see how the choices of normalization method and skew power affect the reduction in MSE as successive swaps are applied. The following method was used to generate the curves seen below for each choice of normalization method and skew power: (1) create a randomly generated Seafarers 8-Wide board, (2) perform 5000 random swaps on the board where swaps that increase MSE are rejected and reverted, (3) repeat this process for a total of 100 boards. The MSE values are tracked over time and averaged to create the needed curves. Note that skew powers of 0, 0.5, 1, 2, 4 and infinity were tested and that the same 100 random seeds where used for all method/power pairs.
Notice how an infinite skew power tends to decrease MSE the fastest in the early stages for both statically and dynamically normalized error methods
Ultimately other skew powers catch up but the infinite skew power is best for a larger number of steps in the static case than it is in the dynamic one
For finite skew powers, it is clear that there is not a significant difference between the curves when using static normalization
However, it is also clear that there is a significant difference between skew powers when using dynamic normalization, i.e. higher skew powers reduce MSE more quickly early on but fall behind as the number of steps increases
Note how an infinite skew power quickly stops reducing MSE in both normalizations
Now that we have an understanding of the pros and cons of the algorithm's parameters, we can finally look at some example Settlers of Catan boards as generated by this algorithm. The code in the catan_boards folder is capable of handling both Original Catan and the Seafarers expansion. However, the Seafarers boards with the additional tile types of water and gold are more visually interesting so here are a couple examples:
(Left) Seafarers 6-Wide randomly generated board before applying the entropy-based algorithm.
(Right) The same Seafarers 6-Wide board after applying algorithm with a skew power of 1 for 3000 steps and static normalization.
(Left) Seafarers 10-Wide randomly generated board before applying the entropy-based algorithm.
(Right) The same Seafarers 10-Wide board after applying algorithm with a skew power of 0.5 for 8000 steps and dynamic normalization.
Catan specific scripts from the catan_boards folder of the active_projects repository
Using entropy to generate Catan boards from catan_board_generator.py
Testing high/low error swap hypothesis from catan_create_efficiency_database.py
Comparing skew powers from catan_find_best_skew_power.py
Computing target efficiency values from catan_find_efficiencies_of_official_boards.py
Board rendering capabilities from the board_games folder of the infrastructure repository
Hexagonal polygons from Polygon.py
Entire board layout from Board.py
Additional basic functionality from the common_needs folder of the infrastructure repository
Custom RGB spectrum from color_helper.py
Attribute privacy from privacy_helper.py
SQL functionality from sqlite3_helper.py
File dialogs from tkinter_helper.py
Object type checks from type_helper.py