Introduction
When ordering old files, I found a lot of code. This is a program to find the solution to a jigsaw puzzle. I don't remember why I wrote it, it was a challenge I found interesting.
The Problem
There is a jigsaw. You have to put all the pieces in a square rectangle. You can flip and rotate pieces.
The solution
I wrote it first with an array of boards with pieces on it. Each piece that did fit without holes would be a new board. It was slow and it did use a lot of resources (for allocating arrays).
So I rewrote it in a way that the algorithm goes forward and backwards on the same board, adding and removing pieces.
The idea is pretty simple. You put the first piece (iterating an array of pieces). If it is OK, then you get to put the next one. If you can't, then you go backwards and remove it from the array. The only thing you have to remember is the order of the pieces so that when you go backwards, you can continue with the next piece.
The code - Loading the pieces
There are three classes for handling the pieces:
1111
0010
The class CPiezas
also has an array (map) with all the pieces (CPiezaInfo
) that it loads from the XML.
CPiezaInfo
: Contains information about the pieces and all its possible rotations and inversions. It contains a vector of CPieza
. Each CPieza
of the vector represents a possible rotation or inversion of the piece.CPieza
: Each item of this class represents a piece in a rotated or inverted state. It has the colour and the methods for rotation and inversion. As there are 8 possible combinations of rotation and inversion (4 for rotation and 2 for inversion), the ID holds both the piece ID and the state.For example, the ID 26 represents the piece ID:2, State:6. All possible states are defined in an enum TipoPieza
. This class inherits from CFillRectangle
which contains the general rectangle operations.
The code - Rectangle operations
The class CFillRectangle
is responsible for marking squares inside a rectangle. It was written with efficiency in mind (considering large vectors of boards). So, it uses binary logic to assign the squares. So, in a byte, 8 squares are used and marked as 1 or 0. This class is used for both boards and pieces.
The code - The board
The class CBoard
which inherits from CFillRectangle
represents the place to put the pieces. The variables are:
Fila
: Represents the current row to put a piece.Columna
: Represents the current column to put a piece.nodos
: An array of 12 items of type CNodo
. Each CNodo
represents a piece in a location of the board. Each CNodo
also has the index variable to move to the next piece in an array of pieces.inodo
: Represents the current position being processed in the array nodos
(0-11).nodeseval
: A counter for each piece that is being tried.mappiezas
: A map that links each piece ID with the CPieza
object.vecpiezas
: A vector to iterate pieces with the index variable of CNodo
.mappiezasdisp
: A map that represents if the piece is in the board or not (the piece is available or not).piezas
: A pointer to the pieces.
The methods are:
TieneHuecos
: Gets if the board has any unoccupied cells. This method is not used and could be removed.MarcarCasillas
: Given a piece (a parameter of type CPieza
), it marks or unmarks all cells of the board that correspond to the piece in the current location of the board (variables Fila
and Columna
).AsignarProxColumnaFila
: It assigns the board variables Fila
and Columna
to the first unmarked cell of the board from bottom to top, left to right.CorrespondePieza
: Checks if placing a piece in the current board cell (Fila
and Columna
) can be done.InitPiezasDisp
: Loads all the piece arrays (vecpiezas
, mappiezasdisp
, mappiezas
).Init
: Loads the 12 nodes of nodos
and sets them to null.MarcarPiezaInfo
: Given a piece, it marks or unmarks all of them in the map (in all positions) as available or not.GetIndexFirstAvailable
: Gets the next piece in the vector vecpiezas
that is available (checks mappiezasdisp
).
The code - The solver
The class CPuzzleSolver
is where the algorithm takes place. The variable "soluciones
" is used to store solutions (array of 12 CNodo
). The variable solman
is an object of type CSolutionMan
used to filter duplicated solutions. The method SolvePuzzle
inits the board and the arrays, and executes the method HacerMovida
until user cancels or the algorithm finishes. A description of HacerMovida
:
if (Current piece index in current node is last + 1, go backwards)
{
Put null into current node.
Move backwards (go to the node before, index of nodes -1)
if (There are no more nodes before)
End Process.
Position the board in the coordinates of the current node.
Remove the piece of the current node from
the board (we need to move to the next piece).
Add the piece of current node to the list of available pieces.
Move to the next available piece in the current node.
}
else
{
if (Add the piece of the current node to the board)
{
Add the piece to the board (mark cells).
Remove the piece from the list of available pieces.
Assign the current node the coordinates of the board.
if (not find next free cell in the board, the board is full, solution)
{
Add Solution
Remove piece from board
Add piece to list of available pieces
Move to the next available piece in the current node.
}
else
{
Move to the next node (which is null).
Assign the node the first available piece from the list.
}
}
else
Move to the next available piece in the current node.
}
The code - Deleting duplicates
The class CSolutionMan
uses an array of type CSolPuntos
to delete duplicated solutions. The class CSolPuntos
, unlike the board (CFillRectangle
), contains an ID of piece instead of a boolean (1 or 0).
The idea is that, to check for duplicates, it is way easier to rotate or invert points in a board than pieces.
The algorithm is:
- Iterate solutions (i).
- For each solution, rotate, invert, rotate, and invert (result: 3 new solutions).
- Compare each remaining solution (n-i sols) to the three solutions of the previous step.
- If equal, remove solution.
Algorithm X
There is a faster algorithm to solve this problem called Algorithm X developed by Donald Knuth. The most efficient implementation of this algorithm is dancing links.
The problem this algorithm resolves is "Exact cover". Basically, what it does is, given a matrix of 1s and 0s, find all rows that cover exactly once all columns with 1s:
| A | B | C | D |
P | 0 | 1 | 0 | 1 |
Q | 1 | 0 | 0 | 0 |
R | 1 | 1 | 0 | 0 |
S | 0 | 0 | 1 | 0 |
T | 0 | 0 | 1 | 1 |
In that example, the rows P, Q, and S and the rows R and T are the solutions for the given matrix. It seems simple, but for a larger matrix, finding all the solutions takes time.
The algorithm X consist of 4 steps that are executed until the matrix is null (a solution is found) or a column has all 0s (unsuccessful). Each iteration reduces the matrix size.
- Find the column with fewer 1s (if there is more than one, choose the first). For example, column A.
- For that column, choose a row where the intersection of the row and column is 1. Remember the row. For example, Row Q.
- For that row, choose the columns where the row value is 1 and for those columns, remove all rows where the value of the column is 1 (in this example, the only 1 of row Q is in column A, so remove rows Q and R).
- Remove the columns selected previously (A in this example).
After executing the steps, the matrix has column A removed and rows Q and R removed:
Next we choose column B and from that column, Row P. We remember that row. From row P, we select columns B and D (where row P value is 1). Columns B and D have 1s in row T and P (of course). So, rows P and T go, and so columns B and D:
Here we select column C (we have little choice) and select Row S. Row S and column C go and we end with an empty matrix (success). The selected rows are Q, P, S. This is one solution.
To continue, we select the next 1 row of the column with fewer 1s (column A) in the original matrix. That row is Row R. Row R has 1s in columns A and B, so we select those columns. Those columns have 1s in rows P, Q, and R. So, rows P, Q, and R go, and so columns A and B:
Now the column with fewer 1s is column D. The first row with 1 is T. So we select Row T. Row T has 1s in columns C and D, which have 1s in all rows, so we end with an empty matrix again (success). The solution are rows R and T.
Implementation of Algorithm X, dancing links
For a large matrix, removing and restoring rows and columns are time expensive operations. Besides, in most problems, these matrixes are "sparse" which means that there are a lot of 0s and a few 1s. So, it is a good idea to only represent the 1s.
First we introduce the concept of double linked circular list:
Each item has two pointers, one to the left and one to the right. The final item points to the first.
To iterate the list moving to the right, starting with Item2
:
For (i = Item1.Right; i != Item1.Right; i = i.Right)
To remove an item (Item2
for example) using only Item2
:
Item2.Right.Left = Item2.Left
Here, we point Item3.Left
(Item2.Right.Left
) to Item2.Left
(Item1
) so that it "skips" Item2
. Item2.Right
is Item3
. Item2.Left
is Item1
.
Item2.Left.Right = Item2.Right
Here, we point Item1.Left
(Item2.Left.Right
) to Item2.Right
(Item3
) so that it "skips" Item2
. Item2.Left
is Item1
. Item2.Right
is Item3
.
To add an item (Item23
for example) at the right of Item2
(between Item2
and Item3
) using only Item2
:
Item2.Right.Left = Item23
Item2.Right = Item23
Here we point Item3.Left
to Item23
and Item2.Right
to Item23
. Item2.Right
is Item3
.
Item23.Right = Item2.Right
Item23.Left = Item2
Here we point Item23.Right
to Item3
(Item2.Right
) and Item23.Left
to Item2
.
Now, the problem of algorithm X can be represented with each 1 as a node and circular double links lists interlaced in a matrix with nodes that point up, down, left, and right. Representing the sample problem:
In this image, the node at Row S is alone in the row, so it points both left and right to itself. Each node also has a pointer to its column header (in the algorithm).
Representing the pentominoes problem as exact cover
We can represent the board as 60 "cells" to be filled. Each piece with its orientation, at a certain position in the board "covers" certain "cells". For example, the T piece, starting in the position 1 covers cells 1, 2, 3, 8, and 14.
So, we can make a grid with 60 columns, and represent every position (in the board) of each piece at all orientations. It is a pretty big matrix. But each piece can only be put in the board once, so we need to make sure it doesn't repeat. To do that, we add 12 columns (one for each piece) and we put a 1 in the corresponding piece. If the T piece is piece 3, the representation of the three boards filled with T piece starting at 1, 2, and 3 would be:
Of course, there are 60 board columns and I only represented 17, the rest of the columns have 0 values for this example. For this problem, it is a sparse matrix. Of 72 values for each row, only 6 will have 1 value (8%). The matrix has 1864 rows if we only consider a piece with 2 of 8 orientations to remove trivial solutions.
Trivial results
To avoid rotated or inverted solutions, you can use only 2 orientations out of a piece which allows 8 orientations.
Dancing links, the code
The two most important methods are the cover and uncover (or remove
and restore
). Given a col, they remove the col and all the rows where the col has an item. Visually, it's like removing a TV antenna:
void CDLinks::Remove(CColumnHeader* hcol)
{
hcol->right->left = hcol->left;
hcol->left->right = hcol->right;
for (CNode* row = hcol->down; row != hcol; row = row->down)
{
for (CNode* col = row->right; col != row; col = col->right)
{
col->down->up = col->up;
col->up->down = col->down;
col->header->len--;
}
}
}
void CDLinks::Restore(CColumnHeader* hcol)
{
for (CNode* row = hcol->up; row != hcol; row = row->up)
{
for (CNode* col = row->left; col != row; col = col->left)
{
col->header->len++;
col->down->up = col;
col->up->down = col;
}
}
hcol->right->left = hcol;
hcol->left->right = hcol;
}
Results
In my laptop (slow), in 11 min 13 seconds, the program found 9356 solutions. After deleting duplicates, 2339 solutions remained (which is logically correct, 9356/4=2339). It checks for 288 949 150 nodes (try to put pieces in the board).
Using dancing links and avoiding trivial solutions, the program takes 14 seconds to find all the solutions.
This shows that dancing links is a much faster algorithm to calculate solutions for this kind of problems.
To improve speed, all the nodes are stored in a vector, with the reserve()
method. That way, there is only one allocation instead of one for each node being created.
Libraries
I migrated the solution to VS 2010. It uses MFC. The code compiles to both 32 and 64 bits.
Acknowledgements