Sunday, December 22, 2013

Sudoku

Remarkably the evolutionary algorithm I developed for Eight Queens problem worked without modification for sudoku problem. The only differences were array size (DNA) and the fitness function.

This is perhaps not so surprising considering how evolution creates vastly different species (solutions) for different environmental conditions (problems).

Here is the pseudo-code of Evolutionary Algorithm:


Here is a diagrammatic representation:


It is simple really. You have a pool of population which is made of many solutions. At each iteration select fit parents, recombine their genes and allow mutation to make 2 distinct children. Sort the population by fitness score. Let the worst 2 die, and continue iteration until the best fit is found.

In the Eight Queens problem, from the chess-board, the array size is 8x8 = 64, whereas in Sudoku it is 9x9 = 81.

The fitness function calculates a positive number which increases with the level of fitness.

Usually the worst fit solution is calculable. In the case of Sudoku this is one of the worst solutions:

[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
 
This is not even a legal acceptable final solution is Sudoku. But any solution in evolutionary landscape is legally born. When an individual solution's fitness is calculated it may or may not survive the next run (generation).

A perfectly fit Sudoku solution is achieved when all 1-9 digits present in 3x3 matrices along with no-clashing digits present across rows or columns.

Therefore in Sudoku the unfitness score is a number calculated by counting number of times a given digit is repeated across a row or a column and also adding the missing set of digits in all 3x3 sub-matrices.

The resultant fitness score is normalised form of unfitness score, i.e. fitness score is calculated by subtracting the calculated unfitness score from the maximum unfitness score (378).

Two runs from a typical output are given below:

-----------------------------------------
Run: 1
Best score 378
Found best '1' in 3540 iterations.
[4, 1, 3, 9, 8, 6, 2, 5, 7]
[7, 2, 9, 5, 3, 4, 8, 1, 6]
[6, 5, 8, 2, 1, 7, 3, 9, 4]
[2, 3, 7, 4, 5, 8, 9, 6, 1]
[8, 6, 4, 1, 2, 9, 5, 7, 3]
[1, 9, 5, 6, 7, 3, 4, 2, 8]
[9, 4, 2, 8, 6, 1, 7, 3, 5]
[3, 8, 6, 7, 9, 5, 1, 4, 2]
[5, 7, 1, 3, 4, 2, 6, 8, 9]
-----------------------------------------
Run: 2
Best score 378
Found best '1' in 8636 iterations.
[2, 7, 5, 3, 9, 4, 8, 6, 1]
[8, 1, 3, 5, 2, 6, 7, 9, 4]
[9, 6, 4, 8, 1, 7, 2, 5, 3]
[4, 2, 6, 9, 8, 3, 1, 7, 5]
[1, 8, 7, 2, 4, 5, 9, 3, 6]
[3, 5, 9, 6, 7, 1, 4, 2, 8]
[5, 4, 1, 7, 3, 2, 6, 8, 9]
[6, 9, 2, 1, 5, 8, 3, 4, 7]
[7, 3, 8, 4, 6, 9, 5, 1, 2]
...

I still have to tune the program for better performance. I may try different mutation or recombination strategies, play with population sizes and selection strategies to shorten the time required to create a sudoku solution.

The prototype was written in Python. Eventually I would like to port the algorithm to IOS 7 for creating the sexiest sudoku app ever.

The Evolutionary Algorithm I used was from the book Introduction to Evolutionary Computing1.

References:
1. Introduction to Evolutionary Computing, Agoston E. Eiben, J.E. Smith, Springer, 01/01/2003
2. Sudoku - Wikipedia
3. Eight Queens - Wikipedia

No comments: