The Sudoku


When I was in Australia , some of my flatemates spent a lot of time in trying to resolve sudoku grids. I’ve never been very attracted by the game. But I wanted to make a joke: shamming to resolve manually the sudoku very quickly to see their faces ;-). Actually I wanted to create a program to resolve the problem and wrote the solution to the real grid. I wasn’t an expert in Sudoku. I thought there were only few logical rules to solve it. So I quickly created an algorithm and didn’t spend much time to study the problem. In the afternoon the program was coded and the problem apparently solved: I was able to resolve the sudoku grids (and the Samurai Sudoku) from the Sydney Morning Herald newspaper. Mission accomplished. I stoped to work on this problem. Few months after, I was just curious to know what kind of algorithm used the other people and began to study more precisely the sudoku. I discovered my algorithm couldn’t resolve all the sudokus: there are many other logical rules to successfully solve hard sudokus and sometimes it’s apparently impossible to resolve logically some sudokus!. That’s why I wrote this article, summary of my study.

The sudoku puzzle

The Sudoku is a popular puzzle, printed daily in newspapers in Japan, the United Kingdom, and the USA, in which the aim is to fill a 9×9 matrix of cells with digits from 1 through 9. Each puzzle consists of a grid in which some digits have already been filled in, and the goal is to fill in the remaining cells so that each digit appears once in each row, once in each column, and once in each of nine 3×3 squares into which the grid has been subdivided. Several (difficult) examples of such puzzles are shown in the figures of this section.

original Sudoku

Samurai Sudoku

Killer Sudoku

Mathematical point of view

Sudoku is a constraint satisfaction problem (CSP).The type of constraint used in this puzzle is called “all different constraint” [9]. For a set of variables noted V and a set of values D such that the cardinality |V|=|D|, each variable v must have a value d, different from other variables. It relates to problems in Graph Theory, job assignment (or Marriage Problem), and more recently, processor scheduling for massively parallel computer systems. Algorithms for solving the Marriage Problem are also used in Linear Algebra to reduce matrices to block diagonal form. Although Sudoku is NP-complete [4], it is not difficult for a computer program to solve most Sudoku puzzles quickly by a brute force backtracking search.

“proper” and “improper” sudoku puzzle

I read several forums and found that apparently all sudokus are not solvable by logical rules. in [2] , D. Eppstein define 2 types of sudoku:

A proper Sudoku puzzle must have a unique solution, and it should be possible to reach that solution by a sequence of logical deductions without trial and error methods. we have to distinguish “proper” puzzles that allow human step-by-step solution from “improper” puzzles that seemingly require trial and error for their solution.

Well.. ok , but how can we know, in the case a logical solver fails to solve a puzzle, if it’s because the puzzle is “improper” or because the solver doesn’t cover all the logical rules? Are we able to demonstrate that a sudoku is proper or improper? At this question I have no answer and didn’t find a more mathematical definition of “proper” or “improper”. Another mystery: What is the minimal number of initial values allowing us to solve a sudoku?, this answer is still unknown. According to Wikipedia [1] the winner is a japanese who generated a sudoku with only 17 initial values, but without proof of the optimality.

Try and error methods are not fun!

A competition [3] of sudoku organized for the undergraduate students in the oxford university in 2005 shows programs that solve the sudoku mainly by try and error methods. For the best programs, the solution is done in about < 0,5s with a source code smaller than 300 lines.
However, it is more interesting to instead attempt to mimic human Sudoku solvers, and derive rules that solve Sudoku puzzles without backtracking, first for the standard AI reason that such attempts can teach us much about the power of human and machine reasoning, and second because human-like problem solving capabilities allow us to automatically estimate the difficulty of Sudoku puzzles for human solvers by examining the rules necessary to solve each puzzle. In fact computer solvers can estimate the difficulty for a human to find the solution, based on the complexity of the solving techniques required. This estimation allows publishers to adapt their Sudoku puzzles to audiences of varied solving experience.

What are these logical rules?

At the beginning I thought that only 2 or 3 basic rules should be sufficient to solve a sudoku.But I was suprised by the number of existing rules. There are a lot of different rules to resolve deductively a sudoku grid, from the easiest which is just to fill a cell if it has only one value as possibility, to much more complicated rules needing a deeper analysis of the puzzle. You can read the most common rules explained in this ‘sudoku hints’ page [7]. But for the fiendship sudokus there are also other rules more difficult to code, see “Single-digit Sudoku rules” [5] for an example. In these cases, it’s a real temptation to no attempt to a backtracking search method;-)

My Approach

Try to resolve the sudoku with a “divide to conquer” approach.

I started with a basic idea of a cellular automaton representation of the problem: The neighbours of the cells wouldn’t be its physic neighbours on the grid but the cells linked to it by a constraint. So each cell would have 3 neighbourhoods representing the 3 types of constraints (Block, vertical, horizontal). For instance, The cell (0,0) has the 3×3 block neighbourhood of cells {(0,1),(0,2),(1,0)..(2,2)}, the horizontal neighbourhood {(0,1),…,(0,8)} and the vertical neighbourhood {(1,0),..(8,0)}. So the cells share a set of constraints neighbourhoods. They can deduce their states (i.e. their values) from the state of their neighbours by a (logical) rule, like cellular automaton.

But behind this original idea is the idea of decomposing the problem in several levels of sub problems and locally try to resolve the problem. So we can see the grid as a set of several entities that are the constraints (so there are 9 3x3blocks + 9 horizontal + 9 vertical constraints). These entities are composed of several sub-entities that are the cells included in each constraint ( so each constraints have 9 cells). An entities can be included in several parent entities. Example: A cell can be included in several constraints (actually 3), and a constraint in several sudokus (see the puzzle “Samurai sudoku”,a 5 interconnected sudokus puzzle).

Each entity try to resolve its own problem which is to associate each element of its variable domain V (for a cell it’s only 1 variable, for a constraint it’s the 9 variables of the 9 associated cells V={x1,x2,…,x9} ) to a element of its value domain D with all the possible values (at the beginning [1,9]) until to have an empty no resolved value domain.

This Entity/Object-oriented approach allow us to:

  • clearly separate the different level of analysis: each entity has its own logical rules.
  • allow to very easily implement sudoku’s variants like Sumarai (5 inter-connected sudokus) or Killer sudoku (no initial value, but a new kind of constraint: the sum of several cells must be equal to a given value) by adding only a new kind of entity.

Killer Sudoku representing with this approach

Try to generalize the rules

Sometimes, in the common rules presented in [7], in wikipedia or in other places, we can generalize some rules that are actually specific cases of a more global rule. Even some serious articles like [2] present 10 local rules to solve a sudoku, which is not really interesting to understand the general logic of the resolution of a sudoku.

Cell rule:

if I have only 1 possible value in D, it’s the solution

Constraint rules: the 3 generalized rules
  • 1) Divide to conquer rule:

Sometimes, in a constraint, we have a subset “V” of values needing to be placed in a subset “C” of cells even if we don’t exactly know which cell is for which value. If the cardinalities |V|= |C|, it’s a alldifferent sub-constraint, included in the original constraint. So We can split the original constraint in two new sub-constraints, and remove the values and the cells from the original constraint, belonging to the hidden constraint.

For instance, we have the constraint C where D={1,..,9} and V={cell1,..cell9} and the value domain “Dcelli” of the cells:
Dcell1={2,3,4,5,6}, Dcell2={3,4}, Dcell3= {2,4,6}, Dcell4= {5,6,8}, Dcell5= {7,6,8}, Dcell6={5,8,9}, DCell7={5,7} , Dcell8={8,7,6,1}, Dcell9={1}. We can observe that:

– The Cell9 is already found. so it’s like we have a AllDiff sub-constraint C1 with D1={1} and V1={cell9}. The original constraint C can be split in two subconstraints: C1 and C’ with: D’ =D intersect D1={1,..,8} and V’ = V intersect V1={cell1,..cell8}. So the domains Dcelli of the cells of C’ are: Dcell1={2,3,4,5,6}, Dcell2={3,4}, Dcell3= {2,4,6}, Dcell4= {5,6,8}, Dcell5= {7,6,8}, Dcell6={5,8,9}, DCell7={5,7} , Dcell8={8,7,6}.

– But if we observe C’, we can see that it’s possible to divide it again. In fact we have a sub-constraint C2 with D2={2,3,4} and V2={Cell1, cell2, cell3}. So we can split C’ to have now the new constraint C” with : D”=D’ n (i.e intersect) D2={1,5,6,..,8} and V”=V’nV2={cell4,…cell8}.

So finally from 1 Alldiff constraint, we have 3 Alldiff sub-constraints:
C1: Dcell9={1}
C2: Dcell1={2,3,4}, Dcell2={3,4}, Dcell3= {2,4}
C”: Dcell4= {5,6,8}, Dcell5= {7,6,8}, Dcell6={5,8,9}, DCell7={5,7} , Dcell8={8,7,6}

This rule generalizes the Hidden Single/Pair/Tripe rules and Naked pair/tripe rules: all of theses rules had finally a common logic.

After finding this rule, surprised that everybody talks about decade of rules but not this kind of generalized rule, I decided to search more in the web and finally came across a guy who found this rule and 2 others [6].

  • 2) Xor Reduction:

Take any Row or Column denoted P and a Block denoted B where C = P is included in B and C is not empty. Any candidate values not represented within the set P-C can be removed from all items in the set B-C. Likewise, any values not represented within the set B-C can be removed from all items in the set P-C.

This generalized rule covers the popular rules: Single Box, Intersection Removal, Pointing Pairs

  • 3) Symbol Reduction:

For each possible value V, if there are K rows which together contain exactly K possible columns for the value V, then any other rows with candidates of value V in those columns can have the candidates removed. This argument can be applied to Columns as well. Symbol Reduction is only valid when applied directly after Splitting the constraint. This generalized rule covers the popular rules: X-Wing, Swordfish, Jellyfish, Squirmbag, 6-Gronk, 7-Gronk, N-Fish

Grid rules:

see the paper Nonrepetitive Paths and Cycles in Graphs with Application to Sudoku [2]

Artificial Stupidity vs Artificial Intelligence: Backtracking method are not fun but useful

Well, can we finally resolve all the sudokus with these rules? According to [6] , the 3 rules doesn’t covered these rules: “Coloring, Remote Pairs, XY-Chains, Forcing Chains, Nishio, XY/XYZ-Wing, Aligned Pairs, Single Chains, Y-Wing ChainsNo”. Moreover, in the article [2], they randomly generate a set of 33302 Sudoku puzzles. They use their solver (a lot of local rules and a grid rule) to test the difficulty of these randomly generated puzzles. Among them, 1460 puzzles (4.4%) were unsolvable with their current rule set without backtracking.

What about the Dancing Links Algorithm and others methods

Although for standard Sudoku problems highly-optimized and sophisticated backtracking programs are the most efficient, another popular way of solving such constraint problems is Donald Knuth’s Dancing Links Algorithm [8] for solving the exact matrix cover problem, of which the Sudoku problems are a special case.