OpenCV-Python | category: sudoku solver



OpenCV Python Tutorials

Sudoku Solver - Part 2


This is the continuation of the article : Sudoku Solver - Part 1

So we start implementing here.

Load the image :

Below is the image I used to work with.

Sudoku Solver - Part 2
Original  Image
So, first we import necessary libraries.

import cv2
import numpy as np

Then we load the image, and convert to grayscale.

img =  cv2.imread('sudoku.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

Image Pre-processing :

I have done just noise removal and thresholding. And it is working. So I haven't done anything extra.

gray = cv2.GaussianBlur(gray,(5,5),0)
thresh = cv2.adaptiveThreshold(gray,255,1,1,11,2)

Below is the result :

Sudoku Solver - Part 2
Result of adaptive thresholding
Now two questions may arise :

1) What is the need of smoothing here?
2) Why Adaptive Thresholding ? Why not normal Thresholding using cv2.threshold()  ? 

Find the answers here : Some Common Questions

Find Sudoku Square and Corners :

Now we find the sudoku border. For that, we are taking a practical assumption : The biggest square in the image should be Sudoku Square. In short, image should be taken close to Sudoku, as you can see in the input image of demo.

So a lot of things are clear from this : Image should have only one square, Sudoku Square, or not, Sudoku Square must be the biggest. If this condition is not true, method fails.

It is because, we find the sudoku square by finding the biggest blob ( an independant particle) in the image. So if biggest blob is something other than Sudoku, that blob is processed. So, I think you will keep an eye on it.

We start by finding contours in the thresholded image:

contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

Now we find the biggest blob, ie blob with max. area.

For this, first we find area of each blob. Then we filter them by area. We consider a blob for next processing only if its area is greater than a particular value (here, it is 100). If so, we approximate the contours. It removes unwanted coordinate values in the contour and keep only the corners. So if number of corners equal to four, that is a square (actually, a rectangle). If it has the maximum area among all detected squares, it is out Sudoku square.

biggest = None
max_area = 0
for i in contours:
area = cv2.contourArea(i)
if area > 100:
peri = cv2.arcLength(i,True)
approx = cv2.approxPolyDP(i,0.02*peri,True)
if area > max_area and len(approx)==4:
biggest = approx
max_area = area

For you to understand between original contour and approximated contour, I have drawn it on the image (using cv2.drawContours() function). Red line is the original contour, Green line is the approximated contour and corners marked in blue color circles.

Sudoku Solver - Part 2
Border and corners detected
Look at the top edge of sudoku. Original contour ( Red line) grazes on the edge of square and it is curved. Approximated contour ( Green line) just made it into a straight line.

Now, a simple question may arise. What is the benefit of filtering contours with respect to area? What is the need of removing them ? In simple words, it is done for speed up of the program. Although it may give you a little performance ( in the range of few milliseconds), even that will be good for those who want to implement it in real time. For more explanation, visit : Some Common Questions

Summary :

So, in this section, we have found the boundary of sudoku. Next part is the image transformation. I will explain it in next post.

Until then, I would like to know your feedback, doubts etc.

With Regards

Sudoku Solver - Some Common Questions


This is a post to answer some common questions that can arise while dealing with the Sudoku Solver.

Question 1 : What is the need of Smoothing?

Answer : You will understand its need if you see the result without applying Smoothing. Below is the result of Adaptive Threshold without Smoothing.

Sudoku Solver - Some Common Questions
Result of adaptive noise without smoothing
You can see the same result after applying a smoothing:

Sudoku Solver - Some Common Questions
After smoothing
Compare the results. There are lot of noises in the first case. So we have to remove them in the next step which is an extra task.

I just compared number of independent objects found (ie contours ) in both the cases. Below is the result:

First without smoothing:
>>> len(contours)

Next after smoothing:
>>> len(contours)

See the difference. Without smoothing, we are dealing with 7 times the number of objects than those found after smoothing. So which one is good?

To know different Smoothing Techniques : Smoothing Techniques in OpenCV

Question 2 : Why adaptive thresholding ? Why not normal thresholding ?

AnswerReason, You will understand when we compare the results of them. 

Below is the result, I got using Adaptive Threshold :

Sudoku Solver - Some Common Questions
Result of Adaptive Threshold
Now we apply normal thresholding for a value of 96 ( 96 is the auto threshold value generated by GIMP):

Sudoku Solver - Some Common Questions
Normal thresholding for value = 96
Now see the difference. It is because normal thresholding thresholds the image taken as a whole, while adaptive threshold thresholds the image taking an optimum value for a local neighbourhood. 

To know more about thresholding techniques :

Question 3 What is the benefit of filtering contours with respect to area? 

Answer : 1) To avoid small noises which has an area less than prescribed value and we are sure it can't be the square

2) It also improves the speed a little bit.

I will show you some performance comparisons below:

A)  We have already calculated number of objects (contours) found, which is 450. Without having any area filter, it process all the 450 contours. For that, you can just change the code as below:

for i in contours:
if area > min_size:
peri = cv2.arcLength(i,True)
approx = cv2.approxPolyDP(i,0.02*peri,True)
if area > max_area and len(approx)==4:
biggest = approx
max_area = area

It checks all the 450 contours for maximum area and it takes an average of 30 ms.

B)  Now we implement a filter for area of 100, as explained in the original code. Then it takes checks only 100 contours and takes only an average of 15 ms. So we get 2X performance.

C)  Now change the value from 100 to 1/4 of the image size. Check the code below:

min_size = thresh.size/4
for i in contours:
if area > min_size:
peri = cv2.arcLength(i,True)
approx = cv2.approxPolyDP(i,0.02*peri,True)
if area > max_area and len(approx)==4:
biggest = approx
max_area = area

Now it checks only one contour,our square, and takes only an average of 3 ms. ie, 10X performance.

Now, although time difference is only 27 ms, it will be highly useful if we implement it in real time.

So, it all depends on how you use it.
Sudoku Solver - Part 2Sudoku Solver - Some Common Questions

Report "OpenCV-Python"

Are you sure you want to report this post for ?