# OpenCV-Python

### OpenCV Python Tutorials

opencvpython.blogspot.com

## Histograms - 3 : 2D Histograms

Hi friends,

In the first article, we calculated and plotted one-dimensional histogram. It is called one-dimensional because we are taking only one feature into our consideration, ie grayscale intensity value of the pixel. But in two-dimensional histograms, you consider two features. Normally it is used for finding color histograms where two features are Hue & Saturation values of every pixel.

There is a python sample in the official samples already for finding color histograms. We will try to understand how to create such a color histogram, and it will be useful in understanding further topics like Histogram Back-Projection.

2D Histogram in OpenCV

It is quite simple and calculated using the same function, cv2.calcHist(). For color histogram, we need to convert the image from BGR to HSV. (Remember, for 1D histogram, we converted from BGR to Grayscale). While calling calcHist(), parameters are :

channels = [0,1] # because we need to process both H and S plane.
bins = [180,256] # 180 for H plane and 256 for S plane
range = [0,180,0,256] # Hue value lies between 0 and 180 & Saturation lies between 0 and 256

`import cv2import numpy as npimg = cv2.imread('home.jpg')hsv = cv2.cvtColor(img,cv2.COLOR_BGR2HSV)hist = cv2.calcHist( [hsv], [0, 1], None, [180, 256], [0, 180, 0, 256] )`

That's it.

2D Histogram in Numpy

Numpy also provides a specific function for this : np.histogram2d(). (Remember, for 1D histogram we used np.histogram() ).

`import cv2import numpy as npfrom matplotlib import pyplot as pltimg = cv2.imread('home.jpg')hsv = cv2.cvtColor(img,cv2.COLOR_BGR2HSV)hist, xbins, ybins = np.histogram2d(h.ravel(),s.ravel(),[180,256],[[0,180],[0,256]])`

First argument is H plane, second one is the S plane, third is number of bins for each and fourth is their range.

Now we can check how to plot this color histogram

Plotting 2D Histogram

Method - 1 : Using cv2.imshow()
The result we get is a two dimensional array of size 180x256. So we can show them as we do normally, using cv2.imshow() function. It will be a grayscale image and it won't give much idea what colors are there, unless you know the Hue values of different colors.

Method - 2 : Using matplotlib
We can use matplotlib.pyplot.imshow() function to plot 2D histogram with different color maps. It gives us much more better idea about the different pixel density. But this also, doesn't gives us idea what color is there on a first look, unless you know the Hue values of different colors. Still I prefer this method. It is simple and better.

NB : While using this function, remember, interpolation flag should be 'nearest' for better results.

`import cv2import numpy as npfrom matplotlib import pyplot as pltimg = cv2.imread('home.jpg')hsv = cv2.cvtColor(img,cv2.COLOR_BGR2HSV)hist = cv2.calcHist( [hsv], [0, 1], None, [180, 256], [0, 180, 0, 256] )plt.imshow(hist,interpolation = 'nearest')plt.show()`

Below is the input image and its color histogram plot. X axis shows S values and Y axis shows Hue. 2D Histogram in matplotlib with 'heat' color map

In histogram, you can see some high values near H = 100 and S = 200. It corresponds to blue of sky. Similarly another peak can be seen near H = 25 and S = 100. It corresponds to yellow of the palace. You can verify it with any image editing tools like GIMP.

Method 3 : OpenCV sample style !!
There is a sample code for color_histogram in OpenCV-Python2 samples. If you run the code, you can see the histogram shows even the corresponding color. Or simply it outputs a color coded histogram. Its result is very good (although you need to add extra bunch of lines).

In that code, the author created a color map in HSV. Then converted it into BGR. The resulting histogram image is multiplied with this color map. He also uses some preprocessing steps to remove small isolated pixels, resulting in a good histogram.

I leave it to the readers to run the code, analyze it and have your own hack arounds. Below is the output of that code for the same image as above: OpenCV-Python sample color_histogram.py output

You can clearly see in the histogram what colors are present, blue is there, yellow is there, and some white due to chessboard(it is part of that sample code) is there. Nice !!!

Summary :

So we have looked into what is 2D histogram, functions available in OpenCV and Numpy, how to plot it etc.

So this is it for today !!!

Regards,
Abid Rahman K.

## K-Means Clustering - 2 : Working with Scipy

Hi,

In the previous article, 'K-Means Clustering - 1 : Basic Understanding', we understood what is K-Means clustering, how it works etc. In this article, we will use k-means functionality in Scipy for data clustering. OpenCV will be covered in another article.

Scipy's cluster module provides routines for clustering. The vq module in it provides k-means functionality. You will need Scipy version 0.11 to get this feature.

We also use Matplotlib to visualize the data.

Note : All the data arrays used in this article are stored in github repo for you to check. It would be nice to check it for a better understanding. It is optional. Or you can create your own data and check it.

So we start by importing all the necessary libraries.

`>>> import numpy as np>>> from scipy.cluster import vq>>> from matplotlib import pyplot as plt`

Here I would like to show three examples.

1 - Data with only one feature :

Consider, you have a set of data with only one feature, ie one-dimensional. For eg, we can take our t-shirt problem where you use only height of people to decide the size of t-shirt.

Or, from an image processing point of view, you have a grayscale image with pixel values ranges from 0 to 255. You need to group it into just two colors, may be black and white only. ( That is another version of thresholding. I don't think someone will use k-means for thresholding. So just take this as a demo of k-means.)

So we start by creating data.

`>>> x = np.random.randint(25,100,25)>>> y = np.random.randint(175,255,25)>>> z = np.hstack((x,y))>>> z = z.reshape((50,1))`

So we have 'z' which is an array of size 50, and values ranging from 0 to 255. I have reshaped 'z' to a column vector. It is not necessary here, but it is a good practice. Reason, I will explain in coming sections. Now we can plot this using Matplotlib's histogram plot.

`>>> plt.hist(z,256,[0,256]),plt.show()`

We get following image : Test Data

Now we use our k-means functions.

First function, vq.kmeans(), is used to cluster the data as per our requirements and it returns the centroids of the clusters. (Docs)

It takes our test data and number of clusters we need as inputs. Other two inputs are optional and is not of big concern now.

`>>> centers,dist = vq.kmeans(z,2)>>> centersarray([,       [ 60]])`

First output is 'centers', which are the centroids of clustered data. For our data, it is 60 and 207. Second output is the distortion between centroids and test data. We mark the centroids along with the inputs.

`>>> plt.hist(z,256,[0,256]),plt.hist(centers,32,[0,256]),plt.show()`

Below is the output we got. Those green bars are the centroids. Green bars shows centroids after clustering

Now we have found the centroids. From first article, you might have seen our next job is to label the data '0' and '1' according to distance to the centroids. We use vq.vq() function for this purpose.

vq.vq() takes our test data and centroids as inputs and provides us the labelled data,called 'code' and distance between each data and corresponding centroids.

`>>> code, distance = vq.vq(z,centers)`

If you compare the arrays 'code' and 'z' in git repo, you can see all values near to first centroid will be labelled '0' and next as '1'.

Also check the distance array. 'z' is 47, which is near to 60, so labelled as '1' in 'code'. And distance between them is 13, which is 'distance'. Similarly you can check other data also.

Now we have the labels of all data, we can separate the data according to labels.

`>>> a = z[code==0]>>> b = z[code==1]`

'a' corresponds to data with centroid = 207 and 'b' corresponds to remaining data. (Check git repo to see a&b).

Now we plot 'a' in red color, 'b' in blue color and 'centers' in yellow color as below:

`>>> plt.hist(a,256,[0,256],color = 'r') # draw 'a' in red color>>> plt.hist(b,256,[0,256],color = 'b') # draw 'b' in blue color>>> plt.hist(centers,32,[0,256],color = 'y') # draw 'centers' in yellow color>>> plt.show()`

We get the output as follows, which is our clustered data : Output of K-Means clustering

So, we have done a very simple and basic example on k-means clustering. Next one, we will try with more than one features.

2 - Data with more than one feature :

In previous example, we took only height for t-shirt problem. Here, we will take both height and weight, ie two features.

Remember, in previous case, we made our data to a single column vector. This is because, it is a good convention, and normally followed by people from all fields. ie each feature is arranged in a column, while each row corresponds to an input sample.

For example, in this case, we set a test data of size 50x2, which are heights and weights of 50 people. First column corresponds to height of all the 50 people and second column corresponds to their weights. First row contains two elements where first one is the height of first person and second one his weight. Similarly remaining rows corresponds to heights and weights of other people. Check image below:

So now we can prepare the data.

`>>> x = np.random.randint(25,50,(25,2))>>> y = np.random.randint(60,85,(25,2))>>> z = np.vstack((x,y))`

Now we got a 50x2 array. We plot it with 'Height' in X-axis and 'Weight' in Y-axis.

`>>> plt.scatter(z[:,0],z[:,1]),plt.xlabel('Height'),plt.ylabel('Weight')>>> plt.show()`

(Some data may seem ridiculous. Never mind it, it is just a demo) Test Data

Now we apply k-means algorithm and label the data.

`>>> center,dist = vq.kmeans(z,2)>>> code,distance = vq.vq(z,center)`

This time, 'center' is a 2x2 array, first column corresponds to centroids of height, and second column corresponds to centroids of weight.(Check git repo data)

As usual, we extract data with label '0', mark it with blue, then data with label '1', mark it with red, mark centroids in yellow and check how it looks like.

`>>> a = z[code==0]>>> b = z[code==1]>>> plt.scatter(a[:,0],a[:,1]),plt.xlabel('Height'),plt.ylabel('Weight')>>> plt.scatter(b[:,0],b[:,1],c = 'r')>>> plt.scatter(center[:,0],center[:,1],s = 80,c = 'y', marker = 's')>>> plt.show()`

This is the output we got : Result of K-Means clustering

So this is how we apply k-means clustering with more than one feature.

Now we go for a simple application of k-means clustering, ie color quantization.

3 - Color Quantization :

Color Quantization is the process of reducing number of colors in an image. One reason to do so is to reduce the memory. Sometimes, some devices may have limitation such that it can produce only limited number of colors. In those cases also, color quantization is performed.

There are lot of algorithms for color quantization. Wikipedia page for color quantization gives a lot of details and references to it. Here we use k-means clustering for color quantization.

There is nothing new to be explained here. There are 3 features, say, R,G,B. So we need to reshape the image to an array of Mx3 size (M is just a number). And after the clustering, we apply centroid values (it is also R,G,B) to all pixels, such that resulting image will have specified number of colors. And again we need to reshape it back to the shape of original image. Below is the code:

`import cv2import numpy as npfrom scipy.cluster import vqimg = cv2.imread('home.jpg')z = img.reshape((-1,3))k = 2           # Number of clusterscenter,dist = vq.kmeans(z,k)code,distance = vq.vq(z,center)res = center[code]res2 = res.reshape((img.shape))cv2.imshow('res2',res2)cv2.waitKey(0)cv2.destroyAllWindows()`

Change the value of 'k' to get different number of colors. Below is the original image and results I got for values k=2,4,8 : Color Quantization with K-Means clustering

So, that's it !!!

In this article, we have seen how to use k-means algorithm with the help of Scipy functions. We also did 3 examples with sufficient number of images and plots. There are two more functions related to it, but I will deal it later.

In next article, we will deal with OpenCV k-means implementation.

I hope you enjoyed it...

Regards,

Abid Rahman K.

Ref :

1 - Scipy cluster module documentation

2 - Color Quantization

## Skeletonization using OpenCV-Python

I see people asking an algorithm for skeletonization very frequently. At first, I had no idea about it. But today, I saw a which demonstrates simple method to do this. Code was in C++, so I would like to convert it to Python here.

What is Skeletonization?

Answer is just right in the term. Simply, it make a thick blob very thin, may be one pixel width. Visit the wikipedia page for more details : Topological Skeleton

Code :

`import cv2import numpy as npimg = cv2.imread('sofsk.png',0)size = np.size(img)skel = np.zeros(img.shape,np.uint8)ret,img = cv2.threshold(img,127,255,0)element = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))done = Falsewhile( not done):    eroded = cv2.erode(img,element)    temp = cv2.dilate(eroded,element)    temp = cv2.subtract(img,temp)    skel = cv2.bitwise_or(skel,temp)    img = eroded.copy()    zeros = size - cv2.countNonZero(img)    if zeros==size:        done = Truecv2.imshow("skel",skel)cv2.waitKey(0)cv2.destroyAllWindows()`

Below is the result I got:  References :

1) http://felix.abecassis.me/2011/09/opencv-morphological-skeleton/

## Drawing Histogram in OpenCV-Python

Hi Friends,

Do you want to draw a histogram for an image as below?

See the histogram for above image for RGB channels.

The code:

`import cv2import numpy as npimg = cv2.imread('zzzyj.jpg')h = np.zeros((300,256,3))bins = np.arange(256).reshape(256,1)color = [ (255,0,0),(0,255,0),(0,0,255) ]for ch, col in enumerate(color):    hist_item = cv2.calcHist([img],[ch],None,,[0,256])    cv2.normalize(hist_item,hist_item,0,255,cv2.NORM_MINMAX)    hist=np.int32(np.around(hist_item))    pts = np.column_stack((bins,hist))    cv2.polylines(h,[pts],False,col)h=np.flipud(h)cv2.imshow('colorhist',h)cv2.waitKey(0)`

You can see the same code written using numpy functions on histogram here  : Drawing histogram in OpenCV- Python.

With Regards,
Abid Rahman K.

## Simple Digit Recognition OCR in OpenCV-Python

Hi Friends,

It is  a long since i have posted an article.

Now i present you a Simple Digit Recognition OCR using kNearestNeighbour features in OpenCV-Python.

It demonstrats how to train the data and recongnize digits from previously trained data.

The code is using new Python interface, cv2. ( OpenCV v 2.3+)

The code and explanation can be found here:

# Simple Digit Recognition OCR in OpenCV-Python

Test Image:

Result image:

Abid Rahman K.

## Contour features

For more details on contours, visit :

1) Contours - 1 : Getting Started

2) Contours - 2 : Brotherhood

''' filename : contourfeatures.py

This sample calculates some useful parameters of a contour. This is an OpenCV implementation of regionprops function in matlab with some additional features.

Benefit : Learn to find different parameters of a contour region.
Get familier with different contour functions in OpenCV.

Level : Beginner or Intermediate

Usage : python contourfeatures.py <image_file>

Abid Rahman 3/25/12 '''

import cv2
import numpy as np

class Contour:
''' Provides detailed parameter informations about a contour

Create a Contour instant as follows: c = Contour(src_img, contour)
where src_img should be grayscale image.

Attributes:

c.area -- gives the area of the region
c.parameter -- gives the perimeter of the region
c.moments -- gives all values of moments as a dict
c.centroid -- gives the centroid of the region as a tuple (x,y)
c.bounding_box -- gives the bounding box parameters as a tuple => (x,y,width,height)
c.bx,c.by,c.bw,c.bh -- corresponds to (x,y,width,height) of the bounding box
c.aspect_ratio -- aspect ratio is the ratio of width to height
c.equi_diameter -- equivalent diameter of the circle with same as area as that of region
c.extent -- extent = contour area/bounding box area
c.convex_hull -- gives the convex hull of the region
c.convex_area -- gives the area of the convex hull
c.solidity -- solidity = contour area / convex hull area
c.center -- gives the center of the ellipse
c.majoraxis_length -- gives the length of major axis
c.minoraxis_length -- gives the length of minor axis
c.orientation -- gives the orientation of ellipse
c.eccentricity -- gives the eccentricity of ellipse
c.filledImage -- returns the image where region is white and others are black
c.filledArea -- finds the number of white pixels in filledImage
c.convexImage -- returns the image where convex hull region is white and others are black
c.pixelList -- array of indices of on-pixels in filledImage
c.maxval -- corresponds to max intensity in the contour region
c.maxloc -- location of max.intensity pixel location
c.minval -- corresponds to min intensity in the contour region
c.minloc -- corresponds to min.intensity pixel location
c.meanval -- finds mean intensity in the contour region
c.leftmost -- leftmost point of the contour
c.rightmost -- rightmost point of the contour
c.topmost -- topmost point of the contour
c.bottommost -- bottommost point of the contour
c.distance_image((x,y)) -- return the distance (x,y) from the contour.
c.distance_image() -- return the distance image where distance to all points on image are calculated
'''
def __init__(self,img,cnt):
self.img = img
self.cnt = cnt
self.size = len(cnt)

# MAIN PARAMETERS

#Contour.area - Area bounded by the contour region'''
self.area = cv2.contourArea(self.cnt)

# contour perimeter
self.perimeter = cv2.arcLength(cnt,True)

# centroid
self.moments = cv2.moments(cnt)
if self.moments['m00'] != 0.0:
self.cx = self.moments['m10']/self.moments['m00']
self.cy = self.moments['m01']/self.moments['m00']
self.centroid = (self.cx,self.cy)
else:
self.centroid = "Region has zero area"

# bounding box
self.bounding_box=cv2.boundingRect(cnt)
(self.bx,self.by,self.bw,self.bh) = self.bounding_box

# aspect ratio
self.aspect_ratio = self.bw/float(self.bh)

# equivalent diameter
self.equi_diameter = np.sqrt(4*self.area/np.pi)

# extent = contour area/boundingrect area
self.extent = self.area/(self.bw*self.bh)

### CONVEX HULL ###

# convex hull
self.convex_hull = cv2.convexHull(cnt)

# convex hull area
self.convex_area = cv2.contourArea(self.convex_hull)

# solidity = contour area / convex hull area
self.solidity = self.area/float(self.convex_area)

### ELLIPSE  ###

self.ellipse = cv2.fitEllipse(cnt)

# center, axis_length and orientation of ellipse
(self.center,self.axes,self.orientation) = self.ellipse

# length of MAJOR and minor axis
self.majoraxis_length = max(self.axes)
self.minoraxis_length = min(self.axes)

# eccentricity = sqrt( 1 - (ma/MA)^2) --- ma= minor axis --- MA= major axis
self.eccentricity = np.sqrt(1-(self.minoraxis_length/self.majoraxis_length)**2)

### CONTOUR APPROXIMATION ###

self.approx = cv2.approxPolyDP(cnt,0.02*self.perimeter,True)

### EXTRA IMAGES ###

# filled image :- binary image with contour region white and others black
self.filledImage = np.zeros(self.img.shape[0:2],np.uint8)
cv2.drawContours(self.filledImage,[self.cnt],0,255,-1)

# area of filled image
filledArea = cv2.countNonZero(self.filledImage)

# pixelList - array of indices of contour region
self.pixelList = np.transpose(np.nonzero(self.filledImage))

# convex image :- binary image with convex hull region white and others black
self.convexImage = np.zeros(self.img.shape[0:2],np.uint8)
cv2.drawContours(self.convexImage,[self.convex_hull],0,255,-1)

### PIXEL PARAMETERS

# mean value, minvalue, maxvalue

### EXTREME POINTS ###

# Finds the leftmost, rightmost, topmost and bottommost points
self.leftmost = tuple(self.cnt[self.cnt[:,:,0].argmin()])
self.rightmost = tuple(self.cnt[self.cnt[:,:,0].argmax()])
self.topmost = tuple(self.cnt[self.cnt[:,:,1].argmin()])
self.bottommost = tuple(self.cnt[self.cnt[:,:,1].argmax()])
self.extreme = (self.leftmost,self.rightmost,self.topmost,self.bottommost)

### DISTANCE CALCULATION

def distance_image(self,point=None):

'''find the distance between a point and adjacent point on contour specified. Point should be a tuple or list (x,y)
If no point is given, distance to all point is calculated and distance image is returned'''
if type(point) == tuple:
if len(point)==2:
self.dist = cv2.pointPolygonTest(self.cnt,point,True)
return self.dist
else:
dst = np.empty(self.img.shape)
for i in xrange(self.img.shape):
for j in xrange(self.img.shape):
dst.itemset(i,j,cv2.pointPolygonTest(self.cnt,(j,i),True))

dst = dst+127
dst = np.uint8(np.clip(dst,0,255))

# plotting using palette method in numpy
palette = []
for i in xrange(256):
if i<127:
palette.append([2*i,0,0])
elif i==127:
palette.append([255,255,255])
elif i>127:
l = i-128
palette.append([0,0,255-2*l])
palette = np.array(palette,np.uint8)
self.h2 = palette[dst]
return self.h2

#### DEMO ######
if __name__=='__main__':

import sys
if len(sys.argv)>1:
image = sys.argv
else:
image = 'new.bmp'
print "Usage : python contourfeatures.py <image_file>"

imgray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
k = 1000
for cnt in contours:

# first shows the original image
im2 = im.copy()
c = Contour(imgray,cnt)
print c.leftmost,c.rightmost
cv2.putText(im2,'original image',(20,20), cv2.FONT_HERSHEY_PLAIN, 1.0,(0,255,0))
cv2.imshow('image',im2)
if cv2.waitKey(k)==27:
break

im2 = im.copy()

# Now shows original contours, approximated contours, convex hull
cv2.drawContours(im2,[cnt],0,(0,255,0),4)
string1 = 'green : original contour'
cv2.putText(im2,string1,(20,20), cv2.FONT_HERSHEY_PLAIN, 1.0,(0,255,0))
cv2.imshow('image',im2)
if cv2.waitKey(k)==27:
break

approx = c.approx
cv2.drawContours(im2,[approx],0,(255,0,0),2)
string2 = 'blue : approximated contours'
cv2.putText(im2,string2,(20,40), cv2.FONT_HERSHEY_PLAIN, 1.0,(0,255,0))
cv2.imshow('image',im2)
if cv2.waitKey(k)==27:
break

hull = c.convex_hull
cv2.drawContours(im2,[hull],0,(0,0,255),2)
string3 = 'red : convex hull'
cv2.putText(im2,string3,(20,60), cv2.FONT_HERSHEY_PLAIN, 1.0,(0,255,0))
cv2.imshow('image',im2)
if cv2.waitKey(k)==27:
break

im2 = im.copy()

# Now mark centroid and bounding box on image
(cx,cy) = c.centroid
cv2.circle(im2,(int(cx),int(cy)),5,(0,255,0),-1)
cv2.putText(im2,'green : centroid',(20,20), cv2.FONT_HERSHEY_PLAIN, 1.0,(0,255,0))

(x,y,w,h) = c.bounding_box
cv2.rectangle(im2,(x,y),(x+w,y+h),(0,0,255))
cv2.putText(im2,'red : bounding rectangle',(20,40), cv2.FONT_HERSHEY_PLAIN, 1.0,(0,255,0))

(center , axis, angle) = c.ellipse
cx,cy = int(center),int(center)
ax1,ax2 = int(axis),int(axis)
orientation = int(angle)
cv2.ellipse(im2,(cx,cy),(ax1,ax2),orientation,0,360,(255,255,255),3)
cv2.putText(im2,'white : fitting ellipse',(20,60), cv2.FONT_HERSHEY_PLAIN, 1.0,(255,255,255))

cv2.circle(im2,c.leftmost,5,(0,255,0),-1)
cv2.circle(im2,c.rightmost,5,(0,255,0))
cv2.circle(im2,c.topmost,5,(0,0,255),-1)
cv2.circle(im2,c.bottommost,5,(0,0,255))
cv2.imshow('image',im2)
if cv2.waitKey(k)==27:
break

# Now shows the filled image, convex image, and distance image
filledimage = c.filledImage
cv2.putText(filledimage,'filledImage',(20,20), cv2.FONT_HERSHEY_PLAIN, 1.0,255)
cv2.imshow('image',filledimage)
if cv2.waitKey(k)==27:
break

conveximage = c.convexImage
cv2.putText(conveximage,'convexImage',(20,20), cv2.FONT_HERSHEY_PLAIN, 1.0,255)
cv2.imshow('image',conveximage)
if cv2.waitKey(k)==27:
break

distance_image = c.distance_image()
cv2.imshow('image',distance_image)
cv2.putText(distance_image,'distance_image',(20,20), cv2.FONT_HERSHEY_PLAIN, 1.0,(255,255,255))
if cv2.waitKey(k)==27:
break

cv2.destroyAllWindows()

## Report "OpenCV-Python"

Are you sure you want to report this post for ?

Cancel
×