Activity #7 – Image Segmentation

Digital image processing is not only engaged with image enhancement and analysis, but also with morphological operations. These operations allow the removal of imperfections found in binary images, such as when simple thresholding is applied to the grayscale version of my fingerprint in Activity #6. Moreover, the conversion of grayscale images into binary images involves the process known as Image Segmentation, which is a necessary procedure in morphological operations. In this activity, I am going to do image segmentation not only with grayscale images, but also with colored images. Some of the images here are resized to be smaller to fit in the blog, but you can click the image to see it in full size.

I. Grayscale Image Segmentation

In image processing, we are usually concerned with some parts of the image. Image segmentation allows a region of interest (ROI) to be highlighted from the whole image, in which further image processing can be done with it [1]. This is particularly easy to do with grayscale images, such as the given image below (sorry for the profanity).

Figure 1. Grayscale image of a check from [1].
Figure 1. Grayscale image of a check from [1].

To segment this image, one can just do thresholding to turn this grayscale image into a binary image, in which the 0 and 1 values will depend on the ROI. If we are to process the handwritten and printed parts of the image, which are all darker than the background, one can then set a threshold, in which the higher intensities are omitted. A Scilab code is shown below for grayscale image segmentation.

Code for grayscale image segmentation.
Code for grayscale image segmentation [1].

In the code shown, the histogram, which serves as pixel value reference, of an image is first calculated and plotted, which is shown below. The imhist() function in the code takes on two input values, first is the image, and then the number in the second input refers to the number of bins. For a grayscale image, the pixel values can vary from any integers in the range 0 – 255, which is why the specified number is 256.

Figure 2. Histogram of the grayscale image in Fig. 1.
Figure 2. Histogram of the grayscale image in Fig. 1.

The histogram relates the grayscale value to the number of pixels having that grayscale value in the given image. The sharp peak then signifies that most of the pixels have grayscale values in the range 190 – 200. This is evident in the paper of the check. To segment the handwritten and printed parts, I set the threshold value to 125 gray value. The resulting image is shown below.

Figure 3. Segmented image of the check in Fig. 1 for a threshold value of 125.
Figure 3. Segmented image of the check in Fig. 1 for a threshold value of 125.

As you can see, almost all of the handwritten and printed elements in the check are highlighted. This is because the code picks out all pixels with gray values less than 125 and sets them into a 1 value. If I reverse the condition (I>125), the resulting image would then be the inverted version of Fig. 3, shown below.

Figure 4. Inverted segmented image of Fig. 3, in which I>125 from the code given.
Figure 4. Inverted segmented image of Fig. 3, in which I>125 from the code given.

One can then specify the thresholding condition and values in order to pick out the ROI. Below are the resulting images when the threshold value in the code shown was set to 50, 100, and 175.

Figure 5. Segmentation image of the check when threshold value is set to 50 (I<50).
Figure 5. Segmentation image of the check when threshold value is set to 50 (I<50).
Figure 6. Segmentation image of the check when threshold value is set to 100 (I<100).
Figure 6. Segmentation image of the check when threshold value is set to 100 (I<100).
Figure 7. Segmentation image of the check when threshold value is set to 175 (I<175).
Figure 7. Segmentation image of the check when threshold value is set to 175 (I<175).

The differences are observable, as to the amount of imperfections for each threshold. One can then properly set the conditions by referring from the histogram in Fig. 2, in which in this case, the best one to highlight all the handwritten and printed texts is 125.

II. Colored Image Segmentation

Not all images can be segmented by converting them into grayscale images and setting up threshold values, especially when the image is colored. Take for an example the image shown below, together with its grayscale equivalent.

Figure 8. a) Colored image of a rusty box from [1], together with its b) grayscale equivalent.
Figure 8. a) Colored image of a rusty box from [1], together with its b) grayscale equivalent.

As can be observed, the rusty box in Fig. 8b has grayscale pixel values similar to that in the background elements. Thus, doing a grayscale image segmentation is useless. To remedy the situation, we resort to colored image segmentation.

In reality, objects with solid, monochromatic color will show shading variations depending on the orientation of the observer, intensity of the light source, geometry of the object, etc. Thus, these variations can be taken into account into image segmentation by representing the image in a color space that accounts both brightness and chromaticity (pure color) information, separately. We now utilize the Normalized Chromaticity Coordinates (NCC) [1].

Basically, colored images have red, green and blue image space, in which the matrices are designated to be R, G and B. The NCC is then given by

NCC

where the sum of r, g and b is equal to 1, and the coordinates can have values from 0 to 1 [1]. Moreover, the NCC for b can be represented by r and g, which is equivalent to 1-(r+g). This means that NCC can be represented in two-dimensional space r and g, and the values of r and g will determine the color of the pixel. This 2D space is shown below.

Figure 9. Normalized Chromaticity Coordinate space for coordinate axes r and g. Image retrieved from: https://upload.wikimedia.org/wikipedia/en/7/7b/Rg_normalized_color_coordinates.png
Figure 9. Normalized Chromaticity Coordinate space for coordinate axes r and g. Image retrieved from: Wikipedia.

The primary colors are then obtained when only one of these coordinates is equal to 1. Any colors can then be represented using this color space, which we will be using for colored image segmentation.

In this activity, we are not only concerned with image segmentation, but also a comparison with the techniques involved. There are two main techniques involved in colored image segmentation: Parametric and Non-Parametric Probability Distribution Estimation.

A. Parametric Probability Distribution Estimation

This first technique involves parameters associated with probability distributions, mainly, the mean and the standard deviation. Basically, we can segment colored images by determining the probability that a pixel belongs to a color distribution of interest. Similar to grayscale image segmentation, one first picks the ROI, then obtain its probability distribution function by normalizing its image histogram, which is done for all primary spaces of NCC. Then, one must determine if a pixel from the desired image belongs to the probability distribution of the ROI. The probability that a pixel with chromaticity r belongs to the ROI can then be assumed to be Gaussian, and is given by

Probability

where µr and σr  are the mean and standard deviation of the ROI image values in chromaticity space r [1]. This is also true for g and b. Since we are only concerned with the 2D NCC space, the pixel-finding distribution can then be approximated to be a joint probability distribution, where the joint probability is given by the product of p(r) and p(g) [1].

Let’s try this technique for a pattern shown below.

Figure 10. Image of a pattern obtained from http://www.desktopaper.com/wp-content/uploads/cool-psychedlic-pattern-wallpaper-by-grebenru.jpg
Figure 10. Image of a pattern obtained from Desktopaper.com.

For my first ROI, I wanted to segment all the pink colors outlining each teardrop-like pattern. The image of the ROI and the Scilab code used are shown below.

Figure 11. Image of the ROI used for parametric image segmentation.
Figure 11. Image of the ROI used for parametric image segmentation. (Image here is enlarged for a better view.)
Code for Parametric Colored Image Segmentation.
Code for Parametric Colored Image Segmentation.

The necessary parameters were calculated and used for the joint probability distribution. The resulting image is shown below.

Figure 12. Segmented image of Fig. 10 for the ROI in Fig. 11 using parametric image segmentation.
Figure 12. Segmented image of Fig. 10 for the ROI in Fig. 11 using parametric image segmentation.

What can I say? The technique used works like a charm. The outlines of the teardrop patterns were highlighted, together with small circles having the same color as the ROI. Since the teardrop patterns were almost the same for all patterns, there high correlations were observed for every segmented element.

Now, let’s move on to another technique.

B. Non-Parametric Probability Distribution Estimation: Histogram Backprojection

We now move on to non-parametric image segmentation. From the name itself, this technique does not utilize the probability distribution parameters of an ROI. One non-parametric technique is called Histogram Backprojection. According to OpenCV, backprojection is a way of recording how well the pixels of a given image fit the distribution of pixels in a histogram model of an ROI. Basically, we are just replacing every pixel in a given image by its probability to occur in the ROI. This takes off the calculation of the mean and standard deviations, and the assumption that the probability is Gaussian. We are merely looking for histogram values in NCC space.

To implement this kind of technique, one must be able to produce a 2D histogram of an image. A Scilab code was provided to us by our professor, Ma’am Jing, that helped us create a 2D histogram. For the image in Fig. 10 and ROI in Fig. 11, the Scilab code is shown below.

Code3 Part1

Code for Non-parameteric Colored Image Segmentation.
Code for Non-parameteric Colored Image Segmentation.

Let me discuss first the given code. The code is divided into two parts: first is the calculation of the 2D histogram of the ROI, and then the backprojection. Similar as before, the ROI image was mapped into the NCC space. The number of bins, which divides the entire range of values into specific intervals in the histogram, was given a value of 32 (integers from 0 to 31). The resulting 2D histogram is then a 32×32 image, in which the intensity of the grayscale image determines its probability in NCC space, with dimensions r and g for x and y axes, respectively. For the ROI given in Fig. 11, the 2D histogram is shown below, together with the NCC space diagram.

Figure 13. Image of a) the 2D histogram (rotated by 90 degrees counter-clockwise to conform with r and g dimensions) and b) the NCC space.
Figure 13. Image of a) the 2D histogram (rotated by 90 degrees counter-clockwise to conform with r and g dimensions) and b) the NCC space.

If one is to map properly the 2D histogram towards the NCC space, one can then verify the location of the high intensities in the NCC space, which are near the pinkish region of the graph. For the next part of the code, the original image in Fig. 10 is mapped towards the NCC space, similar to the parametric technique. Only this time, we introduce a projection array, in which each pixel is given a value, based on their value in the histogram of the ROI. The result then is shown below.

Figure 14. Segmented image of Fig. 10 for the ROI in Fig. 11 using non-parametric image segmentation.
Figure 14. Segmented image of Fig. 10 for the ROI in Fig. 11 using non-parametric image segmentation.

I have obtained almost similar results with the parametric approach. To provide a better analysis on the figures, I placed them side-by-side, as shown in Fig. 15.

Figure 15. Images of the segmented image of Fig. 10 for an ROI in Fig. 11 using a) Parametric and b) Non-parametric image segmentation.
Figure 15. Images of the segmented image of Fig. 10 for an ROI in Fig. 11 using a) Parametric and b) Non-parametric image segmentation. (You can click the photo to enlarge it.)

So, what’s the difference? Well, based on what I can see, the parametric technique showed much smoother and cleaner segmentation than that with the non-parametric technique. Here’s the mechanism behind their segmentation. The parametric technique utilizes a Gaussian probability distribution, in which the joint probability distribution involved will determine a pixel membership. Gaussian bell curves have significant width and area of significance, thus, by using a Gaussian fit to determine pixel membership, more pixels will be introduced in the segmentation at a significant intensity. For the histogram backprojection technique, the algorithm directly relates pixel membership towards the histogram of the ROI. Therefore, the segmentation will only be as good as the extracted ROI from the original image. Thus, one can improve the quality of the non-parametric result by providing a very accurate ROI. Remember that the ROI is even riddled with imperfections, such as darker shades of pink, which directly affects the backprojection. For me, if we want to maximize color extraction, then parametric technique is used. But if we are concerned with the exact color pattern from the ROI and accurately obtain the regions with the same hues, then we need non-parametric technique.

In addition to the previous results, the method used also affects the total running time of the segmentation. Theoretically, the parametric technique should be slower, since we really need calculations not only for the mean and standard deviations in NCC space, but also for the probability distributions for pixel membership, while the backprojection only does a look up of histogram value for each pixel in the original image. However, the code for the backprojection took a few seconds longer than the parametric approach, simply because the double for loop accounts the size of the original image, and Scilab is really optimized for matrix element-per-element operations, which is utilized in the parametric approach.

Since I have so much time left in doing this activity, then it is high time to test the methods for different photos and ROI’s. Using the same pattern in Fig. 10, I extracted a different ROI, which is the pattern observed in the middle of the teardrops, which has both the yellow and green color hues. The ROI is shown below.

Figure 16. Different ROI for the pattern in Fig. 10.
Figure 16. Different ROI for the pattern in Fig. 10. (Photo is enlarged.)

And here are the results.

Figure 17. Images for the segmentation of Fig. 10 using Fig. 16 as ROI through a) parametric and b) non-parametric approach.
Figure 17. Images for the segmentation of Fig. 10 using Fig. 16 as ROI through a) parametric and b) non-parametric approach.

In this case, even though the ROI used has a larger area for the green hue, the segmentations did not only highlighted the center pattern, but also the whole teardrop since the ROI has yellow hues. Even though the ROI is not highly monochromatic, it will still accept pixels that don’t have the same pattern as the ROI, but have certain colors that can be found in the ROI. Let us now compare the two methods. Similar with the pink ROI segmentation, the parametric method showed smoother and brighter segmentation than that with the non-parametric method. Again, the segmentation will only be as good as the ROI image for the non-parametric approach. I guess the Gaussian fit really does allow a lot more pixels in the segmentation.

Now, not all images are colored with the same brightness. Real-life photos such as human skin and landscapes are affected by the intensity and color of the light source, orientation of the camera, etc. I have here a photo of my three minions, having shadows on different parts of their bodies, together with their segmented images.

Figure 18. Images of a) the three minions, together with the ROI (bottom right corner) extracted from the left minion's forehead and their segmented images using b) parametric and c) non-parametric approach.
Figure 18. Images of a) the three minions, together with the ROI (bottom right corner) extracted from the left minion’s forehead and their segmented images using b) parametric and c) non-parametric approach.

Yet again, the parametric approach showed brighter intensities for the segmented regions. The shadows throughout the bodies affected their segmentations, which is much apparent in the body of the left minion.

How about we try segmenting a single body and color, only with different brightness levels throughout the body? Let’s try segmenting a shiny Lamborghini’s photo from weknowyourdreams.com. (Yes, you know my dream.) We start by using an ROI that has a smaller color range.

Figure 19. Images of a) an orange car, together with the ROI (bottom right corner) and its segmented images using b) parametric and c) non-parametric approach.
Figure 19. Images of a) an orange car, together with the ROI (bottom right corner) and its segmented images using b) parametric and c) non-parametric approach.

In this case, the non-parametric method has obtained more intensities than that of the parametric method. Moreover, the ROI color has a lower range, meaning, the orange color had almost a single brightness. Therefore, the resulting segmentations did not capture the whole car kit. To remedy this, I extracted a larger ROI than what is previously used. The images are shown below in Figs. 20 and 21.

Figure 20. Images of a) an orange car, together with a different ROI (bottom right corner) and its segmented images using b) parametric and c) non-parametric approach.
Figure 20. Images of a) an orange car, together with a higher ROI (bottom right corner) than in Fig. 19 and its segmented images using b) parametric and c) non-parametric approach.
Figure 21. Images of a) an orange car, together with a different ROI (bottom right corner) and its segmented images using b) parametric and c) non-parametric approach.
Figure 21. Images of a) an orange car, together with an ROI (bottom right corner) that has the highest color range from previous ROI’s and its segmented images using b) parametric and c) non-parametric approach.

In Fig. 20, we can see the difference between the parametric and non-parametric approach, in which the segmented image in the parametric approach showed a “cleaner” segmentation than the non-parametric approach. In Fig. 21, the difference lies with the covered pixels. The non-parametric approached covered more car parts than that of the parametric method. Again, the backprojection is only as good as the extracted ROI. To actually prove that all of the ROI’s have varying color range, I present here their 2D histograms. One can confirm that they are indeed in the red-orange region using the NCC space. One must also be careful in extracting ROI’s since the extracted ROI may contain a specular (white) spot from the car, which will segment the background of the image.

Figure 22. 2D histograms for the car ROI's in Figs. a) 19a, b) 20a, and c) 21a.
Figure 22. 2D histograms for the car ROI’s in Figs. a) 19a, b) 20a, and c) 21a.

Before I end this blog post, let me just add one more thing. Let me try segmenting my face. Human skin can be troublesome to segment because of its color variety and brightness gradient. Here is a selfie of my own.

Figure 21. My own selfie.
Figure 21. My own selfie.

I will now try to segment my own face using these particular regions: my nose, right cheek and forehead, to see what facial feature will segment my face in this photo the best. Here are the said ROI’s and their 2D histograms.

Figure 22. ROI (top) and their 2D histograms (bottom). The ROI is extracted from a) the nose, b) right cheek and c) forehead.
Figure 22. ROI (top) and their 2D histograms (bottom). The ROI is extracted from a) the nose, b) right cheek and c) forehead.

The nose encompassed a lot of space in the histogram in the figures. This is affected not only by the nostrils, but also by the specular spots. In our AP 187 class with Ma’am Jing, we had a discussion on different types of color locus, that is consisting of a set of characteristic points for color.  One type of color locus is the skin locus, which consists of the color pigments for various skin colors. The 2D histogram shown in Fig. 22 is in the range of the presented locus, although the nose part seems to encompass a lot of points. Below are the results of the segmentation.

Figure 23. a) Parametric and b) Non-parametric segmentation of my face using the nose ROI.
Figure 23. a) Parametric and b) Non-parametric segmentation of my face using the nose ROI.
Figure 24. a) Parametric and b) Non-parametric segmentation of my face using the right cheek ROI.
Figure 24. a) Parametric and b) Non-parametric segmentation of my face using the right cheek ROI.
Figure 25. a) Parametric and b) Non-parametric segmentation of my face using the forehead ROI.
Figure 25. a) Parametric and b) Non-parametric segmentation of my face using the forehead ROI.

One criterion for determining the best segmentation is how well it differentiates facial features like the mouth, teeth, and eyes with the facial skin. Referring from Fig. 22a, the nose ROI encompassed a greater space in the histogram, which resulted to more facial skin segmentation in Fig. 23, but less differentiation between the facial features. Moreover, the parametric segmentation in Fig. 23a included stray pixels, which is again due to its color range including the white center of the NCC space. Both the cheek and forehead segmentations provided good results, which is due to the lower color range in the 2D histogram. Figure 24a for me had the best facial segmentation, since the other segmentations produced imperfections towards the skin segments. For best possible results, the extraction of the ROI must only include the segments necessary for skin detection, and the image to be segmented must have an almost unique brightness throughout the image.

Image segmentation is indeed useful in extracting parts of an image. I may not appreciate image segmentation fully right now since I only did it with random images. As far as I have spent time in the internet, image segmentation allows for easier image manipulation and analysis. One can segment an MRI image to differentiate cell parts with brain tumor with healthy cells. Weather forecasts are easily done by applying image segmentation towards satellite images, in which the segmented regions locates regions with high and low atmospheric activities. Segmentation is also a vital element in topology. There are other methods of image segmentation such as the Level Set method by Osher and Sethian and the Fast Marching methods (from Wikipedia).

Judging from the way I have handled the activity alone, all images seems to be in tact and analyzed properly. And yes, I had fun segmenting my own face, even though some of the results were kinda creepy. The coding part was fairly straightforward, since we were provided by a very helpful code from our dearest professor, Ma’am Jing. Also, I was able to peer into the advantages and disadvantages of the parametric and non-parametric approaches. I would give myself and 12/10 for face value. I would like to thank Shia LaBeouf for this highly motivational GIF.

“Don’t let your dreams be dreams.”

References:

[1] Soriano, M., “Image Segmentation,” 2014.

2 thoughts on “Activity #7 – Image Segmentation

Leave a comment