Activity #5 – Fourier Transform Model of Image Formation

Digital image manipulation and image processing have never been complete without the famous Fourier Transform. Developed by Joseph Fourier (1768-1830), the Fourier Transform (FT) has not only led to advancements in mathematics such as determining solutions of differential equations, but also has been used for optics, sound and acoustics, signal processing (acquisition of signal frequencies), and in this case, image processing. In image processing, FT is used for image analysis, compression, filtering, cleaning and reconstruction.

The Fourier Transform basically transforms a signal with physical dimension X to the dimension 1/X. Thus, the FT converts a time-dependent signal into frequency space, and a spatial signal into its spatial frequency space. Mathematically, the FT of an image (2-D signal) is given by the equation [1]

Fourier Formula

where f­x and fy are the spatial frequencies of the function f with respect to x and y. Basically, if one is to use FT, the user is manipulating the function in its frequency domain.

Since image processing is digital work, we need to implement an FT program to work on discrete signal values. Luckily, good partners Cooley and Tukey were able to implement this using the Fast Fourier Transform (FFT) algorithm. The algorithm created is fast and powerful, and efficient in working with discrete signal values that have many data points. It has a few properties: FFT works fast when the sample size is in powers of 2 (inherent property of the algorithm), the FFT of an image has its quadrants diagonally inverted, and the inverse FFT is also inverted [1].

In this activity, we will observe the FFT of some synthetic images or apertures. In Optics, the FT of an image is similar to that of a lens system. Figure 1 presents a simple lens system equivalent to that of the FT of an aperture. By optical geometry, rays coming from a point source located at the focus of a lens will emerge parallel to the optical axis, while rays parallel to the optic axis entering the lens will emerge towards the focus of the lens [2]. The image produced will also be inverted. Thus, the lens serves as a “Fourier Transformer” of the actual image of an aperture. This will discussed and verified in this activity.

Figure 1. Lens system equivalent to that of FT of an image. Image from: http://cns-alumni.bu.edu/~slehar/fourier/fourier.html
Figure 1. Lens system equivalent to that of FT of an aperture image. Image from: An Intuitive Explanation of Fourier Theory by S. Lehar.

With that in mind, let’s move on with the exploration of FFT using Scilab.

I. Familiarization with Discrete FFT: Lens as a Fourier Transformer

It is good to start with simple shapes and apertures as our 2-D signal. Take for an example a circular aperture shown in Figure 1, which is generated using Scilab. If you want a tutorial on different synthetic images, visit Activity #3 (Scilab Basics). Take note that all images used in this activity are all 128×128 pixels in image size since FFT works fast with samples sizes with powers of 2.

A. Circular Aperture

Figure 1. Circular aperture of radius r = 0.3 (image size:128x128)
Figure 2. Circular aperture of radius r = 0.3.

We apply FFT in this image. The Scilab code is shown below:

FFT code.
FFT code for the circular aperture.

Let me run through the code. The image is first converted to grayscale, then fft2() is applied. Since the FFT returns complex-valued numbers, the magnitude of the values are obtained using the abs() function. It is necessary to shift the values using fftshift() in order to correct the quadrant locations. The inverse FFT is then applied by using the same 2D FFT function, and the images are analyzed. Real and imaginary values of the complex-valued signal were also obtained. Here are the results of the code:

Figure 3. a) FT of the circular aperture and b) its inverse FT.
Figure 3. a) FT of the circular aperture and b) its inverse FT.

As you can see, the FT of circle is a group of concentric circles varying intensity, with its highest intensity at the center. This is often referred to as the Airy Pattern. These alternating dark and bright circles represent destructive and constructive interference in the equivalent lens system. The center is brightest since most of the light from the aperture converges to the focus or in this case, the center. As we can see, the inverse FT would be the same circular aperture itself, although the inversion of the image is not seen since the image is symmetric along the x and y axes. Next are the real and imaginary parts of the FT.

Figure 4. a) Real and b) Imaginary parts of the FT of a circular aperture.
Figure 4. a) Real and b) Imaginary parts of the FT of a circular aperture.

The image here is a bit hard to decipher. Nevertheless, the important thing is that the FT of a function is complex, and that real and imaginary parts contribute to the FT of a function. If you can also look closely at the patterns, the real part is symmetric in the x and y axes, while the imaginary part is symmetric towards the origin. This property may vary from shape to shape.

2. The Letter “A”

We try to determine the FT of a bit of an irregular image. Here, is a letter “A,” which is also treated as an aperture. The same Scilab code with the circular aperture is used.

Figure 5. Image of the letter "A" as an aperture.
Figure 5. Image of the letter “A” as an aperture.

Below are the FT results of the given figure:

Figure 6. a) FT of the letter "A" and b) its inverse FT. c) Real and d) Imaginary parts of the FT.
Figure 6. a) FT of the letter “A” and b) its inverse FT. c) Real and d) Imaginary parts of the FT.

We now observe a certain pattern for the FT of the image of a letter “A.” The symmetry is observed with respect to the axes, and the Airy pattern, which consists of alternating dark and bright patterns, is also observed. The symmetry is due to the shape of the image. Remember that in FT, we are observing the image in spatial frequency, or 1/X dimension. Thus, we can see the legs of the letter “A” spread horizontally along the x-axis. In Fig. 6b, we observe now the inversion of the image. The real and imaginary parts are also present and have a definite grayscale value.

Now, we generate the FT of some aperture patterns we see in experimental optics.

C. Sinusoid along the x-axis (Corrugated Roof)

Presented here are the sinusoid image along the x-axis and the FT results.

Figure 7. Image of the corrugated roof aperture.
Figure 7. Image of the corrugated roof aperture.
Figure 8. a) FT of the corrugated roof and b) its inverse FT. c) Real and d) Imaginary parts of the FT.
Figure 8. a) FT of the corrugated roof and b) its inverse FT. c) Real and d) Imaginary parts of the FT.

We observe dots along the middle for the FT of the image. All of the rays parallel to the lens converge towards the focus, and an Airy pattern is observed in the dots deviated from the middle dot. This is due to constructive and destructive interference. The inverse FT is inverted. In the real and imaginary images, there is a difference in the contrast of the dots and background, indicating presence of the middle dot in the real part of the FT. Moreover, the FT of sinusoidal graphs represents the location of its spatial frequencies.

Before anything else, there is something I have to discuss on the images produced. In Activity #3, we were able to plot the corrugated roof in the range of values -1 to 1. However, images were supposed to have a value between 0 to 1. This means that in the corrugated roof image, we could observe tapering of the negative values. We then would definitely have a different periodic graph in the original image, which adds to the bias (center dot) and additional dots in the FT. I will discuss this further when we discuss anamorphism and rotation property of FT in Activity #6.

D. Simulated Double Slit

A double slit similar to that used in Young’s Double Slit experiment is used.

Figure 9. Simulated double slit aperture.
Figure 9. Simulated double slit aperture.
Figure 10. a) FT of the simulated double slit and b) its inverse FT. c) Real and d) Imaginary parts of the FT.
Figure 10. a) FT of the simulated double slit and b) its inverse FT. c) Real and d) Imaginary parts of the FT.

We observe the interference pattern, which is experimentally observed in Young’s Double Slit experiment. The Airy pattern is not only observed in the horizontal intensities, but also on the regions top and bottom of the bright fringes. Again, this is due to constructive and destructive interference occurring due to the aperture. This property will be discussed further in Activity #6.

E. Square Function (Centered Square Aperture)

A square function similar to that of a centered square aperture is used.

Figure 11. Square function aperture.
Figure 11. Square function aperture.
Figure 12. a) FT of the square function and b) its inverse FT. c) Real and d) Imaginary parts of the FT.
Figure 12. a) FT of the square function and b) its inverse FT. c) Real and d) Imaginary parts of the FT.

We again observe the Airy pattern similar to that of the circular aperture from the image. There are alternating fringes along the x-axis, but also along the y-axis. The fringes resemble rectangular fringes, which is due to the shape of the aperture. In the real and imaginary maps, a sort of checkered pattern is observed in the middle for the real part, and across the x and y axes for the imaginary part. The bright and dark fringes in Fig. 12a could have been contributed mainly by the imaginary part of the FT.

F. 2D Gaussian Bell Curve

An aperture generating Gaussian light (σ = 0.9) along x and y axes is used.

Figure 13. 2D Gaussian bell curve image.
Figure 13. 2D Gaussian bell curve image.
Figure 14. a) FT of the 2D Gaussian bell curve and b) its inverse FT. c) Real and d) Imaginary parts of the FT.
Figure 14. a) FT of the 2D Gaussian bell curve and b) its inverse FT. c) Real and d) Imaginary parts of the FT.

The Airy pattern in the image is not visible anymore. This could have been due to the decline in intensity as the points move away from the center, in which constructive and destructive interference are not prominent anymore. Now, according to Wolfram MathWorld, the Fourier Transform of a Gaussian distribution is also Gaussian. However, the FT we obtained is kinda small in size, in which the Gaussian is not observable. When I’m done explaining anamorphism in Activity #6, I will show you that the FT of a Gaussian is still Gaussian.

Now, let’s move on to some basic applications of FT.

II. Convolution

Convolution is like the mathematical process of mixing together two distinct signals. This process is extremely important in signal processing, such as in windowing and impulse decomposition (convolution with impulse functions such as Dirac delta function and sinc function). The convolution of two 2D functions f and g is given in the equation [1]

Convolution

where h is the convolution of the two functions.

The question now is, what in blazes is FT’s role in all of these? Well, as proven by our great mathematicians, the convolution of two functions is just inverse FT of the product of the two functions’ FT. Pretty convenient, isn’t it?

Too Easy

To show the convenience of such relation, let’s make use of our digital aperture imaging. In imaging, one function can be depicted as the object, and the other function as the transfer function, or in this case, the aperture of the imaging system. The convolution is then the image acquired by the imaging system, which is usually not perfectly identical with the object. This is an inherent property of most digital cameras since most of them are installed with lenses of finite size, which could gather limited amount of light from the scene.[1]

I have here an image of the acronym “VIP,” which we will treat as an object.

Figure 15. Grayscale image of the acronym "VIP."
Figure 15. Grayscale image of the acronym “VIP.”

Now, let’s make an image convolution. The aperture used is circular, same as in part I.A. The code is given below:

Codep1

Codep2
Code for Convolution using FT.

Now, in the code, I did not calculate for the FT of the apertures. This means that the FT of the physical aperture is the circular aperture itself. What we are trying to convolute then is the inverse FT of the circular apertures generated and the “VIP” image. Thus, if the circular aperture used in the code has a larger radius, then it’s inverse FT has a smaller radius, and is an Airy Pattern.

As you can see, the code utilizes different apertures, which vary in radius length. This is to see the effect of finite size lens in the image obtained by the detection system. The results are shown below.

All Image
Figure 16. The aperture image (top) and the corresponding image of Fig. 15 (bottom) for an aperture of radius = a) 0.05, b) 0.1, c) 0.3, d) 0.5, e) 0.7, and f) 0.9 units. The image of the object is in spatial space (inverse FT).

As seen from the images of the object, the size of the aperture has a big effect on the intensity and rendering of the image. In Fig. 12a, the “VIP” image can be barely seen. As the aperture size increases, the blurring of the image diminishes, and the “VIP” can be legibly detected. For larger apertures as seen Figs. 16d-16f, the image rendering or the sharpness of the edges and the intensity of the white parts increases. Remember that I’ve said that we are not doing a convolution between the aperture shown in the Fig. 16 and the “VIP” image. Instead, we are doing a convolution on the inverse FT of the circular apertures in Fig. 16 and the “VIP” image. A smaller radius of the aperture in Fig. 16 will produce an inverse FT in the shape of an Airy pattern, and when we do a convolution on the “VIP” image, the result turns out to be a combination of the Airy pattern and the “VIP” image. As the apertures size in Fig. 16 increases, this Airy pattern diminishes, and eventually into a small dot, which would make the convolution a clear image of the “VIP.” Embedded programs may have been used in most digital cameras to improve image rendering and acquisition, which is hindered by small lenses.

III. Correlation (Template Matching)

One of my favorite uses of Fourier Transform is its ability to easily correlate a simple signal to another signal consisting of signals, some of which similar to that of the former signal. Correlation determines how similar two functions are. The correlation of two 2D functions is given by the equation [1]

Correlation Formula

where p is the correlation of the signals f and g. Again, as calculated by our great minds and geniuses, the correlation of two signals is just the inverse FT of the product of the FT of g and the conjugate of the FT of f.

Now, what is the use of this correlation thing in image processing? As stated earlier, it measures the degree of similarity between two functions. Thus, the correlation value at a certain point would be greater if the functions are identical at that certain point. Template matching and pattern recognition are then techniques that make use of correlation, which I will demonstrate now.

I have here an image of a sentence, which I will call the “Rain.” Sitting next to it is the image of the letter A, which has the same size, font and gray values as the “Rain.”

Figure 17. a) The "Rain" image and b) the letter "A" similar to that in the "Rain."
Figure 17. a) The “Rain” image and b) the letter “A” similar to that in the “Rain.”

The Scilab code for image correlation is posted below.

Code for Correlation using FT.
Code for Correlation using FT.

Now, let’s see the results.

Figure 18. a) Correlation of the two images in Fig. 17 in Fourier space, together with b) a map of high intensity points. c) Images a and b were put together to shown locations of the dots.
Figure 18. a) Correlation of the two images in Fig. 17 in Fourier space, together with b) a map of high intensity points. c) Images a and b were put together to shown locations of the dots. Intensity threshold is set to 0.8 gray value.

The image shown in Fig. 18a indicates the correlation of the two images, of which high intensity dots represent regions of high correlation. The map is presented in Fig. 18b. The mapping indicates position with high correlation, and its consistent as shown in Fig. 18c. This application is very useful in pattern recognition such as those used in word-finding algorithms.

Now, we’re not just limited to word images. We could also do it with graphical images. I have here a picture that I obtained from A Hidden-picture Puzzles Generator by Yoon, et. al. Let’s say, I want to find the picture of this cute little girl. I tried using the code with the images below.

Figure 19. a) Object image correlated with b) pattern image using FT. c) The correlation image shows a single dot with a certain intensity pointing to the pattern location in a).
Figure 19. a) Object image correlated with b) pattern image using FT. c) The correlation image shows a single dot (inside the red box) with a distinct intensity pointing near the pattern location in a).

Although there were high intensities throughout the correlation image in Fig. 19c, we could detect a single observable dot that is distinct locally. The high intensities around the image could be due to the pattern in Fig. 19b being prominent throughout the image, since the original image is a bunch of lines. Nevertheless, improvements can be added to obtain a more precise result.

IV. Edge Detection using Fourier Transform

Remember the length and area estimation activity? Well, let’s try to understand the principle behind the edge detection using convolution and FT. Edge detection can be treated as a template matching of a pattern with an image. The function usually used is a 3×3 matrix, in which the sum of the elements is zero. There are different patterns that can be used as the edge detecting matrix, in which the pattern of the 3×3 matrix is detected in the image it is convolved with.

Let’s try this technique with the “VIP” image in Fig. 15. To start things off, we designate a 3×3 horizontal matrix, that has an element sum of 0 shown below.

Horizontal Matrix.
Horizontal Matrix.

To find the edges of the “VIP” image using this matrix, I used the Scilab code shown below.

Code for edge detection.
Code for edge detection.

Basically, the code converts the 3×3 matrix into 128×128 image, in which the pixels surrounding it are zero-valued. This is correlated with the “VIP” image. And here’s the result.

Figure 20. Edge detected image with horizontal matrix.
Figure 20. Edge detected image with horizontal matrix.

As you can see, the edges with a horizontal orientation obtained high values in the convolution. Those are pixel locations that has the same pattern as the matrix. Now, let’s try doing this with different patterns.

Figure 21. Edge detection matrix (top) and the corresponding edge-detected image (bottom) for a) vertical, b) diagonal, c) spot, d) crossed, and e) cornered matrix.
Figure 21. Edge detection matrix (top) and the corresponding edge-detected image (bottom) for a) vertical, b) diagonal, c) spot, d) cross, and e) corner matrix.

The edge detection worked like a charm. Vertical edges are observed from the vertical matrix, and diagonal edges, which are prominent from the “V,” are observed. Spot and cross matrix had almost the same results, in which they outlined the whole “VIP” image. The corner matrix showed corner detection for the “I” and straight line part of “P”, and showed high intensities for the corners of “V.”

Before I finish this, let me just add a few more pictures. There are prominent edge-finding techniques such as Sobel and Prewitt. I searched for their matrix operators and tried it using my code. Here are some of my results.

Figure 22. Edge detection matrix (top) and the corresponding edge-detected image (bottom) using a) Sobel vertical operator and b) Sobel horizontal operator.
Figure 22. Edge detection matrix (top) and the corresponding edge-detected image (bottom) using a) Sobel vertical derivative (Gx) and b) Sobel horizontal derivative (Gy).
Figure 23. Edge detection matrix (top) and the corresponding edge-detected image (bottom) using a) Prewitt vertical operator and b) Prewitt horizontal operator.
Figure 23. Edge detection matrix (top) and the corresponding edge-detected image (bottom) using a) Prewitt vertical derivative (Gx) and b) Prewitt horizontal derivative (Gy).

Now, these operators are the main constituents of the one used in the edge-detection techniques used in image processing. The thing is, the elements in the matrices of the operators, take for an example the vertical mask of the Sobel operator, determines the difference between the right and left pixel intensities of a particular edge, thus increasing the intensity for edge visibility. By applying more weight to the matrix, such as by increasing the values but still adding up to zero, we obtain higher intensities of the matrix [3].

For my final images, I used masks of no particular pattern.

Figure 24. Edge detection matrix (top) and the corresponding edge-detected image (bottom) using a) low-weighted mask and b) high-weighted mask.
Figure 24. Edge detection matrix (top) and the corresponding edge-detected image (bottom) using a) low-weighted mask and b) high-weighted mask.

In Fig. 24, there are two different masks of random elements, but still the total sum of elements is zero. One has a lower weight (lower values), and the other has higher weights. One could assume that the more prominent features of the matrix of the one with the lower weight is the vertical edge, meaning the mask is vertically oriented, or the difference in the column values tend to have higher intensity. The other image showed edges around the whole “VIP” image, with a bit of higher intensities in the horizontal orientation. One could then mix up different elements to obtain different types of edge-detecting masks, which is essential to digital image processing. As we have used in Activity #4, edge detection is necessary in estimating area via Green’s theorem.

Well, it’s been worthwhile doing the activity since it has brought up the main blood of digital image processing. The beauty of it is not just in the process of doing it and creating the code, but the act of seeing the resulting image. This is just the basics of image processing using FT, and I’ve spending so much time dilly-dallying on the basics. Moreover, this is the first time I’ve used the FT on image, which is what I’ve been waiting since the start of AP 186. I’ve been using FT just for time-dependent signals obtained from signal generators, which is of course quite useful in that time. I’m not just happy with my results, but I’m ecstatically looking forward to FT’s major capabilities, or should I say, its power. So with that, I’m going to rate myself 11/10, and end with an edge-detected picture of my Facebook photo.

"YEEEEEEAAAAHHHHH!!"
“YEEEEEEAAAAHHHHH!!”

Acknowledgements: I just wanted to thank Gio Jubilo for clarifying what a square function is, that it’s not a square wave!

References:

[1] Soriano, M., “Fourier Transform Model of Image Formation,” 2014.

[2] Lehar, S., “An Intuitive Explanation of Fourier Theory.” Retrieved from: <http://cns-alumni.bu.edu/~slehar/fourier/fourier.html>

[3] Tutorialspoint article on “Sobel Operator.” Retrieved from: <http://www.tutorialspoint.com/dip/Sobel_operator.htm>

2 thoughts on “Activity #5 – Fourier Transform Model of Image Formation

Leave a comment