CSE/EE 486:  Fundamentals of Computer Vision


      Computer Project Report #1:

      Binary Image Processing and Recognition

      Group #11: Howard Elikan, Jim Geis and Anirudh Modi

      February 16, 1999


      1. Objectives:
        1. To design a simple machine-vision system to recognize unknown test objects
        2. To implement code that processes images and obtains image features and characteristics
        3. To build a model database of objects for comparison with test objects

      2. Methods:
        The images used to test the machine vision system were taken from the SunVideo camera in 101 Pond. Four objects were photographed individually. Then three of the objects were captured in a single image. The fourth object, not used in the previous image, was photographed with a fifth, unknown object. Overall there are six images used for the project. We chose the given objects because they were of uniform color and simple shapes.

        1. Individual images
            Figure 1 - a dark gray calculator
            Figure 2 - a square bright yellow Post-It notepad
            Figure 3 - a blue racquetball
            Figure 4 - a white soapdish

        2. Test images
            Figure 5 - the calculator, racquetball, and soapdish
            Figure 6 - the Post-It notepad with a black wallet

        The images were converted to PGM format using xv. Then the images could be read into our program for analysis. Our machine vision system is implemented entirely in object-oriented C++ code. The program is made up of several main functions:

        1. Grayscale image import - reads image data in one of several image formats and stores it in an array for processing.
        2. Histogram Analysis - extracts histogram from the grayscale image and smoothens the histogram by using Bezier curves (Ref: Mathematical Elements of Computer Graphics, Roger and Adams, pp. 295-298). Then the smoothened histogram is examined to find its peaks. (eg. Figure 13)
        3. Thresholding - Using the peaks found in the histogram, several thresholds are applied to separate the objects from the background and generate additional images.
        4. Component labeling - Labels the components found in the image, while deleting small components to eliminate noise.
        5. Object feature extraction - After components are identified and labeled, features such as area, perimeter, minimum bounding rectangle, elongation, and compactness are extracted.
        6. Invariant moments calculation - All seven invariant moments are produced for each component. They are expressed as logarithms to the base 10.
        7. Object classifier - The features of the components are compared against the model database to differentiate the objects in Figure 5 and Figure 6. The classifier recognizes two objects as being the same if the average absolute error for the corresponding invariant moments is less than 5 percent, and none of the invariant moments are off by more than 10 percent.

        To run the program, we made a script that compiles the code (if necessary) and executes the machine vision program on a specified image file. During execution, the program creates several temporary image files. When execution is complete, the script loads xv to view the output image files created by the program. When the user closes xv, the script deletes the temporary files and terminates. For example, to run the program on the image with three objects in it, the user would type: run 3things.pgm

        The program code and script code can be found in the Appendix.

      3. Results:
        Output thresholded images are created and then viewed in xv. The binary images outputted by the computer vision system for each of the individual images are:

          Figure 7 - binary calculator image
          Figure 8 - binary Post-It notepad image
          Figure 9 - binary racquetball image
          Figure 10 - binary soapdish image

        Below is the object information generated by the program as it processes each of the individual images:

        Feature Calculator Post-It Racquetball Soapdish
        Area 12933 5174 2320 7210
        Perimeter 498 283 170 347
        Compactness 19.1761 15.479 12.4569 16.7003
        Elongation 2.1647 1.17647 1.4222 1.8000
        Moment phi(1) 7.6144 6.65767 5.9590 7.0227
        Moment phi(2) 14.9672 11.8834 10.9383 13.5485
        Moment phi(3) 15.9548 13.9503 12.2206 14.9118
        Moment phi(4) 14.3529 13.8495 12.4009 14.2102
        Moment phi(5) 29.2652 27.7216 24.5691 27.7059
        Moment phi(6) 21.5070 19.4761 17.5828 20.9607
        Moment phi(7) 29.5028 27.6203 24.5616 28.8328

        After finding information in the individual images, the computer vision system uses the information to classify the three objects in Figure 5. Processing Figure 5, the program identified and labeled the three components by shading them different grayscale colors as shown in Figure 11. The program then generated the following data for each component:

        Feature Component 1 Component 2 Component 3
        Area 2373 12510 7982
        Perimeter 178 478 364
        Compactness 13.3519 18.2641 16.5993
        Elongation 1.4667 2.3038 1.8235
        Moment phi(1) 5.9800 7.5808 7.1139
        Moment phi(2) 11.0069 14.8911 13.7512
        Moment phi(3) 12.1741 16.3242 14.8077
        Moment phi(4) 12.5520 15.5116 13.9352
        Moment phi(5) 24.3962 31.2358 28.0073
        Moment phi(6) 17.6484 22.4332 20.8003
        Moment phi(7) 24.8781 31.5026 27.9832

        The computer vision system compares the data collected from the individual images with the data collected from Figure 5, the image containing three objects. Then the system can determine which objects match the database objects. The method the classifier uses is straightforward. It calculates the error between the individual object features and all three of the unknown object's features from Figure 5. If the unknown object's seven invariant moments have less than 5 percent error with those of an object in the model database, and none of the invariant moments are off by more than 10 percent, then it is considered a match. When a match occurs, the matching components are stated and the feature vectors of both objects are printed. The feature vector consists of: area, perimeter, compactness, elongation, and all seven moments. Below is a portion of the output of the program. It shows the comparison among the objects and resulting matches.

        Avg error between component 1 and images/calculator.pgm = 19.20 %
        
        Avg error between component 1 and images/racquetball.pgm = 0.71 %
        Component 1 of "images/3things.pgm" <==> "images/racquetball.pgm"
        ------------------------------------------------------------
        Property   	  Image 1 	  Image 2 	    Error
        ------------------------------------------------------------
        Moment 1   	    percent 5.9800	    5.9590	    0.35 %
        Moment 2   	   11.0069	   10.9383	    0.63 %
        Moment 3   	   12.1741	   12.2206	   -0.38 %
        Moment 4   	   12.5520	   12.4009	    1.22 %
        Moment 5   	   24.3962	   24.5691	   -0.70 %
        Moment 6   	   17.6484	   17.5828	    0.37 %
        Moment 7   	   24.8781	   24.5616	    1.29 %
        Area       	      2373	      2320	    2.28 %
        Perimeter  	       178	       170	    4.71 %
        Compactness	   13.3519	   12.4569	    7.18 %
        Elongation 	    1.4667	    1.4222	    3.12 %
        ------------------------------------------------------------
        
        Avg error between component 1 and images/soapdish.pgm = 15.01 %
        Avg error between component 1 and images/postit.pgm = 20.18 %
        
        Avg error between component 2 and images/calculator.pgm = 4.17 %
        Component 2 of "images/3things.pgm" <==> "images/calculator.pgm"
        ------------------------------------------------------------
        Property   	  Image 1 	  Image 2 	    Error
        ------------------------------------------------------------
        Moment 1   	    7.5808	    7.6144	   -0.44 %
        Moment 2   	   14.8911	   14.9672	   -0.51 %
        Moment 3   	   16.3242	   15.9548	    2.32 %
        Moment 4   	   15.5116	   14.3529	    8.07 %
        Moment 5   	   31.2358	   29.2652	    6.73 %
        Moment 6   	   22.4332	   21.5070	    4.31 %
        Moment 7   	   31.5026	   29.5028	    6.78 %
        Area       	     12510	     12933	   -3.27 %
        Perimeter  	       478	       498	   -4.02 %
        Compactness	   18.2641	   19.1761	   -4.76 %
        Elongation 	    2.3038	    2.1647	    6.43 %
        ------------------------------------------------------------
        
        Avg error between component 2 and images/racquetball.pgm = 29.29 %
        Avg error between component 2 and images/soapdish.pgm = 9.36 %
        Avg error between component 2 and images/postit.pgm = 25.68 %
        Avg error between component 3 and images/calculator.pgm = 5.36 %
        Avg error between component 3 and images/racquetball.pgm = 17.84 %
        
        Avg error between component 3 and images/soapdish.pgm = 1.46 %
        Component 3 of "images/3things.pgm" <==> "images/soapdish.pgm"
        ------------------------------------------------------------
        Property   	  Image 1 	  Image 2 	    Error
        ------------------------------------------------------------
        Moment 1   	    7.1139	    7.0227	    1.30 %
        Moment 2   	   13.7512	   13.5485	    1.50 %
        Moment 3   	   14.8077	   14.9118	   -0.70 %
        Moment 4   	   13.9352	   14.2102	   -1.93 %
        Moment 5   	   28.0073	   27.7059	    1.09 %
        Moment 6   	   20.8003	   20.9607	   -0.77 %
        Moment 7   	   27.9832	   28.8328	   -2.95 %
        Area       	      7982	      7210	   10.71 %
        Perimeter  	       364	       347	    4.90 %
        Compactness	   16.5993	   16.7003	   -0.60 %
        Elongation 	    1.8235	    1.8000	    1.31 %
        ------------------------------------------------------------
        
        Avg error between component 3 and images/postit.pgm = 15.46 %
        
        Thus, the program identifies component 1 as racquetball, component 2 as calculator, component 3 as soapdish,


        The same test was then performed on Figure 6, an image containing a known object to the database and an unknown object. Figure 12 shows the component labeled image with the unknown object in blue color. The data and output are shown below:

        Feature Component 1 Component 2
        Area 6917 5190
        Perimeter 359 289
        Compactness 18.6325 16.0927
        Elongation 1.6000 1.1765
        Moment phi(1) 6.9673 6.6611
        Moment phi(2) 13.3584 11.8809
        Moment phi(3) 13.8035 14.0154
        Moment phi(4) 14.6173 13.9223
        Moment phi(5) 28.6816 27.7294
        Moment phi(6) 20.6515 19.7001
        Moment phi(7) 28.0990 27.9599


        Avg error between component 1 and images/calculator.pgm = 16.47 %
        Avg error between component 1 and images/racquetball.pgm = 26.92 %
        
        Avg error between component 1 and images/soapdish.pgm = 2.86 %
        Component 1 of "images/wallet_postit.pgm" <==> "images/soapdish.pgm"
        ------------------------------------------------------------
        Property          Image 1         Image 2           Error
        ------------------------------------------------------------
        Moment 1            6.9673          7.0227         -0.79 %
        Moment 2           13.3584         13.5485         -1.40 %
        Moment 3           13.8035         14.9118         -7.43 %
        Moment 4           14.6173         14.2102          2.86 %
        Moment 5           28.6816         27.7059          3.52 %
        Moment 6           20.6515         20.9607         -1.48 %
        Moment 7           28.0990         28.8328         -2.54 %
        Area                  6917            7210         -4.06 %
        Perimeter              359             347          3.46 %
        Compactness        18.6325         16.7003         11.57 %
        Elongation          1.6000          1.8000        -11.11 %
        ------------------------------------------------------------
        
        Avg error between component 1 and images/postit.pgm = 14.94 %
        Avg error between component 2 and images/calculator.pgm = 19.60 %
        Avg error between component 2 and images/racquetball.pgm = 22.30 %
        Avg error between component 2 and images/soapdish.pgm = 14.95 %
        
        Avg error between component 2 and images/postit.pgm = 0.49 %
        Component 2 of "images/wallet_postit.pgm" <==> "images/postit.pgm"
        ------------------------------------------------------------
        Property          Image 1         Image 2           Error
        ------------------------------------------------------------
        Moment 1            6.6611          6.6544          0.10 %
        Moment 2           11.8809         11.8604          0.17 %
        Moment 3           14.0154         13.9487          0.48 %
        Moment 4           13.9223         13.8845          0.27 %
        Moment 5           27.7294         27.7733         -0.16 %
        Moment 6           19.7001         19.4679          1.19 %
        Moment 7           27.9599         27.6697          1.05 %
        Area                  5190            5155          0.68 %
        Perimeter              289             283          2.12 %
        Compactness        16.0927         15.5362          3.58 %
        Elongation          1.1765          1.1765          0.00 %
        ------------------------------------------------------------
        

        Thus, the program identifies component 2 as postit and it incorrectly identifies component 1 (the wallet) as soapdish

      4. Conclusions:
        The results of the machine vision system are roughly what we expected to see. The system correctly identified all three objects in Figure 5. The system correctly identified the known object in Figure 6. Unfortunately, it incorrectly identified the unknown object as one of the database objects. This is not difficult to explain. Looking at the binary image of the unknown object, the wallet, and the known object, the soapdish, one can see how similar they are. Their dimensions are nearly identical. Based on the success rate of our classifier, we think that it was well chosen. Using the invariant moments in the feature vector for the classifier made our tests independent of scale, position, and rotation. However, the lack of a measure for intensity allowed for the black wallet to be mistaken for the white soapdish.

      5. Appendix:
        All images are 256 x 256.

        Figure 1: Model Database Image #1
        calculator.gif
        Figure 7: Calculator object thresholded
        out from Figure 1
        binary_calculator.gif
        Figure 2: Model Database Image #2
        postit.gif
        Figure 8: Post_It notepad thresholded
        out from Figure 2
        binary_postit.gif
        Figure 3: Model Database Image #3
        racquetball.gif
        Figure 9: Racquetball object thresholded
        out from Figure 3
        binary_racquetball.gif
        Figure 4: Model Database Image #4
        soapdish.gif
        Figure 10: Soapdish object thresholded
        out from Figure 4
        binary_soapdish.gif
        Figure 5: Test Image with three objects
        from the database
        3things.gif
        Figure 11: Three objects identified
        and labeled from Figure 5
        combined.gif
        Figure 6: Test Image with one object
        from the database and one unknown object
        wallet_postit.gif
        Figure 12: The two objects identified
        and labeled from Figure 6
        both_things.gif
        Figure 13: Histogram analysis of Figure 5

        Source code:

          The file image.h is the header file for the class of functions that we designed in C++.

          The file image.cc is the implementation file for the class of functions that we designed in C++.

          The file main.cc is the driver file for our program to compare images.

          The file main2.cc is another driver file for our program for analyzing individual images.

        Note: Special run scripts are necessary to execute the code.   For a complete copy of the code, scripts, and images, click here.   After you have downloaded the file:

        1. Type "gunzip -c project1.tar.gz | tar -xvf -"
        2. Type "./run" to execute the code