Limitations of CBIR - Appendix B

Limitations of Content-based Image Retrieval

Appendix B: Simulating What is Happening in Current CBIR systems

In order to investigate the performance of the current approaches in CIBR I wrote a simple program using only eight features and tested it on my personal collection of images using a picture of a dog as query and hoping to find other pictures of the same dog in the collection.

Figure B1 shows screen dumps of trying to match a given image in three databases of increasingly larger size.


Figure B1: Results of CBIR on three different databases of size 7 (left), 307 (middle), and 1049 (right). Click on each of them for full size version. (The gray areas in image with distance 781 in the right screen dump have painted after the match to obscure names of people.)

The image with the yellow border is given image and the rest are the top five matches. The first set contained only 7 images so the results look "impressive." (This was, in effect, the design set.) The second set included additional images for a total of 307 and the results are shakier. A picture of a house in the winter has "sneaked" amongst the dogs. The third set included all the images of the second plus additional for a total of 1049. The results are now even worse. When a test was run on a database not containing any of images of the design set but containing other images of the same dog, none of the latter are found. This is shown in Figure B2.


Figure B2: On the left are the results of CBIR on a database that contained none of the images of the design set. On the right are images containing the same dog in the database. Features based CBIR fails because the dog occupies only a small fraction of the images and the camera angle is different. Click on each of the images for a full size version.

In the case of Figure B2 (right) one might think that the additional objects in the image might distract from the matching but this does not seem to be the case. A cropped version from this set where the dog occupies a larger fraction of the image than in the others and also has similar pose to that of the query is found even more distant from the query than the others.

This example also illustrates the unreliability of evaluating methods on the basis of "experimental" results only. Looking only at the example of Figure B1 (center) one might argue that the features I used are doing a good job on finding matches. Even Figure B1 (right) might support that claim. It is only the examples of Figure B2 that demonstrate that the "successful" results were really spurious.

At least this implementation of the feature based approach is not scalable.

It should be clear that this example is offered only as an illustration of the weakness of the feature based approach in CBIR rather than as proof. The underlying reason for the failure is that feature based CBIR projects its objects (images) from a very high dimensional space (all possible pixel color combinations) to one of much lower dimensionality, that of the features. When the number of objects is small their identity may be preserved in the lower dimensional space but it will be lost as the number of objects increases.
Back to the Table of Contents - Back to the Introduction - Back to Figure F1.

Latest update June 11, 2008