Limitations of Content-based Image Retrieval© Copyright 2008 by T. Pavlidis Appendix B: Simulating What is Happening in Current CBIR systemsIn order to investigate the performance of the current approaches in CIBR I wrote a simple program using only eight features and tested it on my personal collection of images using a picture of a dog as query and hoping to find other pictures of the same dog in the collection. Figure B1 shows screen dumps of trying to match a given image in three databases of increasingly larger size. The image with the yellow border is given image and the rest are the top five matches. The first set contained only 7 images so the results look "impressive." (This was, in effect, the design set.) The second set included additional images for a total of 307 and the results are shakier. A picture of a house in the winter has "sneaked" amongst the dogs. The third set included all the images of the second plus additional for a total of 1049. The results are now even worse. When a test was run on a database not containing any of images of the design set but containing other images of the same dog, none of the latter are found. This is shown in Figure B2. In the case of Figure B2 (right) one might think that the additional objects in the image might distract from the matching but this does not seem to be the case. A cropped version from this set where the dog occupies a larger fraction of the image than in the others and also has similar pose to that of the query is found even more distant from the query than the others. This example also illustrates the unreliability of evaluating methods on the basis of "experimental" results only. Looking only at the example of Figure B1 (center) one might argue that the features I used are doing a good job on finding matches. Even Figure B1 (right) might support that claim. It is only the examples of Figure B2 that demonstrate that the "successful" results were really spurious. At least this implementation of the feature based approach is not scalable. It should be clear that this example is offered only as an illustration
of the weakness of the feature based approach in CBIR rather than as proof.
The underlying reason for the failure is that feature based CBIR projects
its objects (images) from a very high dimensional space (all possible
pixel color combinations) to one of much lower dimensionality, that of
the features. When the number of objects is small their identity may be
preserved in the lower dimensional space but it will be lost as the number
of objects increases. Latest update June 11, 2008 |