2004/08/23
Classifying Images on the web automatically
Rainer Lienhart and Alexander Hartmann
J Electronic Imaging 11 (4), 1-0 (Oct 2002)
Available: in paper only, LINC has some problems getting this through OpenURL
Relevant to: image classification, non-photographic image classification
Printed, highlighted, and filed.
The authors examine web image classification over a database of 300,000 images. They divide the non-image categories into presentation slides, comics/cartoons and other. Our classification for Fei's project is a bit more comprehensive but not motivated by corpus study.
The work is a machine learning feature oriented work, achieving high accuracy using simple image only (raster based) features. Colormap and proportion of the picture wrt to the colormap seems to be some of the most salient features.
For the top level photo/non-photo classification, AdaBoost was used (similar to our work) and feature pruning is inherently done through decision stump feature selection, The highlighted features show that the four features for classification include: 1) total colors 2) what is the prevalent color 3) fraction of pixels with distance > 0 (f1) and 4) ratio of of f1/f2, where f2 is similar to 3) but using a high threshold rather than zero. Surprisingly, edge detection (an expensive feature) doesn't appear to be too useful. All selected features were based on the colormap and not on the locality / placement of the pixels in the image. Dimension features were not used.
For the non-photo classification, text proves to be an important feature, and they capitalize on their group's previous research to detect text. Here, edge detection proves to be the second most useful feature after aspect ratio, which according to Table 3, accounts for over 95% accuracy. This leads me to believe that an optimized Hough transform for only vertical lines may be able to be used, to lower the complexity of the feature extraction. Also, presentation slides exported from powerpoint and others might be detectable by their embedded metadata rather than raster data properties.
Strengths:
- demonstrates that colormap features are a very strong key for non-photograph image classification.
- also does some error analysis that illustrates some borderline cases.
- uses only jpeg compressed images for their study.
Weaknesses:
- no information about the kappa or percentage agreement between assessors. It is presumed that the task is easy and 100% doable.
- 95% + accuracy in non-photograph classification only subdivides into two classes: comics vs. presentation slides.
