Readspeaker menu

Image Representation

How does your machine represent an image?

Alexander Freytag, Johannes RühlePaul Bodesheim, Erik Rodner, and Joachim Denzler


Motivation

Due to their simplicity and performance many visual recognition systems use the popular bag-of-words approach for image classification. Inspired by text analysis and classification systems, the heart of the bag-of-words approach is the codebook that incorporates the information of frequently occurring low-level visual patterns, called visual "words". Using the codebook it allows for an individual image description based on mapping -quantizing- unseen image patterns to those prominent codebook words.

But how much information is lost due to quantization when mapping image features to a fixed number of visual codebook words? To answer this question, we present an in-depth analysis of the effect of local feature quantization on the human recognition performance of images of the well-known 15 scenes dataset. We therefore employ an feature inversion technique that reveals the visual information hidden in the bag-of-words feature vector histogram to the human observer.

 


Towards understanding quantization effects in feature extraction methods


  HoggleBowOverview  






[Freytag14:STB]

Alexander Freytag and Johannes Rühle and Paul Bodesheim and Erik Rodner and Joachim Denzler. Seeing through bag-of-visual-word glasses: towards understanding quantization effects in feature extraction methods. ICPR Workshop on Features and Structures (FEAST). 2014. [pdf] [bib]

Source code available

GitHub project page: at: https://github.com/cvjena/bowInversion
Download Tar-ballTar-ball
   

Human classification result in comparison to machine learning performance (ML)

   

Overview of the images presented to human observers during our experiment  (15 Scenes dataset)

  accuracies-annotated    overview-scenes

Following the original 15 scenes images, each row depictures examples of different inverted quantizations: HOG inversion and inversion based on codebooks of size 2048, 512, 128 and 32 words  (top to bottom).

Publications

[Freytag14:STB]

Alexander Freytag and Johannes Rühle and Paul Bodesheim and Erik Rodner and Joachim Denzler. Seeing through bag-of-visual-word glasses: towards understanding quantization effects in feature extraction methods. ICPR Workshop on Features and Structures (FEAST). 2014. [pdf] [bib]




 

Locations of Site Visitors