基于BoW模型的图像分类 Image Classification with Bag of Visual Words
2016-01-08 17:15
399 查看
You can use the Computer Vision System Toolbox™ functions for image category classification by creating a bag of visual words. The process generates a histogram of visual word occurrences that represent an image. These histograms are used to train an image
category classifier. The steps below describe how to setup your images, create the bag of visual words, and then train and apply an image category classifier.
to organize categories of images to use for training an image classifier. Organizing images into categories makes handling large sets of images much easier. You can use the
to create subsets of representative images from each category.
Read the category images and create the image sets.
Separate the sets into training and test image subsets. In this example, 30% of the images are partitioned for training and the remainder for testing.
The
defines the features, or visual words, by using the k-means clustering algorithm
on the feature descriptors extracted from
into k mutually exclusive clusters. The resulting clusters are compact and separated by similar characteristics. Each cluster center represents a feature, or visual word.
You can extract features based on a feature detector, or you can define a grid to extract feature descriptors. The grid method may lose fine-grained scale information. Therefore, use the grid for images that do not contain distinct features, such as an image
containing scenery, like the beach. Using speeded up robust features (or SURF) detector provides greater scale invariance. By default, the algorithm runs the
This algorithm workflow analyzes images in their entirety. Images must have appropriate labels describing the class that they represent. For example, a set of car images could be labeled cars. The workflow does not rely on spatial information nor on marking
the particular objects in an image. The bag-of-visual-words technique relies on detection without localization.
returns an image classifier. The function trains a multiclass classifier using the error-correcting output codes (ECOC) framework with binary support vector machine (SVM) classifiers. The
uses the bag of visual words returned by the
to encode images in the image set into the histogram of visual words. The histogram of visual words are then used as the positive and negative samples to train the classifier.
Use the
to encode each image from the training set. This function detects and extracts features from the image and then uses the approximate nearest neighbor algorithm to construct a feature histogram for each image. The function then increments histogram bins based
on the proximity of the descriptor to a particular cluster center. The histogram length corresponds to the number of visual words that the
constructed. The histogram becomes a feature vector for the image.
Repeat step 1 for each image in the training set to create the training data.
Evaluate the quality of the classifier. Use the
to test the classifier against the validation image set. The output confusion matrix represents the analysis of the prediction. A perfect classification results in a normalized matrix containing 1s on the diagonal. An incorrect classification results fractional
values.
on a new image to determine its category.
Image Retrieval Using Customized Bag of Features
from: http://cn.mathworks.com/help/vision/ug/image-classification-with-bag-of-visual-words.html
category classifier. The steps below describe how to setup your images, create the bag of visual words, and then train and apply an image category classifier.
Step 1: Set Up Image Category Sets
Organize and partition the images into training and test subsets. Use theimageSetfunction
to organize categories of images to use for training an image classifier. Organizing images into categories makes handling large sets of images much easier. You can use the
imageSet.partitionmethod
to create subsets of representative images from each category.
Read the category images and create the image sets.
setDir = fullfile(toolboxdir('vision'),'visiondata','imageSets'); imgSets = imageSet(setDir,'recursive');
Separate the sets into training and test image subsets. In this example, 30% of the images are partitioned for training and the remainder for testing.
[trainingSets,testSets] = partition(imgSets,0.3,'randomize');
Step 2: Create Bag of Features
Create a visual vocabulary, or bag of features, by extracting feature descriptors from representative images of each category.The
bagOfFeaturesobject
defines the features, or visual words, by using the k-means clustering algorithm
on the feature descriptors extracted from
trainingSets. The algorithm iteratively groups the descriptors
into k mutually exclusive clusters. The resulting clusters are compact and separated by similar characteristics. Each cluster center represents a feature, or visual word.
You can extract features based on a feature detector, or you can define a grid to extract feature descriptors. The grid method may lose fine-grained scale information. Therefore, use the grid for images that do not contain distinct features, such as an image
containing scenery, like the beach. Using speeded up robust features (or SURF) detector provides greater scale invariance. By default, the algorithm runs the
'grid'method.
This algorithm workflow analyzes images in their entirety. Images must have appropriate labels describing the class that they represent. For example, a set of car images could be labeled cars. The workflow does not rely on spatial information nor on marking
the particular objects in an image. The bag-of-visual-words technique relies on detection without localization.
Step 3: Train an Image Classifier With Bag of Visual Words
ThetrainImageCategoryClassifierfunction
returns an image classifier. The function trains a multiclass classifier using the error-correcting output codes (ECOC) framework with binary support vector machine (SVM) classifiers. The
trainImageCategoryClassfierfunction
uses the bag of visual words returned by the
bagOfFeaturesobject
to encode images in the image set into the histogram of visual words. The histogram of visual words are then used as the positive and negative samples to train the classifier.
Use the
bagOfFeatures
encodemethod
to encode each image from the training set. This function detects and extracts features from the image and then uses the approximate nearest neighbor algorithm to construct a feature histogram for each image. The function then increments histogram bins based
on the proximity of the descriptor to a particular cluster center. The histogram length corresponds to the number of visual words that the
bagOfFeaturesobject
constructed. The histogram becomes a feature vector for the image.
Repeat step 1 for each image in the training set to create the training data.
Evaluate the quality of the classifier. Use the
imageCategoryClassifier
evaluatemethod
to test the classifier against the validation image set. The output confusion matrix represents the analysis of the prediction. A perfect classification results in a normalized matrix containing 1s on the diagonal. An incorrect classification results fractional
values.
Step 4: Classify an Image or Image Set
Use theimageCategoryClassifier
predictmethod
on a new image to determine its category.
References
[1] Csurka, G., C. R. Dance, L. Fan, J. Willamowski, and C. Bray. Visual Categorization with Bags of Keypoints. Workshop on Statistical Learning in Computer Vision. ECCV 1 (1–22), 1–2.Related Examples
Image Category Classification Using Bag of FeaturesImage Retrieval Using Customized Bag of Features
from: http://cn.mathworks.com/help/vision/ug/image-classification-with-bag-of-visual-words.html
相关文章推荐
- PCANet --- 用于图像分类的深度学习基准
- 基于SIFT特征和SVM的图像分类
- 图像分类算法调研报告
- libsvm遥感图像的分类(MATLAB中进行)
- deep learning convolution and pooling(卷积和池化)
- Bag of Words cpp实现(stable version 0.01)
- Image classification with deep learning常用模型
- BOW模型在ANN框架下的解释
- 基于Bow模型的图像检索 Image Retrieval with Bag of Visual Words
- 基于BoW模型的场景识别 Scene recognition with bag of words
- 京东金融大数据竞赛猪脸识别(7)- 识别方法之三
- Matlab图像识别/检索系列(6)-10行代码完成深度学习网络之基于CNN的图像分类
- Matlab图像识别/检索系列(3)—10行代码完成caltech图象集分类和识别
- Matlab图像识别/检索系列(2)—10行代码完成分类、识别
- 图像分类系统之功能实现概要
- Caffe C++接口实现图像分类(非官方Classification.cpp)
- GoogleNet:Going deeper with convolutions
- A Committee of Neural Networks for Traffic Sign Classification 阅读笔记
- ImageNet Classification with Deep Convolutional Neural Networks AlexNet阅读笔记
- 分类以及目标检测发展脉络——从12到17。