Dense Scale Invariant Feature Transform (DSIFT)
2015-12-16 21:22
351 查看
Dense Scale Invariant Feature Transform (DSIFT)
Overview
Usage
Technical details
Dense descriptors
Sampling
AuthorAndrea VedaldiBrian Fulkerson
dsift.h implements a dense version of SIFT.
This is an object that can quickly compute descriptors for densely sampled keypoints with identical size and orientation. It can be reused for multiple images of the same size.
details
This module implements a fast algorithm for the calculation of a large number of SIFT descriptors of densely sampled features of the same scale and orientation. See the SIFT
module for an overview of SIFT.
The feature frames (keypoints) are indirectly specified by the sampling steps (vl_dsift_set_steps)
and the sampling bounds (vl_dsift_set_bounds). The
descriptor geometry (number and size of the spatial bins and number of orientation bins) can be customized (vl_dsift_set_geometry, VlDsiftDescriptorGeometry).
Dense SIFT descriptor geometry
By default, SIFT uses a Gaussian windowing function that discounts contributions of gradients further away from the descriptor centers. This function can be changed to a flat window by invoking vl_dsift_set_flat_window.
In this case, gradients are accumulated using only bilinear interpolation, but instad of being reweighted by a Gassuain window, they are all weighted equally. However, after gradients have been accumulated into a spatial bin, the whole bin is reweighted by
the average of the Gaussian window over the spatial support of that bin. This “approximation” substantially improves speed with little or no loss of performance in applications.
Keypoints are sampled in such a way that the centers of the spatial bins are at integer coordinates within the image boundaries. For instance, the top-left bin of the top-left descriptor is centered on the pixel (0,0). The bin immediately
to the right at (
a paramtere in the VlDsiftDescriptorGeometry structure. vl_dsift_set_bounds can
be used to further restrict sampling to the keypoints in an image.
that can be used to process a sequence of images of a given geometry. To use the DSIFT filter:
Initialize a new DSIFT filter object by vl_dsift_new (or
the simplified vl_dsift_new_basic).
Customize the descriptor parameters byvl_dsift_set_steps, vl_dsift_set_geometry,
etc.
Process an image by vl_dsift_process.
Retrieve the number of keypoints (vl_dsift_get_keypoint_num),
the keypoints (vl_dsift_get_keypoints), and their
descriptors (vl_dsift_get_descriptors).
Optionally repeat for more images.
Delete the DSIFT filter by vl_dsift_delete.
specialzies it to the case of dense keypoints.
xh(t,i,j)==mσx^+T,mσ∫gσwin(x−T)wang(∠J(x)−θt)w(x−Txmσ−x^i)w(y−Tymσ−y^j)|J(x)|dx.
Since many different values of T are sampled, this is conveniently expressed as a separable convolution. First, we translate by xij=mσ(x^i, y^i)⊤ and
we use the symmetry of the various binning and windowing functions to write
h(t,i,j)T′==mσ∫gσwin(T′−x−xij)wang(∠J(x)−θt)w(T′x−xmσ)w(T′y−ymσ)|J(x)|dx,T+mσ[xiyj].
Then we define kernels
ki(x)kj(y)==12π−−√σwinexp(−12(x−xi)2σ2win)w(xmσ),12π−−√σwinexp(−12(y−yj)2σ2win)w(ymσ),
and obtain
h(t,i,j)J¯t(x)==(kikj∗J¯t)(T+mσ[xiyj]),wang(∠J(x)−θt)|J(x)|.
Furthermore, if we use a flat rather than Gaussian windowing function, the kernels do not depend on the bin, and we have
k(z)h(t,i,j)==1σwinw(zmσ),(k(x)k(y)∗J¯t)(T+mσ[xiyj]),
(here σwin is
the side of the flat window).
NoteIn this case the binning functions k(z) are
triangular and the convolution can be computed in time independent on the filter (i.e. descriptor bin) support size by integral signals.
always at integer coordinates within the image boundaries. This eliminates the need for costly interpolation. This condition amounts to (expressed in terms of the x coordinate, and equally applicable to y)
{0,…,W−1}∋Tx+mσxi=Tx+mσi−Nx−12=T¯x+mσi,i=0,…,Nx−1.
Notice that for this condition to be satisfied, the descriptor center Tx needs
to be either fractional or integer depending on Nx being
even or odd. To eliminate this complication, it is simpler to use as a reference not the descriptor center T, but the coordinates of the upper-left bin T¯.
Thus we sample the latter on a regular (integer) grid
[00]≤T¯=[T¯minx+pΔxT¯miny+qΔy]≤[W−1−mσNxH−1−mσNy],T¯=⎡⎣Tx−Nx−12Ty−Ny−12⎤⎦
and we impose that the bin size mσ is
integer as well.
from:
from: http://www.vlfeat.org/api/dsift.html
Table of Contents
OverviewUsage
Technical details
Dense descriptors
Sampling
AuthorAndrea VedaldiBrian Fulkerson
dsift.h implements a dense version of SIFT.
This is an object that can quickly compute descriptors for densely sampled keypoints with identical size and orientation. It can be reused for multiple images of the same size.
Overview
See alsoThe SIFT module, Technicaldetails
This module implements a fast algorithm for the calculation of a large number of SIFT descriptors of densely sampled features of the same scale and orientation. See the SIFT
module for an overview of SIFT.
The feature frames (keypoints) are indirectly specified by the sampling steps (vl_dsift_set_steps)
and the sampling bounds (vl_dsift_set_bounds). The
descriptor geometry (number and size of the spatial bins and number of orientation bins) can be customized (vl_dsift_set_geometry, VlDsiftDescriptorGeometry).
Dense SIFT descriptor geometry
By default, SIFT uses a Gaussian windowing function that discounts contributions of gradients further away from the descriptor centers. This function can be changed to a flat window by invoking vl_dsift_set_flat_window.
In this case, gradients are accumulated using only bilinear interpolation, but instad of being reweighted by a Gassuain window, they are all weighted equally. However, after gradients have been accumulated into a spatial bin, the whole bin is reweighted by
the average of the Gaussian window over the spatial support of that bin. This “approximation” substantially improves speed with little or no loss of performance in applications.
Keypoints are sampled in such a way that the centers of the spatial bins are at integer coordinates within the image boundaries. For instance, the top-left bin of the top-left descriptor is centered on the pixel (0,0). The bin immediately
to the right at (
binSizeX,0), where
binSizeXis
a paramtere in the VlDsiftDescriptorGeometry structure. vl_dsift_set_bounds can
be used to further restrict sampling to the keypoints in an image.
Usage
DSIFT is implemented by a VlDsiftFilter objectthat can be used to process a sequence of images of a given geometry. To use the DSIFT filter:
Initialize a new DSIFT filter object by vl_dsift_new (or
the simplified vl_dsift_new_basic).
Customize the descriptor parameters byvl_dsift_set_steps, vl_dsift_set_geometry,
etc.
Process an image by vl_dsift_process.
Retrieve the number of keypoints (vl_dsift_get_keypoint_num),
the keypoints (vl_dsift_get_keypoints), and their
descriptors (vl_dsift_get_descriptors).
Optionally repeat for more images.
Delete the DSIFT filter by vl_dsift_delete.
Technical details
This section extends the SIFT descriptor section andspecialzies it to the case of dense keypoints.
Dense descriptors
When computing descriptors for many keypoints differing only by their position (and with null rotation), further simplifications are possible. In this case, in fact,xh(t,i,j)==mσx^+T,mσ∫gσwin(x−T)wang(∠J(x)−θt)w(x−Txmσ−x^i)w(y−Tymσ−y^j)|J(x)|dx.
Since many different values of T are sampled, this is conveniently expressed as a separable convolution. First, we translate by xij=mσ(x^i, y^i)⊤ and
we use the symmetry of the various binning and windowing functions to write
h(t,i,j)T′==mσ∫gσwin(T′−x−xij)wang(∠J(x)−θt)w(T′x−xmσ)w(T′y−ymσ)|J(x)|dx,T+mσ[xiyj].
Then we define kernels
ki(x)kj(y)==12π−−√σwinexp(−12(x−xi)2σ2win)w(xmσ),12π−−√σwinexp(−12(y−yj)2σ2win)w(ymσ),
and obtain
h(t,i,j)J¯t(x)==(kikj∗J¯t)(T+mσ[xiyj]),wang(∠J(x)−θt)|J(x)|.
Furthermore, if we use a flat rather than Gaussian windowing function, the kernels do not depend on the bin, and we have
k(z)h(t,i,j)==1σwinw(zmσ),(k(x)k(y)∗J¯t)(T+mσ[xiyj]),
(here σwin is
the side of the flat window).
NoteIn this case the binning functions k(z) are
triangular and the convolution can be computed in time independent on the filter (i.e. descriptor bin) support size by integral signals.
Sampling
To avoid resampling and dealing with special boundary conditions, we impose some mild restrictions on the geometry of the descriptors that can be computed. In particular, we impose that the bin centers T+mσ(xi, yj) arealways at integer coordinates within the image boundaries. This eliminates the need for costly interpolation. This condition amounts to (expressed in terms of the x coordinate, and equally applicable to y)
{0,…,W−1}∋Tx+mσxi=Tx+mσi−Nx−12=T¯x+mσi,i=0,…,Nx−1.
Notice that for this condition to be satisfied, the descriptor center Tx needs
to be either fractional or integer depending on Nx being
even or odd. To eliminate this complication, it is simpler to use as a reference not the descriptor center T, but the coordinates of the upper-left bin T¯.
Thus we sample the latter on a regular (integer) grid
[00]≤T¯=[T¯minx+pΔxT¯miny+qΔy]≤[W−1−mσNxH−1−mσNy],T¯=⎡⎣Tx−Nx−12Ty−Ny−12⎤⎦
and we impose that the bin size mσ is
integer as well.
from:
from: http://www.vlfeat.org/api/dsift.html
相关文章推荐
- SIFT算法:特征描述子
- 图像特征描述子SIFT的快版变体Dense SIFT
- SLAM系列之1 - ORB SLAM
- opencv之SURF特征点提取及匹配
- SIFT(Scale-invariant feature transform, 尺度不变特征转换)特征
- SIFT特征教程:Scale Invariant Feature Transform
- Introduction to SIFT (Scale-Invariant Feature Transform) SIFT特征导论
- 将特征点描述子输出成文本
- openCV中的特征点检测、描述子计算、特征匹配的一些类
- hadoop datanode启动失败
- JAVASCRIPT中经典面试题
- [Servlet&JSP] 从JSP到Servlet
- jsp环境搭建
- React-非dom属性-key
- js简单制作图片焦点图
- 跑在成功的路上......
- React 入门实例教程
- React-非dom属性-ref标签
- 深和学习导航CSS样式
- jquery 数值比较大小时注意事项