TCNN:Modeling and Propagating CNNs in a Tree Structure for Visual Tracking
2017-10-26 18:16
891 查看
TCNN:Modeling and Propagating CNNs in a Tree Structure for Visual Tracking
arXiv 16Hyeonseob Nam∗ Mooyeol Baek∗ Bohyung Han
韩国POSTECH大学 Bohyung Han团队的论文,MDNet,BranchOut的作者。
Movtivation
本文的motivation是用一个tree结构的multi-model CNN来解决tracking中的o遮挡、突然运动、跟丢等难题,以提高跟踪的精度。CNNs目标跟踪领域最大问题是target appearance的改变,为了应对这个问题,目前常用的方法是进行online update。但online update是基于目标缓慢变化这个前提假设进行的,当目标被遮挡或者跟丢时,跟踪器就学习了背景特征从而失效。因此本文用多个tracker连成一个树形结构,树的叶节点保存的目标最新的特征,根节点保存的是目标的历史特征,然后再对这些tracker进行加权平均求每个candidate的score值,最大的score对应的candidate即为目标。Algorithm Overview
如上图所示,树每个节点红框的粗细表示这个节点的可靠性(reliablity),两个节点间的连续的粗细表示连个节点间的亲属关系的疏密(affinity),黑色箭头的粗细表示这个节点对目标评估的权重值(weight)。
Proposed Algorithm
CNN Architecture
文章用的依然是vgg-m,三个卷积层加上个权连接层。网络结构图下图所示。网络的输出为归一化的二值向量[ϕ(x),1−ϕ(x)]T分别表示candidate属于目标和背景的score。
Tree Construction
树结构由树节点和路径组成。节点包括一个CNN网络和用于训练这个网络的n帧图像$F_n$。每个路径带有一个分数s,表示两个节点间的疏密关系:
s(u,v)=1|Fv|<
4000
span style="display: inline-block; width: 0px; height: 2.723em;">∑t=0Fvϕu(x∗t)
其中x∗t为前面的网络在v的训练帧Fv的评估的目标。
Target State Estimation using Multiple CNNs
目标的状态评估用的还是candidate的方法:在上一帧的目标的位置随机采集很多个candidate的,然后判断哪个candidate的score最大,而每个candidate的分数来自于树结构中active的节点的CNN的加权平均。每个节点的权重(weight,对应图一中的黑箭头的粗细)由当前节点对当前帧的亲密度( affinity,注意这个affinity和前文用于评估两个节点之间的路径权重的affinity是两个不同的概念)和此节点的可靠性(reliablity)之间的最小值充当。节点与当前帧的亲密度为所有candidate中的最大score,而可靠性是此节点到根节点的所有路径中最小权重的路径的权重数。最好再对每个节点的weight进行归一化。Model Update
Model Update 是在所有叶子节点中选择能使当前新节点权重最大的节点作为新节点的父节点,然后对父节点的cnn网络进行fune-tuning得到新节点的cnn参数,并且计算新节点的与父节点之前的edge weight,新节点的reliablity。Bounding Box Regression
文章同时也进行Bounding Box Regression。相关文章推荐
- 目标跟踪算法三:Modeling and Propagating CNNs in a Tree Structure for Visual Tracking (VOT2016冠军)
- 目标跟踪系列四:Modeling and Propagating CNNs in a Tree Structure for Visual Tracking (2016年8月)
- WP7 LongListSelector in depth | Part1: Visual structure and API
- Data Structure Binary Tree: Construct Tree from given Inorder and Preorder traversals
- Building Applications with Force.com and VisualForce(六):Designing Applications for Multiple users: Accommodating Multiple Users in your App
- 【转】R-CNN学习笔记3:Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition(SPP-net)
- [论文]CA-Tree: A Hierarchical Structure for Efficient and Scalable Coassociation-Based Cluster Ensembles
- Tips and Tricks For CNN Structure
- Data Structure Binary Tree: Populate Inorder Successor for all nodes
- 论文笔记:Parallel Tracking and Verifying: A Framework for Real-Time and High Accuracy Visual Tracking
- A solution for conflict of structure loc_t in header files of vac compiler and informix esql on AIX
- Step by Step Camera Pose Estimation for Visual Tracking and Planar Markers
- 利用二叉树的中序遍历和后序遍历序列构造一个二叉树Search results for Construct Binary Tree from Inorder and Postorder Traversa
- Using UTF-8 as the internal representation for strings in C and C++ with Visual Studio
- 2013.4.14 Modeling and Algorithms for QoS-Aware Service Composition in Virtualization-Based Cloud Computing
- The article discusses a couple of new features introduced for assemblies and versioning in Visual Studio 2005.
- Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL
- R-CNN学习笔记3:Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition(SPP-net)
- How to get Intellisense for Web.config and App.config in Visual Studio .NET?(转载)
- Debugging Tips and Tricks for C++ in Visual Studio