Paper Read: Robust Deep Multi-modal Learning Based on Gated Information Fusion Network
Robust Deep Multi-modal Learning Based on Gated Information Fusion Network
2018-07-27 14:25:26
Paper:https://arxiv.org/pdf/1807.06233.pdf
Related Papers:
1. Infrared and visible image fusion methods and applications: A survey Paper
2. Chenglong Li, Xiao Wang, Lei Zhang, Jin Tang, Hejun Wu, and Liang Lin. WELD: Weighted Low-rank Decomposition or Robust Grayscale-Thermal Foreground Detection. IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT), 27(4): 725-738, 2017. [Project page with Dataset and Code]
3. Chenglong Li, Xinyan Liang, Yijuan Lu, Nan Zhao, and Jin Tang. RGB-T Object Tracking: Benchmark and Baseline.[arXiv] [Dataset: Google drive, Baidu cloud] [Project page]
本文针对多模态融合问题(Multi-modal),提出一种基于 gate 机制的融合策略,能够自适应的进行多模态信息的融合。作者将该方法用到了物体检测上,其大致流程图如下所示:
如上图所示,作者分别用两路 Network 来提取两个模态的特征。该网络是由标准的 VGG-16 和 8 extra convolutional layers 构成。另外,作者提出新的 GIF(Gated Information Fusion Network) 网络进行多个模态之间信息的融合,以取得更好的结果。动机当然就是多个模态的信息,是互补的,但是有的信息帮助会更大,有的可能就质量比较差,功效比较小,于是就可以自适应的来融合,达到更好的效果。
Gated Information Fusion Network (GIF):
如上图所示:
该 GIF 网络的输入是:已经提取的 CNN feature map,这里是 F1, F2. 然后,将这两个 feature 进行 concatenate,得到 $F_G$. 该网络包含两个部分:
1. information fusion network(图2,虚线框意外的部分);
2. weight generation network (WG Network,即:图2,虚线处);
Weight Generation Network 分别用两个 3*3*1 的卷积核对组合后的 feature map $F_G$ 进行操作,然后输入到 sigmoid 函数中,即:gate layer,然后输出对应的权重 $w_1$,$w_2$。
Information fusion network 分别用得到的两个权重,点乘原始的 feature map,得到加权以后的特征图,将两者进行 concatenate 后,用 1*1*2k 的卷积核,得到最终的 feature map。
总结整个过程,可以归纳为:
== Done !
- Paper-[acmi 2015]Image based Static Facial Expression Recognition with Multiple Deep Network Learning
- 目标跟踪之“Robust Visual Tracking with Deep Convolutional Neural Network based Object Proposals on PETS”
- 论文笔记之:Learning Cross-Modal Deep Representations for Robust Pedestrian Detection
- Some Improvements on Deep Convolutional Neural Network Based Image Classif ication
- 【Paper】DeepSaliency: Multi-Task Deep NeuralNetwork Model for Salient Object Detection
- Paper Note - Learning to Hash with Binary Deep Neural Network
- 读paper:Deep Convolutional Neural Network using Triplets of Faces, Deep Ensemble, andScore-level Fusion for Face Recognition
- Some Improvements on Deep Convolutional Neural Network Based Image Classification(精读)
- Automatic Feature Learning for Glaucoma Detection Based on Deep Learning论文理解
- Information centric network (icn) node based on switch and network process using the node
- 论文阅读:BoVW-MI:TASK DRIVEN DICTIONARY LEARNING BASED ON MUTUAL INFORMATION FOR MEDICAL IMAGE CLASSIFIC
- #Paper Reading# Multi-document Summarization Based on Cluster Using Non-negative Matrix
- 论文阅读之:Deep Meta Learning for Real-Time Visual Tracking based on Target-Specific Feature Space
- 零基础10分钟运行DQN图文教程 Playing Flappy Bird Using Deep Reinforcement Learning (Based on Deep Q Learning DQN
- READING NOTE: Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detect
- 笔记:Deep multi patch aggregation network for image style, aesthetics and quality estimation
- Paper Review-Fast and Robust Multiframe Super Resolution-#1-"Brain" hang out
- network flow based on BFS
- 读后感 Spectral–Spatial Residual Network for Hyperspectral Image Classification: A 3-D Deep Learning Fr
- [Paper 学习笔记]PCANet: A Simple Deep Learning Baseline for Image Classification?