您的位置：首页 > Web前端

《A Review of Unsupervised Feature Learning and Deep Learning for Time-Series Modeling》笔记

2020-02-04 06:15 567 查看

A Review of Unsupervised Feature Learning and Deep Learning for Time-Series Modeling

Abstract

这篇文章所做的：

1.概述时序数据处理中的挑战
2. time-series data 和 unsupervised feature learning

1. Introduction and Background

1.Traditional approaches

autoregressive models（AM）、 Linear Dynamical Systems(LDS)、 Hidden Markov Model（HMM）只有一小部分非线性操作

然而，more complex, high-dimensional, and noisy real-world time-series data cannot be described with analytical equations with parameters 复杂、高维、含有噪声的真实世界数据无法有效处理

2.unsupervised feature learning

benefit 1: features are learned from the data instead of being hand-crafted

benefit 2: layers of feature representations can be stacked to create deep networks, which are more capable of modeling complex structures in the data

2.Properties of time-series data（时间序列数据的性质）

1.contain much noise、high dimensionality（高维度）

2.不确定信息是否足够，以预测未来

3.time-dependencies

4.non-stationary（mean, variance, and frequency, changes over time）

5. Most features used for time-series need to be invariant to translations in time

The key for any successful application lies in choosing the right representation

3. Unsupervised feature learning and deep learning

Unsupervised feature learning：可以利用大量的无标签样本、相比于手工指定的特征更多的特征可以被学到

无监督特征学习方法：RBM、Conditional RBM、Auto-encoder、RNN

Auto-encoder：最初以一种 dimensionality reduction algorithm（降维算法）提出。包含可视层、隐层、重建层。

A 1-layer auto-encoder for static time-series input：

RNN（trained by backpropagation-through-time (BPTT) 穿越时空的BP）、

Long-short term memory cell：better finds long-term dependencies

3.6 deep learning

逐层贪婪训练: 解决梯度扩散、避免局部最优解、提供比随机初始化更好的办法

无监督预训练（unsupervised pre-training）即训练网络的第一个隐藏层，再训练第二个…最后用这些训练好的网络参数值作为整体网络参数的初始值

3.7 Convolution and pooling

A technique that is particularly interesting for high-dimensional data such as images and time-series data, is convolution.

特点：隐层单元与输入层之间不是全连接的，而是分成局部连接，如图5.

Convolution的应用：应用于RBMs和自编码器，得到convRBM、convAE， Time-Delay Neural Network (TDNN) ： exploits the time structure of the input by performing convolutions on overlapping windows。

3.8 Temporal coherence（时间连贯性）

3.9 Hidden Markov Model（HMM）

3.10. Summary

Temporal relation : 该模型是否可以捕获到时序关系

Memory : how many steps back in time an input have on the current frame （时间上的回溯长度）

Generative: indicates if the model is generative（该模型是否是生成模型）

判断model应该考虑的问题：

（1）使用生成模型还是判别模型？（ A generative model is preferred if the trained model should be used for synthesizing new data or prediction tasks where partial input data (data at t + 1) need to be reconstructed，如果是分类任务则选择discriminative model足矣）

（2）数据属性为何种？

（3）输入大小为多大？

if the data has a temporal structure it is not recommended to treat the input data as a feature vector since this will discard the temporal information.

4. Classical time-series problems

4.1 Videos

Video data are series of images over time (spatio-temporal data) and can
therefore be viewed as high-dimensional time-series data.(视频数据是随时间变化的一系列图像(时空数据)，因此可以视为高维时间序列数据)

深度学习、特征学习和池函数卷积的使用推动了视频处理的发展，由于深度学习已经被证明在静态图像上构建有用特征是很成功的，因此对视频流进行建模是其算法的自然延续。

存在的问题：time-dependency 只能在a few frames 之间建模

展望：to look at models that can learn longer time-dependencies

4.2 Stock market prediction

Stock market data 特点： nonlinear, uncertain, non-stationary

Effcient Market Hypothesis (EMH) 有效市场假设

a random walk pattern、the same probability to go up and go down( 导致：predictions can not have more than 50% accuracy)

4.3. Speech recognition

The raw input data is single channel and highly time and frequency dependent.

子问题：speaker identification、 gender identification、 speech-to-text、 acoustic modeling

主要方法： Hidden Markov Models（HMMs）、RBM、Time-Delay Neural Network (TDNN)（1989）、 deep Long Short-term Memory Recurrent Neural Network（LSTM，2013）

4.4. Music recognition

The motivation for using deep networks is that music itself is structured hierarchically by a combination of chords, melodies and rhythms that creates motives, phrases, sections and finally entire pieces .

even though convolutional networks have given good results on time-frequency representations of audio, there is room for discovering new and better models.

4.5. Motion capture data

A motivation for using deep learning algorithms for motion capture data is that it
has been suggested that human motion is composed of elementary building blocks (motion templates) and any complex motion is constructed from a library of these previously learned motion templates. Deep networks can, in an unsupervised manner, learn these motion templates from raw data and use them to form complex human motions.

4.6. Electronic nose data

…

4.7. Physiological data

such as EEG（脑电图）、MEG（脑磁图）、ECG（心电图）和用于健康检测的传感器数据.

The use of a feature learning algorithm is particularly beneficial in medical applications because acquiring a labeled medical data set is expensive since the data sets are often very large and require the labeling of an expert in the field.

recent works show that DBNs can be applied to raw physiological data to e?ectively learn relevant features.

Independent Component Analysis (ICA) has provided to be a new tool to analyze time series.

Self-taught learning (自我学习)has been used with time-series data from wearable hand-motion sensors.

4.8. Summary

The fifth column （Common method）reports the most commonly used method(s), or current state-of-the-art, for each time-series problem.

股票预测：the progress has stopped at classical neural networks，The current state-
of-the-art augments additional information beside the stock data（ it has been shown that predicted stock prices can be improved if further information is extracted from online social media, such as Twitter feeds (Bollen et al., 2011) and online chat activity (Gruhl et al.,2005)）

高维时序数据：For high dimensional temporal data such as video and music recognition, the convolutional version of RBM have been successful.

speech recongnition: RBM、RNN（ the current state-of-the-art is achieved with an RNN）

E-nose data: single layer neural networks with temporal capabilities have been used and

the use of deep networks is an interesting future direction for modeling e-nose data

5.Conclusion

无监督特征学习和深度学习技术已经成功运用到许多领域。深度学习在time-series data 上大有可为!

用static input方式处理时序数据的问题在于：时间在data中的重要性没有被认识到；
和静态数据一样，处理时序数据同样面临很多问题，如处理高维观测值、变量之间的非线性关系；
忽略时间因素而简单地将静态数据模型用于时间序列数据，会丢掉大量的结构信息。使用静态模型，当前输入的上下文信息会丢失，只能捕获到输入大小的time-dependencies；
模型的选择以及数据应以何种形式被呈现给模型高度相关于数据的类型；
给定模型，可以有很多连接方式、结构和超参数的设置方式；
配置和训练好后，Deep learning methods在时间序列数据的处理上要优于shallow approaches；
Another possible future direction ：构建能在学习过程中改变内部结构的Models，以捕获短程和长程的实践依赖；
Further research： algorithms for time-series（learn better features 、easier and faster to train）.

点赞
收藏
分享
文章举报

nero2018 发布了10 篇原创文章 · 获赞 0 · 访问量 260 私信关注

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航