TCN: TEMPORAL CONVOLUTIONAL NETWORKS

article/2025/11/5 7:55:19

搬运自:Raushan Roy-TEMPORAL CONVOLUTIONAL NETWORKS

Learning sequences efficiently and effectively

Until recently the default choice for sequence modeling task was RNNs because of their great ability to capture temporal dependencies in sequential data. There are several variants of RNNs such as LSTM, GRU capable of capturing long term dependencies in sequences and have achieved state of art performances in seq2seq modeling tasks.

Recently, Deep Learning practitioners have been using a variation of Convolutional Neural Network architecture for the sequence modelling tasks, Temporal Convolutional Networks. This is a simple descriptive term to refer to a family of architectures.

Google DeepMind published a paper Pixel Recurrent Neural Network where the authors introduced the notion of using CNN as sequence model with fixed dependency range, by using masked convolutions. The idea was to convolve only on the elements from current timestamp or earlier in the previous layer. Google DeepMind published another paper Neural Machine Translation in Linear Time inspired from the above idea which achieved state-of -the-art performance on English-to-German translation task surpassing the recurrent networks with attentional pooling.

The distinguishing characteristics of TCNs are:

  • The architecture can take a sequence of any length and map it to an output sequence of the same length, just as with an RNNs.
  • The convolutions in the architecture are causal, meaning that there is no information “leakage” from future to past.

To accomplish the first point, the TCN uses a 1D fully-convolutional network (FCN) architecture, where each hidden layer is the same length as the input layer, and zero padding of length (kernel size − 1) is added to keep subsequent layers the same length as previous ones. To achieve the second point, the TCN uses causal convolutions, convolutions where an output at time t is convolved only with elements from time t and earlier in the previous layer.

To put it simply: TCN= 1D FCN + causal convolutions

This basic architecture can look back at history with size linear in the depth of the network and the filter size. Hence, capturing long term dependencies becomes really challenging. One simple solution to overcome this problem is to use dilated convolutions.

Dilated Convolutions

For a 1-D sequence input x ∈ R^n and a filter f: {0,…,k-1} → R, the dilated convolution operation F on element s of the sequence is defined as(Image from paper):
在这里插入图片描述
In above formula the causal convolution criteria is also taken care of by using only non-negative values of i. It is evident that the normal convolution is nothing but a 1 dilated convolution. For more on dilated convolution you can refer to this blog.
在这里插入图片描述

A dilated causal convolution with dilation factors d = 1, 2, 4 and filter size k = 3

Advantages of using TCNs for sequence modeling

  • Parallelism: Unlike in RNNs where the predictions for later time-steps must wait for their predecessors to complete, convolutions can be done in parallel since the same filter is used in each layer. Therefore, in both training and evaluation, a long input sequence can be processed as a whole in TCN, instead of sequentially as in RNN.
  • Flexible receptive field size: A TCN can change its receptive field size in multiple ways. For instance, stacking more dilated (causal) convolutional layers, using larger dilation factors, or increasing the filter size are all viable options (with possibly different interpretations). TCNs thus afford better control of the model’s memory size, and are easy to adapt to different domains.
  • Low memory requirement for training: Especially in the case of a long input sequence, LSTMs and GRUs can easily use up a lot of memory to store the partial results for their multiple cell gates. However, in a TCN the filters are shared across a layer, with the back-propagation path depending only on network depth. Therefore in practice, gated RNNs are likely to use up to a multiplicative factor more memory than TCNs.
  • Capturing local information: Using convolution operation helps in capturing local information along with temporal information.

Departing Words:

Performing BackProp for dilated convolutions is nothing but transpose convolution operation.
Since a TCN’s receptive field depends on the network depth n as well as filter size k and dilation factor d, stabilization of deeper and larger TCNs becomes important. Therefore, a generic residual module in place of a convolutional layer.

References:

  • Bai, S., Kolter, J. Z., & Koltun, V. (2018). An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271.
  • Lea, C., Vidal, R., Reiter, A., & Hager, G. D. (2016, October). Temporal convolutional networks: A unified approach to action segmentation. In European Conference on Computer Vision(pp. 47–54). Springer, Cham.
  • Kalchbrenner, N., Espeholt, L., Simonyan, K., Oord, A. V. D., Graves, A., & Kavukcuoglu, K. (2016). Neural machine translation in linear time. arXiv preprint arXiv:1610.10099.

补充:
https://blog.csdn.net/Kuo_Jun_Lin/article/details/80602776


http://chatgpt.dhexx.cn/article/2gKDBX60.shtml

相关文章

【深度学习论文阅读】TCN:An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence

2018年人工智能十佳论文之一:TCN 论文地址:An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling 项目地址:TCN的github链接 文章目录 1. ABSTRACT2. INTRODUCTION & BACKGROUND2.1 Temporal…

wavenet及TCN

一切皆卷积——包括时序相关任务 1.wavenet1.1 wavenet的pytorch实现1.1.1 wavenet类1.1.2 ResidualConv1dGLU 1.2 wavenet在纳米孔测序中的应用 2.Temporal Convolutional Network(TCN)2.1 TCN模型介绍2.3 TCN代码实现及可视化 3.wavenet/TCN的优点参考…

时序模型(一)—— TCN 时间卷积网络

一、 概述 TCN是18年提出的时序卷积神经网络模型。 时序问题建模,通常采用RNN循环神经网络及其相关变种,比如LSTM、GRU等,这里将卷积神经网络通过膨胀卷积达到抓取长时依赖信息的效果,TCN在一些任务上甚至能超过RNN相关模型。 …

【学习日志】【TCN】时间序列卷积神经网络(1)

1. ask bing(Temporal Convolutional Network) 问:“我对CNN、RNN、TCN等神经网络没有任何基础,你能直观地给我讲一下TCN的结构、输入输出和原理吗?” bing对TCN的解释如下: TCN是一种用于处理序列数据的神…

LSTM的备胎,用卷积处理时间序列——TCN与因果卷积(理论+Python实践)

什么是TCN TCN全称Temporal Convolutional Network,时序卷积网络,是在2018年提出的一个卷积模型,但是可以用来处理时间序列。 卷积如何处理时间序列 时间序列预测,最容易想到的就是那个马尔可夫模型: P ( y k ∣ x…

python深度学习之TCN实例

1.TCN的介绍 近些年,关于时间序列、自然语言处理等任务大家一般都会想到RNN、LSTM、GRU,一维CNN以及后面延伸出的Bi-Lstm、ConvLstm等等,这是因为RNN天生可以记住以前时段的信息,而传统的神经网络并不具有这个功能。卷积神经网络…

时间卷积网络TCN:时间序列处理的新模型

这篇文章回顾了基于TCN的解决方案的最新创新。我们首先介绍了一个运动检测的案例研究,并简要回顾了TCN架构及其相对于传统方法的优势,如卷积神经网络(CNN)和递归神经网络(RNN)。然后,我们介绍了一些使用TCN的应用,包括改进交通预测…

TCN论文及代码解读总结

前言:传统的时序处理,普遍采用RNN做为基础网络模型,如其变体LSTM、GRU、BPTT等。但是在处理使用LSTM时时序的卷积神经网络 目录 论文及代码链接一、论文解读1、 摘要2、引言(摘)3、时序卷积神经网络(Temporal Convolutional Networks)3.1 因果…

轨道交通中TCN、TRDP、TSN的理解

轨道交通中TCN、TSN、TRDP的理解 1 TCN2 TSN3 TRDP———————————————————————— 1 TCN TCN(Train Communication Network)是列车通信网络,是列车通信网络的总称。 在IEC 61375-1 以及GB/T 28029.1中都详细的介绍了列车…

深度学习 + 论文详解: TCN_时间卷积网络_原理与优势

论文链接 TCN: https://arxiv.org/pdf/1803.01271.pdf p.s. TCN stands for Temporal Convolutional Network. 它是除了 RNN architecture 之外的第二种可以分析时间性数据的架构 更多文章将在公众号:AI 算法辞典 首发! 前言 RNN 从最一开始发展以来…

【python量化】用时间卷积神经网络(TCN)进行股价预测

写在前面 下面这篇文章首先主要简单介绍了目前较为先进的时间序列预测方法——时间卷积神经网络(TCN)的基本原理,然后基于TCN的开源代码,手把手教你如何通过时间卷积神经网络来进行股价预测,感兴趣的读者也可以基于此模…

时域卷积网络(Temporal Convolutional Network,TCN)

TCN基本结构 时域卷积网络(Temporal Convolutional Network,TCN)由Shaojie Bai et al.在2018年提出的,可以用于时序数据处理,详细内容请看论文。 1.因果卷积(Causal Convolution) 因果卷积如上…

时域卷积网络TCN详解:使用卷积进行序列建模和预测

CNN经过一些简单的调整就可以成为序列建模和预测的强大工具 尽管卷积神经网络(CNNs)通常与图像分类任务相关,但经过适当的修改,它已被证明是进行序列建模和预测的有价值的工具。在本文中,我们将详细探讨时域卷积网络(TCN)所包含的基本构建块&…

TCN代码详解-Torch (误导纠正)

1. 绪论 TCN网络由Shaojie Bai, J. Zico Kolter, Vladlen Koltun 三人于2018提出。对于序列预测而言,通常考虑循环神经网络结构,例如RNN、LSTM、GRU等。他们三个人的研究建议我们,对于某些序列预测(音频合…

时序CNN基础——TCN

自用~~笔记~~ 知识补充: 空洞卷积(膨胀卷积)——Dilated Conv 在标准卷积的基础上注入空洞,以此来增加感受野(reception field)。因此增加一个超参:膨胀率,指kernel的间隔数量。 因…

时间卷积网络(TCN):结构+pytorch代码

文章目录 TCNTCN结构1-D FCN的结构因果卷积(Causal Convolutions)膨胀因果卷积(Dilated Causal Convolutions)膨胀非因果卷积(Dilated Non-Causal Convolutions)残差块结构 pytorch代码讲解 TCN TCN(Temporal Convolutional Network)是由Shaojie Bai et al.提出的,p…

时间序列预测——时序卷积网络(TCN)

本文展示了使用时序卷积网络(TCN)进行时间序列预测的全过程,包含详细的注释。整个过程主要包括:数据导入、数据清洗、结构转化、建立TCN模型、训练模型(包括动态调整学习率和earlystopping的设置)、预测、结…

TCN-时间卷积网络

目录 一、引言 二、时序卷积神经网络 2.1 因果卷积(Causal Convolution) 2.2 膨胀卷积(Dilated Convolution) 2.3 残差链接(Residual Connections) 三、讨论和总结 1. TCN的优点 2. TCN的缺点 参考…

时间卷积网络TCN:CNN也可以处理时序数据并且比LSTM更好

本文回顾了 Shaojie Bai、J. Zico Kolter 和 Vladlen Koltun 撰写的论文:An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling。 TCN源代码:https://github.com/locuslab/TCN 文章目录 1. 序列建模2. 因果…

TCN(Temporal Convolutional Network,时间卷积网络)

1 前言 实验表明,RNN 在几乎所有的序列问题上都有良好表现,包括语音/文本识别、机器翻译、手写体识别、序列数据分析(预测)等。 在实际应用中,RNN 在内部设计上存在一个严重的问题:由于网络一次只能处理一个…