你应该知道的LightGBM各种操作!
简介上一篇TimesNet论文中,吴海旭等人将时序1D结构转换为2D结构后,方便抽取更多信息。该篇文章LightTS也是将1D时序结构转为2D结构,而且十分简单,我之前打时序比赛就看过有朋友,将1D时序,reshape成2D,然后用卷积核建模抽取信息,跟这篇文章很像,只不过这里考虑了2种采样组织2D数据的方式,然后用的是MLP抽取特征。本篇推荐指数:4星。不足之处,还望批评指正。
论文:2022 | Less is more: Fast multivariate time series forecasting with light sampling-oriented mlp structures [1]
作者:Zhang, Tianping, Yizhuo Zhang, Wei Cao, Jiang Bian, Xiaohan Yi, Shun Zheng, and Jian Li
机构:清华、微软亚洲研究院
代码:https://github.com/thuml/Time-Series-Library/blob/main/models/LightTS.py
引用量:22
上一篇TimesNet论文中,吴海旭等人将时序1D结构转换为2D结构后,方便抽取更多信息。该篇文章LightTS也是将1D时序结构转为2D结构,而且十分简单,我之前打时序比赛就看过有朋友,将1D时序,reshape成2D,然后用卷积核建模抽取信息,跟这篇文章很像,只不过这里考虑了2种采样组织2D数据的方式,然后用的是MLP抽取特征。
首先,假设输入的时序维度为[B, T, N],作者便做了2种采样:
- 连续采样:侧重于捕获短期局部模式。
- 间隔采样:侧重于捕获长期依赖性。
如下图所示,很好理解,新的数据维度为[B, C, T/C, N],N代表时序的数量。
注意:论文上IEBlockC是直接出预测结果,但代码上,如上图红色标注补充,会有个从输入端过来的预测highway结果,然后和IEBlockCx相加后,作为最终预测输出。
之后经过一个Information Exchange Block (IEBlock)模块,这个模块很简单,就是对每条时序[B,C,T/C,1]分别做temporal projection (C维度上经过MLP) 和channel projection (T/C维度上经过MLP),然后两个结果[B, F’, T/C]相加后,喂入MLP,得到最终输出,维度为[B,F,T/C,1]。
注意:论文上是temporal projection结果串联喂入channel projection,而代码实际上这两个projection是并行处理输入的,然后在相加后喂入output projection。
两种采样结果分别经过各自的IEBlock后,再经过线性层转换,合并结果,再喂入最后的IEBlock输出预测结果即可。
实验效果:
但我从TimesNet论文看,对比Dlinear,还是差点意思:
直接上代码吧,我把维度变化写在注释上了,结合上面模型图看,就很清晰了(我后面打比赛试试):
def encoder(self, x):
B, T, N = x.size() # [B, T, N]
# [B, T, N] -> [B, T_pred, N]
highway = self.ar(x.permute(0, 2, 1))
highway = highway.permute(0, 2, 1)
# continuous sampling
# [B, T, N] -> [B, T/C, C, N]
x1 = x.reshape(B, self.num_chunks, self.chunk_size, N)
# [B, T/C, C, N] -> [B, N, C, T/C]
x1 = x1.permute(0, 3, 2, 1)
# [B, N, C, T/C] -> [B*N, C, T/C]
x1 = x1.reshape(-1, self.chunk_size, self.num_chunks)
# [B*N, C, T/C] -> [B*N, F, T/C]
x1 = self.layer_1(x1)
# [B*N, F, T/C] -> [B*N, F,]
x1 = self.chunk_proj_1(x1).squeeze(dim=-1)
# interval sampling
# [B, T, N] -> [B, C, T/C, N]
x2 = x.reshape(B, self.chunk_size, self.num_chunks, N)
x2 = x2.permute(0, 3, 1, 2)
x2 = x2.reshape(-1, self.chunk_size, self.num_chunks)
x2 = self.layer_2(x2)
x2 = self.chunk_proj_2(x2).squeeze(dim=-1)
x3 = torch.cat([x1, x2], dim=-1) # [B*N, 2*F]
x3 = x3.reshape(B, N, -1) # [B, N, 2*F]
x3 = x3.permute(0, 2, 1) # [B, 2*F, N]
out = self.layer_3(x3) # [B, T_pred, N]
out = out + highway # [B, T_pred, N]
return out
IEBlock:
class IEBlock(nn.Module):
def __init__(self, input_dim, hid_dim, output_dim, num_node):
super(IEBlock, self).__init__()
self.input_dim = input_dim # C
self.hid_dim = hid_dim
self.output_dim = output_dim # F
self.num_node = num_node # T/C
self._build()
def _build(self):
self.spatial_proj = nn.Sequential(
nn.Linear(self.input_dim, self.hid_dim),
nn.LeakyReLU(),
nn.Linear(self.hid_dim, self.hid_dim // 4)
)
self.channel_proj = nn.Linear(self.num_node, self.num_node)
torch.nn.init.eye_(self.channel_proj.weight)
self.output_proj = nn.Linear(self.hid_dim // 4, self.output_dim)
def forward(self, x):
# [B*N, C, T/C] -> [B*N, T/C, F']
x = self.spatial_proj(x.permute(0, 2, 1))
# [B*N, F', T/C] + [B*N, F', T/C]
x = x.permute(0, 2, 1) + self.channel_proj(x.permute(0, 2, 1))
# [B*N, F', T/C] -> [B*N, T/C, F]
x = self.output_proj(x.permute(0, 2, 1))
# [B*N, T/C, F] -> [B*N, F, T/C]
x = x.permute(0, 2, 1)
return x
参考资料
[1] Zhang, T., Zhang, Y., Cao, W., Bian, J., Yi, X., Zheng, S., & Li, J. (2022). Less is more: Fast multivariate time series forecasting with light sampling-oriented mlp structures. arXiv preprint arXiv:2207.01186.