coding
unsky
deepdim
thought

回声状态网络ESN

ESN( echo state networks):
针对递归神经网络训练困难以及记忆渐消问题,jaeger 于2001年提出一种新型递归神经网络— — —回声状态网络. ESN网络一经提出便成为学术界的研究热点,并应用到各种不同领域,包括动态模式分类、机器人控制、对象跟踪核运动目标 检 测、 事 件 监 测 等, 尤其是时间序列预测问题.

ESN

ESN 属于RNN的范畴,具有短期记忆的能力。今天将探讨和实验ESN在时间序列问题上的原理。

时间序列数据往往具有高噪声、随机性以及非线性等特点,其建模、分析以及预测问题一直是学术界研究的热点 .一般地讲,为了更加准确地预测时间序列,需要时间序列模型既具有良好的非线性逼近能力,又具有良好的记忆能力 .这对于经典的时间序列建模和分析方法提出极大的挑战 .为了解决非线性时间序列预测问题,支持向量机、神经网络等人工智能方法被引入到时间序列分析领域.

ESN结构

相对于传统神经网络结构ESN所学习的并不是神经元的权重,二是神经元之间的链接。
回声状态状态网络作为一种新型的递归神经网络,无论是建模还是学习算法,都已经与传统的递归神经网络差别很大。
比较直观的结构如下:

ESN网络特点:

  1. 它的核心结构是一个随机生成、且保持不 变的储备池(Reservoir)
  2. 其输出权值是唯一需要调整的部分
  3. 简单的线性回归就可完成网络的训练

    ESN的数学模型

假设系统具有M个输入单元,N个内部处理单元(Processing elements PEs),即N个内部神经元,同时具有L个输出单元,输入单元、内部状态、以及输出单元n时刻的值分别为

$$u\left( n \right) =\left[ u_1\left( n \right) ,…,u_M\left( n \right) \right] ^T$$
$$x\left( n \right) =\left[ x_1\left( n \right) ,…,x_N\left( n \right) \right] ^T$$
$$y\left( n \right) =\left[ y_1\left( n \right) ,…,y_L\left( n \right) \right] ^T$$
其中,$u\left( n \right) \in R^{K\times 1}$ , $x\left( n \right) \in R^{N\times 1}$ , $y\left( n \right) \in R^{L\times 1}$

从结构上讲,ESNs是一种特殊类型的神经网络,其基本思想是使用大规模随机连接的递归网络,取代经典神经网络中的中间层,从而简化网络的训练过程,回声状态网络的状态方程为:
$$x\left( n+1 \right) =f\left( Wx\left( n \right) +W^{in}u\left( n \right) +W^{back}y\left( n \right) \right) $$
$$
y\left( n+1 \right) =f_{out}\left( W^{out}\left[ x\left( n+1 \right) ,u\left( n+1 \right) ,y\left( n \right) \right] +b^{out} \right)$$

其中,$W\in R^{N\times N}$ , $W^{in}\in R^{N\times K}$ , $W^{back}\in R^{N\times L}$ , $W^{out}\in R^{\left( L+K+N \right) \times L}$

f为激活函数,一般为双曲正切函数。
其中W, $W^{in}$ , $W^{back}$ 分别表示状态变量、输入和输出对状态变量的连接权矩阵;$W^{out}$ 表示储备池、输入和输出对于输出的连接权矩阵,$b^{out}$ 表示输出的偏置项或者可以代表噪声,表示内部神经元激活函数,在网络训练过程中,连接到储备池的连接权矩阵 W, $W^{in}$ , $W^{back}$ 随机产生,一经产生就固定不变.而连接到输出的连接权矩阵 $W^{out}$ 需要通过训练得到,因为状态变量、输入和输出与输出之间是线性关系,所以通常这些连接权只需通过求解线性回归问题得到。

ESN的训练

ESNs的训练过程就是根据给定的训练样本 $u\left( n \right) =\left[ u_1\left( n \right) ,…,u_M\left( n \right) \right] ^T$ ,确定系统输出连接权矩阵
$W^{out}$的过程 .
ESNs的训练分为两个过程: 采样过程权值计算过程

采样过程

采样阶段首先任意选定网络的初始状态,但是通常情况下选取网络的初始状态为0,即 $x(0)=0$ ,训练样本$u\left( n \right) =\left[ u_1\left( n \right) ,…,u_M\left( n \right) \right] ^T$ 经过输入连接权矩阵 $W^{in}$ 被加入到储备池中, 按照方程,依次完成系统状态和输出珋 $y(n)$ 的计算与收集 . 为了计算输出连接权矩阵,需要从某一时刻 m 开始收集(采样)内部状态变量,并以向量 $x\left( n \right) =\left[ x_1\left( n \right) ,…,x_N\left( n \right) \right] ^T$ 为行构成矩阵,同时相应的样本数据 y(n)也被收集,并构成一个列向量 $B\in R^{\left( p-m+1 \right) \times N}$ ,同时相应的样本数据y(n)也被收集,并构成 $T\in R^{\left( p-m+1 \right) \times L}$

权值计算

权值计算就是根据在采样阶段收集到系统状态矩阵和样本数据,计算输出连接权 $W^{out}$ . 因为状态变量 $x(n)$ 和系统输出 $y(n)$ 之间是线性关系,而需要实现的目标是利用网络实际输出逼近期望输出 :
$$y\left( n \right) \approx \hat{y}\left( n \right) =\sum_{i=1}^L{w_{i}^{out}x_i\left( n \right)}$$

所以损失函数为:
$$\underset{w_{i}^{out}}{\min}\frac{1}{p-m+1}\sum_{n=m}^P{\left( y\left( n \right) -\sum_{i=1}^L{w_{i}^{out}x_i\left( n \right)} \right) ^2}$$

所以其训练过程可以用于计算:
$$W^{out}=B^{-1}T $$

实验

实验代码:https://github.com/unsky/esn-rmlp

主程序

esn_main.m

1
2
3
4
5
6
7
8
9
10
11
clear;
clc;
% Generate ESN for training
net = esn_net(25, 600, 1);
% Generate training data
[I_data, T_data] = seq_gen_esn(26);
% Train ESN
net_trained = esn_train(net,I_data,T_data);
% Test ESN
[original_out,net_out,error] = esn_test(net_trained);

训练数据准备

 seq_gen_esn 用于产生3000个训练数据

1
2
3
4
5
6
7
8
9
10
11
12
t = 0 : 3000; % training sequence time interval
y = signal(t); % generate training sequence
len = length(y);
incom_data = (rand(1,len)>0.00); % incomplete data ratio
y = y.*incom_data;
num_subset = len - len_subset + 1; % number of subset
fprintf('Training sequence generation is in process, please wait...\n')
%>>>>>>>>>>>>>>>>> Main Loop <<<<<<<<<<<<<<<<<<<<<
for i = (1:num_subset),
I_data(i,:) = y(i:len_subset-2+i);
T_data(i,:) = y(len_subset-1+i);
end;

网络训练

esn_train用于网络训练求得 $W^{out}$

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
AUC = net.numAllUnits; % number of all units
IUC = net.numInputUnits; % number of input units
OUC = net.numOutputUnits; % number of output units
HUC = net.numHiddenLayer; % number of hidden units
drWeights = net.reservoirWeights; % dynamic reservoir weights matrix
inWeights = net.inputWeights; % input matrix
bkWeights = net.backWeights; % backward weights from output layer
ouWeights = net.outputWeights; % output weights matrix
bl_out = net.bl_out; % type of output neuron
int_bk = net.int_bk; % intensity of feedback
attenu = net.attenu; % attenuation ratio for the signal
%>>>>>>>>>>>>>>>>> Parameter Check <<<<<<<<<<<<<<<<<<<<<<<<<<
[inpSize, inpNum] = size(I_data');
[tarSize, tarNum] = size(T_data');
if inpSize ~= IUC,
error ('Number of input units and input pattern size do not match.');
end;
if tarSize ~= OUC,
error ('Number of output units and target pattern size do not match.');
end;
if inpNum ~= tarNum,
error ('Number of input and output patterns are different.');
end;
%>>>>>>>>>>> Initialization of Training <<<<<<<<<<<<<<<<<<<<
I_data = attenu*I_data;
T_data = attenu*T_data;
X(1,:) = zeros(1,HUC); % initial reservoir state
I1_data = [zeros(1,inpSize); I_data]; % add zero to initial input
T1_data = [zeros(1,tarSize); T_data]; % add zero to initial output
timeflag= cputime; % a timer to save the training time
wb = waitbar(0, 'Echo State Network Training in Progress...');
T0 = 1000; % washout time
fprintf('\nThe echo state network training is in process...\n');
%>>>>>>>>>>>>>>> Main Loop of ESN Training <<<<<<<<<<<<<<<<<<
for i = (1:inpNum),
waitbar(i/inpNum,wb)
set(wb,'name',['Progress = ' sprintf('%2.1f',i/inpNum*100) '%']);
X(i+1,:) = hyperb((inWeights*I1_data(i+1,:)' + drWeights*X(i,:)' + ...
int_bk*bkWeights*T1_data(i,:)' + 0.001*(rand(1,HUC)-0.5)')');
end;
close(wb);
fprintf('Please wait for another while...\n');
%>>>>>> Calculate output weights and update ESN <<<<<<<<<<<<<
if (bl_out == 1),
ouWeights = (pinv(X(T0+2:end,:))*(T_data(T0+1:end,:)))'; % linear output
else
ouWeights = (pinv(X(T0+2:end,:))*(inv_hyperb(T_data(T0+1:end,:))))';
end;
net.outputWeights = ouWeights;
net_trained = net;
timeflag = cputime - timeflag;
fprintf('Training accomplished! Total time is %2.2f hours.\n',timeflag/3600);

测试结果

esn_test

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
%>>>>>>>>> Obtain parameters from RMLP net <<<<<<<<<<<<<<<<<<
AUC = net.numAllUnits; % number of all units
IUC = net.numInputUnits; % number of input units
OUC = net.numOutputUnits; % number of output units
HUC = net.numHiddenLayer; % number of hidden units
drWeights = net.reservoirWeights; % dynamic reservoir weights matrix
inWeights = net.inputWeights; % input matrix
bkWeights = net.backWeights; % backward weights from output layer
ouWeights = net.outputWeights; % output weights matrix
len_subset= IUC + OUC; % subset length
bl_out = net.bl_out; % type of output neuron
int_bk = net.int_bk; % intensity of feedback
attenu = net.attenu; % attenuation ratio for the signal
%>>>>>>>>>>>> Testing Parameters Setting <<<<<<<<<<<<<<<<<<<
S_point = 4000; % starting point of testing data
testNum = 3000; % number of testing data
X(1,:) = zeros(1,HUC); % initial reservoir state
t = [S_point : S_point+len_subset-1];
y0 = rand(1,OUC)-0.5; % initial output
%>>>>>>>>>>>>>>>>>>>>>>>> Check parameter <<<<<<<<<<<<<<<<<<
if length(t) ~= len_subset,
error('Length of testing data subset and the network structure do not match');
end;
%>>>>>>>>>>>>>>>>> Testing Main Routine <<<<<<<<<<<<<<<<<<<<<
wb = waitbar(0, 'Echo State Network Testing in Progress...');
for i = (1:testNum),
waitbar(i/testNum,wb)
set(wb,'name',['Progress = ' sprintf('%2.1f',i/testNum*100) '%']);
y = attenu*signal(t); % generate testing data
X(i+1,:) = hyperb((inWeights*y(1:end-OUC)' + ...
drWeights*X(i,:)' + int_bk*bkWeights*y0')'); % update reservoir state
if (bl_out == 1),
Y(i+1,:) = ouWeights*X(i+1,:)'; % update output state - Linear output
else
Y(i+1,:) = hyperb(ouWeights*X(i+1,:)'); % update output state - nonlinear output
end;
% update state for next iteration and output
original_out(i) = (1/attenu)*y(end-OUC+1:end); % original output
net_out(i) = (1/attenu)*Y(i+1,:); % network output
error(i) = net_out(i) - original_out(i); % errors
y0 = Y(i+1,:); % store the output for next calculation
t = t + 1; % Move one-step forward
end;
close(wb);
%>>>>>>>>>>>>>>>>>> Plotting <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
subplot(211);
plot([S_point+1:S_point+testNum],original_out,'b',[S_point+1:S_point+testNum],net_out,'r');
hold on; grid on;
legend('Original sequence','Network output');
xlabel('time'); ylabel('Amplitude');
subplot(212);
plot([S_point+1:S_point+testNum],error,'b');
hold on; grid on;
xlabel('Time'); ylabel('Output error');
RMSE = sqrt(mean((net_out(1:end) - original_out(1:end)).^2))

实验结果


从错误率可以看出,ESNs的误差率可以接受。。

坚持原创技术分享,您的支持将鼓励我继续创作!