ESN( echo state networks):
针对递归神经网络训练困难以及记忆渐消问题,jaeger 于2001年提出一种新型递归神经网络— — —回声状态网络. ESN网络一经提出便成为学术界的研究热点，并应用到各种不同领域，包括动态模式分类、机器人控制、对象跟踪核运动目标检测、事件监测等，尤其是时间序列预测问题.

ESN

ESN 属于RNN的范畴，具有短期记忆的能力。今天将探讨和实验ESN在时间序列问题上的原理。

时间序列数据往往具有高噪声、随机性以及非线性等特点，其建模、分析以及预测问题一直是学术界研究的热点．一般地讲，为了更加准确地预测时间序列，需要时间序列模型既具有良好的非线性逼近能力，又具有良好的记忆能力．这对于经典的时间序列建模和分析方法提出极大的挑战．为了解决非线性时间序列预测问题，支持向量机、神经网络等人工智能方法被引入到时间序列分析领域.

ESN结构

相对于传统神经网络结构ESN所学习的并不是神经元的权重，二是神经元之间的链接。
回声状态状态网络作为一种新型的递归神经网络，无论是建模还是学习算法，都已经与传统的递归神经网络差别很大。
比较直观的结构如下：

ESN网络特点：

它的核心结构是一个随机生成、且保持不变的储备池（Reservoir）
其输出权值是唯一需要调整的部分
简单的线性回归就可完成网络的训练
ESN的数学模型

假设系统具有M个输入单元,N个内部处理单元（Processing elements PEs），即N个内部神经元,同时具有L个输出单元,输入单元、内部状态、以及输出单元n时刻的值分别为

$$u\left( n \right) =\left[ u_1\left( n \right) ,…,u_M\left( n \right) \right] ^T$$
$$x\left( n \right) =\left[ x_1\left( n \right) ,…,x_N\left( n \right) \right] ^T$$
$$y\left( n \right) =\left[ y_1\left( n \right) ,…,y_L\left( n \right) \right] ^T$$
其中，$u\left( n \right) \in R^{K\times 1}$ , $x\left( n \right) \in R^{N\times 1}$ , $y\left( n \right) \in R^{L\times 1}$

从结构上讲，ESNs是一种特殊类型的神经网络，其基本思想是使用大规模随机连接的递归网络，取代经典神经网络中的中间层，从而简化网络的训练过程,回声状态网络的状态方程为：
$$x\left( n+1 \right) =f\left( Wx\left( n \right) +W^{in}u\left( n \right) +W^{back}y\left( n \right) \right) $$
$$
y\left( n+1 \right) =f_{out}\left( W^{out}\left[ x\left( n+1 \right) ,u\left( n+1 \right) ,y\left( n \right) \right] +b^{out} \right)$$

其中，$W\in R^{N\times N}$ , $W^{in}\in R^{N\times K}$ , $W^{back}\in R^{N\times L}$ , $W^{out}\in R^{\left( L+K+N \right) \times L}$

f为激活函数，一般为双曲正切函数。
其中W, $W^{in}$ , $W^{back}$ 分别表示状态变量、输入和输出对状态变量的连接权矩阵；$W^{out}$ 表示储备池、输入和输出对于输出的连接权矩阵，$b^{out}$ 表示输出的偏置项或者可以代表噪声,表示内部神经元激活函数,在网络训练过程中，连接到储备池的连接权矩阵 W, $W^{in}$ , $W^{back}$ 随机产生,一经产生就固定不变.而连接到输出的连接权矩阵 $W^{out}$ 需要通过训练得到，因为状态变量、输入和输出与输出之间是线性关系，所以通常这些连接权只需通过求解线性回归问题得到。

ESN的训练

ESNs的训练过程就是根据给定的训练样本 $u\left( n \right) =\left[ u_1\left( n \right) ,…,u_M\left( n \right) \right] ^T$ ，确定系统输出连接权矩阵
$W^{out}$的过程．
ESNs的训练分为两个过程： 采样过程和权值计算过程

采样过程

采样阶段首先任意选定网络的初始状态，但是通常情况下选取网络的初始状态为0，即 $x(0)=0$ ,训练样本$u\left( n \right) =\left[ u_1\left( n \right) ,…,u_M\left( n \right) \right] ^T$ 经过输入连接权矩阵 $W^{in}$ 被加入到储备池中，按照方程，依次完成系统状态和输出珋 $y(n)$ 的计算与收集．为了计算输出连接权矩阵，需要从某一时刻 m 开始收集（采样）内部状态变量，并以向量 $x\left( n \right) =\left[ x_1\left( n \right) ,…,x_N\left( n \right) \right] ^T$ 为行构成矩阵，同时相应的样本数据 y(n)也被收集，并构成一个列向量 $B\in R^{\left( p-m+1 \right) \times N}$ ,同时相应的样本数据y(n)也被收集，并构成 $T\in R^{\left( p-m+1 \right) \times L}$

权值计算

权值计算就是根据在采样阶段收集到系统状态矩阵和样本数据，计算输出连接权 $W^{out}$ ．因为状态变量 $x(n)$ 和系统输出 $y(n)$ 之间是线性关系，而需要实现的目标是利用网络实际输出逼近期望输出 :
$$y\left( n \right) \approx \hat{y}\left( n \right) =\sum_{i=1}^L{w_{i}^{out}x_i\left( n \right)}$$

所以损失函数为：
$$\underset{w_{i}^{out}}{\min}\frac{1}{p-m+1}\sum_{n=m}^P{\left( y\left( n \right) -\sum_{i=1}^L{w_{i}^{out}x_i\left( n \right)} \right) ^2}$$

所以其训练过程可以用于计算：
$$W^{out}=B^{-1}T $$

实验

实验代码：https://github.com/unsky/esn-rmlp

主程序

esn_main.m

clear;
clc;
% Generate ESN for training
net = esn_net(25, 600, 1);
% Generate training data
[I_data, T_data] = seq_gen_esn(26);
% Train ESN
net_trained = esn_train(net,I_data,T_data);
% Test ESN
[original_out,net_out,error] = esn_test(net_trained);

训练数据准备

　seq_gen_esn 用于产生3000个训练数据

t = 0 : 3000;                      % training sequence time interval
y = signal(t);                     % generate training sequence
len = length(y);
incom_data = (rand(1,len)>0.00);   % incomplete data ratio
y = y.*incom_data;
num_subset = len - len_subset + 1; % number of subset
fprintf('Training sequence generation is in process, please wait...\n')
%>>>>>>>>>>>>>>>>> Main Loop <<<<<<<<<<<<<<<<<<<<<
for i = (1:num_subset),
    I_data(i,:) = y(i:len_subset-2+i);
    T_data(i,:) = y(len_subset-1+i);
end;

网络训练

esn_train用于网络训练求得 $W^{out}$

AUC  = net.numAllUnits;           % number of all units
IUC  = net.numInputUnits;         % number of input units
OUC  = net.numOutputUnits;        % number of output units
HUC  = net.numHiddenLayer;        % number of hidden units
drWeights = net.reservoirWeights; % dynamic reservoir weights matrix
inWeights = net.inputWeights;     % input matrix
bkWeights = net.backWeights;      % backward weights from output layer
ouWeights = net.outputWeights;    % output weights matrix
bl_out    = net.bl_out;           % type of output neuron
int_bk    = net.int_bk;           % intensity of feedback
attenu    = net.attenu;           % attenuation ratio for the signal
%>>>>>>>>>>>>>>>>> Parameter Check <<<<<<<<<<<<<<<<<<<<<<<<<<
[inpSize, inpNum] = size(I_data');
[tarSize, tarNum] = size(T_data');
if inpSize ~= IUC,
    error ('Number of input units and input pattern size do not match.');
end;
if tarSize ~= OUC,
    error ('Number of output units and target pattern size do not match.');
end;
if inpNum ~= tarNum,
    error ('Number of input and output patterns are different.');
end;
%>>>>>>>>>>>  Initialization of Training <<<<<<<<<<<<<<<<<<<<
I_data  = attenu*I_data;
T_data  = attenu*T_data;
X(1,:)  = zeros(1,HUC);               % initial reservoir state
I1_data = [zeros(1,inpSize); I_data]; % add zero to initial input
T1_data = [zeros(1,tarSize); T_data]; % add zero to initial output
timeflag= cputime;                    % a timer to save the training time
wb = waitbar(0, 'Echo State Network Training in Progress...');
T0 = 1000;                            % washout time
fprintf('\nThe echo state network training is in process...\n');
%>>>>>>>>>>>>>>> Main Loop of ESN Training <<<<<<<<<<<<<<<<<<
for i = (1:inpNum),
    waitbar(i/inpNum,wb)
    set(wb,'name',['Progress = ' sprintf('%2.1f',i/inpNum*100) '%']);
    X(i+1,:) = hyperb((inWeights*I1_data(i+1,:)' + drWeights*X(i,:)' + ...
                int_bk*bkWeights*T1_data(i,:)' + 0.001*(rand(1,HUC)-0.5)')');
end;
close(wb);
fprintf('Please wait for another while...\n');
%>>>>>> Calculate output weights and update ESN <<<<<<<<<<<<<
if (bl_out == 1),
    ouWeights = (pinv(X(T0+2:end,:))*(T_data(T0+1:end,:)))'; % linear output
else
    ouWeights = (pinv(X(T0+2:end,:))*(inv_hyperb(T_data(T0+1:end,:))))';
end;
net.outputWeights = ouWeights;
net_trained       = net;
timeflag          = cputime - timeflag;
fprintf('Training accomplished! Total time is %2.2f hours.\n',timeflag/3600);

测试结果

esn_test

%>>>>>>>>> Obtain parameters from RMLP net <<<<<<<<<<<<<<<<<<
AUC  = net.numAllUnits;           % number of all units
IUC  = net.numInputUnits;         % number of input units
OUC  = net.numOutputUnits;        % number of output units
HUC  = net.numHiddenLayer;        % number of hidden units
drWeights = net.reservoirWeights; % dynamic reservoir weights matrix
inWeights = net.inputWeights;     % input matrix
bkWeights = net.backWeights;      % backward weights from output layer
ouWeights = net.outputWeights;    % output weights matrix
len_subset= IUC + OUC;            % subset length
bl_out    = net.bl_out;           % type of output neuron
int_bk    = net.int_bk;           % intensity of feedback
attenu    = net.attenu;           % attenuation ratio for the signal
%>>>>>>>>>>>> Testing Parameters Setting <<<<<<<<<<<<<<<<<<<
S_point = 4000;            % starting point of testing data
testNum = 3000;            % number of testing data
X(1,:)  = zeros(1,HUC);    % initial reservoir state
t       = [S_point : S_point+len_subset-1];
y0      = rand(1,OUC)-0.5; % initial output
%>>>>>>>>>>>>>>>>>>>>>>>> Check parameter <<<<<<<<<<<<<<<<<<
if length(t) ~= len_subset,
    error('Length of testing data subset and the network structure do not match');
end;
%>>>>>>>>>>>>>>>>> Testing Main Routine <<<<<<<<<<<<<<<<<<<<<
wb = waitbar(0, 'Echo State Network Testing in Progress...');
for i = (1:testNum),
    waitbar(i/testNum,wb)
    set(wb,'name',['Progress = ' sprintf('%2.1f',i/testNum*100) '%']);
    y = attenu*signal(t);                                   % generate testing data
    X(i+1,:) = hyperb((inWeights*y(1:end-OUC)' + ...
               drWeights*X(i,:)' + int_bk*bkWeights*y0')'); % update reservoir state
    if (bl_out == 1),
        Y(i+1,:) = ouWeights*X(i+1,:)';        % update output state - Linear output
    else
        Y(i+1,:) = hyperb(ouWeights*X(i+1,:)');  % update output state - nonlinear output
    end;
    % update state for next iteration and output
    original_out(i) = (1/attenu)*y(end-OUC+1:end);   % original output
    net_out(i)      = (1/attenu)*Y(i+1,:);           % network output
    error(i)        = net_out(i) - original_out(i);  % errors
    y0 = Y(i+1,:);                                   % store the output for next calculation
    t  = t + 1;                                      % Move one-step forward
end;
close(wb);
%>>>>>>>>>>>>>>>>>> Plotting <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
subplot(211);
plot([S_point+1:S_point+testNum],original_out,'b',[S_point+1:S_point+testNum],net_out,'r');
hold on; grid on;
legend('Original sequence','Network output');
xlabel('time'); ylabel('Amplitude');
subplot(212);
plot([S_point+1:S_point+testNum],error,'b');
hold on; grid on;
xlabel('Time'); ylabel('Output error');
RMSE = sqrt(mean((net_out(1:end) - original_out(1:end)).^2))

实验结果

从错误率可以看出，ESNs的误差率可以接受。。

ESN