multiple training data for cascade-forward backpropagation network - matlab

I am training my neural network with data from 3 consecutive days and testing it with data from a 4th day. The values in this example are randomly chosen and have no relation with reality. I want the neural network to learn the current, depending on the temperature and the solar radiation.
%% initialize data for training
Temperature_Day1 = [25 26 27 26 25];
Temperature_Day2 = [25 24 24 23 24];
Temperature_Day3 = [21 20 22 21 20];
SolarRadiation_Day1 = [990 944 970 999 962];
SolarRadiation_Day2 = [993 947 973 996 967];
SolarRadiation_Day3 = [993 948 973 998 965];
Current_Day1 = [0.11 0.44 0.44 0.45 0.56];
Current_Day2 = [0.41 0.34 0.43 0.55 0.75];
Current_Day3 = [0.34 0.98 0.34 0.76 0.71];
Day1 = [Temperature_Day1; SolarRadiation_Day1]; % 2-by-5
Day2 = [Temperature_Day2; SolarRadiation_Day2]; % 2-by-5
Day3 = [Temperature_Day3; SolarRadiation_Day3]; % 2-by-5
%% training input and training target
Training_Input = [Day1; Day2; Day3]; % 6-by-5
Training_Target = [Current_Day1; Current_Day2; Current_Day3]; % 3-by-5
%% training the network
hiddenLayers= 2;
net = newcf(Training_Input, Training_Target, hiddenLayers);
y = sim(net, Training_Input);
net.trainParam.epochs = 100;
net = train(net, Training_Input, Training_Target);
%% initialize data for prediction
Temperature_Day4 = [45 23 22 11 24];
SolarRadiation_Day4 = [960 984 980 993 967];
Current_Day4 = [0.14 0.48 0.37 0.46 0.77];
Day4 = [Temperature_Day4; SolarRadiation_Day4]; % 2-by-5
Test_Input = [Day4; Day4; Day4]; % same dimension as Training_Input; subject to question
%% prediction
Predicted_Target = sim(net, Test_Input); % yields 3-by-5
My question is: How do I train it with the data of 3 days and then predict the target of the 4th day? Since training and testing inputs must have the same dimension, how do I test it for only one day? Here it is solved by just concatenating three identical data sets of the test input. However, this also yields 3 different data sets for the predicted target.
What is here the right way to do it?
BTW: I have seen this type of question many times, but the answers are never satisfying because they always suggest to change the dimensions of the test input without considering the nature of the problem (which is that only one data set is available for testing). So please don't mark this as a duplicate.

The features that you have for your network are Temperature and SolarRadiation, each taken at specific times during the day. The day on which these readings are taken are irrelevant (otherwise you wouldn't be able to predict the outputs for day 4 given data for days 1-3).
This means that we can simply pass each observation separately by concatenating the days horizontally (and similarly for the target data):
Training_Input = [Day1, Day2, Day3]; % 2-by-15
Training_Target = [Current_Day1, Current_Day2, Current_Day3]; % 1-by-15
The resulting network will give you one output (Current) per observation in the test set, so you don't need to duplicate:
Day4 = [Temperature_Day4; SolarRadiation_Day4]; % 2-by-5
Test_Input = [Day4]; % 2-by-5
PredictedTarget will now be 1-by-5 showing the predicted Current for each of the test observations.
You might consider adding a third feature as input to your net representing the time at which each observations was taken. Assuming that you have t timeslots each day at which observations are taken (thus, length(Temperature) == length(SolarRadiation) == t for all days) and observation s is taken at the same time every day, you can add a feature called TimeSlot:
TimeSlot_Day1 = 1:numel(Temperature_Day1);
TimeSlot_Day2 = 1:numel(Temperature_Day2);
TimeSlot_Day3 = 1:numel(Temperature_Day3)];
Day1 = [Temperature_Day1; SolarRadiation_Day1; TimeSlot_Day1]; % 3-by-5
Day2 = [Temperature_Day2; SolarRadiation_Day2; TimeSlot_Day2]; % 3-by-5
Day3 = [Temperature_Day3; SolarRadiation_Day3; TimeSlot_Day3]; % 3-by-5

Related

can I add other related inputs to Neural Net Time Series in Matlab?

I am trying to design a Neural Network to predict weekly peak load demand.
My input data are the peak load demand from 2 previous years ( it usually follows the same pattern. as well as average temperature and humidity for the past 2 years as well as the predicted ones for the coming year.
i.e:
let us say I'm predicting weekly peak demand for 2022.
I have weekly peak loads for 2021 and 2020, along with the corresponding weekly average temp and humidity for 2020 and 2021.
I also have forecasted average temp and humidity for 2022.
I want my inputs to this Neural Network to be the historical load data for 2020 and 2021 and the historical temp and humidity data, as well as the new predicted average temp and humidity for 2022 in order to get the output of load prediction for 2022.
is there a way I can add this to the NARX model on MATLAB or is there another model I should be using to better fit this application?
This is the code describing what I've done so far, I have used the 2020 and 2021 data to train the network and then tested it with the 2022 with a feedback time delay of 1:2 to accept the previous output as input
is there a way to make it accept 2 feedbacks as input (for example one with 1:2 delay and the other 1:3? to have the last two outputs as feedback)
[WeekNo, Load2020, temp2020, humidity2020, Holiday2020, Population2020] = readvars('LoadData.xlsx','Sheet','2020A','Range','A2:F53');
[WeekNo, Load2021, temp2021, humidity2021, Holiday2021, Population2021] = readvars('LoadData.xlsx','Sheet','2021A','Range','A2:F53');
[WeekNo3, Load2022, temp2022, humidity2022, Holiday2022, Population2022] = readvars('LoadData.xlsx','Sheet','2022A','Range','A2:F49');
% training data
Week=[WeekNo; WeekNo];
WeeklyPeak=[Load2020;Load2021];
Avg_temp=[temp2020;temp2021];
Avg_humidity=[humidity2020;humidity2021];
Holiday=[Holiday2020;Holiday2021];
Population=[Population2020;Population2021];
xtrain=[Week Avg_temp Avg_humidity Holiday Population];
ytrain=[WeeklyPeak];
%testing data
xtest=[WeekNo3 temp2022 humidity2022 Holiday2022 Population2022];
ytest=[Load2022];
% xtrain - input time series.
% ytrain - feedback time series.
X = tonndata(xtrain,false,false);
T = tonndata(ytrain,false,false);
trainFcn = 'trainlm';
inputDelays = 1:1;
feedbackDelays = 1:2;
hiddenLayerSize = 3;
net = narxnet(inputDelays,feedbackDelays,hiddenLayerSize,'open',trainFcn);
[x,xi,ai,t] = preparets(net,X,{},T);
[net,tr] = train(net,x,t,xi,ai);
% Test the Network
X_test = tonndata(xtest,false,false);
T_test = tonndata(ytest,false,false);
[xn,xin,ain,tn] = preparets(net,X_test,{},T_test);
yn = net(xn,xin,ain);
e = gsubtract(tn,yn);
performance = perform(net,tn,yn)
ynn=[cell2mat(xin(2,:))'; cell2mat(yn)'];
plot(WeekNo3,ytest,'B')
hold
plot(WeekNo3,ynn,'R')
I already Have the Actual load for 2022 but it is only for Comparison purposes to test the accuracy of the predictions I am going to generate (the last plot in my code).
with a quick look to me code does making the input delay for x 1:1 and feedback delay 1:2 produce the network I've described? where x are the current inputs and feedback is the time series depandant?

How to interpret dependency on order for 1/3-octave band filters in Matlab?

I am trying to obtain a third-octave band representation of the frequency content of an acoustic signal, obtained by numerical simulation. The signal is the following:
The code I use (you can reproduce it by replacing P and t with other signals):
P = PNL(:,3) ; % Acoustic pressure signal
t = timeWindow ; % Time array
T = t(end) - t(1) ; % Signal duration
fs = length(t)/T ; % = 98123 [Hz]
T0 = 1 ; % Reference duration of 1 [s]
filterOrder = 6 ;
allANSIFrequencies = getANSICenterFrequencies(octaveFilter('FilterOrder', filterOrder, 'Bandwidth', '1/3 octave', 'SampleRate', fs));
cf = allANSIFrequencies(19 : 50) ; % Restricting the frequency range to [15 Hz - 20000 Hz]
for k = 1 : length(cf)
octFilt = octaveFilter(cf(k), 'Bandwidth', '1/3 octave','SampleRate', fs, 'FilterOrder', filterOrder, 'Oversample', false);
y = octFilt(P);
Prms = sqrt((1/T0) * trapz(t, y.^2)) ; % Root mean square pressure level
LE(k) = 20 * log10(Prms/20e-6) ; % [db] Sound Exposure Level
end
semilogx(cf, LE)
grid on
xlabel('Frequency (Hz)') ;
ylabel('L_{E} (dB)') ;
legend('Order 2', 'Order 4', 'Order 6', 'Order 12')
The frequency content I would expect based on experimental data is a mix of the results at different orders: up to 1000 [Hz] the levels from order 6 are correct, while from 1000 [Hz] onwards, it's the order 2 that describes best the frequency content.
I obtain very different results depending on the filter order I am using. To a certain degree that's normal. Higher orders should provide a sharper response of the filter, but I fail to interpret these results.
The resolution of the signal under investigation should be high enough (98123 [Hz]). Any idea of what the issue could be?
Thank you in advance, I'd appreciate any insight on this!

There is a bug in my code and I don't know where!!! [MATLAB]

I am trying to reproduce the results from an article. But so far I am not being successful. Here is the code I wrote
EDIT: Based on the initial comments of Zizy Archer, the code has been revised.
clear;
Nmax = 30; % number of rounds
M = 10000; % number of simulations
beta0 = 5*10^-6; % relative clock offset in micro seconds
alpha0 = 1.01; % relative clock skew
for simN = 1:M
for N = 1:Nmax
mean_dly = randi([20 50],N,1).*10^-6; % micro seconds
stdd_dly = randi([1 5],N,1).*10^-6; % micro seconds
XpropDly = normrnd(mean_dly,stdd_dly,N,1); % micro seconds
YpropDly = normrnd(mean_dly,stdd_dly,N,1); % micro seconds
prcssTme = randi([100 500],N,1).*10^-6; % micro seconds
T_1 = (1:N)'*10^-3; % milli seconds
T_2 = T_1 + XpropDly; % milli seconds
T_3 = T_2 + prcssTme; % milli seconds
T_4 = T_3 + YpropDly; % milli seconds
% actual time
T_2act = (T_1 + XpropDly).*alpha0 + beta0;
T_3act = (T_4 - YpropDly).*alpha0 + beta0;
% equation 13
A = sum(T_2act(1:N) + T_3act(1:N));
B = sum(T_1(1:N) + T_4(1:N));
C = sum((T_2act(1:N) + T_3act(1:N)).^2);
D = sum((T_2act(1:N) + T_3act(1:N)).*(T_1(1:N) + T_4(1:N)));
% equation 16
alpha0est(simN,N) = (A.^2-C.*Nmax)./(A.*B-D.*Nmax);
beta0est(simN,N) = (B.*C-A.*D)./(2.*(A.*B-D.*Nmax));
end
timestamps = [T_1 T_2 T_3 T_4];
clear T_*;
end
% equation 29 and 30
MSE_alpha = sum((alpha0est - alpha0).^2)/M;
MSE_beta = sum((beta0est - beta0).^2)/M;
figure %2(a)
semilogy(1:Nmax,MSE_beta.*10^12)
xlabel('N');ylabel('MSE of the estimated offset \beta_{0}')
figure %2(b)
semilogy(1:Nmax,MSE_alpha)
xlabel('N');ylabel('MSE of the estimated skew \alpha_{0}')
But this is what I get:
EDIT2: Snippets were removed.
Can anyone tell me what is wrong with my code?
Thanking you all in advance.
Next time at least try some rudimentary debugging yourself to figure out what could be wrong.
To debug, perhaps print out some variables or plot stuff. Put some conditionals to check if the values are somewhat expected or make no sense. It isn't that hard if you know something is wrong in such a small piece of code (however, if there was no paper you were trying to hit, this bug might have been lurking for a while).
Well, to the step-by-step solution in this case:
What you immediately notice is that if you plot alpha0est or beta0est, your estimate for alpha is systematically too high, at 1.015 instead of 1.01 for the single round case, similar for beta.
Now, what could it be? It obviously isn't noise in signal processing or delays, this one is shown as all this hairy stuff around the mean, you can set all delays to 0 to verify this. So, it has to be something else.
Looking further, you can notice that this systematic error is decreasing when you increase number of rounds performed, and is gone for full 30 rounds.
So, it has to be something with the number of rounds you are doing. Now try setting N = 10 instead of 30, whoa now 10 round case is fine. And there you have your bug. Equation 13 from the paper - there you have N elements summed. Equation 16 similarly multiplies with N. This N obviously has to be the same number. But as it turns out, in your code it isn't. Equation 13 in your code sums ROUNDS cases. Could be 1, could be 30. Equation 16 multiplies with N (=30, always).
All this could be easily avoided if you used saner variable names (all-caps, really?). If you used N for number of rounds performed, and maxN as the limit how many rounds you can try doing at maximum, you would easily get it right.

Convert milliseconds into hours and plot

I'm trying to convert an array of milliseconds and its respective data. However I want to do so in hours and minutes.
Millis = [60000 120000 180000 240000....]
Power = [ 12 14 12 13 14 ...]
I've set it up so the data records every minute, hence the 60000 millis (= 1 minimte). I am trying to plot time on the x axis and power on the y. I would like to have the x axis displayed in hours and minutes with each respective power data corresponding to its respective time.
I've tried this
for i=2:length(Millis)
Conv2Min(i) = Millis(i) / 60000;
Time(i) = startTime + Conv2Min(i);
if (Time(i-1) > Time(i) + 60)
Time(i) + 100;
end
end
s = num2str(Time);
This in attempt to turn the milliseconds into hours starting at 08:00 and once 60 minutes have past going to 09:00, the problem is plotting this. I get a gap between 08:59 and 09:00. I also cannot maintain the 0=initial 0.
In this scenario it is preferable to work with datenum values and then use datetick to set the format of the tick labels of your plot to 'HH:MM'.
Let's suppose that you started taking measurements at t_1 = [HH_1, MM_1] and stopped taking measurements at t_2 = [HH_2, MM_2].
A cool trick to generate the array of datenum values is to use the following expression:
time_datenums = HH_1/24 + MM_1/1440 : 1/1440 : HH_2/24 + MM_2/1440;
Explanation:
We are creating a regularly-spaced vector time_datenums = A:B:C using the colon (:) operator, where A is the starting datenum value, B is the increment between datenum values and C is the ending datenum value.
Since your measurements have been taken every minute (60000 milliseconds), then the increment between datenum values should be of 1 minute too. As a day has 24 hours, that makes 1440 minutes a day, so use B = 1/1440 as the increment between vector elements, to get 1 minute increments.
For A and C we simply need to divide the hour digits by 24 and the minute digits by 1440 and sum them up like this:
A = HH_1/24 + MM_1/1440
C = HH_2/24 + MM_2/1440
So for example, if t_1 = [08, 00], then A = 08/24 + 00/1440. As simple as that.
Notice that this procedure doesn't use the datenum function at all, and still, it manages to generate a valid array of datenum values only taking into consideration the time of the datenum, without needing to bother about the date of the datenum. You can learn more about this here and here.
Going back to your original problem, let's have a look at the code:
time_millisec = 0:60000:9e6; % Time array in milliseconds.
power = 10*rand(size(time_millisec)); % Random power data.
% Elapsed time in milliseconds.
elapsed_millisec = time_millisec(end) - time_millisec(1);
% Integer part of elapsed hours.
elapsed_hours_int = fix(elapsed_millisec/(1000*60*60));
% Fractional part of elapsed hours.
elapsed_hours_frac = (elapsed_millisec/(1000*60*60)) - elapsed_hours_int;
t_1 = [08, 00]; % Start time 08:00
t_2 = [t_1(1) + elapsed_hours_int, t_1(2) + elapsed_hours_frac*60]; % Compute End time.
HH_1 = t_1(1); % Hour digits of t_1
MM_1 = t_1(2); % Minute digits of t_1
HH_2 = t_2(1); % Hour digits of t_2
MM_2 = t_2(2); % Minute digits of t_2
time_datenums = HH_1/24+MM_1/1440:1/1440:HH_2/24+MM_2/1440; % Array of datenums.
plot(time_datenums, power); % Plot data.
datetick('x', 'HH:MM'); % Set 'HH:MM' datetick format for the x axis.
This is the output:
I would use datenums:
Millis = [60000 120000 180000 240000 360000];
Power = [ 12 14 12 13 14 ];
d = [2017 05 01 08 00 00]; %starting point (y,m,d,h,m,s)
d = repmat(d,[length(Millis),1]);
d(:,6)=Millis/1000; %add them as seconds
D=datenum(d); %convert to datenums
plot(D,Power) %plot
datetick('x','HH:MM') %set the x-axis to datenums with HH:MM as format
an even shorter approach would be: (thanks to codeaviator for the idea)
Millis = [60000 120000 180000 240000 360000];
Power = [ 12 14 12 13 14 ];
D = 8/24+Millis/86400000; %24h / day, 86400000ms / day
plot(D,Power) %plot
datetick('x','HH:MM') %set the x-axis to datenums with HH:MM as format
I guess, there is an easier way using datetick and datenum, but I couldn't figure it out. This should solve your problem for now:
Millis=6e4:6e4:6e6;
power=randi([5 15],1,numel(Millis));
hours=floor(Millis/(6e4*60))+8; minutes=mod(Millis,(6e4*60))/6e4; % Calculate the hours and minutes of your Millisecond vector.
plot(Millis,power)
xlabels=arrayfun(#(x,y) sprintf('%d:%d',x,y),hours,minutes,'UniformOutput',0); % Create time-strings of the format HH:MM for your XTickLabels
tickDist=10; % define how often you want your XTicks (e.g. 1 if you want the ticks every minute)
set(gca,'XTick',Millis(tickDist:tickDist:end),'XTickLabel',xlabels(tickDist:tickDist:end))

How to use set of temperatures values to predict the status of an equipment in a smart home with HMM

Good morning ,
I have a data set of temperatures values during a whole day ( about 45 values ) and I want to evaluate the status of a air conditioner whether is on /off compared to those temperatures values on that day .
I would like to use the HMM toolbox in matlab in order to predict the air conditioner status ( on/ off ) during one day and I confused how to proceed in the situation of training and predicting process with hidden Markov models.
I WOULD like to know and identify the transition matrix , the observation matrix , the sequence of observation and the prior probability of this kind of hmm and what training data should I insert into it ?
% chaincode data set for class ‘1’
data1{1} = [%values of temperatures in one day%];
data1{2} = [%values of temperatures in one day% ];
data1{3} = [%values of temperatures in one day%];
% HMM for class '0' and random initialization of parameters
hmm0.prior = [1 0 0 0 0];
hmm0.transmat = rand(5,5);
% 3 by 3 transition matrix
hmm0.transmat(2,1) =0; hmm0.transmat(3,1) = 0; hmm0.transmat(3,2) = 0;
hmm0.transmat = mk_stochastic(hmm0.transmat);
hmm0.obsmat = rand(5, 16);
% # of states * # of observation
hmm0.obsmat = mk_stochastic(hmm0.obsmat) ;
% HMM for class '1' and random initialiation of parameters
hmm1.prior = [1 0 ];
hmm1.transmat = rand(2,2);
% 2 by 2 transition matrix
hmm1.transmat(2,1) =0;
hmm1.transmat = mk_stochastic(hmm1.transmat);
hmm1.transmat
hmm1.obsmat = rand(2, 40);
% # of states * # of observation
hmm1.obsmat = mk_stochastic(hmm1.obsmat) ;
% Training of HMM model 0 (Baum-Welch algorithm)
[LL0, hmm0.prior, hmm0.transmat, hmm0.obsmat] = dhmm_em(data0, hmm0.prior,hmm0.transmat, hmm0.obsmat) ;
% smoothing of HMM observation parameter: set floor value 1.0e-5
hmm0.obsmat = max(hmm0.obsmat, 1.0e-5);
% Training of HMM model 1 (Baum-Welch algorithm)
[LL1, hmm1.prior, hmm1.transmat, hmm1.obsmat] = dhmm_em(data1, hmm1.prior,hmm1.transmat, hmm1.obsmat) ;
% smoothing of HMM observation parameter: set floor value 1.0e-5
hmm1.obsmat = max(hmm1.obsmat, 1.0e-5);