How to predict temperature for the 4th day, given temperatures for previous days, using a linear perceptron? - matlab

I have four sets of data (3 for training, 1 for testing) that include the hour of the day and temperatures in this format:
Time | Temperature
5, 60
6, 63
7,70
8,73
9,78
10,81.5
11,85.1
12,87
13,90
I need to train and test a perceptron and then predict what the temperatures will be on the next day at the same hours.
I am trying to use Matlab to do this and I know I am supposed to normalize the data and use time-series prediction. However I can't figure out how to start.
I don't understand what the inputs and outputs are, and what activations function to use to make the output linearly from -infinity to +infinity.

I'm pretty sure you won't have to use a perceptron for this task as you want to perform regression and not classification. (Perceptron is a binary classifier see Matlab documentation.)
To start with normalization: You need to adjust your data such that the mean is zero and the standard deviation equals 1. For example:
data = rand(1,100);
data = (data - mean(data))/sqrt(var(data));
You can interpret your input and output as follows:
You have an underlying function which maps your time-values to the temperature values (f:time->temperature). Time is the independent variable and temperature the dependent variable (see for example Wikipedia). And you want to find an approximation for f based on your input data.
For time series regression you will find a detailed example here. If you
are required to use a feedforward network you can also take a look at this.

Related

How do I know the confidence level of my correct probability score?

I have a writer recognition system that gives back an NLL (Negative Least Likelihood) score for a test sample against every trained model. For example if there are thirteen models to compare the sample against the NLL output will look like this.
15885.1881156907 17948.1931699086 17205.1548161452 16846.8936368077 20798.8048757930 18153.8179076007 18972.6746781821 17398.9047592641 19292.8326540969 22559.3178790489 17315.0994094185 19471.9518308519 18867.2297851016
Where each column represents the score for that sample against every model. Column 1 gives the score against model 1 and so on.
This test sample is written by model 1. So the first column should have the minimum value for correct prediction.
The output I provided here gives the desired prediction, as the value of column 1 is minimum.
When I presented my results I was asked how confident I was about the scores or the predicted values? I was asked to provide a confidence level of each score.
I did some reading after this and found some posts on 95 % confidence interval which appears as every result to my google query but it does not appear to be what I need.
The reason I need this is suppose for a test sample I have scores from 2 models. Then using the confidence level I am supposed to know which score to pick up.
For example for the same test sample the scores from another model are:
124494.535128967 129586.451168849 126269.733526396 129579.895935672 128582.387405272 125984.657455834 127486.755531507 125162.136816278 129790.811437270 135902.112799503 126599.346536290 136223.382395325 126182.202727967
Both are correctly predicting as in both cases score in column 1 is minimum. But again how do I find the confidence level of my score?
Would appreciate any guidance here.
As my knowledge you cannot evaluate a confidence level for just one value.
Suppose you can store your results in a matrix where each column corresponds to a model and each row corresponds to an example (or observation). You can evaluate the confidence for every single model by using all the predicted results from that model (i.e. you can evaluate the confidence interval for any column in our matrix) according to the following procedure:
Evaluate the mean value of the column, let's call this µ
Evaluate the standard deviation of the column, let's call this σ
Evaluate the mean error as ε=σ/sqrt(N), where N is the number of samples (rows)
the lower bound for the confidence interval is given by µ-2ε whereas the upper bound is given by µ+2ε. By straightforward subtraction you can find the amplitude of such confidence interval. The more is closer to zero, the more accurate is your measurement.
Hope this is what you're looking for.

how to predict the values at future timesteps in matlab ntstool

I have to use NAR network to train a time-series for my project. To have an idea of how time-series tool (ntstool) works in MATLAB , I used the GUI of ntstool in matlab with a dataset containing 427 timesteps of one element. While training I used a neural network with 10 hidden layers and delay value = 5.
Now I have following Three questions :
What does the **delay value (d) ** in the GUI means. Does it mean that while training the network assumes that each timestep value is dependent on last 'd' timesteps' values ?
how to predict the values at future timesteps in ntstool?
Delay value means that neural network inputs are current input value and N delay values of input signals, in your case N=5. Hope this will help you.

time shift between target and simulation output using neural network

I'm currently working with neural networks and I'm still beginner. My purpose is to use a MLP to predict flow time series (I know, that NARX-networks may be more suitable for time series predictions, but the requirement is a MLP).
For example I want to predict the flow Q(t+x) with current and historical flow Q(t...t-n) and precipitation P(t...t-m) etc.
The results of my net-trainings (training, validation and test of the network) and an additional validation period show relatively good qualities (correlation and RMSE). But when I look closer at the output of training and validation period, there is a lag to the targets of the respective periode. And my problem is that I don't know why.
The lag exactly corresponds to my forecast period x, no matter how large x is.
I use a standard MLP from the Matlab-toolbox with default Settings (randomly divide, trainlm, etc.) like using the graphical NN-tool (but I also tested other Settings with my own code).
With a simple Q(t) NAR-net it is the same problem. If I try it with regular data like predicting sin(t+x) with sin(t..t-n) or the same with a rectangular function there is no time shift, it's all fine.
Only if I use real world data or irregular (but most constant) data like [0.12 0.14 0.13 0.1 0.1 0.1 ... (n times) 0.1 ... 0.1 0.1 0.14 0.15 0.12 ...] there is the shift between the target and the output. Although I train the network with the target Q(t+x) the real training output is Q(t). I try also some other input variable combinations from less to more information. My time series is above 7 years with hourly resolution. But it also occurs with other resolutions.
Is there something I am wrong in my work or something I can try. I've read that some others also have this Problem, but no solutions? I think it is no failure of my implementation, because I also tried the matlab-tool and the sinus function and there are the same outcomes. And if I ignore the shift, the accuracy of the values is not bad (thats why the goodness of correlation and rmse is also good obviously).
I use matlab 2012.
Here's also a minimalistic code example, only with the most import points. But also shows the problem very well.
%% minimalstic example
% but there is the same problem with more input variables
load Q
%% create net inputs and targets
% start point of t
t = 100;
% history data of Q -> Q(t-1), Q(t-2), Q(t-3)
inputs = [Q(t-1:end-1,1) Q(t-2:end-2,1) Q(t-3:end-3,1)]';
% timestep t that want to be predicted
targets = Q(t:end,1)';
%% create fitting net (MLP)
% but it is the same problem for NARnet
% and from here, you can also use the NN graphical tool
% number of hidden neurons
numHiddenNeurons = 6; % the described problem is not dependent on this
% point, therefor it is freely chosen
net = fitnet(numHiddenNeurons); % same problem if choosing the old version newfit
% default MLP settings, no changes, but the problem even exist with other
% combinations of settings
% train net
[trained_net,tr] = train(net,inputs,targets);
% apply trained net with given data (create net outputs)
outputs = sim(trained_net,inputs);
figure(1)
hold on
bar(targets',0.6,'FaceColor','r','EdgeColor','none')
bar(outputs',0.2,'FaceColor','b','EdgeColor','none')
legend('observation','prediction')
% please zoom very far to see single bars!! the bar plot shows very good
% the time shift
% if you choose a bigger forecasting time, the shift will also be better to
% see
%% the result: targets(1,1)=Q(t), outputs(1,1)=Q(t-1)
%% now try the sinus function, the problem will not be there
x = 1:1:1152;
SIN = sin(x);
inputs = [SIN(1,t-1:end-1);SIN(1,t-2:end-2);SIN(1,t-3:end-3)];
targets = SIN(1,t:end);
% start again from above, creating the net
I have not enough reputations to upload two excerpts of the results of the codes for one step ahead prediction.
Consider predicting not the absolute value of the flow, but the change of flow from the previous period, using the recent changes from the previous periods as inputs. As pointed out by Diphtong above, it very well may be the case that the previous flow values are not predictive of (contain no useful information about) the next flow value.
Conceptually, this is similar to predicting the next value of a random walk. Imagine you had a situation where the next value of a function was equal to the current value plus some random number between -1.0 and +1.0. If you tried to predict the next value from the previous values, the best that any function approximator/regressor could do to minimize the prediction error would be to use the current value as the best predictor of the next value.
However, in your case, it could still be possible that there is some information in the previous flow values. To prevent the current value from overwhelming the error term, deny the network from using the current value as the predictor by feeding it the derivative of the absolute flow values. If there is no useful information in those either, it should minimize the error by always predicting 0.
In summary, try:
Inputs: change in flow at [t-1], at [t-2], ... , [t-w]
Output: change in flow at [t]
This "time-shift" you are observing is exactly what #Diphtong mentions: your neural-network cannot resolve the relationship between the inputs and the output, so it bahaves like a "naive predictor" (look it up) where (in the financial stock market world) the best prediction for tomorrow's stock price is today's price.
It may help, but I've seen deltas of the input time series, LOG() and SQRT() perform the same...

Using Linear Prediction Over Time Series to Determine Next K Points

I have a time series of N data points of sunspots and would like to predict based on a subset of these points the remaining points in the series and then compare the correctness.
I'm just getting introduced to linear prediction using Matlab and so have decided that I would go the route of using the following code segment within a loop so that every point outside of the training set until the end of the given data has a prediction:
%x is the data, training set is some subset of x starting from beginning
%'unknown' is the number of points to extend the prediction over starting from the
%end of the training set (i.e. difference in length of training set and data vectors)
%x_pred is set to x initially
p = length(training_set);
coeffs = lpc(training_set, p);
for i=1:unknown
nextValue = -coeffs(2:end) * x_pred(end-unknown-1+i:-1:end-unknown-1+i-p+1)';
x_pred(end-unknown+i) = nextValue;
end
error = norm(x - x_pred)
I have three questions regarding this:
1) Does this appropriately do what I have described? I ask because my error seems rather large (>100) when predicting over only the last 20 points of a dataset that has hundreds of points.
2) Am I interpreting the second argument of lpc correctly? Namely, that it means the 'order' or rather number of points that you want to use in predicting the next point?
3) If this is there a more efficient, single line function in Matlab that I can call to replace the looping and just compute all necessary predictions for me given some subset of my overall data as a training set?
I tried looking through the lpc Matlab tutorial but it didn't seem to do the prediction as I have described my needs require. I have also been using How to use aryule() in Matlab to extend a number series? as a reference.
So after much deliberation and experimentation I have found the above approach to be correct and there does not appear to be any single Matlab function to do the above work. The large errors experienced are reasonable since I am using a linear prediction algorithm for a problem (i.e. sunspot prediction) that has inherent nonlinear behavior.
Hope this helps anyone else out there working on something similar.

Fast fourier transform for deasonalizing data in MATLAB

I'm very much a novice at signal processing techniques, but I am trying to apply the fast fourier transform to a daily time series to remove the seasonality present in the data. The example I am working with is from here:
http://www.mathworks.com/help/signal/ug/frequency-domain-linear-regression.html
While I understand how to implement the code as it is written in the example, I am having a hard time adapting it to my specific application. What I am trying to do is create a preprocessing function which deseasonalizes the training data using similar code to the above example. Then, using the same estimated coefficients from the in-sample data, deseasonalize the out-of-sample data to preserve its independence from the in-sample data. Basically, once the coefficients are estimated, I will normalize each new data point using the same coefficients. I suspect this is akin to estimating a linear trend, then removing it from the in-sample data, and then using the same linear model on unseen data to detrend it i the same manner.
Obviously, when I estimate the fourier coefficients, the vector I get out is equal to the length of the in-sample data. The out-of-sample data is comprised of much fewer observations, so directly applying them is impossible.
Is this sort of analysis possible using this technique or am I going down a dead end road? How should I approach that using the code in the example above?
What you want to do is certainly possible, you are on the right track, but you seem to misunderstand a few points in the example. First, it is shown in the example that the technique is the equivalent of linear regression in the time domain, exploiting the FFT to perform in the frequency domain an operation with the same effect. Second, the trend that is removed is not linear, it is equal to a sum of sinusoids, which is why FFT is used to identify particular frequency components in a relatively tidy way.
In your case it seems you are interested in the residuals. The initial approach is therefore to proceed as in the example as follows:
(1) Perform a rough "detrending" by removing the DC component (the mean of the time-domain data)
(2) FFT and inspect the data, choose frequency channels that contain most of the signal.
You can then use those channels to generate a trend in the time domain and subtract that from the original data to obtain the residuals. You need not proceed by using IFFT, however. Instead you can explicitly sum over the cosine and sine components. You do this in a way similar to the last step of the example, which explains how to find the amplitudes via time-domain regression, but substituting the amplitudes obtained from the FFT.
The following code shows how you can do this:
tim = (time - time0)/timestep; % <-- acquisition times for your *new* data, normalized
NFpick = [2 7 13]; % <-- channels you picked to build the detrending baseline
% Compute the trend
mu = mean(ts);
tsdft = fft(ts-mu);
Nchannels = length(ts); % <-- size of time domain data
Mpick = 2*length(NFpick);
X(:,1:2:Mpick) = cos(2*pi*(NFpick-1)'/Nchannels*tim)';
X(:,2:2:Mpick) = sin(-2*pi*(NFpick-1)'/Nchannels*tim)';
% Generate beta vector "bet" containing scaled amplitudes from the spectrum
bet = 2*tsdft(NFpick)/Nchannels;
bet = reshape([real(bet) imag(bet)].', numel(bet)*2,1)
trend = X*bet + mu;
To remove the trend just do
detrended = dat - trend;
where dat is your new data acquired at times tim. Make sure you define the time origin consistently. In addition this assumes the data is real (not complex), as in the example linked to. You'll have to examine the code to make it work for complex data.