timeseries forecasting using Neural Network - neural-network

I am stuck with a timeseries forecasting problem to forecast end of day balance. I have 2 years of data with daily closing balance for some 100 accounts. Trying to create forecast to predict future values using MLP and LSTM's but both are giving non satisfactory results. Both the model capture the linear trend in the data correctly but fail to recognize any non linear changes.
Here is my approach:
Feature Engg performed:
Took last 5 lag values for each day as input
Day, Month and Weekday extrapolated and fed as input
So, I have 8 columns as input to the model and data is scaled using MinMaxScaler before passing to model.
LSTM Model:
model = Sequential()
model.add(LSTM(32,return_sequences = True, input_shape=(1, 8)))
model.add(LSTM(32))
model.add(Dense(1 , activation='linear'))
model.compile(loss='mae', optimizer='adam')
MLP Model:
model = Sequential()
model.add(Dense(8, activation='relu', input_dim=trainX.shape[1]))
model.add(Dense(20, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mae', optimizer='adam')
model.fit(trainX,trainY, epochs=100, batch_size=32, verbose=0,
shuffle=False)
Am I doing anything wrong and how can this model be improved to better perform on non linear values? Here is a sample of actual values vs prediction that I am getting.
predicted actual Mape %
date
2017-11-01 3.427375e+07 75248606.05 74.82
2017-11-02 3.516451e+07 65382560.38 60.11
2017-11-03 3.592537e+07 64244508.86 56.54
2017-11-04 3.796975e+07 64244508.86 51.41
2017-11-05 3.806401e+07 64244508.86 51.18
2017-11-06 3.994756e+07 60844214.61 41.47
2017-11-07 4.147836e+07 57346364.66 32.11
2017-11-08 4.280740e+07 48821953.03 13.13
2017-11-09 4.481559e+07 49061219.17 9.05
2017-11-10 4.662069e+07 45530356.65 2.37

Related

can I add other related inputs to Neural Net Time Series in Matlab?

I am trying to design a Neural Network to predict weekly peak load demand.
My input data are the peak load demand from 2 previous years ( it usually follows the same pattern. as well as average temperature and humidity for the past 2 years as well as the predicted ones for the coming year.
i.e:
let us say I'm predicting weekly peak demand for 2022.
I have weekly peak loads for 2021 and 2020, along with the corresponding weekly average temp and humidity for 2020 and 2021.
I also have forecasted average temp and humidity for 2022.
I want my inputs to this Neural Network to be the historical load data for 2020 and 2021 and the historical temp and humidity data, as well as the new predicted average temp and humidity for 2022 in order to get the output of load prediction for 2022.
is there a way I can add this to the NARX model on MATLAB or is there another model I should be using to better fit this application?
This is the code describing what I've done so far, I have used the 2020 and 2021 data to train the network and then tested it with the 2022 with a feedback time delay of 1:2 to accept the previous output as input
is there a way to make it accept 2 feedbacks as input (for example one with 1:2 delay and the other 1:3? to have the last two outputs as feedback)
[WeekNo, Load2020, temp2020, humidity2020, Holiday2020, Population2020] = readvars('LoadData.xlsx','Sheet','2020A','Range','A2:F53');
[WeekNo, Load2021, temp2021, humidity2021, Holiday2021, Population2021] = readvars('LoadData.xlsx','Sheet','2021A','Range','A2:F53');
[WeekNo3, Load2022, temp2022, humidity2022, Holiday2022, Population2022] = readvars('LoadData.xlsx','Sheet','2022A','Range','A2:F49');
% training data
Week=[WeekNo; WeekNo];
WeeklyPeak=[Load2020;Load2021];
Avg_temp=[temp2020;temp2021];
Avg_humidity=[humidity2020;humidity2021];
Holiday=[Holiday2020;Holiday2021];
Population=[Population2020;Population2021];
xtrain=[Week Avg_temp Avg_humidity Holiday Population];
ytrain=[WeeklyPeak];
%testing data
xtest=[WeekNo3 temp2022 humidity2022 Holiday2022 Population2022];
ytest=[Load2022];
% xtrain - input time series.
% ytrain - feedback time series.
X = tonndata(xtrain,false,false);
T = tonndata(ytrain,false,false);
trainFcn = 'trainlm';
inputDelays = 1:1;
feedbackDelays = 1:2;
hiddenLayerSize = 3;
net = narxnet(inputDelays,feedbackDelays,hiddenLayerSize,'open',trainFcn);
[x,xi,ai,t] = preparets(net,X,{},T);
[net,tr] = train(net,x,t,xi,ai);
% Test the Network
X_test = tonndata(xtest,false,false);
T_test = tonndata(ytest,false,false);
[xn,xin,ain,tn] = preparets(net,X_test,{},T_test);
yn = net(xn,xin,ain);
e = gsubtract(tn,yn);
performance = perform(net,tn,yn)
ynn=[cell2mat(xin(2,:))'; cell2mat(yn)'];
plot(WeekNo3,ytest,'B')
hold
plot(WeekNo3,ynn,'R')
I already Have the Actual load for 2022 but it is only for Comparison purposes to test the accuracy of the predictions I am going to generate (the last plot in my code).
with a quick look to me code does making the input delay for x 1:1 and feedback delay 1:2 produce the network I've described? where x are the current inputs and feedback is the time series depandant?

What is wrong with my siamese network? Why does it output the same value(appx 0.5) irrespective of the input pairs?

I'm trying to build a Siamese Network for https://www.kaggle.com/moltean/fruits dataset. I've picked 10 Images per class from this dataset. There are a total of 131 classes in this dataset. I'm using the below model to train my network. However, it is failing to converge. I saw a strange behaviour, after 3000 epochs my results are 0.5000003 irrespective of the input pair I give and my loss stops at 0.61. The specifications of the network are as specified in the paper. I tried changing the following things,
Changing Denes layer activation to ReLU
Importing 'ImageNet' weights of ResNet50
Tried increasing and decreasing learning rate.
I also checked the batch inputs to see if the correct input pair (x) is paired with the correct y value. However, I think I'm doing something basically wrong. Glad if you could help me. Thank you :)
The notebook is hosted in Kaggle https://www.kaggle.com/krishnaprasad96/siamese-network.
If you have some doubts on how certain parts of the code works refer https://medium.com/#krishnaprasad_54871/siamese-networks-line-by-line-explanation-for-beginners-55b8be1d2fc6
#Building a sequential model
input_shape=(100, 100, 3)
left_input = Input(input_shape)
right_input = Input(input_shape)
W_init = keras.initializers.RandomNormal(mean = 0.0, stddev = 1e-2)
b_init = keras.initializers.RandomNormal(mean = 0.5, stddev = 1e-2)
model = keras.models.Sequential([
keras.layers.Conv2D(64, (10,10), activation='relu', input_shape=input_shape, kernel_initializer=W_init, kernel_regularizer=l2(2e-4)),
keras.layers.MaxPooling2D(),
keras.layers.Conv2D(128, (7,7), activation='relu', kernel_initializer=W_init, bias_initializer=b_init, kernel_regularizer=l2(2e-4)),
keras.layers.MaxPooling2D(),
keras.layers.Conv2D(128, (4,4), activation='relu', kernel_initializer=W_init, bias_initializer=b_init, kernel_regularizer=l2(2e-4)),
keras.layers.MaxPooling2D(),
keras.layers.Conv2D(256, (4,4), activation='relu', kernel_initializer=W_init, bias_initializer=b_init, kernel_regularizer=l2(2e-4)),
keras.layers.MaxPooling2D(),
keras.layers.Flatten(),
keras.layers.Dense(4096, activation='sigmoid', kernel_initializer=W_init, bias_initializer=b_init, kernel_regularizer=l2(1e-3))
])
encoded_l = model(left_input)
encoded_r = model(right_input)
subtracted = keras.layers.Subtract()([encoded_l, encoded_r])
prediction = Dense(1, activation='sigmoid', bias_initializer=b_init)(subtracted)
siamese_net = Model(input=[left_input, right_input], output=prediction)
optimizer= Adam(learning_rate=0.0006)
siamese_net.compile(loss='binary_crossentropy', optimizer=optimizer)
plot_model(siamese_net, show_shapes=True, show_layer_names=True)
I have seen the notebook on kaggle. Thanks for all the information. But it seems that training and validation spilt is wrong. As this model trains on initial 91 classes only. What about remaining 40 classes. Train and validation spilt should be from the same class. Suppose I have 10 images in a class. I can use 8 image for train and 2 images for validation. Train and validation spilt should be on images not on classes. Also I couldn't see the testing script. It would be a great help if you can provide that also.

AdaBoost - get prediction for the specific number of estimators

I use AdaBoostClassifier from sklearn.ensemble. I trained my model using 1000 estimators:
model = AdaBoostClassifier(
base_estimator = DecisionTreeClassifier(max_depth = 6),
n_estimators = 1000,
learning_rate = 0.2
)
model.fit(X_train, y_train)
then using generator model.staged_predict_proba(X_test) I get know that the best accuracy for X_test data is for 814 estimators.
Now I don't want to use generator model.staged_predict_proba(X_test) to make prediction for new test data X_test_2 because it is a lot of time to calculate predictions for each number of estimators (the dataset is really big). I just want to calculate predictions for model based on 814 estimators. I didn't find a way to do it. Is it possible for AdaBoostClassifier? I think it should be.

DNN model - strange results. "Loss" results different in Train/Testing and in Kaggle

I have trained a simple DNN model - using Keras with Theano backend, having:
2 * (3x3 conv layer, 2x2 max-pool, 15% dropout),
96-FC layer,
softmax activation on the final layer,
optimizer: adam.
The goal was to perform a classification into 6 classes.
I have a dataset which consist of 116000 equally distributed among all classes images.
After 14 epochs - the validation results were great:
loss: 0.1, accuracy 92% on a test data - and during the training I had a similar results on my training and validation data.
The task is for learning purpose only - defined in Kaggle website. They want a minimum log-loss result. When I upload my predictions there - the calculated loss is very high ( > 2)
Do you have any suggestions?
some code representing the network and how I use it:
1. Inputs:
116000 images of size 160x160
2. Model
model = Sequential()
model.add(Convolution2D(64, 3, 3, activation='relu', input_shape=(1, 160, 160)))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.15))
model.add(Convolution2D(128, 3, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.15))
model.add(Flatten())
model.add(Dense(96, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(len(classes), activation='softmax'))
3. Compile and fit:
model.compile(loss='categorical_crossentropy',#categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(np.array(trainData[0]), np.array(trainData[1]),
batch_size=13, nb_epoch=14, verbose=1, shuffle=True, callbacks=callbacks_list, validation_data = (np.array(validationData[0]), np.array(validationData[1])))
4. Test with a new set of samples:
scores = model.evaluate(np.array(testData[0]), np.array(testData[1]), verbose=1)

Using Keras LSTM to predict a single example after using batch training

I have a network model that is trained using batch training. Once it is trained, I want to predict the output for a single example.
Here is my model code:
model = Sequential()
model.add(Dense(32, batch_input_shape=(5, 1, 1)))
model.add(LSTM(16, stateful=True))
model.add(Dense(1, activation='linear'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
I have a sequence of single inputs to single outputs. I'm doing some test code to map characters to next characters (A->B, B->C, etc).
I create an input data of shape (15,1,1) and an output data of shape (15, 1) and call the function:
model.fit(x, y, nb_epoch=epochs, batch_size=5, shuffle=False, verbose=0)
The model trains, and now I want to take a single character and predict the next character (input A, it predicts B). I create an input of shape (1, 1, 1) and call:
pred = model.predict(x, batch_size=1, verbose=0)
This gives:
ValueError: Shape mismatch: x has 5 rows but z has 1 rows
I saw one solution was to add "dummy data" to your predict values, so the input shape for the prediction would be (5,1,1) with data [x 0 0 0 0] and you would just take the first element of the output as your value. However, this seems inefficient when dealing with larger batches.
I also tried to remove the batch size from the model creation, but I got the following message:
ValueError: If a RNN is stateful, a complete input_shape must be provided (including batch size).
Is there another way? Thanks for the help.
Currently (Keras v2.0.8) it takes a bit more effort to get predictions on single rows after training in batch.
Basically, the batch_size is fixed at training time, and has to be the same at prediction time.
The workaround right now is to take the weights from the trained model, and use those as the weights in a new model you've just created, which has a batch_size of 1.
The quick code for that is
model = create_model(batch_size=64)
mode.fit(X, y)
weights = model.get_weights()
single_item_model = create_model(batch_size=1)
single_item_model.set_weights(weights)
single_item_model.compile(compile_params)
Here's a blog post that goes into more depth:
https://machinelearningmastery.com/use-different-batch-sizes-training-predicting-python-keras/
I've used this approach in the past to have multiple models at prediction time- one that makes predictions on big batches, one that makes predictions on small batches, and one that makes predictions on single items. Since batch predictions are much more efficient, this gives us the flexibility to take in any number of prediction rows (not just a number that is evenly divisible by batch_size), while still getting predictions pretty rapidly.
#ClimbsRocks showed a nice workaround. I cannot provide a "correct" answer in sense of "this is how Keras intends it to be done", but I can share another workaround which might help somebody depending on the use-case.
In this workaround I use predict_on_batch(). This method allows to pass a single sample out of a batch without throwing an error. Unfortunately, it returns a vector in the shape the target has according to the training-settings. However, each sample in the target yields then the prediction for your single sample.
You can access it like this:
to_predict = #Some single sample that would be part of a batch (has to have the right shape)#
model.predict_on_batch(to_predict)[0].flatten() #Flatten is optional
The result of the prediction is exactly the same as if you would pass an entire batch to predict().
Here some cod-example.
The code is from my question which also deals with this issue (but in a sligthly different manner).
sequence_size = 5
number_of_features = 1
input = (sequence_size, number_of_features)
batch_size = 2
model = Sequential()
#Of course you can replace the Gated Recurrent Unit with a LSTM-layer
model.add(GRU(100, return_sequences=True, activation='relu', input_shape=input, batch_size=2, name="GRU"))
model.add(GRU(1, return_sequences=True, activation='relu', input_shape=input, batch_size=batch_size, name="GRU2"))
model.compile(optimizer='adam', loss='mse')
model.summary()
#Summary-output:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
GRU (GRU) (2, 5, 100) 30600
_________________________________________________________________
GRU2 (GRU) (2, 5, 1) 306
=================================================================
Total params: 30,906
Trainable params: 30,906
Non-trainable params: 0
def generator(data, batch_size, sequence_size, num_features):
"""Simple generator"""
while True:
for i in range(len(data) - (sequence_size * batch_size + sequence_size) + 1):
start = i
end = i + (sequence_size * batch_size)
yield data[start : end].reshape(batch_size, sequence_size, num_features), \
data[end - ((sequence_size * batch_size) - sequence_size) : end + sequence_size].reshape(batch_size, sequence_size, num_features)
#Task: Predict the continuation of a linear range
data = np.arange(100)
hist = model.fit_generator(
generator=generator(data, batch_size, sequence_size, num_features),
steps_per_epoch=total_batches,
epochs=200,
shuffle=False
)
to_predict = np.asarray([[np.asarray([x]) for x in range(95,100,1)]]) #Only single element of a batch
correct = np.asarray([100,101,102,103,104])
print( model.predict_on_batch(to_predict)[0].flatten() )
#Output:
[ 99.92908 100.95854 102.32129 103.28584 104.20213 ]