MATCONVVNET nnloss error 'Index Exceed Matrix Dimension' - neural-network

I have made my own IMDB using a set of 51000 images categorized into 43 different categories of road traffic signs. However, when I want to use my own IMDB to train the alexnet network, I get an error which says: Index exceeds matrix dimensions.
Error in vl_nnloss (line 230)
t = - log(x(ci)) ;
Do you have an idea what I am doing wrong? I have checked through my IMDB, and the images, labels and sets have been appropriately created as specified in my code. Also, the image array is declared as type single and not uint8.
Here is my training code below
function [net, info] = alexnet_train(imdb, expDir)
run(fullfile(fileparts(mfilename('fullpath')), '../../', 'matlab', 'vl_setupnn.m')) ;
% some common options
opts.train.batchSize = 100;
opts.train.numEpochs = 20 ;
opts.train.continue = true ;
opts.train.gpus = [1] ;
opts.train.learningRate = [1e-1*ones(1, 10), 1e-2*ones(1, 5)];
opts.train.weightDecay = 3e-4;
opts.train.momentum = 0.;
opts.train.expDir = expDir;
opts.train.numSubBatches = 1;
% getBatch options
bopts.useGpu = numel(opts.train.gpus) > 0 ;
% network definition!
% MATLAB handle, passed by reference
net = dagnn.DagNN() ;
net.addLayer('conv1', dagnn.Conv('size', [11 11 3 96], 'hasBias', true, 'stride', [4, 4], 'pad', [0 0 0 0]), {'input'}, {'conv1'}, {'conv1f' 'conv1b'});
net.addLayer('relu1', dagnn.ReLU(), {'conv1'}, {'relu1'}, {});
net.addLayer('lrn1', dagnn.LRN('param', [5 1 2.0000e-05 0.7500]), {'relu1'}, {'lrn1'}, {});
net.addLayer('pool1', dagnn.Pooling('method', 'max', 'poolSize', [3, 3], 'stride', [2 2], 'pad', [0 0 0 0]), {'lrn1'}, {'pool1'}, {});
net.addLayer('conv2', dagnn.Conv('size', [5 5 48 256], 'hasBias', true, 'stride', [1, 1], 'pad', [2 2 2 2]), {'pool1'}, {'conv2'}, {'conv2f' 'conv2b'});
net.addLayer('relu2', dagnn.ReLU(), {'conv2'}, {'relu2'}, {});
net.addLayer('lrn2', dagnn.LRN('param', [5 1 2.0000e-05 0.7500]), {'relu2'}, {'lrn2'}, {});
net.addLayer('pool2', dagnn.Pooling('method', 'max', 'poolSize', [3, 3], 'stride', [2 2], 'pad', [0 0 0 0]), {'lrn2'}, {'pool2'}, {});
net.addLayer('conv3', dagnn.Conv('size', [3 3 256 384], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'pool2'}, {'conv3'}, {'conv3f' 'conv3b'});
net.addLayer('relu3', dagnn.ReLU(), {'conv3'}, {'relu3'}, {});
net.addLayer('conv4', dagnn.Conv('size', [3 3 192 384], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'relu3'}, {'conv4'}, {'conv4f' 'conv4b'});
net.addLayer('relu4', dagnn.ReLU(), {'conv4'}, {'relu4'}, {});
net.addLayer('conv5', dagnn.Conv('size', [3 3 192 256], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'relu4'}, {'conv5'}, {'conv5f' 'conv5b'});
net.addLayer('relu5', dagnn.ReLU(), {'conv5'}, {'relu5'}, {});
net.addLayer('pool5', dagnn.Pooling('method', 'max', 'poolSize', [3 3], 'stride', [2 2], 'pad', [0 0 0 0]), {'relu5'}, {'pool5'}, {});
net.addLayer('fc6', dagnn.Conv('size', [6 6 256 4096], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'pool5'}, {'fc6'}, {'conv6f' 'conv6b'});
net.addLayer('relu6', dagnn.ReLU(), {'fc6'}, {'relu6'}, {});
net.addLayer('fc7', dagnn.Conv('size', [1 1 4096 4096], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'relu6'}, {'fc7'}, {'conv7f' 'conv7b'});
net.addLayer('relu7', dagnn.ReLU(), {'fc7'}, {'relu7'}, {});
net.addLayer('classifier', dagnn.Conv('size', [1 1 4096 10], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'relu7'}, {'classifier'}, {'conv8f' 'conv8b'});
net.addLayer('prob', dagnn.SoftMax(), {'classifier'}, {'prob'}, {});
net.addLayer('objective', dagnn.Loss('loss', 'log'), {'prob', 'label'}, {'objective'}, {});
net.addLayer('error', dagnn.Loss('loss', 'classerror'), {'prob','label'}, 'error') ;
% -- end of the network
% initialization of the weights (CRITICAL!!!!)
initNet(net, 1/100);
% do the training!
info = cnn_train_dag(net, imdb, #(i,b) getBatch(bopts,i,b), opts.train, 'val', find(imdb.images.set == 3)) ;
end
function initNet(net, f)
net.initParams();
f_ind = net.layers(1).paramIndexes(1);
b_ind = net.layers(1).paramIndexes(2);
net.params(f_ind).value = 10*f*randn(size(net.params(f_ind).value), 'single');
net.params(f_ind).learningRate = 1;
net.params(f_ind).weightDecay = 1;
for l=2:length(net.layers)
% is a conenter code herevolution layer?
if(strcmp(class(net.layers(l).block), 'dagnn.Conv'))
f_ind = net.layers(l).paramIndexes(1);
b_ind = net.layers(l).paramIndexes(2);
[h,w,in,out] = size(net.params(f_ind).value);
net.params(f_ind).value = f*randn(size(net.params(f_ind).value), 'single');
net.params(f_ind).learningRate = 1;
net.params(f_ind).weightDecay = 1;
net.params(b_ind).value = f*randn(size(net.params(b_ind).value), 'single');
net.params(b_ind).learningRate = 0.5;
net.params(b_ind).weightDecay = 1;
end
end
end
% function on charge of creating a batch of images + labels
function inputs = getBatch(opts, imdb, batch)
%[227 by 227 by 3] image
images = imdb.images.data(:,:,:,batch) ;
labels = imdb.images.labels(1,batch) ;
if opts.useGpu > 0
images = gpuArray(images) ;
end
inputs = {'input', images, 'label', labels} ;
end

Your network is not true. Conv1 layer must be [11 11 3 48]. If it doesn't work check again your network. This error occurs due to your network errors.

Related

Convolutional Neural Networks, Implementation using MatConvNet toolbox. How to deal with Overfitting?

I've currently been working with CNN's and am having a hard time with what I believe is overfitting. Specifically, even though my training data converges to a minimum error, my validation data still refuses to drop in respect to error. My input data that I'm using is 512 x 650 x 1 x 4000 (2D data, 4000 samples) and there are only two classes to the data that I'm trying to distinguish between (class A and class B). I'm aware I in the future need many more samples, but for now, I just would like to see my validation error decline even a little before I invest in generating more data.
My networks have all been around 60-70 layers long and have included the following types of layers:
Block Example
Convolutional Layers [3 x 3] filter size, stride [1 x 1], padding [1 1 1 1]
ReLU Layers (Non-linearity)
Batch normalization (Tremendous help to training data convergence and implementation speed)
Max Pooling Layers [2 x 2] filters sizes, stride [2 x 2], padding [0 0 0 0]
I then repeat this "block" until my input data is a 1 x 1 x N size where I then run it through a a few fully connected layers, and then into a softmax.
My actual MatConvNet code is below for inspection and the output plots are attached. For the plots, blue represents my training error and orange represents my validation error. I'm linking my most recent from the code below.
My Questions:
1) How does one know what filter sizes to use for their data? I know its an empirical process, but surely there is some kind of intuition behind this? I've read papers (VGG.net, and more) on using the [3x3] small filters and using a lot of them, but even after designing a 70 layer network with this in mind, still no validation error decline.
2) I have tried dropout layers due to their popularity of reducing over fitting... I placed the dropout layers throughout my network after the ReLU and pooling layers in the "block" shown above, but between all convolutional layers. It unfortunately had no effect on my validation data, the error was still the same. Next I tried only using it after the fully connected layers since thats where the most neurons (or feature maps) are being created in my architecture, and still no luck. I've read the paper on dropout. Should I give up on using it? Is there once again "a trick" to this?
3) If I try a smaller network (I've read that's a descent way to deal with overfitting) how do I effectively reduce the size of my data? Just max pooling?
ANY suggestions would be wonderful.
Again, thank you all for reading this long question. I assure you I've done my research, and found that asking here might help me more in the long run.
CNN Error Output plot
MatConvNet Code (Matlab Toolbox for CNN Design)
opts.train.batchSize = 25;
opts.train.numEpochs = 200 ;
opts.train.continue = true ;
opts.train.gpus = [1] ;
opts.train.learningRate = 1e-3;
opts.train.weightDecay = 0.04;
opts.train.momentum = 0.9;
opts.train.expDir = 'epoch_data';
opts.train.numSubBatches = 1;
bopts.useGpu = numel(opts.train.gpus) > 0 ;
load('imdb4k.mat');
net = dagnn.DagNN() ;
% Block #1
net.addLayer('conv1', dagnn.Conv('size', [3 3 1 64], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'input'}, {'conv1'}, {'conv1f' 'conv1b'});
net.addLayer('relu1', dagnn.ReLU(), {'conv1'}, {'relu1'}, {});
net.addLayer('bn1', dagnn.BatchNorm('numChannels', 64), {'relu1'}, {'bn1'}, {'bn1f', 'bn1b', 'bn1m'});
net.addLayer('pool1', dagnn.Pooling('method', 'max', 'poolSize', [2, 2], 'stride', [2 2], 'pad', [0 0 0 0]), {'bn1'}, {'pool1'}, {});
% Block #2
net.addLayer('conv2', dagnn.Conv('size', [3 3 64 64], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'pool1'}, {'conv2'}, {'conv2f' 'conv2b'});
net.addLayer('relu2', dagnn.ReLU(), {'conv2'}, {'relu2'}, {});
net.addLayer('bn2', dagnn.BatchNorm('numChannels', 64), {'relu2'}, {'bn2'}, {'bn2f', 'bn2b', 'bn2m'});
net.addLayer('pool2', dagnn.Pooling('method', 'max', 'poolSize', [2, 2], 'stride', [2 2], 'pad', [0 0 0 0]), {'bn2'}, {'pool2'}, {});
% Block #3
net.addLayer('conv3', dagnn.Conv('size', [3 3 64 128], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'pool2'}, {'conv3'}, {'conv3f' 'conv3b'});
net.addLayer('relu3', dagnn.ReLU(), {'conv3'}, {'relu3'}, {});
net.addLayer('bn3', dagnn.BatchNorm('numChannels', 128), {'relu3'}, {'bn3'},
{'bn3f', 'bn3b', 'bn3m'});
net.addLayer('pool3', dagnn.Pooling('method', 'max', 'poolSize', [2, 2], 'stride', [2 2], 'pad', [0 0 0 0]), {'bn3'}, {'pool3'}, {});
% Block #4
net.addLayer('conv4', dagnn.Conv('size', [3 3 128 128], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'pool3'}, {'conv4'}, {'conv4f' 'conv4b'});
net.addLayer('relu4', dagnn.ReLU(), {'conv4'}, {'relu4'}, {});
net.addLayer('bn4', dagnn.BatchNorm('numChannels', 128), {'relu4'}, {'bn4'}, {'bn4f', 'bn4b', 'bn4m'});
net.addLayer('pool4', dagnn.Pooling('method', 'max', 'poolSize', [2, 2], 'stride', [2 2], 'pad', [0 0 0 0]), {'bn4'}, {'pool4'}, {});
% Block #5
net.addLayer('conv5', dagnn.Conv('size', [3 3 128 256], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'pool4'}, {'conv5'}, {'conv5f' 'conv5b'});
net.addLayer('relu5', dagnn.ReLU(), {'conv5'}, {'relu5'}, {});
net.addLayer('bn5', dagnn.BatchNorm('numChannels', 256), {'relu5'}, {'bn5'}, {'bn5f', 'bn5b', 'bn5m'});
net.addLayer('pool5', dagnn.Pooling('method', 'max', 'poolSize', [2, 2], 'stride', [2 2], 'pad', [0 0 0 0]), {'bn5'}, {'pool5'}, {});
% Block #6
net.addLayer('conv6', dagnn.Conv('size', [3 3 256 256], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'pool5'}, {'conv6'}, {'conv6f' 'conv6b'});
net.addLayer('relu6', dagnn.ReLU(), {'conv6'}, {'relu6'}, {});
net.addLayer('bn6', dagnn.BatchNorm('numChannels', 256), {'relu6'}, {'bn6'}, {'bn6f', 'bn6b', 'bn6m'});
net.addLayer('pool6', dagnn.Pooling('method', 'max', 'poolSize', [2, 2], 'stride', [2 2], 'pad', [0 0 0 0]), {'bn6'}, {'pool6'}, {});
% Block #7
net.addLayer('conv7', dagnn.Conv('size', [3 3 256 512], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'pool6'}, {'conv7'}, {'conv7f' 'conv7b'});
net.addLayer('relu7', dagnn.ReLU(), {'conv7'}, {'relu7'}, {});
net.addLayer('bn7', dagnn.BatchNorm('numChannels', 512), {'relu7'}, {'bn7'}, {'bn7f', 'bn7b', 'bn7m'});
net.addLayer('pool7', dagnn.Pooling('method', 'max', 'poolSize', [2, 2], 'stride', [2 2], 'pad', [0 0 0 0]), {'bn7'}, {'pool7'}, {});
% Block #8
net.addLayer('conv8', dagnn.Conv('size', [3 3 512 512], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'pool7'}, {'conv8'}, {'conv8f' 'conv8b'});
net.addLayer('relu8', dagnn.ReLU(), {'conv8'}, {'relu8'}, {});
net.addLayer('bn8', dagnn.BatchNorm('numChannels', 512), {'relu8'}, {'bn8'}, {'bn8f', 'bn8b', 'bn8m'});
net.addLayer('pool8', dagnn.Pooling('method', 'max', 'poolSize', [2, 2], 'stride', [1 2], 'pad', [0 0 0 0]), {'bn8'}, {'pool8'}, {});
% Block #9
net.addLayer('conv9', dagnn.Conv('size', [3 3 512 512], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'pool8'}, {'conv9'}, {'conv9f' 'conv9b'});
net.addLayer('relu9', dagnn.ReLU(), {'conv9'}, {'relu9'}, {});
net.addLayer('bn9', dagnn.BatchNorm('numChannels', 512), {'relu9'}, {'bn9'}, {'bn9f', 'bn9b', 'bn9m'});
net.addLayer('pool9', dagnn.Pooling('method', 'max', 'poolSize', [2, 2], 'stride', [2 2], 'pad', [0 0 0 0]), {'bn9'}, {'pool9'}, {});
% Incorporate MLP
net.addLayer('fc1', dagnn.Conv('size', [1 1 512 1000], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'pool9'}, {'fc1'}, {'conv15f' 'conv15b'});
net.addLayer('relu10', dagnn.ReLU(), {'fc1'}, {'relu10'}, {});
net.addLayer('bn10', dagnn.BatchNorm('numChannels', 1000), {'relu10'}, {'bn10'}, {'bn10f', 'bn10b', 'bn10m'});
net.addLayer('classifier', dagnn.Conv('size', [1 1 1000 2], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'bn10'}, {'classifier'}, {'conv16f' 'conv16b'});
net.addLayer('prob', dagnn.SoftMax(), {'classifier'}, {'prob'}, {});
% The dagnn.Loss computes the loss incurred by the prediction scores X given the categorical labels
net.addLayer('objective', dagnn.Loss('loss', 'softmaxlog'), {'prob', 'label'}, {'objective'}, {});
net.addLayer('error', dagnn.Loss('loss', 'classerror'), {'prob','label'}, 'error') ;
First of all your network seems too complex to me for the data and you need two orders of magnitude more samples to see any kind of results on such a complex network. That if the problem itself is that complex. Try to see if results improve with a much smaller network. Answering your questions:
1)Filter sizes ARE empirical but popularly 1x1,3x3,5x5 filters are most used. A 70 layer network does not make sense unless the problem is very complex and you have huge data. Also for that to successfully train you might have to look into resnets.
2)Dropouts are most often used in the Fully Connected layers. You can look into dropconnect. No need to use them between conv layers in general.
3)Reduction of size of intermediate maps can be easily achieved by the conv + maxpooling stacks. You dont have to reduce it to size 1x1 before using a MLP. You can use it directly by the time the maps reach a size 8x8 in the network for example. Try to use more than one FC layer. Also reduce the width of the network to reduce the model complexity (number of filters per layer)
Overall you have very little data which is definitely not going to work for deep models. Finetuning a pretrained model might give you better results. It all depends on the data itself and the task at hand. Also remember networks like VGG are trained for 1000 different classes with Millions of Images which is a very complex problem.

Deep Neural Network training, why is the network training not converging?

I'm using MATCONVNET DagNN. Using AlexNet architecture. The last few layers of my architecture are
[![net = dagnn.DagNN() ;
imdb_32 =load('imdb_all_32_pd_norm.mat');
imdb_32=imdb_32.imdb;
% some common options
opts.train.batchSize = 100;
opts.train.numEpochs = 100 ;
opts.train.continue = true ;
opts.train.gpus = \[\] ;
opts.train.learningRate = 0.2;%\[0.1 * ones(1,30), 0.01*ones(1,30), 0.001*ones(1,30)\] ;%0.002;%\[2e-1*ones(1, 10), 2e-2*ones(1, 5)\];
opts.train.momentum = 0.9;
opts.train.expDir = expDir;
opts.train.numSubBatches = 1;
bopts.useGpu =0;%numel(opts.train.gpus) > 0 ;
%% NET
net.addLayer('conv1', dagnn.Conv('size', \[11 11 3 96\], 'hasBias', true, 'stride', \[4, 4\], 'pad', \[20 20 20 20\]), {'input'}, {'conv1'}, {'conv1f' 'conv1b'});
net.addLayer('relu1', dagnn.ReLU(), {'conv1'}, {'relu1'}, {});
net.addLayer('lrn1', dagnn.LRN('param', \[5 1 2.0000e-05 0.7500\]), {'relu1'}, {'lrn1'}, {});
net.addLayer('pool1', dagnn.Pooling('method', 'max', 'poolSize', \[3, 3\], 'stride', \[2 2\], 'pad', \[0 0 0 0\]), {'lrn1'}, {'pool1'}, {});
net.addLayer('conv2', dagnn.Conv('size', \[5 5 48 256\], 'hasBias', true, 'stride', \[1, 1\], 'pad', \[2 2 2 2\]), {'pool1'}, {'conv2'}, {'conv2f' 'conv2b'});
net.addLayer('relu2', dagnn.ReLU(), {'conv2'}, {'relu2'}, {});
net.addLayer('lrn2', dagnn.LRN('param', \[5 1 2.0000e-05 0.7500\]), {'relu2'}, {'lrn2'}, {});
net.addLayer('pool2', dagnn.Pooling('method', 'max', 'poolSize', \[3, 3\], 'stride', \[2 2\], 'pad', \[0 0 0 0\]), {'lrn2'}, {'pool2'}, {});
net.addLayer('drop2',dagnn.DropOut('rate',0.7),{'pool2'},{'drop2'});
net.addLayer('conv3', dagnn.Conv('size', \[3 3 256 384\], 'hasBias', true, 'stride', \[1, 1\], 'pad', \[1 1 1 1\]), {'drop2'}, {'conv3'}, {'conv3f' 'conv3b'});
net.addLayer('relu3', dagnn.ReLU(), {'conv3'}, {'relu3'}, {});
net.addLayer('conv4', dagnn.Conv('size', \[3 3 192 384\], 'hasBias', true, 'stride', \[1, 1\], 'pad', \[1 1 1 1\]), {'relu3'}, {'conv4'}, {'conv4f' 'conv4b'});
net.addLayer('relu4', dagnn.ReLU(), {'conv4'}, {'relu4'}, {});
net.addLayer('conv5', dagnn.Conv('size', \[3 3 192 256\], 'hasBias', true, 'stride', \[1, 1\], 'pad', \[1 1 1 1\]), {'relu4'}, {'conv5'}, {'conv5f' 'conv5b'});
net.addLayer('relu5', dagnn.ReLU(), {'conv5'}, {'relu5'}, {});
net.addLayer('pool5', dagnn.Pooling('method', 'max', 'poolSize', \[3 3\], 'stride', \[2 2\], 'pad', \[0 0 0 0\]), {'relu5'}, {'pool5'}, {});
net.addLayer('drop5',dagnn.DropOut('rate',0.5),{'pool5'},{'drop5'});
net.addLayer('fc6', dagnn.Conv('size', \[1 1 256 4096\], 'hasBias', true, 'stride', \[1, 1\], 'pad', \[0 0 0 0\]), {'drop5'}, {'fc6'}, {'conv6f' 'conv6b'});
net.addLayer('relu6', dagnn.ReLU(), {'fc6'}, {'relu6'}, {});
net.addLayer('fc7', dagnn.Conv('size', \[1 1 4096 4096\], 'hasBias', true, 'stride', \[1, 1\], 'pad', \[0 0 0 0\]), {'relu6'}, {'fc7'}, {'conv7f' 'conv7b'});
net.addLayer('relu7', dagnn.ReLU(), {'fc7'}, {'relu7'}, {});
classLabels=max(unique(imdb_32.images.labels));
net.addLayer('classifier', dagnn.Conv('size', \[1 1 4096 1\], 'hasBias', true, 'stride', \[1, 1\], 'pad', \[0 0 0 0\]), {'relu7'}, {'prediction'}, {'conv8f' 'conv8b'});
net.addLayer('prob', dagnn.SoftMax(), {'prediction'}, {'prob'}, {});
net.addLayer('l2_loss', dagnn.L2Loss(), {'prob', 'label'}, {'objective'});
net.addLayer('error', dagnn.Loss('loss', 'classerror'), {'prob','label'}, 'error') ;
opts.colorDeviation = zeros(3) ;
net.meta.augmentation.jitterFlip = true ;
net.meta.augmentation.jitterLocation = true ;
net.meta.augmentation.jitterFlip = true ;
net.meta.augmentation.jitterBrightness = double(0.1 * opts.colorDeviation) ;
net.meta.augmentation.jitterAspect = \[3/4, 4/3\] ;
net.meta.augmentation.jitterScale = \[0.4, 1.1\] ;
net.meta.augmentation.jitterSaturation = 0.4 ;
net.meta.augmentation.jitterContrast = 0.4 ;
% net.meta.augmentation.jitterAspect = \[2/3, 3/2\] ;
net.meta.normalization.averageImage=imdb_32.images.data_mean;
initNet_He(net);
info = cnn_train_dag(net, imdb_32, #(i,b) getBatch(bopts,i,b), opts.train, 'val', find(imdb_32.images.set == 2)) ;][1]][1]
and The result of each epoch is shown in attachment. Why isn't the error and Objective converging? The regression loss is the MSE loss.
Try to decrease the momentum , say, to 0.5
For each individual conv filters' bias and initialization, the parameters have to be chosen based on application at hand. This result is due to signal fading after passing through different filters.

Constant error in neural network, MatConvNet

Solved: Previously my dataset had around 1000 images. I increased it to 50 000 and now the neural network learns and works.
I have created a convolutional neural network for recognizing three emotions from facial expression(positive, neutral, negative emotion). Somehow, my error function does not get any better(error image). Training and validation error are constant for 100 epochs. What could be the reason?
Why the error is constant?
Here's my code:
function training(varargin)
setup ;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
rngNum = 1; % rng number for random weight initialization, e.g., 1,2,3
num_fcHiddenNeuron =1024; % # neurons in the fully-connected hidden layer
prob_fcDropout = 0.5; % dropout probability in the fully-connected hidden layer,
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% input data for training deep CNNs
imdb1 = load(['trainingdata']) ;
imdb2 = load(['testdata']) ;
imdb.images.data = cat(4, imdb1.images.data, imdb2.images.data);
imdb.images.labels = cat(2, imdb1.images.labels, imdb2.images.labels);
imdb.images.set = cat(2, imdb1.images.set, imdb2.images.set);
imdb.meta = imdb1.meta;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
trainOpts.batchSize = 200 ;
trainOpts.numEpochs = 100 ;
trainOpts.gpus = [] ;
trainOpts.continue = true ;
trainOpts.learningRate = [0.004*ones(1,25), 0.002*ones(1,25), 0.001*ones(1,25), 0.0005*ones(1,25)];
trainOpts = vl_argparse(trainOpts, varargin);
%% Training Deep CNNs
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% CNN configuration
net.layers = {} ;
% %
% % %% Conv1 - MaxPool1
rng(rngNum) %control random number generation
net.layers{end+1} = struct('type', 'conv', ...
'weights', {{0.01*randn(3,3,1,32, 'single'), 0.1*ones(1, 32, 'single')}}, ...
'stride', 1, ...
'pad', 1, ...
'filtersLearningRate', 1, ...
'biasesLearningRate', 1, ...
'filtersWeightDecay', 1/5, ...
'biasesWeightDecay', 0) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'pool', ...
'method', 'max', ...
'pool', [2 2], ...
'stride', 2, ...
'pad', 0) ;
% %%% Conv2 - MaxPool2
rng(rngNum)
net.layers{end+1} = struct('type', 'conv', ...
'weights', {{0.01*randn(3,3,32,32, 'single'), 0.1*ones(1, 32, 'single')}}, ...
'stride', 1, ...
'pad', 0, ...
'filtersLearningRate', 1, ...
'biasesLearningRate', 1, ...
'filtersWeightDecay', 1/5, ...
'biasesWeightDecay', 0) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'pool', ...
'method', 'max', ...
'pool', [2 2], ...
'stride', 2, ...
'pad', [1, 0, 1, 0]) ;
% %%% Conv3 - MaxPool3
rng(rngNum)
net.layers{end+1} = struct('type', 'conv', ...
'weights', {{0.01*randn(3,3,32,64, 'single'), 0.1*ones(1, 64, 'single')}}, ...
'stride', 1, ...
'pad', 1, ...
'filtersLearningRate', 1, ...
'biasesLearningRate', 1, ...
'filtersWeightDecay', 1/5, ...
'biasesWeightDecay', 0) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'pool', ...
'method', 'max', ...
'pool', [2 2], ...
'stride', 2, ...
'pad', 0) ;
% %%% Fc Hidden
rng(rngNum)
net.layers{end+1} = struct('type', 'conv', ...
'weights', {{0.001*randn(5,5,64,num_fcHiddenNeuron, 'single'), 0.01*ones(1, num_fcHiddenNeuron, 'single')}}, ...
'stride', 1, ...
'pad', 0, ...
'filtersLearningRate', 1, ...
'biasesLearningRate', 1, ...
'filtersWeightDecay', 1/5, ...
'biasesWeightDecay', 0) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'dropout', ...
'rate', prob_fcDropout) ;
%
% %%% Fc Output
rng(rngNum)
net.layers{end+1} = struct('type', 'conv', ...
'weights', {{zeros(1,1,num_fcHiddenNeuron, 3, 'single'), zeros(1, 3, 'single')}}, ...
'stride', 1, ...
'pad', 0, ...
'filtersLearningRate', 1, ...
'biasesLearningRate', 1, ...
'filtersWeightDecay', 4, ...
'biasesWeightDecay', 0) ;
net.layers{end+1} = struct('type', 'softmaxloss') ;
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% starting to train deep CNN
[net,info] = cnn_train(net, imdb, getBatch(opts), trainOpts, 'val', find(imdb.images.set == 2)) ;
net.layers(end) = [] ;
function fn = getBatch(opts)
% -------------------------------------------------------------------------
fn = #(x,y) getSimpleNNBatch(x,y) ;
end
% -------------------------------------------------------------------------
function [images, labels] = getSimpleNNBatch(imdb, batch)
% -------------------------------------------------------------------------
images = imdb.images.data(:,:,:,batch) ;
labels = imdb.images.labels(1,batch) ;
end

Regression based number estimation?

I'm using AlexNet to train regression based count estimation.
My code as as folows
...
net.addLayer('fc7', dagnn.Conv('size', [1 1 4096 4096], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'bn6'}, {'fc7'}, {'conv7f' 'conv7b'});
net.addLayer('relu7', dagnn.ReLU(), {'fc7'}, {'relu7'}, {});
net.addLayer('bn7',dagnn.BatchNorm('numChannels',4096),{'relu7'},{'bn7'},{'bn7f','bn7b','bn7m'});
classLabels=max(unique(imdb_32.images.labels));
net.addLayer('classifier', dagnn.Conv('size', [1 1 4096 1], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'bn7'}, {'classifier'}, {'conv8f' 'conv8b'});
net.addLayer('prediction', dagnn.SoftMax(), {'classifier'}, {'prediction'}, {});
net.addLayer('objective', dagnn.Loss('loss','logistic'), {'prediction', 'label'}, {'objective'}, {});
net.addLayer('error', dagnn.Loss('loss', 'classerror'), {'prediction','label'}, 'error') ;
But the preduction of input image is constant for all the images and negative.. what am I doing wrong here?

Using AlexNet for prediction on new data after training

I am using Hands-on DL Tutorial (http://www.cvc.uab.es/~gros/index.php/hands-on-deep-learning-with-matconvnet/) for understanding how Convolutional Neural Networks (CNNs) work.
To start, I compiled MatConvnet and ran AlexNet with the network structure as follows:
net = dagnn.DagNN() ;
% special padding for CIFAR-10
net.addLayer('conv1', dagnn.Conv('size', [11 11 3 96], 'hasBias', true, 'stride', [4, 4], 'pad', [20 20 20 20]), {'input'}, {'conv1'}, {'conv1f' 'conv1b'});
net.addLayer('relu1', dagnn.ReLU(), {'conv1'}, {'relu1'}, {});
net.addLayer('lrn1', dagnn.LRN('param', [5 1 2.0000e-05 0.7500]), {'relu1'}, {'lrn1'}, {});
net.addLayer('pool1', dagnn.Pooling('method', 'max', 'poolSize', [3, 3], 'stride', [2 2], 'pad', [0 0 0 0]), {'lrn1'}, {'pool1'}, {});
net.addLayer('conv2', dagnn.Conv('size', [5 5 48 256], 'hasBias', true, 'stride', [1, 1], 'pad', [2 2 2 2]), {'pool1'}, {'conv2'}, {'conv2f' 'conv2b'});
net.addLayer('relu2', dagnn.ReLU(), {'conv2'}, {'relu2'}, {});
net.addLayer('lrn2', dagnn.LRN('param', [5 1 2.0000e-05 0.7500]), {'relu2'}, {'lrn2'}, {});
net.addLayer('pool2', dagnn.Pooling('method', 'max', 'poolSize', [3, 3], 'stride', [2 2], 'pad', [0 0 0 0]), {'lrn2'}, {'pool2'}, {});
net.addLayer('conv3', dagnn.Conv('size', [3 3 256 384], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'pool2'}, {'conv3'}, {'conv3f' 'conv3b'});
net.addLayer('relu3', dagnn.ReLU(), {'conv3'}, {'relu3'}, {});
net.addLayer('conv4', dagnn.Conv('size', [3 3 192 384], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'relu3'}, {'conv4'}, {'conv4f' 'conv4b'});
net.addLayer('relu4', dagnn.ReLU(), {'conv4'}, {'relu4'}, {});
net.addLayer('conv5', dagnn.Conv('size', [3 3 192 256], 'hasBias', true, 'stride', [1, 1], 'pad', [1 1 1 1]), {'relu4'}, {'conv5'}, {'conv5f' 'conv5b'});
net.addLayer('relu5', dagnn.ReLU(), {'conv5'}, {'relu5'}, {});
net.addLayer('pool5', dagnn.Pooling('method', 'max', 'poolSize', [3 3], 'stride', [2 2], 'pad', [0 0 0 0]), {'relu5'}, {'pool5'}, {});
net.addLayer('fc6', dagnn.Conv('size', [1 1 256 4096], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'pool5'}, {'fc6'}, {'conv6f' 'conv6b'});
net.addLayer('relu6', dagnn.ReLU(), {'fc6'}, {'relu6'}, {});
net.addLayer('fc7', dagnn.Conv('size', [1 1 4096 4096], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'relu6'}, {'fc7'}, {'conv7f' 'conv7b'});
net.addLayer('relu7', dagnn.ReLU(), {'fc7'}, {'relu7'}, {});
net.addLayer('classifier', dagnn.Conv('size', [1 1 4096 10], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'relu7'}, {'classifier'}, {'conv8f' 'conv8b'});
net.addLayer('prob', dagnn.SoftMax(), {'classifier'}, {'prob'}, {});
net.addLayer('objective', dagnn.Loss('loss', 'log'), {'prob', 'label'}, {'objective'}, {});
net.addLayer('error', dagnn.Loss('loss', 'classerror'), {'prob','label'}, 'error') ;
I load dataset (imdb_cifar10.mat) and train the network:
imdb_cifar10 = load('../data/imdb_cifar10.mat');
[net_alexnet, info] = alexnet_train(imdb_cifar10, 'results/cifar_10_experiment_1');
After 10 epochs, I received 10 net-epoch-x.mat file. So then, I would like to load one of these files to test on an image, but without success:
net = load('results/cifar_10_experiment_1 /net-epoch-10.mat');
net = dagnn.DagNN.loadobj(net.net);
net.meta.classes.description = imdb_cifar10 .meta.classes;
im = imread('../data/dog.jpg');
inference_classification(im, alexNet);
Where:
function inference_classification(im, net)
im_ = single(im) ; % note: 0-255 range
im_ = imresize(im_, net.meta.normalization.imageSize(1:2));
im_ = im_ - net.meta.normalization.averageImage;
% run the CNN
net.eval({'input', im_});
% obtain the CNN otuput
scores = net.vars(net.getVarIndex('prob')).value;
scores = squeeze(gather(scores));
% show the classification results
[bestScore, best] = max(scores);
figure(1) ; clf ; imagesc(im);
title(sprintf('%s (%d), score %.3f', net.meta.classes.description{best}, best, bestScore));
end
Matlab showed that I have an error with im_ = imresize(im_, net.meta.normalization.imageSize(1:2));
Error
I tried to run the code with some different versions of MatconvNet (e.g, matconvnet-1.0-beta16 and matconvnet-1.0-beta23) but the result is same. Could you please give me some advice for solving this problem.
Thank you so much for your time.
Best regards,
An Nhien