When designing a CNN for 1D time series signal classification in MATLAB i get the error that the 2dconvolutional layer does not take sequences as input. From my understanding it is perfectly possible to convolve of an "array" with a 3x1 filter. To resolve this issue MATLAB suggests to use a "sequence folding layer". What would be the function of such a sequence folding layer and how would the architecture need to be changed?
I get the following error message:
What would be the function of such a sequence folding layer and how would the architecture need to be changed?
Simply speaking, you're parsing in a sequence (or as you called it - "an array") of images whereas you need to convert them into a batch of images before performing any convolutional operations.
Documentation about sequenceFoldingLayer:
A sequence folding layer converts a batch of image sequences to a batch of images. Use a sequence folding layer to perform convolution operations on time steps of image sequences independently.
Regarding the usage of a sequenceFoldingLayer (once again suggested to check out the documentation):
To use a sequence folding layer, you must connect the miniBatchSize output to the miniBatchSize input of the corresponding sequence unfolding layer. For an example, see Create Network for Video Classification.
On said website are also lots of examples on how to create a sequence folding layer - e.g. one with the name of "fold1":
layer = sequenceFoldingLayer('Name','fold1')
... as well as examples on how to properly implement it within your project.
What would be the function of such a sequence folding layer and how would the architecture need to be changed?
Using sequenceFolding/sequenceUnFoldingLayer works as follows experimentally, the results in the red box in the figure are consistent.
%% Testing the sequenceFoldingLayer/sequenceUnfoldingLayer internal flow
% The difference between using the sequence layer and not using the sequenceFolding/sequenceUnFoldingLayer to extract features
% Common parameters used below
%
% 测试sequenceFoldingLayer/sequenceUnfoldingLayer内部具体操作流程
% 用和不用sequence layer的区别,使用sequenceFolding/sequenceUnFoldingLayer提取出特征结果比较
% 下面使用的共有参数
inputSize = [28 28 1];
filterSize = 5;
numFilters = 20;
numHiddenUnits = 200;
numClasses = 10;
inputData = rand(28,28,1,30);% h*w*c*n
convL = convolution2dLayer(filterSize,numFilters,'Name','conv',...
'Weights',ones(filterSize,filterSize,1,numFilters),...
'Bias',ones(1,1,numFilters));
bnL= batchNormalizationLayer('Name','bn',...
'TrainedVariance',ones(1,1,numFilters),...
'TrainedMean',zeros(1,1,numFilters),...
'Offset',zeros(1,1,numFilters),...
'Scale',ones(1,1,numFilters));
%% first stage, use sequence layer
layers = [ ...
sequenceInputLayer(inputSize,'Name','input')
sequenceFoldingLayer('Name','fold')
convL
bnL
reluLayer('Name','relu')
sequenceUnfoldingLayer('Name','unfold')
flattenLayer('Name','flatten')
lstmLayer(numHiddenUnits,'OutputMode','last','Name','lstm',...
'InputWeights',ones(800,11520),...
'RecurrentWeights',ones(800,200),...
'Bias',ones(800,1));
fullyConnectedLayer(numClasses, 'Name','fc',...
'Weights',ones(numClasses,200),...
'Bias',ones(numClasses,1));
softmaxLayer('Name','softmax')
classificationLayer('Name','classification')];
lgraph = layerGraph(layers);
lgraph = connectLayers(lgraph,'fold/miniBatchSize','unfold/miniBatchSize');
assembleNet = assembleNetwork(lgraph);
%% second stage: not use sequence layer
% The sequenceFoldingLayer is essentially a per-frame graph acquisition feature as follows
layers2 = [ ...
imageInputLayer(inputSize,'Name','input','Normalization','none')
convL
bnL
reluLayer('Name','relu')
fullyConnectedLayer(numClasses, 'Name','fc',...
'Weights',ones(numClasses,11520),...
'Bias',ones(numClasses,1));
softmaxLayer('Name','softmax')
classificationLayer('Name','classification')];
lgraph2 = layerGraph(layers2);
assembleNet2 = assembleNetwork(lgraph2);
%% visualize
analyzeNetwork(lgraph)
analyzeNetwork(lgraph2)
%% comparison
out1 = activations(assembleNet,inputData,'unfold','OutputAs','channels');
out2 = activations(assembleNet2,inputData,'relu','OutputAs','channels');
t1 = out1{1};
t2 = out2;
%% result is same
t1(1:10)
t2(1:10)
Related
I have a matrix which is 256X192X80. I want to normalize all slices (80 represents the slices) without using for loop.
The way I'm doing with for is below: (im_dom_raw is our matrix)
normalized_raw = zeros(size(im_dom_raw));
for a=1:80
slice_raw = im_dom_raw(:,:,a);
slice_raw = slice_raw-min(slice_raw(:));
slice_raw = slice_raw/(max(slice_raw(:)));
normalized_raw(:,:,a) = slice_raw;
end
The code below implements your normalization approach without using loops. Its based on bsxfun.
% Shift all values to the positive side
slices_raw = bsxfun(#minus,im_dom_raw,min(min(im_dom_raw)));
% Normalize all values with respect to the slice maximum (With input from #Daniel)
normalized_raw2 = bsxfun(#mrdivide,slices_raw,max(max(slices_raw)));
% A slightly faster approach would be
%normalized_raw2 = bsxfun(#times,slices_raw,max(max(slices_raw)).^-1);
% ... but it will differ with your approach due to numerical approximation
% Comparison to your previous loop based implementation
sum(abs(normalized_raw(:)-normalized_raw2(:)))
The last line of code outputs
ans =
0
Which (thanks to #Daniel) means that both approaches yield exact same results.
I am trying to build a neural network and have the following code:
for i = 1:num_samples-num_reserved
% Getting the sample and transposing it so it can be multiplied
sample = new_data_a(:,i)';
% Normalizing the input vector
sample = sample/norm(sample);
% Calculating output
outputs = sample*connections;
% Neuron that fired the hardest's index (I) and its output (output)
[output, I] = max(outputs);
% Conections leading to this neuron
neuron_connections = connections(:,I);
% Looping through input components
for j = 1:num_features
% Value of this input component
component_input = sample(j);
% Updating connection weights
delta = 0.7*(component_input - neuron_connections(j));
neuron_connections(j) = neuron_connections(j) + delta;
end
% Copying new connection weights into original matrix
connections(:,I) = neuron_connections/norm(neuron_connections);
if(rem(i,100) == 0)
if(I == 1)
delta_track = [delta_track connections(2,I)];
end
end
% Storing current connections
end
I think I'm doing everything right. This loop is repeated about 600 times so as to progressively update connections. The function to update weights I am using is the standard one I found in textbooks.
However, when I look at the values stored in delta_track These keep oscillating forming a regular pattern.
Any advice?
You can reduce the feedback factor. Then the network may require more time to learn but is less likely to oscillate.
Another common technique is to add a decay, i.e. reducing the factor each iteration.
In general neural networks have the same stability rules as control systems have (because as long as a NN is learning it is a control system) Therefore similar approaches work as for e.g. PID controllers.
I am trying to implement Bag of Words in opencv and has come with the implementation below. I am using Caltech 101 database. However, since its my first time and not being familiar, I have planned to used two image sets from the database, the chair image set and the soccer ball image set. I have coded for the svm using this.
Everything went allright, except when I call classifier.predict(descriptor) , I do not get the label vale as intended. I always get a0 instead of '1', irrespective of my test image. The number of images in the chair dataset is 10 and in the soccer ball dataset is 10. I labelled chair as 0 and soccer ball as 1 . The links represent the samples of each categories, the top 10 is of chairs, the bottom 10 is of soccer balls
function hello
clear all; close all; clc;
detector = cv.FeatureDetector('SURF');
extractor = cv.DescriptorExtractor('SURF');
links = {
'http://i.imgur.com/48nMezh.jpg'
'http://i.imgur.com/RrZ1i52.jpg'
'http://i.imgur.com/ZI0N3vr.jpg'
'http://i.imgur.com/b6lY0bJ.jpg'
'http://i.imgur.com/Vs4TYPm.jpg'
'http://i.imgur.com/GtcwRWY.jpg'
'http://i.imgur.com/BGW1rqS.jpg'
'http://i.imgur.com/jI9UFn8.jpg'
'http://i.imgur.com/W1afQ2O.jpg'
'http://i.imgur.com/PyX3adM.jpg'
'http://i.imgur.com/U2g4kW5.jpg'
'http://i.imgur.com/M8ZMBJ4.jpg'
'http://i.imgur.com/CinqIWI.jpg'
'http://i.imgur.com/QtgsblB.jpg'
'http://i.imgur.com/SZX13Im.jpg'
'http://i.imgur.com/7zVErXU.jpg'
'http://i.imgur.com/uUMGw9i.jpg'
'http://i.imgur.com/qYSkqEg.jpg'
'http://i.imgur.com/sAj3pib.jpg'
'http://i.imgur.com/DMPsKfo.jpg'
};
N = numel(links);
trainer = cv.BOWKMeansTrainer(100);
train = struct('val',repmat({' '},N,1),'img',cell(N,1), 'pts',cell(N,1), 'feat',cell(N,1));
for i=1:N
train(i).val = links{i};
train(i).img = imread(links{i});
if ndims(train(i).img > 2)
train(i).img = rgb2gray(train(i).img);
end;
train(i).pts = detector.detect(train(i).img);
train(i).feat = extractor.compute(train(i).img,train(i).pts);
end;
for i=1:N
trainer.add(train(i).feat);
end;
dictionary = trainer.cluster();
extractor = cv.BOWImgDescriptorExtractor('SURF','BruteForce');
extractor.setVocabulary(dictionary);
for i=1:N
desc(i,:) = extractor.compute(train(i).img,train(i).pts);
end;
a = zeros(1,10)';
b = ones(1,10)';
labels = [a;b];
classifier = cv.SVM;
classifier.train(desc,labels);
test_im =rgb2gray(imread('D:\ball1.jpg'));
test_pts = detector.detect(test_im);
test_feat = extractor.compute(test_im,test_pts);
val = classifier.predict(test_feat);
disp('Value is: ')
disp(val)
end
These are my test samples:
Soccer Ball
(source: timeslive.co.za)
Chair
Searching through this site I think that my algorithm is okay, even though I am not quite confident about it. If anybody can help me in finding the bug, it will be appreciable.
Following Amro's code , this was my result:
Distribution of classes:
Value Count Percent
1 62 49.21%
2 64 50.79%
Number of training instances = 61
Number of testing instances = 65
Number of keypoints detected = 38845
Codebook size = 100
SVM model parameters:
svm_type: 'C_SVC'
kernel_type: 'RBF'
degree: 0
gamma: 0.5063
coef0: 0
C: 62.5000
nu: 0
p: 0
class_weights: 0
term_crit: [1x1 struct]
Confusion matrix:
ans =
29 1
1 34
Accuracy = 96.92 %
Your logic looks fine to me.
Now I guess you'll have to tweak the various parameters if you want to improve the classification accuracy. This includes the clustering algorithm parameters (such as the vocabulary size, clusters initialization, termination criteria, etc..), the SVM parameters (kernel type, the C coefficient, ..), the local features algorithm used (SIFT, SURF, ..).
Ideally, whenever you want to perform parameter selection, you ought to use cross-validation. Some methods already have such mechanism embedded (CvSVM::train_auto for instance), but for the most part you'll have to do this manually...
Finally you should follow general machine learning guidelines; see the whole bias-variance tradeoff dilemma. The online Coursera ML class discusses this topic in detail in week 6, and explains how to perform error analysis and use learning curves to decide what to try next (do we need to add more instances, increase model complexity, and so on..).
With that said, I wrote my own version of the code. You might wanna compare it with your code:
% dataset of images
% I previously saved them as: chair1.jpg, ..., ball1.jpg, ball2.jpg, ...
d = [
dir(fullfile('images','chair*.jpg')) ;
dir(fullfile('images','ball*.jpg'))
];
% local-features algorithm used
detector = cv.FeatureDetector('SURF');
extractor = cv.DescriptorExtractor('SURF');
% extract local features from images
t = struct();
for i=1:numel(d)
% load image as grayscale
img = imread(fullfile('images', d(i).name));
if ~ismatrix(img), img = rgb2gray(img); end
% extract local features
pts = detector.detect(img);
feat = extractor.compute(img, pts);
% store along with class label
t(i).img = img;
t(i).class = find(strncmp(d(i).name,{'chair','ball'},4));
t(i).pts = pts;
t(i).feat = feat;
end
% split into training/testing sets
% (a better way would be to use cvpartition from Statistics toolbox)
disp('Distribution of classes:')
tabulate([t.class])
tTrain = t([1:7 11:17]);
tTest = t([8:10 18:20]);
fprintf('Number of training instances = %d\n', numel(tTrain));
fprintf('Number of testing instances = %d\n', numel(tTest));
% build visual vocabulary (by clustering training descriptors)
K = 100;
bowTrainer = cv.BOWKMeansTrainer(K, 'Attempts',5, 'Initialization','PP');
clust = bowTrainer.cluster(vertcat(tTrain.feat));
fprintf('Number of keypoints detected = %d\n', numel([tTrain.pts]));
fprintf('Codebook size = %d\n', K);
% compute histograms of visual words for each training image
bowExtractor = cv.BOWImgDescriptorExtractor('SURF', 'BruteForce');
bowExtractor.setVocabulary(clust);
M = zeros(numel(tTrain), K);
for i=1:numel(tTrain)
M(i,:) = bowExtractor.compute(tTrain(i).img, tTrain(i).pts);
end
labels = vertcat(tTrain.class);
% train an SVM model (perform paramter selection using cross-validation)
svm = cv.SVM();
svm.train_auto(M, labels, 'SvmType','C_SVC', 'KernelType','RBF');
disp('SVM model parameters:'); disp(svm.Params)
% evaluate classifier using testing images
actual = vertcat(tTest.class);
pred = zeros(size(actual));
for i=1:numel(tTest)
descs = bowExtractor.compute(tTest(i).img, tTest(i).pts);
pred(i) = svm.predict(descs);
end
% report performance
disp('Confusion matrix:')
confusionmat(actual, pred)
fprintf('Accuracy = %.2f %%\n', 100*nnz(pred==actual)./numel(pred));
Here are the output:
Distribution of classes:
Value Count Percent
1 10 50.00%
2 10 50.00%
Number of training instances = 14
Number of testing instances = 6
Number of keypoints detected = 6300
Codebook size = 100
SVM model parameters:
svm_type: 'C_SVC'
kernel_type: 'RBF'
degree: 0
gamma: 0.5063
coef0: 0
C: 312.5000
nu: 0
p: 0
class_weights: []
term_crit: [1x1 struct]
Confusion matrix:
ans =
3 0
1 2
Accuracy = 83.33 %
So the classifier correctly labels 5 out of 6 images from the test set, which is not bad for a start :) Obviously you'll get different results each time you run the code due to the inherent randomness of the clustering step.
What is the number of images you are using to build your dictionary i.e. what is N? From your code, it seems that you are only using a 10 images (those listed in links). I hope this list is truncated down for this post else that would be too few. Typically you need in the order of 1000 or much more images to build the dictionary and the images need not be restricted to only these 2 classes that you are classifying. Otherwise, with only 10 images and 100 clusters your dictionary is likely to be messed up.
Also, you might want to use SIFT as a first choice as it tends to perform better than the other descriptors.
Lastly, you can also debug by checking the detected keypoints. You can get OpenCV to draw the keypoints. Sometimes your keypoint detector parameters are not set properly, resulting in too few keypoints getting detected, which in turn gives poor feature vectors.
To understand more about the BOW algorithm, you can take a look at these posts here and here. The second post has a link to a free pdf for an O'Reilley book on computer vision using python. The BOW model (and other useful stuff) is described in more details inside that book.
Hope this helps.
My question is specific to the "learn_params()" function of the BayesNetToolbox in MatLab. In the user manual, "learn_params()" is stated to be suitable for use only if the input data is fully observed. I have tried it with a partially observed dataset where I represented unobserved values as NaN's.
It seems like "learn_params()" can deal with NaNs and the node state combinations that do not occur in the dataset. When I apply dirichlet priors to smoothen the 0 values, I get 'sensible' MLE distributions for all nodes. I have copied the script where I do this.
Can someone clarify whether what I am doing makes sense or if I am missing
something, i.e. the reason why "learn_params()" cannot be used with partially
observed data.
The MatLab Script where I test this is here:
% Incomplete dataset (where NaN's are unobserved)
Age = [1,2,2,NaN,3,3,2,1,NaN,2,1,1,3,NaN,2,2,1,NaN,3,1];
TNMStage = [2,4,2,3,NaN,1,NaN,3,1,4,3,NaN,2,4,3,4,1,NaN,2,4];
Treatment = [2,3,3,NaN,2,NaN,4,4,3,3,NaN,2,NaN,NaN,4,2,NaN,3,NaN,4];
Survival = [1,2,1,2,2,1,1,1,1,2,2,1,2,2,1,2,1,2,2,1];
matrixdata = [Age;TNMStage;Treatment;Survival];
node_sizes =[3,4,4,2];
% Enter the variablesmap
keys = {'Age', 'TNM','Treatment', 'Survival'};
v= 1:1:length(keys);
VariablesMap = containers.Map(keys,v);
% create the dag and the bnet
N = length(node_sizes); % Instead of entering it manually
dag2 = zeros(N,N);
dag2(VariablesMap('Treatment'),VariablesMap('Survival')) = 1;
bnet21 = mk_bnet(dag2, node_sizes);
draw_graph(bnet21.dag);
dirichletweight=1;
% define the CPD priors you want to use
bnet23.CPD{VariablesMap('Age')} = tabular_CPD(bnet23, VariablesMap('Age'), 'prior_type', 'dirichlet','dirichlet_type', 'unif', 'dirichlet_weight', dirichletweight);
bnet23.CPD{VariablesMap('TNM')} = tabular_CPD(bnet23, VariablesMap('TNM'), 'prior_type', 'dirichlet','dirichlet_type', 'unif', 'dirichlet_weight', dirichletweight);
bnet23.CPD{VariablesMap('Treatment')} = tabular_CPD(bnet23, VariablesMap('Treatment'), 'prior_type', 'dirichlet','dirichlet_type', 'unif','dirichlet_weight', dirichletweight);
bnet23.CPD{VariablesMap('Survival')} = tabular_CPD(bnet23, VariablesMap('Survival'), 'prior_type', 'dirichlet','dirichlet_type', 'unif','dirichlet_weight', dirichletweight);
% Find MLEs from incomplete data with Dirichlet prior CPDs
bnet24 = learn_params(bnet23, matrixdata);
% Look at the new CPT values after parameter estimation has been carried out
CPT24 = cell(1,N);
for i=1:N
s=struct(bnet24.CPD{i}); % violate object privacy
CPT24{i}=s.CPT;
end
According to my understanding of the BNT documentation, you need to make a couple of changes:
Missing values should be represented as empty cells instead of NaN values.
The learn_params_em function is the only one that supports missing values.
My previous response was incorrect, as I mis-recalled which of the BNT learning functions had support for missing values.
I'm working in matlab processing images for steganography. In my work so far I have been working with block processing command blockproc to break the image up into blocks to work on it. I'm now looking to start working with two image, the secret and the cover, but i can't find anyway to use blockproc with two input matrices instead of one.
Would anyone knowof a way to do this?
blockproc allows you to iterate over a single image only, but doesn't stop you from operating on whatever data you would like. The signature of the user function takes as input a "block struct", which contains not only the data field (which is used in all the blockproc examples) but also several other fields, one of which is "location". You can use this to determine "where you are" in your input image and to determine what other data you need to operate on that block.
for example, here's how you could do element-wise multiplication on 2 same-size images. This is a pretty clunky example but just here to demonstrate how this could look:
im1 = rand(100);
im2 = rand(100);
fun = #(bs) bs.data .* ...
im2(bs.location(1):bs.location(1)+9,bs.location(2):bs.location(2)+9);
im3 = blockproc(im1,[10 10],fun);
im4 = im1 .* im2;
isequal(im3,im4)
Using the "location" field of the block struct you can figure out the appropriate parts of a 2nd, 3rd, 4th, etc. data set you need for that particular block.
hope this helps!
-brendan
I was struggling with the same thing recently and solved it by combining both my input matrices into a single 3D matrix as follows. The commented out lines were my original code, prior to introducing block processing to it. The other problem I had was using variables other than the image matrix in the function: I had to do that part of the calculation first. If someone can simplify it please let me know!
%%LAB1 - L*a*b nearest neighbour classification
%distance_FG = ((A-FG_A).^2 + (B-FG_B).^2).^0.5;
%distance_BG = ((A-BG_A).^2 + (B-BG_B).^2).^0.5;
distAB = #(bs) ((bs.data(:,:,1)).^2 + (bs.data(:,:,2)).^2).^0.5;
AB = A - FG_A; AB(:,:,2) = B - FG_B;
distance_FG = blockproc(AB, [1000, 1000], distAB);
clear AB
AB = A - BG_A; AB(:,:,2) = B - BG_B;
distance_BG = blockproc(AB, [1000, 1000], distAB);
clear AB
I assume the solution to your problem lies in creating a new matrix that contains both input matrices.
e.g. A(:,:,1) = I1; A(:,:,2) = I2;
Now you can use blockproc on A.