I am trying to learn how to use neural networks in MATLAB and I am starting with a simple example that uses four data points that I split into two row vectors. One of them is Input and the other is Temp. The input vector is a vector from 1 to 4.
Next I run some neural network coding I found from examples. Now I would like for the neural network to predict the outcome of a sample input vector which is a row vector [5 6].
clear all
clc
Input = [1,2,3,4];
Temp = [.25,.15,.1,.07];
Smpl = [5,6]
net = newff(minmax(Input),[20,1],{'logsig','purelin','trainlm'})
net.trainparam.epochs = 500;
net.trainparam.goal = 1e-25;
net.trainparam.lr = .01;
net = train(net,Input,Temp)
TempPr = net(Input)
error = TempPr - Temp
TempPrSmpl = net(Smpl)
The row vector, TempPr, generated by the neural network exactly matches with the target vector, Temp. However, it seems that I am unable to predict values properly. For example I try to predict temperature values for inputs 5 and 6 which I expect them to be less than .07.
But instead the matlab code is returning:
TempPrSmpl =
0.3560 0.3560
Two questions:
Why is the value being returned from MATLAB greater than .07?
Why are there not two different values being returned from MATLAB (one for 5 and one for 6)?
I have two very large matrices (228453x460) and I want to compute correlation between rows.
for i=1:228453
if(vec1_preprocess(i,1))
for j=1:228453
df = effdf(vec1_preprocess(i,:)',vec2_preprocess(j,:)');
corr_temp = corr(vec1_preprocess(i,:)', vec2_preprocess(j,:)');
p = calculate_p(corr_temp, df);
temp = (meanVec(i)+p)/2;
meanVec(i) = temp;
end
disp(i);
end
end
This takes ~1day. Is there a direct way to compute this?
Edit: Code for effdf
function df = effdf(ts1,ts2);
%function df = effdf(ts1,ts2);
ts1=ts1-mean(ts1);
ts2=ts2-mean(ts2);
N=length(ts1);
ac1=xcorr(ts1);
ac1=ac1/max(ac1); % normalized autocorrelation
ac1=ac1(((length(ac1)+3)/2):((length(ac1)+3)/2+floor(N/4)));
ac2=xcorr(ts2);
ac2=ac2/max(ac2); % normalized autocorrelation
ac2=ac2(((length(ac2)+3)/2):((length(ac2)+3)/2+floor(N/4)));
df = 1/((1/N)+(2/N)*sum(((N-(1:length(ac1)))/N)'.*ac1.*ac2));
Since you didn't post the code, I assume that your custom functions calculate_p and effdf are perfectly optimized and don't represent the bottleneck of your script. Let's focus on what we have.
The first problem I see is:
if (vec1_preprocess(i,1))
A check over 228453 iterations can sensibly increase the running time. Hence, extract only the matrix rows that don't contain a 0 in the first column and perform your calculations on those:
idx = vec1_preprocess(:,1) ~= 0;
vec1_preprocess = vec1_preprocess(idx,:);
for i = 1:size(vec1_preprocess,1)
% ...
end
The second problem is corr. It seems like you are computing p-values also, using calculate_p. Why don't you use the buil-in p-values returned by the function as second output argument?
[c,p] = corr(A,B);
Alternatively, if Pearson's correlation is what you are looking for, you could replace corr with corrcoef to see if it produces a better performance.
Last but not least (in fact it's the most important thing): is there any reason why you are performing this computation row by row instead of running it on the whole matrices?
If you read the documentation, you'll see that corr computes the correlation between columns, not rows.
To convert rows into columns and columns into rows, simply transpose the matrix:
tmp1 = vec1_preprocess';
tmp2 = vec2_preprocess';
C = corr(tmp1,tmp2);
I have implemented CNN in Matlab, but my implementation takes too much time. I have identified which part is more time consuming. It is max-pooling related code below:
%blockwise operation
fun = #(block_struct) max_matrix(block_struct.data);
%downsampling
maxpool = cell(number_feature_map,1);
for i=1:number_feature_map
maxpool{i}=blockproc(y{i},[2 2],fun);
end
function [maximum]=max_matrix(A)
maximum=max(A(:));
Without this (downsampling) it takes only 2 minutes to converge.
How can I make it efficient?
Instead of blockproc you can use kron to create indices of blocks and use accumarray to apply max to each block. assumed number of rows and column are even and assumed data are random matrices of size [6,8]
r = 6 ,c=8
idx = kron(reshape(1:(r*c/4),c/2,[]).',ones(2))
for ii=1:number_feature_map
data = rand(r,c);
maxpool{ii} = reshape(accumarray(idx(:),data(:),[],#max),c/2,[]).';
end
Summary: This question deals with the improvement of an algorithm for the computation of linear regression.
I have a 3D (dlMAT) array representing monochrome photographs of the same scene taken at different exposure times (the vector IT) . Mathematically, every vector along the 3rd dimension of dlMAT represents a separate linear regression problem that needs to be solved. The equation whose coefficients need to be estimated is of the form:
DL = R*IT^P, where DL and IT are obtained experimentally and R and P must be estimated.
The above equation can be transformed into a simple linear model after applying a logarithm:
log(DL) = log(R) + P*log(IT) => y = a + b*x
Presented below is the most "naive" way to solve this system of equations, which essentially involves iterating over all "3rd dimension vectors" and fitting a polynomial of order 1 to (IT,DL(ind1,ind2,:):
%// Define some nominal values:
R = 0.3;
IT = 600:600:3000;
P = 0.97;
%// Impose some believable spatial variations:
pMAT = 0.01*randn(3)+P;
rMAT = 0.1*randn(3)+R;
%// Generate "fake" observation data:
dlMAT = bsxfun(#times,rMAT,bsxfun(#power,permute(IT,[3,1,2]),pMAT));
%// Regression:
sol = cell(size(rMAT)); %// preallocation
for ind1 = 1:size(dlMAT,1)
for ind2 = 1:size(dlMAT,2)
sol{ind1,ind2} = polyfit(log(IT(:)),log(squeeze(dlMAT(ind1,ind2,:))),1);
end
end
fittedP = cellfun(#(x)x(1),sol); %// Estimate of pMAT
fittedR = cellfun(#(x)exp(x(2)),sol); %// Estimate of rMAT
The above approach seems like a good candidate for vectorization, since it does not utilize MATLAB's main strength that is MATrix operations. For this reason, it does not scale very well and takes much longer to execute than I think it should.
There exist alternative ways to perform this computation based on matrix division, as demonstrated here and here, which involve something like this:
sol = [ones(size(x)),log(x)]\log(y);
That is, appending a vector of 1s to the observations, followed by mldivide to solve the equation system.
The main challenge I'm facing is how to adapt my data to the algorithm (or vice versa).
Question #1: How can the matrix-division-based solution be extended to solve the problem presented above (and potentially replace the loops I am using)?
Question #2 (bonus): What is the principle behind this matrix-division-based solution?
The secret ingredient behind the solution that includes matrix division is the Vandermonde matrix. The question discusses a linear problem (linear regression), and those can always be formulated as a matrix problem, which \ (mldivide) can solve in a mean-square error senseā”. Such an algorithm, solving a similar problem, is demonstrated and explained in this answer.
Below is benchmarking code that compares the original solution with two alternatives suggested in chat1, 2 :
function regressionBenchmark(numEl)
clc
if nargin<1, numEl=10; end
%// Define some nominal values:
R = 5;
IT = 600:600:3000;
P = 0.97;
%// Impose some believable spatial variations:
pMAT = 0.01*randn(numEl)+P;
rMAT = 0.1*randn(numEl)+R;
%// Generate "fake" measurement data using the relation "DL = R*IT.^P"
dlMAT = bsxfun(#times,rMAT,bsxfun(#power,permute(IT,[3,1,2]),pMAT));
%% // Method1: loops + polyval
disp('-------------------------------Method 1: loops + polyval')
tic; [fR,fP] = method1(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
%% // Method2: loops + Vandermonde
disp('-------------------------------Method 2: loops + Vandermonde')
tic; [fR,fP] = method2(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
%% // Method3: vectorized Vandermonde
disp('-------------------------------Method 3: vectorized Vandermonde')
tic; [fR,fP] = method3(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
function [fittedR,fittedP] = method1(IT,dlMAT)
sol = cell(size(dlMAT,1),size(dlMAT,2));
for ind1 = 1:size(dlMAT,1)
for ind2 = 1:size(dlMAT,2)
sol{ind1,ind2} = polyfit(log(IT(:)),log(squeeze(dlMAT(ind1,ind2,:))),1);
end
end
fittedR = cellfun(#(x)exp(x(2)),sol);
fittedP = cellfun(#(x)x(1),sol);
function [fittedR,fittedP] = method2(IT,dlMAT)
sol = cell(size(dlMAT,1),size(dlMAT,2));
for ind1 = 1:size(dlMAT,1)
for ind2 = 1:size(dlMAT,2)
sol{ind1,ind2} = flipud([ones(numel(IT),1) log(IT(:))]\log(squeeze(dlMAT(ind1,ind2,:)))).'; %'
end
end
fittedR = cellfun(#(x)exp(x(2)),sol);
fittedP = cellfun(#(x)x(1),sol);
function [fittedR,fittedP] = method3(IT,dlMAT)
N = 1; %// Degree of polynomial
VM = bsxfun(#power, log(IT(:)), 0:N); %// Vandermonde matrix
result = fliplr((VM\log(reshape(dlMAT,[],size(dlMAT,3)).')).');
%// Compressed version:
%// result = fliplr(([ones(numel(IT),1) log(IT(:))]\log(reshape(dlMAT,[],size(dlMAT,3)).')).');
fittedR = exp(real(reshape(result(:,2),size(dlMAT,1),size(dlMAT,2))));
fittedP = real(reshape(result(:,1),size(dlMAT,1),size(dlMAT,2)));
The reason why method 2 can be vectorized into method 3 is essentially that matrix multiplication can be separated by the columns of the second matrix. If A*B produces matrix X, then by definition A*B(:,n) gives X(:,n) for any n. Moving A to the right-hand side with mldivide, this means that the divisions A\X(:,n) can be done in one go for all n with A\X. The same holds for an overdetermined system (linear regression problem), in which there is no exact solution in general, and mldivide finds the matrix that minimizes the mean-square error. In this case too, the operations A\X(:,n) (method 2) can be done in one go for all n with A\X (method 3).
The implications of improving the algorithm when increasing the size of dlMAT can be seen below:
For the case of 500*500 (or 2.5E5) elements, the speedup from Method 1 to Method 3 is about x3500!
It is also interesting to observe the output of profile (here, for the case of 500*500):
Method 1
Method 2
Method 3
From the above it is seen that rearranging the elements via squeeze and flipud takes up about half (!) of the runtime of Method 2. It is also seen that some time is lost on the conversion of the solution from cells to matrices.
Since the 3rd solution avoids all of these pitfalls, as well as the loops altogether (which mostly means re-evaluation of the script on every iteration) - it unsurprisingly results in a considerable speedup.
Notes:
There was very little difference between the "compressed" and the "explicit" versions of Method 3 in favor of the "explicit" version. For this reason it was not included in the comparison.
A solution was attempted where the inputs to Method 3 were gpuArray-ed. This did not provide improved performance (and even somewhat degradaed them), possibly due to wrong implementation, or the overhead associated with copying matrices back and forth between RAM and VRAM.
I am trying to implement Naive Bayes Classifier using a dataset published by UCI machine learning team. I am new to machine learning and trying to understand techniques to use for my work related problems, so I thought it's better to get the theory understood first.
I am using pima dataset (Link to Data - UCI-ML), and my goal is to build Naive Bayes Univariate Gaussian Classifier for K class problem (Data is only there for K=2). I have done splitting data, and calculate the mean for each class, standard deviation, priors for each class, but after this I am kind of stuck because I am not sure what and how I should be doing after this. I have a feeling that I should be calculating posterior probability,
Here is my code, I am using percent as a vector, because I want to see the behavior as I increase the training data size from 80:20 split. Basically if you pass [10 20 30 40] it will take that percentage from 80:20 split, and use 10% of 80% as training.
function[classMean] = naivebayes(file, iter, percent)
dm = load(file);
for i=1:iter
idx = randperm(size(dm.data,1))
%Using same idx for data and labels
shuffledMatrix_data = dm.data(idx,:);
shuffledMatrix_label = dm.labels(idx,:);
percent_data_80 = round((0.8) * length(shuffledMatrix_data));
%Doing 80-20 split
train = shuffledMatrix_data(1:percent_data_80,:);
test = shuffledMatrix_data(percent_data_80+1:length(shuffledMatrix_data),:);
train_labels = shuffledMatrix_label(1:percent_data_80,:)
test_labels = shuffledMatrix_data(percent_data_80+1:length(shuffledMatrix_data),:);
%Getting the array of percents
for pRows = 1:length(percent)
percentOfRows = round((percent(pRows)/100) * length(train));
new_train = train(1:percentOfRows,:)
new_trin_label = shuffledMatrix_label(1:percentOfRows)
%get unique labels in training
numClasses = size(unique(new_trin_label),1)
classMean = zeros(numClasses,size(new_train,2));
for kclass=1:numClasses
classMean(kclass,:) = mean(new_train(new_trin_label == kclass,:))
std(new_train(new_trin_label == kclass,:))
priorClassforK = length(new_train(new_trin_label == kclass))/length(new_train)
priorClassforK_1 = 1 - priorClassforK
end
end
end
end
First, compute the probability of evey class label based on frequency counts. For a given sample of data and a given class in your data set, you compute the probability of evey feature. After that, multiply the conditional probability for all features in the sample by each other and by the probability of the considered class label. Finally, compare values of all class labels and you choose the label of the class with the maximum probability (Bayes classification rule).
For computing conditonal probability, you can simply use the Normal distribution function.