Random Sampling/Matlab/Matrix - matlab

I am trying to create a set of 320 matrices, each having dimensions 1152 x 241. Each matrix represents a different time step. I am trying to populate each cell, using a random value from another file. This other file is also dimensioned 1152 x 241, but there are ~2520 time steps from which to choose.
So what is supposed to happen is pick a cell, populate with a value from a random time step from the big file, and move onto the adjacent cell and do the same thing. Repeat until 320 matrices have been created.
Problem is I run the code and I only create one matrix. What do I need to do to fix my code so that 320 matrices are created? Thanks!
clear all;
clc;
% Load datafile
load 1979_1999_tropics_subset_3mmhr.mat
% Create empty maps
rain_fake_timeseries = zeros(1152,241,320);
for i = 1:1152; % set longitude
%disp(i)
for j = 1:241; % set latitude
%disp(j)
%for k = 1:320; % create map
%disp(k)
rain_fake_timeseries = datasample(rain_sample_1979_1999,1,3);
%disp(rain_fake_timeseries)
%save random_clus_fake_timeseries.mat rain_fake_timeseries -v7.3;
%end
end
end
save random_clus_fake_timeseries.mat rain_fake_timeseries -v7.3;

This is because you are not properly indexing into your time series array to store the data. What you are doing is that you are only saving the last randomly chosen slice in your time series array. If you look at your loop closely, you are simply overwriting the output array at each iteration of the for loop.
You are also not creating your for loop correctly. If I understand you correctly, each location in a slice represents a unique (x,y) coordinate. For each matrix that you have, you want to sample from this exact same location but temporally search through your ~2500 time instances. As such, you need to use all of your loop variables i, j and k to index into your 3D matrix. You also need to access all time slices at position (i,j) and randomly sample from all of the slices. If I can suggest a small optimization change, we can do this with only two for loops rather than three, randomly choose 320 points at this position for all of the time slices, and store it into the 3D matrix.
In other words:
clear all;
clc;
% Load datafile
load 1979_1999_tropics_subset_3mmhr.mat
% Create empty maps
rain_fake_timeseries = zeros(1152,241,320);
for i = 1 : size(rain_fake_timeseries,1)
for j = 1 : size(rain_fake_timeseries,2)
rain_fake_timeseries(i,j,:) = datasample(rain_sample_1979_1999(i,j,:), ...
size(rain_fake_timeseries,3), 3);
end
end
save random_clus_fake_timeseries.mat rain_fake_timeseries -v7.3;
Note that I have replaced the dimensions in the for loop with calls to size so that you can easily change the size of the matrices and it'll still work without you having to change any constants.

Related

Saving parfor loop data in workspace (Matlab)

Good evening,
May I please get help with a script I'm writing? I have a parfor loop nested within a for loop. The goal is to iterate over a set of data that consists of 10 data subsets generated from an earlier parsim simulink analysis (it's labeled as 1x10 SimulationOutput). Each data subset is 24 rows deep, and a variable length of columns (usually about 200,000 to 300,000 columns of data). Part of the process is to find the maximum or minimum values in each data set. Once that is done, it is to be put into a table, appending data to that table. Ideally, I should have a 6x10 table by the end of it. See below for the code:
% Run Time
tic
% Preallocate memory to increase speed
b=zeros(24,1); %Make space for this array.
c=zeros(500000,1);
d=zeros(500000,1);
e=zeros(500000,1);
f=zeros(500000,1);
g=zeros(500000,1);
h=zeros(500000,1);
%table=[];
for j = 1:length(out(1,:)) %iterate over each run
parfor i = 1:length(out(1,j).PN.time) % Set length of vector
b=out(1,j).PN.signals.values(:,i); % Find the values to work on
c(i)=b(19,:); % Distance to target (m)
d(i)=b(20,:); % Lat. Accelerations, integrated twice (m)
e(i)=b(21,:); % Long. Acceleration, integrated twice (m)
f(i)=b(22,:); % Lat. Guidance Error
g(i)=b(23,:); % Long. Guidance Error
h(i)=b(24,:); % time to target (sec)
end
%For c_min, there's extranous zeros popping up, exclude them
tc = c;
tc(tc <= 0) = nan;
[c_min, I_1] = min(tc);
% [c_min,I_1]=min(c(c>0)); % Collect the closest missile/target approach (most
critical value)
[d_max,I_2]=max(d); % We need to find the max value per run, but wish for the min value
%over all runs.
[e_max,I_3]=max(e); % We need to find the max value per run, but wish for the min value
%over all runs.
[f_min,I_4]=min(f); % We just want the minimum value here.
[g_min,I_5]=min(g); % We just want the minimum value here.
[h_max,I_6]=max(h); % The minimum time is 2nd most critical value, after distance to
%target.
table(:,j)=[ c_min d_max e_max f_min g_min h_max]; %d_max e_max f_min g_min h_max
end
toc
The issue that I am having is that, while I can input the correct data sets in the correct locations in the table if I set a constant j value (example: if j = 7, then the 7th column in the table gets the correct data) I can't seem to get all the values inputted correctly. What I mean is that, the outputted table (6x10) will have repeated values across columns, values from one column in another column, and so on). It is as if the script cannot differentiate between columns anymore, so values just go wherever.
If anyone has any advice, I'd greatly appreciate it. Thank you,

Encode each training image as a histogram of the number of times each vocabulary element shows up for Bag of Visual Words

I want to implement bag of visual words in MATLAB. I used SURF features to extract features from the images and k-means to cluster those features into k clusters. I now have k centroids and I want to know how many times each cluster is used by assigning each image feature to its closet neighbor. Finally, I'd like to create a histogram of this for each image.
I tried to use knnsearch function but it doesn't work in this case.
Here is my MATLAB code:
clc;
clear;
close all;
folder = 'CarData/TrainImages/cars';
filePattern = fullfile(folder, '*.pgm');
f=dir(filePattern);
files={f.name};
for k=1:numel(files)
fullFileName = fullfile(folder, files{k});
H = fspecial('log');
image=imfilter(imread(fullFileName),H);
temp = detectSURFFeatures(image);
[im_features, temp] = extractFeatures(image, temp);
features{k}= im_features;
end
features = vertcat(features{:});
image_feats = [];
[assignments,centers] = kmeans(double(features),500);
vocab = centers';
I have all images feature in features array and cluster center in centroid array
You're almost there. You don't even need to use knnsearch at all. The assignments variable tells you which input feature mapped to which cluster. assignments will give you a N x 1 vector where N is the total number of examples you have, or the total number of features in the input matrix features. Each value assignments(i) tells you which cluster the example i (or row i) of features it maps to. The cluster centroid dictated by assignments(i) would be given as centers(i, :).
Therefore given how you've called kmeans, it will be a N x 1 vector where each element is from 1 to 500 with 500 being the total number of clusters desired.
Let's do the simple case where we only have one image in your codebook. If this is the case, all you have to do is create a histogram of the assignments variable. The output histogram h will be a 500 x 1 vector with each element h(i) being the number of times an example used centroid i as its representation in your codebook.
Just use the histcounts function and make sure that you specify the bin ranges so that they coincide with each cluster ID. You must make sure that you account for the ending bin, as the bin ranges are exclusive on the right edge so just add an additional bin to the end.
Something like this will work:
h = histcounts(assignments, 1 : 501);
If you want something simpler and you don't want to worry about specifying the end bin, you can use accumarray to achieve the same result:
h = accumarray(assignments, 1);
The effect of accumarray we assign key-value pairs where the key is the centroid that the example mapped to and the value is simply 1 for all keys. accumarray will bin all values in assignments that share the same key and you do something with those values. The default behaviour of accumarray is to sum all values, which is effectively computing the histogram.
However, you want to do this for multiple images, not just a single image.
For Bag of Visual Words problems, we will certainly have more than one training image in our database. Therefore, you want to find the histogram of the features for each image. We can still use the above concept, but one thing I can suggest is you maintain a separate variable that tells you how many features were detected per image, then you can index into the assignments variable to help extract out the correct assigned centroid IDs, then build a histogram of those individually. We can build a 2D matrix where each row delineates the histogram of each image. Remember that in kmeans, each row tells you what cluster each example was assigned to independently of the other examples in your data. Using that, you would use kmeans on the entire training dataset, then be smart about how you're accessing the assignments variable to extract out the assigned clusters for each input image.
Therefore, modify your code so that it looks something like this:
clc;
clear;
close all;
folder = 'CarData/TrainImages/cars';
filePattern = fullfile(folder, '*.pgm');
f=dir(filePattern);
files={f.name};
num_features = zeros(numel(files), 1); % New - for keeping track of # of features per image
for k=1:numel(files)
fullFileName = fullfile(folder, files{k});
H = fspecial('log');
image=imfilter(imread(fullFileName),H);
temp = detectSURFFeatures(image);
[im_features, temp] = extractFeatures(image, temp);
num_features(k) = size(im_features, 1); % New - # of features per image
features{k}= im_features;
end
features = vertcat(features{:});
num_clusters = 500; % Added to make the code adaptive
[assignments,centers] = kmeans(double(features), num_clusters);
counter = 1; % Keeps track of where we need to slice in assignments
% Go through each image and find their histograms
features_hist = zeros(numel(files), num_clusters); % Records the per image histograms
for k = 1 : numel(files)
a = assignments(counter : counter + num_features(k) - 1); % Get the assignments
h = histcounts(a, 1 : num_clusters + 1);
% Or:
% h = accumarray(a, 1).'; % Transpose to make it a row
% Place in final output
features_hist(k, :) = h;
% Increment counter
counter = counter + num_features(k);
end
features_hist will now be a N x 500 matrix where each row is the histogram of each image you are seeking. The final job would be to use a supervised machine learning algorithm (SVM, Neural Networks, etc.) where the expected labels is the description of each image you have assigned to the image accompanied by the histogram of each image as the input features. The final result would be a learned model so that when you have a new image, calculate the SURF features, represent them in a histogram of features like we did above, then feed it into the classification model to give you the expected class or label that the image represents.
P.S. Deep Learning / CNNs do a much better job at this, but require much more time to train. If you're looking at performance wise, don't use Bag of Visual Words but this is something very quick to implement and it's known to perform moderately well but that of course depends on the kinds of images you want to classify.

Error message when storing output from loop in matrix

I have this program which calculates the realized covariance for each day in my sample but I have some troubles with storing the output in a matrix.
the program is as follows:
for i=1:66:(2071*66)
vec = realized_covariance(datapa(i:i+65),data(i:i+65),datapo(i:i+65),data(i:i+65),'wall','Fixed',fixedInterval,5)
mat(2,4142) = vec
end
Output:
vec =
1.0e-03 *
0.1353 -0.0283
-0.0283 0.0185
Subscripted assignment dimension mismatch.
I have tried various way to store the output in a matrix like defining a matrix on zeroes to store the output in or let the row dimension of the storing matrix be undefined, but nothing seems to do the job.
I would really appreciate an advice on how to tackle this challenge.
I have used a solution which does the job.
I defined a matrix and then filled in all my output one at the time using the following:
A = zeros(0,0) %before loop, only serve to define the storing matrix
A = [A; vec]%after the calculating function, inside the loop.
Actually mat(2,4142) is a single location in a matrix, you can't assign there four values.
You need to define the exact location inside mat every time you want to assign values into it. Try doing it like that:
mat=zeros(2,2142);
for k=1:66:(2071*66)
vec=realized_covariance(datapa(i:i+65),data(i:i+65),datapo(i:i+65),data(i:i+65),'wall','Fixed',fixedInterval,5)
mat(:,[(((k-1)/66)*2)+1 (((k-1)/66)*2)+2])=vec;
end
You're trying to store a 2 x 2 matrix into a single element. I.e. 4 elements on the right hand side, one on the left. That won't fit. See it like this: you have a garage besides your house where 1 car fits. You've got three friends coming over and they also want to park their car inside. That's a problem though, as you've got only space for one. So you have to buy a bigger garage: assign 4 elements on the left (e.g. mat(ii:ii+1,jj:jj+1) = [1 2;3 4]), or use a cell/structure array.
As Steve suggests in a comment below, you can use a 3D matrix quite easily:
counters = 1:66:(2071*66);
mat = zeros(2,2,numel(counters)); %// initialise output matrix
for ii=1:numel(counters)
vec = realized_covariance(datapa(counters(ii):counters(ii+65)),...
data(counters(ii):counters(ii+65)),datapo(counters(ii):counters(ii+65)),...
data(counters(ii):counters(ii+65)),'wall','Fixed',fixedInterval,5)
mat(:,:,ii) = vec; %// store in a 3D matrix
end
Now mat is 3D, with the first two coordinates being your regular output, i.e.e vec, and the last index is the iteration number. So to access the output of iteration 1032 you'd do mat(:,:,1032), possibly with a squeeze around that to make it 2D instead of 3D.

Sample matrix data and retrieve in Matlab

In Matlab, I want to sample data in such a way that to calculate the length of matrix, and for that, calculate its every 20th row and stores in a array. That what I sampled my data.
length(P) for instance which is 251.
Now, I want to check if the Original P index is equal to the sampled Matrix index (obviously the operation is in loop) then merge both same indexes, Which is:
[L]=[0];
for ii=1:length()
if P(ii,:)== SP{ii}(ii,:) %SP is sample points array
L = [P(ii,:)=; SP{ii}(ii,:);];
end
end
My Problem:
I'm unable to sample the data in my accordance, i.e SP= datasample(P,2); and also, couldn't retrive the calculated L very well, may be facing problem of indexes, i.e
if L~=0
l=L(ii,:);
end
Sample data after 20th iterations, can be simply in for-loop instead of any built-in function, the code below showing some sketch for only one cell index.
kk = 0;
for ii=1:round(length(P{1})/30)
kk = kk+20;
L{ii} =P{1}(kk,:);
end

Matlab Plot Smoothing having no effect

I'm currently having some trouble with the 'smooth' command, namely that it seems to have no effect on the generated plot. I have already used the following script to generate a plot
for h=1:4
linespec = {'rx', 'gx', 'bx', 'yx'};
hold on
for b=1:365
y=mean(mean(A(b,6*(h-1)+1:6*h)));
p(h)=plot(b,y,linespec{h});
end
hold off
end
Going row by row in data set A and taking the average of the values in the first six columns, then column 7 through 12, 13 through 18 and 19 through 14; generating four plots in total.
The next step was to smooth the resultant plot by averaging the values over a span of 9. So, I tweaked the script to the following;
for h=1:4
linespec = {'rx', 'gx', 'bx', 'yx'};
hold on
for b=1:365
y=mean(mean(A(b,6*(h-1)+1:6*h)));
w = smooth(y,9,'moving');
p(h)=plot(b,w,linespec{h});
end
hold off
end
Essentially just adding the w variable and replacing y with w in the plot command. Yet this has no effect whatsoever on my plot. Matlab doesn't throw up any errors either, so there doesn't seem to be a problem with the input size. Does anyone have an idea as to what the issue might be?
In either version of the loop, you appear to be plotting individual values of y against individual values of b. I presume, then, that y is a single value. You can't smooth a point, so the smooth operation is having no effect.
From the start, you don't need to make a loop to calculate the various means; mean can take a 2D matrix and return a vector. Calculate y in one go, then smooth that vector (should have length 365, I presume - depends on the size of input A). e.g.:
b = 1:365;
y=mean(A(:,6*(h-1)+1:6*h),2);
w = smooth(y,9,'moving');
plot(b,y,'rx');
hold on
plot(b,w,'gx');