I have a function where I'm receiving input data and other data from four random sources. This function must be repeated for 12 times and 12 should be set so that. This function should also be repeated 10 times. Is there a more compact way to perform what I'm doing below?
for ii=1:10
Percent=0.7;
num_points1 = size(X_1,1);
split_point1 = round(num_points1*Percent);
sequence1 = randperm(num_points1);
X1_train{ii} = X_1(sequence1(1:split_point1),:);
Y1_train{ii} = Y_1(sequence1(1:split_point1));
X1_test{ii} = X_1(sequence1(split_point1+1:end),:);
Y1_test{ii}= Y_1(sequence1(split_point1+1:end));
num_points2 = size(X_2,1);
split_point2 = round(num_points2*Percent);
sequence2 = randperm(num_points2);
X2_train{ii} = X_2(sequence2(1:split_point2),:);
Y2_train{ii} = Y_2(sequence2(1:split_point2));
X2_test{ii} = X_2(sequence2(split_point2+1:end),:);
Y2_test{ii}= Y_2(sequence2(split_point2+1:end));
.
.
.
.
num_points12 = size(X_12,1);
split_point12 = round(num_points12*Percent);
sequence12 = randperm(num_points12);
X12_train{ii} = X_12(sequence12(1:split_point12),:);
Y12_train{ii} = Y_12(sequence12(1:split_point12));
X12_test{ii} = X_12(sequence12(split_point12+1:end),:);
Y12_test{ii}= Y_12(sequence12(split_point12+1:end));
end
The biggest problem you have currently is that you have 12 separate variables to do 12 related operations. Don't do that. Consolidate all of the variables into one container then iterate over the container.
I have the following suggestions for you:
Combine X_1, X_2, ... X_12 into one container. A cell array or structure may be prudent to use here. I'm going to use cell arrays in this case as your code currently employs them and it's probably the easiest thing for you to transition to.
Create four master cell arrays for the training and test set data and labels and within each cell array are nested cell arrays that contain each trial.
Loop over the cell array created in step #1 and assign the results to each of the four master cell arrays.
Therefore, something like this comes to mind:
X = {X_1, X_2, X_3, X_4, X_5, X_6, X_7, X_8, X_9, X_10, X_11, X_12};
Y = {Y_1, Y_2, Y_3, Y_4, Y_5, Y_6, Y_7, Y_8, Y_9, Y_10, Y_11, Y_12};
N = numel(X);
num_iterations = 10;
X_train = cell(1, num_iterations);
Y_train = cell(1, num_iterations);
X_test = cell(1, num_iterations);
Y_test = cell(1, num_iterations);
Percent = 0.7;
for ii = 1 : num_iterations
for jj = 1 : N
vals = X{jj};
labels = Y{jj};
num_points = size(vals,1);
split_point = round(num_points*Percent);
sequence = randperm(num_points);
X_train{ii}{jj} = vals(sequence(1:split_point),:);
Y_train{ii}{jj} = labels(sequence(1:split_point));
X_test{ii}{jj} = vals(sequence(split_point+1:end),:);
Y_test{ii}{jj} = labels(sequence(split_point+1:end));
end
end
As such, to access the training data for a particular iteration, you would do:
data = X_train{ii};
ii is the iteration you want to access. data would now be a cell array, so if you want to access the training data for a particular group, you would now do:
group = data{jj};
jj is the group you want to access. However, you can combine this into one step by:
group = X_train{ii}{jj};
You'll see this syntax in various parts of the code I wrote above. You'd do the same for the other data in your code (X_test, Y_train, Y_test).
I think you'll agree that this is more compact and to the point.
Related
Using Matlab R2019a, is there any way to avoid the for-loop in the following code in spite of the dimensions containing different element so that each element has to be checked? M is a vector with indices, and Inpts.payout is a 5D array with numerical data.
for m = 1:length(M)-1
for power = 1:noScenarios
for production = 1:noScenarios
for inflation = 1:noScenarios
for interest = 1:noScenarios
if Inpts.payout(M(m),power,production,inflation,interest)<0
Inpts.payout(M(m+1),power,production,inflation,interest)=...
Inpts.payout(M(m+1),power,production,inflation,interest)...
+Inpts.payout(M(m),power,production,inflation,interest);
Inpts.payout(M(m),power,production,inflation,interest)=0;
end
end
end
end
end
end
It is quite simple to remove the inner 4 loops. This will be more efficient unless you have a huge matrix Inpts.payout, as a new indexing matrix must be generated.
The following code extracts the two relevant 'planes' from the input data, does the logic on them, then writes them back:
for m = 1:length(M)-1
payout_m = Inpts.payout(M(m),:,:,:,:);
payout_m1 = Inpts.payout(M(m+1),:,:,:,:);
indx = payout_m < 0;
payout_m1(indx) = payout_m1(indx) + payout_m(indx);
payout_m(indx) = 0;
Inpts.payout(M(m),:,:,:,:) = payout_m;
Inpts.payout(M(m+1),:,:,:,:) = payout_m1;
end
It is possible to avoid extracting the 'planes' and writing them back by working directly with the input data matrix. However, this yields more complex code.
However, we can easily avoid some indexing operations this way:
payout_m = Inpts.payout(M(1),:,:,:,:);
for m = 1:length(M)-1
payout_m1 = Inpts.payout(M(m+1),:,:,:,:);
indx = payout_m < 0;
payout_m1(indx) = payout_m1(indx) + payout_m(indx);
payout_m(indx) = 0;
Inpts.payout(M(m),:,:,:,:) = payout_m;
payout_m = payout_m1;
end
Inpts.payout(M(m+1),:,:,:,:) = payout_m1;
It seems like there is not a way to avoid this. I am assuming that each for lop independently changes a variable parameter used in the main calculation. Thus, it is required to have this many for loops. My only suggestion is to turn your nested loops into a function if you're concerned about appearance. Not sure if this will help run-time.
I have a list of parameters and I need to evaluate my method over this list. Right now, I am doing it this way
% Parameters
params.corrAs = {'objective', 'constraint'};
params.size = {'small', 'medium', 'large'};
params.density = {'uniform', 'non-uniform'};
params.k = {3,4,5,6};
params.constraintP = {'identity', 'none'};
params.Npoints_perJ = {2, 3};
params.sampling = {'hks', 'fps'};
% Select the current parameter
for corrAs_iter = params.corrAs
for size_iter = params.size
for density_iter = params.density
for k_iter = params.k
for constraintP_iter = params.constraintP
for Npoints_perJ_iter = params.Npoints_perJ
for sampling_iter = params.sampling
currentParam.corrAs = corrAs_iter;
currentParam.size = size_iter;
currentParam.density = density_iter;
currentParam.k = k_iter;
currentParam.constraintP = constraintP_iter;
currentParam.Npoints_perJ = Npoints_perJ_iter;
currentParam.sampling = sampling_iter;
evaluateMethod(currentParam);
end
end
end
end
end
end
end
I know it looks ugly and if I want to add a new type of parameter, I have to write another for loop. Is there any way, I can vectorize this? Or maybe use 2 for loops instead of so many.
I tried the following but, it doesn't result in what I need.
for i = 1:numel(fields)
% if isempty(params.(fields{i}))
param.(fields{i}) = params.(fields{i})(1);
params.(fields{i})(1) = [];
end
What you need is all combinations of your input parameters. Unfortunately, as you add more parameters the storage requirements will grow quickly (and you'll have to use a large indexing matrix).
Instead, here is an idea which uses linear indicies of a (never created) n1*n2*...*nm matrix, where ni is the number of elements in each field, for m fields.
It is flexible enough to cope with any amount of fields being added to params. Not performance tested, although as with any "all combinations" operation you should be wary of the non-linear increase in computation time as you add more fields to params, note prod(sz)!
The code I've shown is fast, but the performance will depend entirely on which operations you do in the loop.
% Add parameters here
params.corrAs = {'objective', 'constraint'};
params.size = {'small', 'medium', 'large'};
params.density = {'uniform', 'non-uniform'};
% Setup
f = fieldnames( params );
nf = numel(f);
sz = NaN( nf, 1 );
% Loop over all parameters to get sizes
for jj = 1:nf
sz(jj) = numel( params.(f{jj}) );
end
% Loop for every combination of parameters
idx = cell(1,nf);
for ii = 1:prod(sz)
% Use ind2sub to switch from a linear index to the combination set
[idx{:}] = ind2sub( sz, ii );
% Create currentParam from the combination indices
currentParam = struct();
for jj = 1:nf
currentParam.(f{jj}) = params.(f{jj}){idx{jj}};
end
% Do something with currentParam here
% ...
end
Asides:
I'm using dynamic field name references for indexing the fields
I'm passing multiple outputs into a cell array from ind2sub, so you can handle a variable number of field names when ind2sub has one output for each dimension (or field in this use-case).
Here is a vectorized solution :
names = fieldnames(params).';
paramGrid = cell(1,numel(names));
cp = struct2cell(params);
[paramGrid{:}] = ndgrid(cp{:});
ng = [names;paramGrid];
st = struct(ng{:});
for param = st(:).'
currentParam = param;
end
Instead of nested loops we can use ndgrid to create the cartesian product of the cell entries so we can find all combinations of cell entries without loop.
Hello I want to use parallelize my MATLAB code to run High computing server. It is code to make image database for Deep learning. To parallelize the code I found the I have to for parfor loop. But I used that with the first loop or with the second loop it shows me error parfor cannot be run due to the variable imdb and image_counter. Anyone please help me to change the code to work with parfor
for i = 1:length(cur_images)
X = sprintf('image Numb: %d ',i);
disp(X)
cur_image = load(cur_images{i,:});
cur_image=(cur_image.Image.crop);
%----------------------------------------------
cur_image = imresize(cur_image, image_size);
if(rgb < 1)
imdb.images.data(:,:,1,image_counter) = cur_image;
else
imdb.images.data(:,:,1,image_counter) = cur_image(:,:,1);
imdb.images.data(:,:,2,image_counter) = cur_image(:,:,2);
imdb.images.data(:,:,3,image_counter) = cur_image(:,:,3);
imdb.images.data(:,:,4,image_counter) = cur_image(:,:,4);
imdb.images.data(:,:,5,image_counter) = cur_image(:,:,5);
imdb.images.data(:,:,6,image_counter) = cur_image(:,:,6);
imdb.images.data(:,:,7,image_counter) = cur_image(:,:,7);
imdb.images.data(:,:,8,image_counter) = cur_image(:,:,8);
imdb.images.data(:,:,9,image_counter) = cur_image(:,:,9);
imdb.images.data(:,:,10,image_counter) = cur_image(:,:,10);
end
imdb.images.set( 1,image_counter) = set;
image_counter = image_counter + 1;
end
The main problem here is that you can't assign to fields of a structure inside parfor in the way that you're trying to do. Also, your outputs need to be indexed by the loop variable to qualify as "sliced" - i.e. don't use image_counter. Putting this together, you need something more like:
% Make a numeric array to store the output.
data_out = zeros([image_size, 10, length(cur_images)]);
parfor i = 1:length(cur_images)
cur_image = load(cur_images{i, :});
cur_image=(cur_image.Image.crop);
cur_image = imresize(cur_image, image_size);
% Now, assign into 'data_out'. A little care needed
% here.
if rgb < 1
data_tmp = zeros([image_size, 10]);
data_tmp(:, :, 1) = cur_image;
else
data_tmp = cur_image;
end
data_out(:, :, :, i) = data_tmp;
end
imdb.images.data = data_out;
I am writing a code to do some very simple descriptive statistics, but I found myself being very repetitive with my syntax.
I know there's a way to shorten this code and make it more elegant and time efficient with something like a for-loop, but I am not quite keen enough in coding (yet) to know how to do this...
I have three variables, or groups (All data, condition 1, and condition 2). I also have 8 matlab functions that I need to perform on each of the three groups (e.g mean, median). I am saving all of the data in a table where each column corresponds to one of the functions (e.g. mean) and each row is that function performed on the correspnding group (e.g. (1,1) is mean of 'all data', (2,1) is mean of 'cond 1', and (3,1) is mean of 'cond 2'). It is important to preserve this structure as I am outputting to a csv file that I can open in excel. The columns, again, are labeled according the function, and the rows are ordered by 1) all data 2) cond 1, and 3) cond 2.
The data I am working with is in the second column of these matrices, by the way.
So here is the tedious way I am accomplishing this:
x = cell(3,8);
x{1,1} = mean(alldata(:,2));
x{2,1} = mean(cond1data(:,2));
x{3,1} = mean(cond2data(:,2));
x{1,2} = median(alldata(:,2));
x{2,2} = median(cond1data(:,2));
x{3,2} = median(cond2data(:,2));
x{1,3} = std(alldata(:,2));
x{2,3} = std(cond1data(:,2));
x{3,3} = std(cond2data(:,2));
x{1,4} = var(alldata(:,2)); % variance
x{2,4} = var(cond1data(:,2));
x{3,4} = var(cond2data(:,2));
x{1,5} = range(alldata(:,2));
x{2,5} = range(cond1data(:,2));
x{3,5} = range(cond2data(:,2));
x{1,6} = iqr(alldata(:,2)); % inter quartile range
x{2,6} = iqr(cond1data(:,2));
x{3,6} = iqr(cond2data(:,2));
x{1,7} = skewness(alldata(:,2));
x{2,7} = skewness(cond1data(:,2));
x{3,7} = skewness(cond2data(:,2));
x{1,8} = kurtosis(alldata(:,2));
x{2,8} = kurtosis(cond1data(:,2));
x{3,8} = kurtosis(cond2data(:,2));
% write output to .csv file using cell to table conversion
T = cell2table(x, 'VariableNames',{'mean', 'median', 'stddev', 'variance', 'range', 'IQR', 'skewness', 'kurtosis'});
writetable(T,'descriptivestats.csv')
I know there is a way to loop through this stuff and get the same output in a much shorter code. I tried to write a for-loop but I am just confusing myself and not sure how to do this. I'll include it anyway so maybe you can get an idea of what I'm trying to do.
x = cell(3,8);
data = [alldata, cond2data, cond2data];
dfunction = ['mean', 'median', 'std', 'var', 'range', 'iqr', 'skewness', 'kurtosis'];
for i = 1:8,
for y = 1:3
x{y,i} = dfucntion(i)(data(1)(:,2));
x{y+1,i} = dfunction(i)(data(2)(:,2));
x{y+2,i} = dfunction(i)(data(3)(:,2));
end
end
T = cell2table(x, 'VariableNames',{'mean', 'median', 'stddev', 'variance', 'range', 'IQR', 'skewness', 'kurtosis'});
writetable(T,'descriptivestats.csv')
Any ideas on how to make this work??
You want to use a cell array of function handles. The easiest way to do that is to use the # operator, as in
dfunctions = {#mean, #median, #std, #var, #range, #iqr, #skewness, #kurtosis};
Also, you want to combine your three data variables into one variable, to make it easier to iterate over them. There are two choices I can see. If your data variables are all M-by-2 in dimension, you could concatenate them into a M-by-2-by-3 three-dimensional array. You could do that with
data = cat(3, alldata, cond1data, cond2data);
The indexing expression into data that retrieves the values you want would be data(:, 2, y). That said, I think this approach would have to copy a lot of data around and probably isn't the best for performance. The other way to combine data together is in 1-by-3 cell array, like this:
data = {alldata, cond1data, cond2data};
The indexing expression into data that retrieves the values you want in this case would be data{y}(:, 2).
Since you are looping from y == 1 to y == 3, you only need one line in your inner loop body, not three.
for y = 1:3
x{y, i} = dfunctions{i}(data{y}(:,2));
end
Finally, to get the cell array of strings containing function names to pass to cell2table, you can use cellfun to apply func2str to each element of dfunctions:
funcnames = cellfun(#func2str, dfunctions, 'UniformOutput', false);
The final version looks like this:
dfunctions = {#mean, #median, #std, #var, #range, #iqr, #skewness, #kurtosis};
data = {alldata, cond1data, cond2data};
x = cell(length(data), length(dfunctions));
for i = 1:length(dfunctions)
for y = 1:length(data)
x{y, i} = dfunctions{i}(data{y}(:,2));
end
end
funcnames = cellfun(#func2str, dfunctions, 'UniformOutput', false);
T = cell2table(x, 'VariableNames', funcnames);
writetable(T,'descriptivestats.csv');
You can create a cell array of functions using str2func :
function_string = {'mean', 'median', 'std', 'var', 'range', 'iqr', 'skewness', 'kurtosis'};
dfunction = {};
for ii = 1:length(function_string)
fun{ii} = str2func(function_string{ii})
end
Then you can use it on your data as you'd like to :
for ii = 1:8,
for y = 1:3
x{y,i} = dfucntion{ii}(data(1)(:,2));
x{y+1,i} = dfunction{ii}(data(2)(:,2));
x{y+2,i} = dfunction{ii}(data(3)(:,2));
end
end
I have 4 different lengths of data (in rows) and they all have a differing ammount of columns. I need to apply an equation to each of these columns and then extract the max value from each of them.
The equation I am trying to use is:
averg = mean([interpolate(1:end-2),interpolate(3:end)],2); % this is just getting your average value.
real_num = interpolate(2:end-1);
streaking1 = (abs(real_num-averg)./averg)*100;
An example of one of my data sets is 5448 rows by 13 columns
EDIT
This is the current adapation of Ben A.'s Solution and it is working.
A = interpolate;
averg = (A(1:end-2,:) + A(3:end,:))/2;
center_A = A(2:end-1,:);
streaking = [];
for idx = 1:size(A,2)
streaking(:,idx) = (abs(center_A(idx,:)-averg(idx,:))./averg(idx,:))*100;
end
I'm not entirely sure that I fully follow what you're doing in each step, but here is a stab at it:
A = interpolate;
averg = (A(1:end-2,:) + A(3:end,:))/2;
center_A = A(2:end-1,:);
streaking = [];
for idx = 1:size(A,2)
streaking(:,idx) = (abs(center_A(idx,:)-averg(idx,:))./averg(idx,:))*100;
end
Averg will be a vector of means for each column. I just use the values in the given column as the real_num variable that you had before. I'm not clear why you would need to index that the way you are as nothing is at risk of breaking index rules.
If this helps, great! If not let me know and I'll see if I can revise somewhat.