I have 4 different lengths of data (in rows) and they all have a differing ammount of columns. I need to apply an equation to each of these columns and then extract the max value from each of them.
The equation I am trying to use is:
averg = mean([interpolate(1:end-2),interpolate(3:end)],2); % this is just getting your average value.
real_num = interpolate(2:end-1);
streaking1 = (abs(real_num-averg)./averg)*100;
An example of one of my data sets is 5448 rows by 13 columns
EDIT
This is the current adapation of Ben A.'s Solution and it is working.
A = interpolate;
averg = (A(1:end-2,:) + A(3:end,:))/2;
center_A = A(2:end-1,:);
streaking = [];
for idx = 1:size(A,2)
streaking(:,idx) = (abs(center_A(idx,:)-averg(idx,:))./averg(idx,:))*100;
end
I'm not entirely sure that I fully follow what you're doing in each step, but here is a stab at it:
A = interpolate;
averg = (A(1:end-2,:) + A(3:end,:))/2;
center_A = A(2:end-1,:);
streaking = [];
for idx = 1:size(A,2)
streaking(:,idx) = (abs(center_A(idx,:)-averg(idx,:))./averg(idx,:))*100;
end
Averg will be a vector of means for each column. I just use the values in the given column as the real_num variable that you had before. I'm not clear why you would need to index that the way you are as nothing is at risk of breaking index rules.
If this helps, great! If not let me know and I'll see if I can revise somewhat.
Related
In Matlab (R2021b) I am using some given function, which reads time-dependent values of several variables and returns them in a combined matrix together with a time vector. In the data matrix each column represents one vector of time-dependent values for one variable.
[data,time] = function_reading_data_of_several_values('filename');
For readability of the following code where the variables are further processed, I would like to store these columns in separate vector variables. I am doing it like that:
MomentX = data(1,:);
MomentY = data(2,:);
MomentZ = data(3,:);
ForceX = data(4,:);
ForceY = data(5,:);
ForceZ = data(6,:);
That is working. But is there some simpler (or shorter) way of assigning the column of the matrix to individual vectors? I am asking because in real program I have more than the 6 columns as in the example. Code is getting quite long. I was thinking of something similar to the line below, but that does not work:
[MomentX,MomentY,MomentZ,ForceX,ForceY,ForceZ] = data; %does not work
Do you have any idea? Thanks for help!
Update:
Thanks to the hint here in the group to use tables, a solution could be like this:
...
[data,time] = function_reading_data_of_several_values('filename');
% data in matrix. Each column representing a stime dependent variable
varNames = {'MomentX', 'MomentX',...}; % Names of columns
T=array2table(data','VariableNames',varNames); % Transform to Table
Stress = T.MomentX/W + T.ForceY/A %accesing table columns
...
This seems to work fine and readable to me.
Solution 1: In industrial solutions like dSpace, it is very common to do it in struct arrays:
mydata.X(1).time = [0.01 0.02 0.03 0.04];
mydata.Y(1).name = 'MomentX';
mydata.Y(1).data = [1 2 3 4];
mydata.Y(2).name = 'MomentY';
mydata.Y(2).data = [2 3 4 5];
Solution 2: It is also very common to create tables
See: https://de.mathworks.com/help/matlab/ref/table.html
As already commented, it is probably better to use a table instead of separate variables may not be a good idea. But if you want to, it can be done this way:
A = magic(6): % example 6-column matrix
A_cell = num2cell(A, 1); % separate columns in cells
[MomentX, MomentY, MomentZ, ForceX, ForceY, ForceZ] = A_cell{:};
This is almost the same as your
[MomentX,MomentY,MomentZ,ForceX,ForceY,ForceZ] = data; %does not work
except that the right-hand side needs to be a comma-separated list, which in this case is obtained from a cell array.
Using Matlab R2019a, is there any way to avoid the for-loop in the following code in spite of the dimensions containing different element so that each element has to be checked? M is a vector with indices, and Inpts.payout is a 5D array with numerical data.
for m = 1:length(M)-1
for power = 1:noScenarios
for production = 1:noScenarios
for inflation = 1:noScenarios
for interest = 1:noScenarios
if Inpts.payout(M(m),power,production,inflation,interest)<0
Inpts.payout(M(m+1),power,production,inflation,interest)=...
Inpts.payout(M(m+1),power,production,inflation,interest)...
+Inpts.payout(M(m),power,production,inflation,interest);
Inpts.payout(M(m),power,production,inflation,interest)=0;
end
end
end
end
end
end
It is quite simple to remove the inner 4 loops. This will be more efficient unless you have a huge matrix Inpts.payout, as a new indexing matrix must be generated.
The following code extracts the two relevant 'planes' from the input data, does the logic on them, then writes them back:
for m = 1:length(M)-1
payout_m = Inpts.payout(M(m),:,:,:,:);
payout_m1 = Inpts.payout(M(m+1),:,:,:,:);
indx = payout_m < 0;
payout_m1(indx) = payout_m1(indx) + payout_m(indx);
payout_m(indx) = 0;
Inpts.payout(M(m),:,:,:,:) = payout_m;
Inpts.payout(M(m+1),:,:,:,:) = payout_m1;
end
It is possible to avoid extracting the 'planes' and writing them back by working directly with the input data matrix. However, this yields more complex code.
However, we can easily avoid some indexing operations this way:
payout_m = Inpts.payout(M(1),:,:,:,:);
for m = 1:length(M)-1
payout_m1 = Inpts.payout(M(m+1),:,:,:,:);
indx = payout_m < 0;
payout_m1(indx) = payout_m1(indx) + payout_m(indx);
payout_m(indx) = 0;
Inpts.payout(M(m),:,:,:,:) = payout_m;
payout_m = payout_m1;
end
Inpts.payout(M(m+1),:,:,:,:) = payout_m1;
It seems like there is not a way to avoid this. I am assuming that each for lop independently changes a variable parameter used in the main calculation. Thus, it is required to have this many for loops. My only suggestion is to turn your nested loops into a function if you're concerned about appearance. Not sure if this will help run-time.
I have acceleration (10240x31) data that I want to filter by replacing every data point that exceeds the threshold value of 4 times standard derivation of each column with the mean value of the two adjacent data points.
First, I wanted to replace every data point with a zero, if it exceeds the maximum value. This is my loop:
for w = 1:31
Sigma(w) = std(zacceleration(:,w));
zacceleration(zacceleration<(-4*Sigma(w))) = 0;
zacceleration(zacceleration>(4*Sigma(w))) = 0;
end
That code works if w is just one number, for example:
w = 1;
But when w changes every iteration, the filtered data only contains the values that don't exceed the threshold value of the last dataset, Sigma(31).
So, I guess that I overwrite my data or something like that but I cant seem to find a solution.
Can anybody please give me a hint?
Thank you in advance and best regards.
I think I got it now.
Sigma = std(zacceleration);
for a = 1:10240;
for b = 1:31;
if zacceleration(a,b)<(-4*Sigma(b))
zacceleration(a,b) = 0;
end
if zacceleration(a,b)>(4*Sigma(b))
zacceleration(a,b) = 0;
end
end
end
What is the best way to do random sample with replacement from dataset? I am using 316 * 34 as my dataset. I want to segment the data into three buckets but with replacement. Should I use the randperm because I need to make sure I keep the index intact where that index would be handy in identifying the label data. I am new to matlab I saw there are couple of random sample methods but they didn't look like its doing what I am looking for, its strange to think that something like doesn't exist in matlab, but I did the follwoing:
My issue is when I do this row_idx = round(rand(1)*316) sometimes I get zero, that leads to two questions
what should I do to avoid zeor?
What is the best way to do the random sample with replacement.
shuffle_X = X(randperm(size(X,1)),:);
lengthOf_shuffle_X = length(shuffle_X)
number_of_rows_per_bucket = round(lengthOf_shuffle_X / 3)
bucket_cell = cell(3,1)
bag_matrix = []
for k = 1:length(bucket_cell)
for i = 1:number_of_rows_per_bucket
row_idx = round(rand(1)*316)
bag_matrix(i,:) = shuffle_X(row_idx,:)
end
bucket_cell{k} = bag_matrix
end
I could do following:
if row_idx == 0
row_idx = round(rand(1)*316)
assuming random number will never give two zeros values in two consecutive rounds.
randi is a good way to get integer indices for sampling with replacement. Assuming you want to fill three buckets with an equal number of samples, then you can write
data = rand(316,34); %# create some dummy data
number_of_data = size(data,1);
number_of_rows_per_bucket = 50;
bucket_cell = cell(1,3);
idx = randi([1,number_of_data],[number_of_rows_per_bucket,3]);
for iBucket = 1:3
bucket_cell{iBucket} = data(idx(:,iBucket),:);
end
To the question: if you use randperm it will give you a draw order without replacement, since you can draw any item once.
If you use randi it draws you with replacement, that is you draw an item possibly many times.
If you want to "segment" a dataset, that usually means you split the dataset into three distinct sets. For that you use draw without replacement (you don't put the items back; use randperm). If you'd do it with replacement (using randi), it will be incredibly slow, since after some time the chance that you draw an item which you have not before is very low.
(Details in coupon collector ).
If you need a segmentation that is a split, you can just go over the elements and independently decide where to put it. (That is you choose a bucket for each item with replacement -- that is you put any chosen bucket back into the game.)
For that:
% if your data items are vectors say data = [1 1; 2 2; 3 3; 4 4]
num_data = length(data);
bucket_labels = randi(3,[1,num_data]); % draw a bucket label for each item, independently.
for i=1:3
bucket{i} = data(bucket_labels==i,:);
end
%if your data items are scalars say data = [1 2 3 4 5]
num_data = length(data);
bucket_labels = randi(3,[1,num_data]);
for i=1:3
bucket{i} = data(bucket_labels==i);
end
there we go.
I have a data file m.txt that looks something like this (with a lot more points):
286.842995
3.444398
3.707202
338.227797
3.597597
283.740414
3.514729
3.512116
3.744235
3.365461
3.384880
Some of the values (like 338.227797) are very different from the values I generally expect (smaller numbers).
So, I am thinking that
I will remove all the points that lie outside the 3-sigma range. How can I do that in MATLAB?
Also, the bigger problem is that this file has a separate file t.txt associated with it which stores the corresponding time values for these numbers. So, I'll have to remove the corresponding time values from the t.txt file also.
I am still learning MATLAB, and I know there would be some good way of doing this (better than storing indices of the elements that were removed from m.txt and then removing those elements from the t.txt file)
#Amro is close, but the FIND is unnecessary (look up logical subscripting) and you need to include the mean for a true +/-3 sigma range. I would go with the following:
%# load files
m = load('m.txt');
t = load('t.txt');
%# find values within range
z = 3;
meanM = mean(m);
sigmaM = std(m);
I = abs(m - meanM) <= z * sigmaM;
%# keep values within range
m = m(I);
t = t(I);
%# load files
m = load('m.txt');
t = load('t.txt');
%# find outliers indices
z = 3;
idx = find( abs(m-mean(m)) > z*std(m) );
%# remove them from both data and time values
m(idx) = [];
t(idx) = [];