I would like to avoid the for loop in my code, since that is so pretty computationally intensive.
I search my data frame for a variable, if the variable is 0 the amount 1000 should be added to another variable.
The same if the variable is 1.
for i=1:height(dataframe)
if df.status(i) ==0
df.Number(i) = df.Number(i)+10000;
else if df.status(i) ==1
df.Number(i) = df.Number(i)+20000;
end
end
end
I am very grateful for any advice-
Tim
Assuming your loop visits each element in df.Number and df.status is also the same size, then you can summarize your code as
df.Number = df.Number + 10000 + (df.status==1) * 10000;
A logical value (Boolean) in MATLAB always has the value of either 0 or 1, and comparing status==1 ensures a logical value.
If df.status is a logical, you can skip the comparison: 10000 + df.status * 10000.
I don't know the data type of the data frame but you can approach this by getting the index when that condition is reached.
% Generate a simple dataframe
dataframe = [0 0 1 1 2 2 1 0 5]';
% Get 0s index and add 10,000 to those indexes
idx_0 = dataframe == 0;
dataframe(idx_0) = dataframe(idx_0) + 10000;
% Get 1s index and add 20,000 to those indexes
idx_1 = dataframe == 1;
dataframe(idx_1) = dataframe(idx_1) + 20000;
% Print dataframe variable
dataframe
Related
I have 50 spreadsheets with multiple scored columns:
One column (AG) has numbers coded 1:13, the other, (SEC) has numbers coded 1:6.
Ex:
AG SEC
1 1
2 1
4 1
13 1
3 2
12 2
I want to write a for loop that counts all the 1s in .SEC that correspond to #s 1:5 in .AG. (output would be 3 - it wouldn't count the 1 corresponding to 13). I need this to happen for all #s in .SEC (1:6). The final output would have the spreadsheet name in the first column, and counts for .SEC=1,2,3,4,5,6 in each of the proceeding columns.
My current code creates a variable for total .AG counts in .SEC, but is nondiscriminatory (counts the amount of times any number is given in .AG instead of counting for specific values)
scoringfiles is a 50-item path list (when I do readtable(scoringfiles) it iterates through the list and reads through excel files. filelist is a 50-item list with just filenames.
for i=1:length(scoringfiles)
if contains(filelist(i,:),"sheet")
disp(i)
sheetnum=[sheetnum extractBetween(filelist{i},1,4)]
s1=[s1 length(find(readtable(scoringfiles(i,:)).SEC==1))]
s2=[s2 length(find(readtable(scoringfiles(i,:)).SEC==2))]
s3=[s3 length(find(readtable(scoringfiles(i,:)).SEC==3))]
s4=[s4 length(find(readtable(scoringfiles(i,:)).SEC==4))]
s5=[s5 length(find(readtable(scoringfiles(i,:)).SEC==5))]
s6=[s6 length(find(readtable(scoringfiles(i,:)).SEC==6))]
elseif contains(filelist(i,:),"graph")
disp("not sheet")
end
end
In MATLAB, i and j are the imaginary unit. To avoid redefining it, you should make a habit of using ii and jj as your loop variable instead of i and j.
Now back to the main question:
Let's assume you've read the file contents into the data variable. This is going to be a Nx2 array.
You only care about AG when it is in the range 1:5. Let's create a filter array with true where AG is in the range and false elsewhere.
filter = data(:, 1) >= 1 & data(:, 1) <= 5;
Let's first split the columns into two variables for legibility. Use the filter to select just the rows that match our criteria.
ag = data(filter, 1);
sec = data(filter, 2);
Now you want to go through each unique value in sec, and count the number of ag entries.
unique_sec = unique(sec);
counts = zeros(size(unique_sec)); % Preallocate a zeros array to save our answer in
for ii = 1:length(unique_sec)
sec_value = unique_sec(ii); % Get the value of SEC
matches = sec == sec_value; % Make a filter for rows that have this value
% matches is a logical array. true = 1, false = 0. sum gives number of trues.
counts(ii) = sum(matches);
end
Alternatively, you could perform the filter for 1 <= AG <= 5 inside the loop if you don't want to filter before:
ag = data(:, 1);
sec = data(:, 2);
unique_sec = unique(sec);
counts = zeros(size(unique_sec));
for ii = 1:length(unique_sec)
sec_value = unique_sec(ii);
matches = sec == sec_value & ag >= 1 & ag <= 5; % Add more conditions to the filter
counts(ii) = sum(matches);
end
If you want to do this for multiple files, iterate over them and read the files into the data variable.
I figured out how to apply a filter thanks to the help of Pranav. It is as simple as adding the filter to each line of the for loop as it iterates through reading my spreadsheets. See below:
THIS EXAMPLE ONLY LOOKS AT S1 and S2. Realistically, I have this for 6 different #s creating 6 tables with counts per spreadsheet.
for i=1:length(scoringfiles)
filter1 = readtable(scoringfiles(i,:)).AG >= 1;
filter2 = readtable(scoringfiles(i,:)).AG <= 5;
if contains(filelist(i,:),"sheet")
disp(i)
sheetnum=[sheetnum extractBetween(filelist{i},1,4)]
s1=[s1 length(find(readtable(scoringfiles(i,:)).SEC==1 & filter1 & filter2))]
s2=[s2 length(find(readtable(scoringfiles(i,:)).SEC==2 & filter1 & filter2))]
elseif contains(filelist(i,:),"graph")
disp("not sheet")
end
end
I have a 1x24 vector (a). I should define a command in Matlab which compare all 24 values of a vector (a) with a certain value (mean (b)) and if the vector (a) item is greater than certain value (mean (b)), ''I'' sets 1 and if the vector item is less than certain value ''I'' sets 0. I wrote the below code:
for i=1:length(a)
if a(i) >= mean(b)
I = 1;
else
I = 0;
end
end
But it implements the comparison only for the last index of vector a and sets I=0. How can I fix the command that do the comparison for all indexes of vector a?
In MATLAB, you can use the following syntax to do so:
I = a >= mean(b);
If you want to use your code for doing so, you'll need to initialize I as a vector, and modify its indices as follows:
I = zeros(length(a),1)
for ii=1:length(a)
if a(ii) >= mean(b)
I(ii) = 1;
else
I(ii) = 0;
end
end
You should read about logical indexing in matlab. You don't need for loops for what you are doing. For example, if you have,
rng(5);
a = rand(1,10);
b = 0.5;
then, I = a > b; will return a logical array with zeros and ones, where one indicates the position in the array where the given condition is satisfied,
I =
0 1 0 1 0 1 1 1 0 0
Using these indices, you can modify your original array. For example, if you wish to change all values of a greater than b to be 10, you would simply do,
a(a > b) = 10;
Specifically, if you need indices where the condition is satisfied, you can use, find(a > b), which in this example will give you,
ans =
2 4 6 7 8
array = [2 1 3 2 1]
for i = 2:length(array)
value = array(i);
j = i - 1;
array_j=array(1:j);
array_j_indices=cumsum(array_j>value);
[~,n]=find(array_j_indices==1);
newArray=array;
array(n+1:i)=array_j(array_j>value);
j=j-max(array_j_indices);
array(j+1) = value;
end %forLoop
disp(array);
Hello,
I saw this code for vectorising while loop insertion code but i cannot seem to understand how it works.
How does cumsum(array_j>value) work? I understand and tested cumsum functions but i can't seem to understand how the rational operator of (array_j>value) works in the within a cumsum function under the for loop.
Also, i dont understand how [~,n]=find(array_j_indices==1) stores value for the matrix of n. Does it store it only in columns because there is a not (~) in the rows?
cumsum(array_j>value)?
array_j>value: due to the sorted nature of array_j, the result is always some zeros followed by some ones, e.g. [0 0 0 0 1 1 1 1]
cumsum(array_j>value) = [0 0 0 0 1 2 3 4]: at most one element will be equal to 1.
[~,n]=find(array_j_indices==1); ?
Because there is only one row, this is equal to n=find(array_j_indices==1);.
Fastest implementation?
Note that this 'vectorised' code is slower the following (easier) implementation:
for i = 2:length(array)
value = array(i);
j = i - 1;
n=find(array(1:j)>value,1);
array(n+1:i)=array(n:j);
array(n) = value;
end
and much slower than the built-in matlab sort method.
I would like to know if there is a way to get rid of the inner for loop
for i = 1:size(VALUES)
for k = 2:bins+1
if VALUES(i) < Arr(k)
answer_list(i) = find(Arr == Arr(k)) - 1;
break
end
end
end
VALUES is a file with 100 doubles from 2 to 4
Arr is an array with 4 values, starting at VALUES min a step of 1 and ends at VALUES max
bins is Arr's length - 1
and answer_list is a column of numbers VALUES long that hold the discrete value depending on the size of the bins variable.
I think this is what you look for (in comments are the references to the original lines in your code):
out = bsxfun(#lt,VALUES(:).',Arr(:)) % if VALUES(i) < Arr(k):
out2 = size(out,1)-cumsum(out,1); % find(Arr == Arr(k)) - 1;
answer_list = out2(end,any(out,1)).';
This replaces the whole code, not only the inner loop.
I have a Matlab time series data set, which consist of a signal that can only be 1 or 0. How can I get rid of all the values except for the changing ones?
For example:
1
1
1
0
1
0
0
0
should ideally result in
1
0
1
0
while keeping the correct time values as well of course.
Thing is, that I need to find the frequency of the signal. The time should be measured from 0->1 to the next time 0->1 occurs. The smallest time / highest frequency is what I need in the end.
Thanks!
You can use the getsamples method to get a time series which contains a subset of the original samples. Remains to identify the indices where the time series has changed, for this purpose you can use diff and logical indexing:
ts = timeseries([1 1 1 0 1 0 0 0],1:8)
ts.getsamples([true;squeeze(diff(ts.Data)) ~= 0])
A simple and clever call to to diff should be sufficient:
>> A = [1; 1; 1; 0; 1; 0; 0; 0];
>> B = A(diff([-Inf; A]) ~= 0)
B =
1
0
1
0
The code is quite simple. diff finds pairs of differences in an array. Concretely, given an array A, the output is of the following structure:
B = [A(2) - A(1), A(3) - A(2), ..., A(N) - A(N-1)];
N is the total length of the signal. This results in a N-1 length signal. As such, a trick that you can use is to append the array A with -Inf (or some high non-zero value) so that when you find the difference between the first element of this appended array and the actual first element of the true array, you will get some non-zero change. That is registered with diff([-Inf; A]). The next thing you'll want is to check is to see where the differences are non-zero. Whenever there is a non-zero difference, that is a position that you want to keep because there has been a change that occurred. This produces a logical array and so the last step is to use this to index into your array A and thus get the result.
This only extracts out the signal you need however. If you'd like to extract the time in between unique elements, supposing you had some time vector t that was as long as your signal stored in A. You would first record the logical vector in a separate variable, then index into both your time array and the signal array to extract out what you need (original idea from user dfri):
ind = diff([-Inf; A]) ~= 0;
times = t(ind);
B = A(ind);
You can make use of diff and logical to save the results as a logical array, used as a subsequent index filter in your data (say t for time and y for boolean values ))
%// example
t = 0:0.01:0.07;
y = [1,1,1,0,1,0,0,0];
%// find indices to keep
keep = [true logical(diff(y))];
%// truncated data
tTrunc = t(keep)
yTrunc = y(keep)
with the results for the example as follows
tTrunc =
0 0.0300 0.0400 0.0500
yTrunc =
1 0 1 0