How can I perform this cumulative sum in MATLAB? - matlab

I want to calculate a cumulative sum of the values in column 2 of dat.txt below for each string of ones in column 1. The desired output is shown as dat2.txt:
dat.txt dat2.txt
1 20 1 20 20 % 20 + 0
1 22 1 22 42 % 20 + 22
1 20 1 20 62 % 42 + 20
0 11 0 11 11
0 12 0 12 12
1 99 1 99 99 % 99 + 0
1 20 1 20 119 % 20 + 99
1 50 1 50 169 % 50 + 119
Here's my initial attempt:
fid=fopen('dat.txt');
A =textscan(fid,'%f%f');
in =cell2mat(A);
fclose(fid);
i = find(in(2:end,1) == 1 & in(1:end-1,1)==1)+1;
out = in;
cumulative =in;
cumulative(i,2)=cumulative (i-1,2)+ cumulative(i,2);
fid = fopen('dat2.txt','wt');
format short g;
fprintf(fid,'%g\t%g\t%g\n',[out cumulative(:)]');
fclose(fid);

Here's a completely vectorized (albeit somewhat confusing-looking) solution that uses the functions CUMSUM and DIFF along with logical indexing to produce the results you want:
>> data = [1 20;... %# Initial data
1 22;...
1 20;...
0 11;...
0 12;...
1 99;...
1 20;...
1 50];
>> data(:,3) = cumsum(data(:,2)); %# Add a third column containing the
%# cumulative sum of column 2
>> index = (diff([0; data(:,1)]) > 0); %# Find a logical index showing where
%# continuous groups of ones start
>> offset = cumsum(index.*(data(:,3)-data(:,2))); %# An adjustment required to
%# zero the cumulative sum
%# at the start of a group
%# of ones
>> data(:,3) = data(:,3)-offset; %# Apply the offset adjustment
>> index = (data(:,1) == 0); %# Find a logical index showing where
%# the first column is zero
>> data(index,3) = data(index,2) %# For each zero in column 1 set the
%# value in column 3 to be equal to
data = %# the value in column 2
1 20 20
1 22 42
1 20 62
0 11 11
0 12 12
1 99 99
1 20 119
1 50 169

Not completely vectorized solution (it loops through the segments of sequential 1s), but should be faster. It's doing only 2 loops for your data. Uses MATLAB's CUMSUM function.
istart = find(diff([0; d(:,1)])==1); %# start indices of sequential 1s
iend = find(diff([d(:,1); 0])==-1); %# end indices of sequential 1s
dcum = d(:,2);
for ind = 1:numel(istart)
dcum(istart(ind):iend(ind)) = cumsum(dcum(istart(ind):iend(ind)));
end
dlmwrite('dat2.txt',[d dcum],'\t') %# write the tab-delimited file

d=[
1 20
1 22
1 20
0 11
0 12
1 99
1 20
1 50
];
disp(d)
out=d;
%add a column
out(:,3)=0;
csum=0;
for(ind=1:length(d(:,2)))
if(d(ind,1)==0)
csum=0;
out(ind,3)=d(ind,2);
else
csum=csum+d(ind,2);
out(ind,3)=csum;
end
end
disp(out)

Related

How to shift non circularly in Matlab

I am trying to shift non circularly in MATLAB so even if I shift outside of the index it will add 0s to correct it. I tried following the answer in How do I shift columns (left or right) in a matrix? but had no success.
data = [1 2 3 4 5; 11 12 13 14 15; 21 22 23 24 25; 31 32 33 34 35]
d = 3; % shift; positive/negative for right/left
result = zeros(size(data), 'like', data); % preallocate with zeros
result(:,max(1,1+d):min(end,end+d)) = data(:,max(1,1-d):min(end,end-d)); % write values
In my output results is nothing but the same size but all zeroes
Desired output:
0 0 0 1 2 3 4 5
0 0 0 11 12 13 14 15
0 0 0 21 22 23 24 25
0 0 0 31 32 33 34 35
You can do it by creating a matrix result, the final size, filled with zeros, then copying the original data into the final result, making sure you place the data at the right indices.
What you have in your example code is not right for what you ask. If I run it,the final result is padded fine but truncated at the size of the original data matrix. This is how some matrix are shifted (with the shifted columns dropped altogether), but that's not what you asked.
A simple way to do it, is to create a padding matrix of the proper size, then simply concatenate it with your original data matrix. This can be done as below:
%% Initial data
data = [1 2 3 4 5; 11 12 13 14 15; 21 22 23 24 25; 31 32 33 34 35] ;
d = 3 ;
%% shift and pad with zeros
nrows = size(data,1) ; % Number of rows in [data]
pad = zeros( nrows , abs(d) ) ; % create padding matrix
if d>0
result = [pad data] ; % Concatenate the filler matrix on the left
else
result = [data pad] ; % Concatenate the filler matrix on the right
end
And just to be sure:
>> result
result =
0 0 0 1 2 3 4 5
0 0 0 11 12 13 14 15
0 0 0 21 22 23 24 25
0 0 0 31 32 33 34 35
If you want to reuse the same way than in your example code, you have to adjust it a bit to allow for the new columns:
%% create result and copy data
result = zeros( size(data,1) , size(data,2)+abs(d) ) ;
colStart = max(1,1+d) ;
result(:,colStart:colStart+size(data,2)-1) = data ;
This will create the same result matrix as above.

Find last true element of columns

I'd like to extract one value per column of a matrix using a condition. Multiple values on each column match that condition, but only the last one should be selected. It is safe to assume that each row contains at least one such value.
So given an NxM matrix and an equally-sized boolean, extract M values for which the boolean is true and it is the last true value in a column. For example:
m = magic(4);
i = (m > 10);
% m =
% 16 2 3 13
% 5 11 10 8
% 9 7 6 12
% 4 14 15 1
% i =
% 1 0 0 1
% 0 1 0 0
% 0 0 0 1
% 0 1 1 0
And the expected output:
% i_ =
% 1 0 0 0
% 0 0 0 0
% 0 0 0 1
% 0 1 1 0
% x = [16, 14, 15, 12]
I know this could be easily achieved by looping through the columns and using find, but in my experience there often are better ways of formulating these problems.
This would do it
m(max(i.*reshape([1:numel(m)],size(m))))
Explanation
So we are generating an array of indices
reshape([1:numel(m)],size(m))
ans =
1 5 9 13
2 6 10 14
3 7 11 15
4 8 12 16
That represents the indices for each value. The we multiply that with I to get the values we are interested in
i.*reshape([1:numel(m)],size(m))
ans =
1 0 0 13
0 6 0 0
0 0 0 15
0 8 12 0
Then we do a max on that since max works on columns. This will give us the last index in each column.
max(i.*reshape([1:numel(m)],size(m)))
ans =
1 8 12 15
Then apply those indices on m to get the values
m(max(i.*reshape([1:numel(m)],size(m))))
ans =
16 14 15 12
You can use the second output of max to find the last true element of each column. Before that the logical matrx should be multiplied by an increasing column vector.
[~, idx] = max((1:size(i, 1)).' .* i, [], 1, 'linear') ;
x = m(idx) ;
Here's another way, using accumarray:
[~, col] = find(i); % column indices
lin = find(i); % linear indices
x = accumarray(col, m(lin), [], #(x) x(end));

Dividing a matrix into two parts

I am trying to classify my dataset. To do this, I will use the 4th column of my dataset. If the 4th column of the dataset is equal to 1, that row will added in new matrix called Q1. If the 4th column of the dataset is equal to 2, that row will be added to matrix Q2.
My code:
i = input('Enter a start row: ');
j = input('Enter a end row: ');
search = importfiledataset('search-queries-features.csv',i,j);
[n, p] = size(search);
if j>n
disp('Please enter a smaller number!');
end
for s = i:j
class_id = search(s,4);
if class_id == 1
Q1 = search(s,1:4)
elseif class_id ==2
Q2 = search(s,1:4)
end
end
This calculates the Q1 and Q2 matrices, but they all are 1x4 and when it gives new Q1 the old one is deleted. I need to add new row and make it 2x4 if conditions are true. I need to expand my Q1 matrix.
Briefly I am trying to divide my dataset into two parts using for loops and if statements.
Dataset:
I need outcome like:
Q1 = [30 64 1 1
30 62 3 1
30 65 0 1
31 59 2 1
31 65 4 1
33 58 10 1
33 60 0 1
34 58 30 1
34 60 1 1
34 61 10 1]
Q2 = [34 59 0 2
34 66 9 2]
How can I prevent my code from deleting previous rows of Q1 and Q2 and obtain the entire matrices?
The main problem in your calculation is that you overwrite Q1 and Q2 each loop iteration. Best solution: get rid of the loops and use logical indexing.
You can use logical indexing to quickly determine where a column is equal to 1 or 2:
search = [
30 64 1 1
30 62 3 1
30 65 0 1
31 59 2 1
31 65 4 1
33 58 10 1
33 60 0 1
34 59 0 2
34 66 9 2
34 58 30 1
34 60 1 1
34 61 10 1
];
Q1 = search(search(:,4)==1,:) % == compares each entry in the fourth column to 1
Q2 = search(search(:,4)==2,:)
Q1 =
30 64 1 1
30 62 3 1
30 65 0 1
31 59 2 1
31 65 4 1
33 58 10 1
33 60 0 1
34 58 30 1
34 60 1 1
34 61 10 1
Q2 =
34 59 0 2
34 66 9 2
Warning: Slow solution!
If you are hell bent on using loops, make sure to not overwrite your variables. Either extend them each iteration (which is very, very slow):
Q1=[];
Q2=[];
for ii = 1:size(search,1) % loop over all rows
if search(ii,4)==1
Q1 = [Q1;search(ii,:)];
end
if search(ii,4)==2
Q2 = [Q2;search(ii,:)];
end
end
MATLAB will put orange wiggles beneath Q1 and Q2, because it's a bad idea to grow arrays in-place. Alternatively, you can preallocate them as large as search and strip off the excess:
Q1 = zeros(size(search)); % Initialise to be as large as search
Q2 = zeros(size(search));
Q1kk = 1; % Intialiase counters
Q2kk = 1;
for ii = 1:size(search,1) % loop over all rows
if search(ii,4)==1
Q1(Q1kk,:) = search(ii,:); % store
Q1kk = Q1kk + 1; % Increase row counter
end
if search(ii,4)==2
Q2(Q2kk,:) = search(ii,:);
Q2kk = Q2kk + 1;
end
end
Q1 = Q1(1:Q1kk-1,:); % strip off excess rows
Q2 = Q2(1:Q2kk-1,:);
Another option using accumarray, if Q is your original matrix:
Q = accumarray(Q(:,4),1:size(Q,1),[],#(x){Q(x,:)});
You can access the result with Q{1} (for class_id = 1), Q{2} (for class_id = 2) and so on...

All unique multiplication products

I'd like to obtain all unique products for a given vector.
For example, given a:
a = [4,10,12,3,6]
I want to obtain a matrix that contains the results of:
4*10
4*12
4*3
4*6
10*12
10*3
10*6
12*3
12*6
3*6
Is there a short and/or quick way of doing this in MATLAB?
EDIT: a may contain duplicate numbers, giving duplicate products - and these must be kept.
Given:
a =
4 10 12 3 6
Construct the matrix of all pairwise products:
>> all_products = a .* a.'
all_products =
16 40 48 12 24
40 100 120 30 60
48 120 144 36 72
12 30 36 9 18
24 60 72 18 36
Now, construct a mask to keep only those values below the main diagonal:
>> mask = tril(true(size(all_products)), -1)
mask =
0 0 0 0 0
1 0 0 0 0
1 1 0 0 0
1 1 1 0 0
1 1 1 1 0
and apply the mask to the product matrix:
>> unique_products = all_products(mask)
unique_products =
40
48
12
24
120
30
60
36
72
18
If you have the Statistics Toolbox, you can abuse pdist, which considers only one of the two possible orders for each pair:
result = pdist(a(:), #times);
One option involves nchoosek, which returns all combinations of k elements out of a vector, each row is one combination. prod computes the product of rows or columns:
a = [4,10,12,3,6];
b = nchoosek(a,2);
b = prod(b,2); % 2 indicates rows
Try starting with this. Have the unique function filter out the result of multiplying a by itself.
b = unique(a*a')

Matlab: How I can make this transformation on the matrix A?

I have a matrix A 4x10000, I want to use it to find another matrix C.
I'll simplify my problem with a simple example:
from a matrix A
20 4 4 74 20 20
36 1 1 11 36 36
77 1 1 15 77 77
3 4 2 6 7 8
I want, first, to find an intermediate entity B:
2 3 4 6 7 8
[20 36 77] 0 1 0 0 1 1 3
[4 1 1] 1 0 1 0 0 0 2
[74 11 15] 0 0 0 1 0 0 1
we put 1 if the corresponding value of the first line and the vector on the left, made ​​a column in the matrix A.
the last column of the entity B is the sum of 1 of each line.
at the end I want a matrix C, consisting of vectors which are left in the entity B, but only if the sum of 1 is greater than or equal to 2.
for my example:
20 4
C = 36 1
77 1
N.B: for my problem, I use a matrix A 4x10000
See if this works for you -
%// We need to replace this as its not available in your old version of MATLAB:
%// [unqcols,~,col_match] = unique(A(1:end-1,:).','rows','stable') %//'
A1 = A(1:end-1,:).'; %//'
[unqmat_notinorder,row_ind,labels] = unique(A1,'rows');
[tmp_sortedval,ordered_ind] = sort(row_ind);
unqcols = unqmat_notinorder(ordered_ind,:);
[tmp_matches,col_match] = ismember(labels,ordered_ind);
%// OR use - "[tmp2,col_match] = ismember(A1,out,'rows');"
C = unqcols(sum(bsxfun(#eq,col_match,1:max(col_match)),1)>=2,:).'; %//'
%// OR use - "C = out(accumarray(col_match,ones(1,numel(col_match)))>=2,:).'"
This should work:
[a,~,c] = unique(A(1:end-1,:).', 'rows', 'stable');
C=a(histc(c,unique(c))>=2, :).';
Edit: For older versions of MATLAB:
D=A(1:end-1,:);
C=unique(D(:,squeeze(sum(all(bsxfun(#eq, D, permute(D, [1 3 2])))))>=2).', 'rows').':