removing uniform columns in MATLAB - matlab

Say i have a 2D matrix A:
A = [ 1 1 0 0
1 0 0 0
1 1 1 0];
A is not necessarily binary, or even integer (i.e., floats are possible). I want to remove any column that contains uniform valued elements. In the above example, i would get:
1 0
0 0
1 1
To make this fully general, i'd like to allow the user to select the dimension along which rows/columns/slices are removed (i.e., with a DIM option).
Any ideas?

You could try using the min and max functions, which allow you to use the dim argument.
For example
index = min(A,[],1)==max(A,[],1);
A(:,index)=[];
will remove the columns you want. It is straightforward to do the same for rows
index = min(A,[],2)==max(A,[],2);
A(index,:)=[];

One-liner:
B = A(:,range(A)~=0); %//columns
The other one-liner is not that nice, and ugly one-liners should not be written down. :-) But is basically the same solution as S..'s, except is way more expensive (requires stats toolbox).
Please note that "generality" of subscript-based solutions doesn't extend to N-dimensional arrays as easily, because subscripting in ND arrays without checking beforehand the number of dimensions is difficult. Also, for the 1D arrays the notion of "uniformity" is a bit odd along the singleton dimension (the result is always empty).

Besides the neat solution provided by #S.. there is this simple hack also for your example:
for ii = 1:size(A,2)
T(ii) = all(A(:,ii) == sum(A(:,ii))/numel(A(:,ii)));
end
A(:,~T)
ans =
1 0
0 0
1 1
As suggested by #gariepy the right side of the equation can be replaced with mean function.
for ii = 1:size(A,2)
T(ii) = all( A(:,ii) == mean(A(:,ii)) );
end
A(:,~T)

A(:,~all(A == repmat(A(1,:),[size(A,1) 1])))
Inspired by #S.. but only checks if every element of the column equals the first element of the column. Seems like a little less work for the processor than finding the min and the max, and checking for equality.

Related

Matlab Insert NaN into array

I need to insert NaNs into in specific positions of an array.
I wrote the code that is correctly doing it, but as I need to do it for really large arrays, it's taking too long to run. Numpy has the function insert(i, x) that inserts an item at a given position. Is there a similar function in Matlab? Or is there a more efficient way to do it?
a = [1 2 3 5 6 7 9 10 13 14];
insertNanIndex = [0 1 0 1 1 0 0 0 1 1 0 0 1 0 0 0];
for i = find(insertNanIndex)
a = [a(1:i-1), NaN, a(i:end)]
end
The efficient way to do this would be to pre-compute the size of the result, make sure that insertNanIndex was large enough to serve as a mask, and insert a into the correct indices all at once. Right now, you are literally re-allocating the entire array for every NaN. Numpy's insert function would be equally inefficient because it would be doing the same operation.
If, as in your example, the number of zeros matches the number of elements of a, you can allocate an array based on insertNanIndex and mask it directly:
result = nan(size(insertNanIndex));
result(~insertNanIndex) = a;
If the number of zeros in insertNanIndex is not equal to the size of a, you can pad or trim it, but in that case, it becomes more of a moot point as to what the whole thing means.

vectorising while loop insertion sort matlab

array = [2 1 3 2 1]
for i = 2:length(array)
value = array(i);
j = i - 1;
array_j=array(1:j);
array_j_indices=cumsum(array_j>value);
[~,n]=find(array_j_indices==1);
newArray=array;
array(n+1:i)=array_j(array_j>value);
j=j-max(array_j_indices);
array(j+1) = value;
end %forLoop
disp(array);
Hello,
I saw this code for vectorising while loop insertion code but i cannot seem to understand how it works.
How does cumsum(array_j>value) work? I understand and tested cumsum functions but i can't seem to understand how the rational operator of (array_j>value) works in the within a cumsum function under the for loop.
Also, i dont understand how [~,n]=find(array_j_indices==1) stores value for the matrix of n. Does it store it only in columns because there is a not (~) in the rows?
cumsum(array_j>value)?
array_j>value: due to the sorted nature of array_j, the result is always some zeros followed by some ones, e.g. [0 0 0 0 1 1 1 1]
cumsum(array_j>value) = [0 0 0 0 1 2 3 4]: at most one element will be equal to 1.
[~,n]=find(array_j_indices==1); ?
Because there is only one row, this is equal to n=find(array_j_indices==1);.
Fastest implementation?
Note that this 'vectorised' code is slower the following (easier) implementation:
for i = 2:length(array)
value = array(i);
j = i - 1;
n=find(array(1:j)>value,1);
array(n+1:i)=array(n:j);
array(n) = value;
end
and much slower than the built-in matlab sort method.

Finding the greatest common divisor of a matrix in MATLAB

Im looking for a way to divide a certain matrix elements with its lowest common divisor.
for example, I have vectors
[0,0,0; 2,4,2;-2,0,8]
I can tell the lowest common divisor is 2, so the matrix after the division will be
[0,0,0;1,2,1;-1,0,4]
What is the built in method that can compute this?
Thanks in advance
p.s. I personally do not like using loops for this computation, it seems like there is built in computation that can perform matrix element division.
Since you don't like loops, how about recursive functions?
iif = #(varargin) varargin{2 * find([varargin{1:2:end}], 1, 'first')}();
gcdrec=#(v,gcdr) iif(length(v)==1,v, ...
v(1)==1,1, ...
length(v)==2,#()gcd(v(1),v(2)), ...
true,#()gcdr([gcd(v(1),v(2)),v(3:end)],gcdr));
mygcd=#(v)gcdrec(v(:)',gcdrec);
A=[0,0,0; 2,4,2;-2,0,8];
divisor=mygcd(A);
A=A/divisor;
The first function iif will define an inline conditional construct. This allows to define a recursive function, gcdrec, to find the greatest common divisor of your array. This iif works like this: it tests whether the first argument is true, if it is, then it returns the second argument. Otherwise it tests the third argument, and if that's true, then it returns the fourth, and so on. You need to protect recursive functions and sometimes other quantities appearing inside it with #(), otherwise you can get errors.
Using iif the recursive function gcdrec works like this:
if the input vector is a scalar, it returns it
else if the first component of the vector is 1, there's no chance to recover, so it returns 1 (allows quick return for large matrices)
else if the input vector is of length 2, it returns the greatest common divisor via gcd
else it calls itself with a shortened vector, in which the first two elements are substituted with their greatest common divisor.
The function mygcd is just a front-end for convenience.
Should be pretty fast, and I guess only the recursion depth could be a problem for very large problems. I did a quick timing check to compare with the looping version of #Adriaan, using A=randi(100,N,N)-50, with N=100, N=1000 and N=5000 and tic/toc.
N=100:
looping 0.008 seconds
recursive 0.002 seconds
N=1000:
looping 0.46 seconds
recursive 0.04 seconds
N=5000:
looping 11.8 seconds
recursive 0.6 seconds
Update: interesting thing is that the only reason that I didn't trip the recursion limit (which is by default 500) is that my data didn't have a common divisor. Setting a random matrix and doubling it will lead to hitting the recursion limit already for N=100. So for large matrices this won't work. Then again, for small matrices #Adriaan's solution is perfectly fine.
I also tried to rewrite it to half the input vector in each recursive step: this indeed solves the recursion limit problem, but it is very slow (2 seconds for N=100, 261 seconds for N=1000). There might be a middle ground somewhere, where the matrix size is large(ish) and the runtime's not that bad, but I haven't found it yet.
A = [0,0,0; 2,4,2;-2,0,8];
B = 1;
kk = max(abs(A(:))); % start at the end
while B~=0 && kk>=0
tmp = mod(A,kk);
B = sum(tmp(:));
kk = kk - 1;
end
kk = kk+1;
This is probably not the fastest way, but it will do for now. What I did here is initialise some counter, B, to store the sum of all elements in your matrix after taking the mod. the kk is just a counter which runs through integers. mod(A,kk) computes the modulus after division for each element in A. Thus, if all your elements are wholly divisible by 2, it will return a 0 for each element. sum(tmp(:)) then makes a single column out of the modulo-matrix, which is summed to obtain some number. If and only if that number is 0 there is a common divisor, since then all elements in A are wholly divisible by kk. As soon as that happens your loop stops and your common divisor is the number in kk. Since kk is decreased every count it is actually one value too low, thus one is added.
Note: I just edited the loop to run backwards since you are looking for the Greatest cd, not the Smallest cd. If you'd have a matrix like [4,8;16,8] it would stop at 2, not 4. Apologies for that, this works now, though both other solutions here are much faster.
Finally, dividing matrices can be done like this:
divided_matrix = A/kk;
Agreed, I don't like the loops either! Let's kill them -
unqA = unique(abs(A(A~=0))).'; %//'
col_extent = [2:max(unqA)]'; %//'
B = repmat(col_extent,1,numel(unqA));
B(bsxfun(#gt,col_extent,unqA)) = 0;
divisor = find(all(bsxfun(#times,bsxfun(#rem,unqA,B)==0,B),2),1,'first');
if isempty(divisor)
out = A;
else
out = A/divisor;
end
Sample runs
Case #1:
A =
0 0 0
2 4 2
-2 0 8
divisor =
2
out =
0 0 0
1 2 1
-1 0 4
Case #2:
A =
0 3 0
5 7 6
-5 0 21
divisor =
1
out =
0 3 0
5 7 6
-5 0 21
Here's another approach. Let A be your input array.
Get nonzero values of A and take their absolute value. Call the resulting vector B.
Test each number from 1 to max(B), and see if it divides all entries of B (that is, if the remainder of the division is zero).
Take the largest such number.
Code:
A = [0,0,0; 2,4,2;-2,0,8]; %// data
B = nonzeros(abs(A)); %// step 1
t = all(bsxfun(#mod, B, 1:max(B))==0, 1); %// step 2
result = find(t, 1, 'last'); %// step 3

MATLAB syntax length

I'm reading some MATLAB trying to pick it up. The line below is probably rather simple but I do not understand it.
I understand length will give me the length of a vector, in this case a vector which is part of a struct, index_struct.data_incl.
The actual value of index_stuct.data_incl at run time is simply 1. What is confusing me is what is inside the brackets i.e. (index_struct.data_incl == 1)? I can't work out what this line is trying to do as simple as it may be!
int_var = length(index_struct.data_incl(index_struct.data_incl == 1));
try this (but think of x as your index_struct.data_incl:):
x = [1 4 5 13 1 1]
length(x(x==1))
ans =
3
It's just counting the number of elements of your x vector that are equal to 1
because x==1 evaluates to [1 0 0 0 1 1] and then using logical indexing x(x==1) evaluates to [1 1 1] whose length is 3;
It could have been written more simply as sum(index_struct.data_incl == 1)
If I dont see the code I can only guess..., but I guess that index_struc.data_incl should be a vector, with length n meaning that you have the option to read until n files, and all the values of the array should be 0 at the begining, and when you read a file you change the corresponding position in the vector index_struc.data_incl from 0 to 1. After some time you can see how many of these files you have read using
int_var = length(index_struct.data_incl(index_struct.data_incl == 1));
because it will give to you the number of 1 in the vector index_struct.data_incl.

Create an empty symbolic matrix and predefine the dimension in Matlab?

I want want to do some string calculation using Matlab, and then stored the value in an matrix.
For numerical study, I often predefined the dimensions in Matlab using zeros to create a 4*4 array.
a = zeros(4)
Now I want to do the same thing for the symbolic matrix. Obviously zeros didn't work at this time.
I tried to copy the official tutorial at this page http://www.mathworks.com/help/symbolic/sym.html
a = sym('0' ,4) % error
Still didn't work.
Now I have do use the ugly code like this
a = sym('[0 0 0 0; 0 0 0 0; 0 0 0 0; 0 0 0 0]');
Since I will use iterations, and dimension of the matrix grows every time. This method is not convenient.
Do you have any ideas? Thanks a lot!
Num = sym(Num) converts a number or a numeric matrix Num to symbolic form.
a=sym(zeros(4,4))
Can't try but suspect that the variables get initialized as zero by default.
For example when using
a = sym('a' ,[2 2])