I want to find the rows of a matrix which contain specified element of another matrix.
For example, a=[1 2 3 4 5 6 7] and b=[1 2 0 4;0 9 10 11;3 1 2 12]. Now, I want to find the rows of b which contain at least three element of a. For this purpose, I used bsxfun command as following:
c=find(sum(any(bsxfun(#eq, b, reshape(a,1,1,[])), 2), 3)>=3);
It works good for low dimension matrices but when I want to use this for high dimension matrices, for example, when the number of rows of b is 192799, MATLAB gives following error:
Requested 192799x4x48854 (35.1GB) array exceeds maximum array size preference.
Creation of arrays greater than this limit may take a long time and cause MATLAB
to become unresponsive. See array size limit or preference panel for more information.
Is there any other command which does this task without producing the behaviour like above for high dimension matrices?
a possible solution:
a=[1 2 3 4 5 6 7]
b=[1 2 0 4;0 9 10 11;3 1 2 12]
i=ismember(b,a)
idx = sum(i,2)
idx = find(idx>=3)
I'm having troubles with randomly shuffling a vector without repeating numbers (ex. 1 1 is not acceptable but 1 2 is acceptable), given that each value is repeated equally.
More specifically, I would like to repeat the matrix [1:4] ten times (40 elements in total) so that 1, 2, 3 and 4 would all repeat 10 times without being consecutive.
If there is any clarification needed please let me know, I hope this question was clear.
This is what I have so far:
cond_order = repmat([1:4],10,1); %make matrix
cond_order = cond_order(:); %make sequence
I know randperm is quite relevant but I'm not sure how to use it with the one condition of non-repeating numbers.
EDIT: Thank you for all the responses.
I realize I was quite unclear. These are the examples I would like to reject [1 1 2 2 4 4 4...].
So it doesn't matter if [1 2 3 4] occurs in that order as long as individual values are not repeated. (so both [1 2 3 4 1 2 3 4...] and [4 3 1 2...] are acceptable)
Preferably I am looking for a shuffled vector meeting the criteria that
it is random
there are no consecutively repeating values (ex. 1 1 4 4)
all four values appear equal amount of times
Kind of working with the rejection sampling idea, just repeating with randperm until a sequence permutation is found that has no repeated values.
cond_order = repmat(1:4,10,1); %//make matrix
N = numel(cond_order); %//number of elements
sequence_found = false;
while ~sequence_found
candidate = cond_order(randperm(N));
if all(diff(candidate) ~= 0) %// check if no repeated values
sequence_found = true;
end
end
result = candidate;
The solution from mikkola got it methodically right, but I think there is a more efficient way:
He chose to sample based on equal quantities and check for the difference. I chose to do it the other way round and ended up with a solution requiering much less iterations.
n=4;
k=10;
d=42; %// random number to fail first check
while(~all(sum(bsxfun(#eq,d,(1:n).'),2)==k)) %' //Check all numbers to appear k times.
d=mod(cumsum([randi(n,1,1),randi(n-1,1,(n*k)-1)]),n)+1; %generate new random sample, enforcing a difference of at least 1.
end
A subtle but important distinction: does the author need an equal probability of picking any feasible sequence?
A number of people have mentioned answers of the form, "Let's use randperm and then rearrange the sequence so that it's feasible." That may not work. What will make this problem quite hard is if the author needs an equal chance of choosing any feasible sequence. Let me give an example to show the problem.
Imagine the set of numbers [1 2 2 3 4]. First lets enumerate the set of feasible sequences:
6 sequences beginning with 1: [1 2 3 2 4], [1 2 3 4 2], [1 2 4 2 3], [1 2 4 3 2], [1 3 2 4 2], [1 4 2 3 2].
Then there are 6 sequences beginning with [2 1]: [2 1 2 3 4], [2 1 2 4 3], [2 1 3 2 4], [2 1 3 4 2], [2 1 4 2 3], [2 1 4 3 2]. By symmetry, there are 18 sequences beginning with 2 (i.e. 6 of [2 1], 6 of [2 3], 6 of [2 4]).
By symmetry there are 6 sequences beginning with 3 and another 6 starting with 4.
Hence there are 6 * 3 + 18 = 36 possible sequences.
Sampling uniformly from feasible sequences, the probability the first number is 2 is 18/36 = 50 percent! BUT if you just went with a random permutation, the probability the first digit is 2 would be 40 percent! (i.e. 2/5 numbers in set are 2)
If equal probability of any feasible sequence is required, you want 50 percent of a 2 as the first number, but naive use of randperm and then rejiggering numbers at 2:end to make sequence feasible would give you a 40 percent probability of the first digit being two.
Note that rejection sampling would get the probabilities right as every feasible sequence would have an equal probability of being accepted. (Of course rejection sampling becomes very slow as probability of being accepted goes towards 0.)
Following some of the discussion on here, I think that there is a trade-off between performance and the theoretical requirements of the application.
If a completely uniform draw from the set of all valid permutations is required, then pure rejection sampling method will probably be required. The problem with this of course is that as the size of the problem is increased, the rejection rate will become very high. To demonstrate this, if we consider the base example in the question being n multiples of [1 2 3 4] then we can see the number of samples rejected for each valid draw as follows (note the log y axis):
My alternative method is to randomly sort the array, and then if duplicates are detected then the remaining elements will again be randomly sorted:
cond_order = repmat(1:4,10,1); %make matrix
cond_order = reshape(cond_order, numel(cond_order), 1);
cond_order = cond_order(randperm(numel(cond_order)));
i = 2;
while i < numel(cond_order)
if cond_order(i) ~= cond_order(i - 1)
i = i + 1;
else
tmp = cond_order(i:end);
cond_order(i:end) = tmp(randperm(numel(tmp)));
end
end
cond_order
Note that there is no guarantee that this will converge, but in the case where is becomes clear that it will not converge, we can just start again and it will still be better that re-computing the whole sequence.
This definitely meets the second two requirements of the question:
B) there are no consecutive values
C) all 4 values appear equal amount of times
The question is whether it meets the first 'Random' requirement.
If we take the simplest version of the problem, with the input of [1 2 3 4 1 2 3 4] then there are 864 valid permutations (empirically determined!). If we run both methods over 100,000 runs, then we would expect a Gaussian distribution around 115.7 draws per permutation.
As expected, the pure rejection sampling method gives this:
However, my algorithm does not:
There is clearly a bias towards certain samples.
In the end, it depends on the requirements. Both methods sample over the whole distribution so both fill the core requirements of the problem. I have not included performance comparisons, but for anything other than the simplest of cases, I am confident that my algorithm would be much faster. However, the distribution of the draws is not perfectly uniform. Whether it is good enough is dependent on the application and the size of the actual problem.
I have data that is output from a computational chemistry program (Gaussian09) which contains sets of Force Constant data. The data is arranged with indexes as the first 2-4 columns (quadratic, cubic and quartic FC's are calculated). As an example the cubic FC's look something like this, and MatLab has read them in successfully so I have the correct matrix:
cube=[
1 1 1 5 5 5
1 1 2 6 6 6
.
.
4 1 1 8 8 8
4 2 1 9 9 9
4 3 1 7 7 7 ]
I need a way to access the last 3 columns when feeding in the indices of the first 3 columns. Something along the lines of
>>index=find([cube(:,1)==4 && cube(:,2)==3 && cube(:,3)==1]);
Which would give me the row number of the data that is index [ 4 3 1 ] and allow me to read out the values [7 7 7] which I need within loops to calculate anharmonic frequencies.
Is there a way to do this without a bunch of loops?
Thanks in advance,
Ben
You have already found one way to solve this, by using & in your expression (allowing you to make non-scalar comparisons).
Another way is to use ismember:
index = find(ismember(cube(:,1:3),[4 3 1]));
Note that in many cases, you may not even need the call to find: the binary vector returned by the comparisons or ismember can directly be used to index into another array.
I have a matrix of following form in matlab:
3 4
4 3
5 6
6 5
I would like to have the rows 1 and 2 to be considered a duplicate, where the elements of the two rows are the same but not in the same order. Similarly rows 3 and 4 should be considered the same. So, given the matrix above, I would like to have the following as the result:
3 4
5 6
I have tried the unique function but it cannot help me for this purpose.
My actual matrix is quite large, and I don't want to solve the problem with an exhaustive pairwise search, since it is extremely time consuming.
Is there an elegant way of achieving my goal?
This is one way of ding this:
X = [3 4
4 3
5 6
6 5];
X = sort(X, 2);
UniqueRows = unique(X, 'rows');
UniqueRows =
3 4
5 6
I think, the question might have already been asked before. But I could not find proper answer in this forum.
Acutally, I have 2 vectors( of unequal length). I need to compare the 2 vectors. I can do it using a for loop. But it is taking a very long time.
Any obvious method which I may be missising ?
here is a small code snippet:
a=[ 1 2 3 4 5 6 7 8 1 2 3 4];
b=[ 2 3 4];
How can we compare a and b. Basically I need the index in vector a when comparison returns true.
Thanks
You can use strfind() for this (it works with doubles):
idx = strfind(a, b);
idx will contain the indices of all matches.