Random select and median of medians - select

I was planning to implement those algorithms in C++ and came upon one problem. What happens when I pass array with elements that repeats and ask for some order statistics? When I ask for a second order statistic from array [1,2,1] should it return 1 or 2 ? If 2, is there any quick way to edit those algorithm to adapt them to that kind of arrays? Is moving elements that repeats to the end of array is good way to do it?

Related

How to get a new solution with high probability from previously found, incomplete solutions with different probabilities?

I am working on a AI algorithm. first when program runs a random solution is generated from which ,in first iteration of the program 10 solution vectors are created, by analyzing these solutions we could give each of them a probability ( highest , second highest, third highest and so on) towards the optimal solution , for the second input of the program I want it to be a vector (possible solution) obtained from those 10 vectors previously found. But i need the vector solution to consider all the previous solutions with a different impact depending on their probability ...
i.e A=[4.7 ,5.6, 3.5,9 ] b=[-7.9 ,8 ,-2.8 ,4.6] c=[7 ,9.7 , 4,6,3.9] ......
i used mean in my program
NextPossibleSolution = mean(([A;B;C;]))
But do you think mean is the right move ? i don't think because all the solution contributes equal to Next Possible Solution (next input) regardless of their likelihood ... Please if there is a method formula or anything , Let me know that ... I really need it badly .... A Billion Thanks

(matlab matrix operation), Is it possible to get a group of value from matrix without loop?

I'm currently working on implementing a gradient check function in which it requires to get certain index values from the result matrix. Could someone tell me how to get a group of values from the matrix?
To be specific, for a result matrx res with size M x N, I'll need to get element res(3,1), res(4,2), res(1,3), res(2,4)...
In my case, M is dimension and N is batch size and there's a label array whose size is 1xbatch_size, [3 4 1 2...]. So the desired values are res(label(:),1:batch_size). Since I'm trying to practice vectorization programming and it's better not using loop. Could someone tell me how to get a group of value without a iteration?
Cheers.
--------------------------UPDATE----------------------------------------------
The only idea I found is firstly building a 'mask matrix' then use the original result matrix to do element wise multiplication (technically called 'Hadamard product', see in wiki). After that just get non-zero element out and do the sum operation, the code in matlab should look like:
temp=Mask.*res;
desired_res=temp(temp~=0); %Note: the temp(temp~=0) extract non-zero elements in a 'column' fashion: it searches temp matrix column by column then put the non-zero number into container 'desired_res'.
In my case, what I wanna do next is simply sum(desired_res) so I don't need to consider the order of those non-zero elements in 'desired_res'.
Based on this idea above, creating mask matrix is the key aim. There are two methods to do this job.
Codes are shown below. In my case, use accumarray function to add '1' in certain location (which are stored in matrix 'subs') and add '0' to other space. This will give you a mask matrix size [rwo column]. The usage of full(sparse()) is similar. I made some comparisons on those two methods (repeat around 10 times), turns out full(sparse) is faster and their time costs magnitude is 10^-4. So small difference but in a large scale experiments, this matters. One benefit of using accumarray is that it could define the matrix size while full(sparse()) cannot. The full(sparse(subs, 1)) would create matrix with size [max(subs(:,1)), max(subs(:,2))]. Since in my case, this is sufficient for my requirement and I only know few of their usage. If you find out more, please share with us. Thanks.
The detailed description of those two functions could be found on matlab's official website. accumarray and full, sparse.
% assume we have a label vector
test_labels=ones(10000,1);
% method one, accumarray(subs,1,[row column])
tic
subs=zeros(10000,2);
subs(:,1)=test_labels;
subs(:,2)=1:10000;
k1=accumarray(subs,1,[10, 10000]);
t1=toc % to compare with method two to check which one is faster
%method two: full(sparse(),1)
tic
k2=full(sparse(test_labels,1:10000,1));
t2=toc

Scilab: How to find first 1 in each column without loops

I'm trying to find the first 1 in each column of a matrix without using a for or a while. Say I have
-->A
A =
1. 0. 0. 0.
0. 0. 1. 1.
1. 0. 1. 1.
1. 1. 0. 0.
then I would like to obtain [1,4,2,2] (I can assume there is always a 1 somewhere in each column). The thing is when I use find(A), it gives me [1,3,4,8,10,11,14,15].
I was told not to use loops but matrix operations because scilab handles the last ones better.
Thank you in advance!
With such a small matrix performance could be fast enough with a for loop and probably more readible. But one solution avoiding the use of for loops could be as following.
//Find all rows and cols of the ones in A
[row,col] = find(A);
//Get all row positions that are in a new column index
disp( row( find([1,diff(col)]) ));
I think a more readible solution would be something like the following:
//For each column
for col=1:4
//Find only the first occurence
disp(find(A(:,col),1));
end
As mentioned, with such a small matrix readibility should be a higher priority. You could measure performance of both (or other) solutions by profiling.
If you like to read more about some performance enhancing techniques have a look here.

Find the m-th smallest number in Matlab? [duplicate]

This question already has answers here:
How to find the index of the n smallest elements in a vector
(2 answers)
Closed 9 years ago.
Is there an efficient way to find the m-th smallest number in a vector of length n in Matlab? Do I have to use sort() function? Thanks and regards!
You don't need to sort the list of numbers to find the mth smallest number. The mth smallest number can be found out in linear time. i.e. if there are n elements in your array you can get a solution in O(n) time by using the selection algorithm and median of median algorithm.
The link is to the Wikipedia article,
http://en.wikipedia.org/wiki/Selection_algorithm#Linear_general_selection_algorithm_-_Median_of_Medians_algorithm
Edit 2: As Eitan pointed the first part of the answer doesn't address the question of finding the smallest m-th value but regarding the m-th element after the min value. The rest of the answer remains... +1 for Eitan's sharpness.
While sort is probably very efficient to begin with, you can try to see whether a find will be better. For example:
id=find(X>min(X),m,'first');
id(end) % is the index of the smallest m-th element in X
the function find has added functionality that lets you find the 'first' or 'last' elements that meet some criterion. For example, if you want to find the first n elements in array X less than a value y, use find(X<y,n,'first')
This operation stops as soon as the first element meeting the condition is encountered, which can result in significant time savings if the array is large and the value you find happens to be far from the end.
I'd also like to recap what #woodchips said already in this SO discussion that is somewhat relevant to your question:
The best way to speed up basic built-in algorithms such as sort is to get a faster hardware. It will speed everything else up too. MATLAB is already doing that in an efficient manner, using an optimized code internally. Saying this, maybe a GPU add-on can improve this too...
Edit:
For what it's worth, adding to Muster's comment, there is a FEX file called nth_element that is a MEX wrap of C++ that will get a solution in O(n) time for what you need. (similar to what #DDD pointed to)
As alternative solution, you may follow this way:
A = randi(100,4000,1);
A = sort(A,'ascend');
m = 5; % the 5 smallest numbers in array A
B = A(1:5);
I hope this helps.

Choosing desired rows in matlab

I have a data of 9 columns 14470 rows,
The first column is filled with 0 or 1. Zero means that there is no measurment and the whole row is not in my interest.... can some body help me in writing a loop which go through all lines and filter the data when in first column 1 exist?
You do not need a loop for this, remember Matlab is a matrix oriented programming language, loops should be avoided. I won't give you the answer, I think you can figure it out yourself, it's easy. This tutorial will help.
Have fun.