quicksort matlab, recursive function - matlab

I need to write a recursive function embedded with quicksort algorithm.
I'm having trouble when updating the new boundaries. y is a matrix and m is the num of the row needs to be sorted.Please help...
function [y]=quicksort(y,left,right,m)
i=left;
j=right;
num=randi(length(y)); % pick a random element in the array as pivot
pivot=y(m,num);
if i <= j %find the element fits criteria below before i overlaps j.
while y(m,i) < pivot
i = i + 1;
end
while y(m,j) > pivot
j = j - 1;
end
ytmp=y(:,j);
y(:,j)=y(:,i);
y(:,i)=ytmp;
i = i + 1;
j = j - 1;
%swap the positions of the two elements when y(m,j) < pivot < y(m,i)
else
return
end
return
[y]=quicksort(y,i,right,m); %update the boundaries.
[y]=quicksort(y,left,j,m); %recursively call the function.

You have done some errors. However, since this seems to be homework I will give you some concrete hints and examples instead of posting the right answer directly. I also limit this to a single vector to keep the answer simpler and more concise.
First, you want to swap all elements which are on the wrong side of the pivot element, not only the first element. So the tip is to use a while loop. You will however still need to do the swapping somewhere, so you will need an if-statement somewhere as well. Secondly, the last return will always be executed. This means that you will never enter the recursion. Try to instead use a condition where you only continue iterating in case the number of elements left exeeds one.
if (i < right)
y=qsort(y,i,right); %update the boundaries.
end
if (left < j)
y=qsort(y,left,j); %recursively call the function.
end
Hope this information is enough. Good luck!

Related

Find sum distance to horizontal line for all points in Matlab

I have a scatter plot of approximately 30,000 pts, all of which lie above a horizontal line which I've visually defined in my plot. My goal now is to sum the vertical distance of all of these points to this horizontal line.
The data was read in from a .csv file and is already saved to the workspace, but I also need to check whether a value is NaN, and ignore these.
This is where I'm at right now:
vert_deviation = 0;
idx = 1;
while idx <= numel(my_data(:,5)) && isnan(idx) == 0
vert_deviation = vert_deviation + ((my_data(idx,5) - horiz_line_y_val));
idx = idx + 1;
end
I know that a prerequisite of using the && operator is having two logical statements I believe, but I'm not sure how to rewrite this loop in this way at the moment. I also don't understant why vert_deviation returns NaN at the moment, but I assume this might have to do with the first mistake I described...
I would really appreciate some guidance here - thank you in advance!
EDIT: The 'horizontal line' is a slight oversimplification - in reality the lower limit I need to find the distance to consists of 6 different line segments
I should have specified that the lower limit to which I need to calculate the distance for all scatterplot points varies for different x values (the horizontal line snippet was meant to be a simplification but may have been misleading... apologies for that)
I first modified the data I had already read into the workspace by replacing all NaNvalues with 0. Next, I wrote a while loop which defines the number if indexes to loop through, and defined an && condition to filter out any zeroes. I then wrote a nested if loop which checks what range of x values the given index falls into, and subsequently takes the delta between the y values of a linear line lower limit for that section of the plot and the given point. I repeated this for all points.
while idx <= numel(my_data(:,3)) && not(my_data(idx,3) == 0)
...
if my_data(idx,3) < upper_x_lim && my_data(idx,5) > lower_x_lim
vert_deviation = vert_deviation + (my_data(idx,4) - (m6 * (my_data(idx,5)) + b6))
end
...
m6 and b6 in this case are the slope and y intercept calculated for one section of the plot. The if loop is repeated six times for each section of the lower limit.
I'm sure there are more elegant ways to do this, so I'm open to any feedback if there's room for improvement!
Your loop doesn't exclude NaN values becuase isnan(idx) == 0 checks to see if the index is NaN, rather than checking if the data point is NaN. Instead, check for isnan(my_data(idx,5)).
Also, you can simplify your code using for instead of while:
vert_deviation = 0;
for idx=1:size(my_data,1)
if !isnan(my_data(idx,5))
vert_deviation = vert_deviation + ((my_data(idx,5) - horiz_line_y_val));
end
end
As #Adriaan suggested, you can remove the loop altogether, but it seems that the code in the OP is an oversimplification of the problem. Looking at the additional code posted, I guess it is still possible to remove the loops, but I'm not certain it will be a significant speed improvement. Just use a loop.

Basic structure of a for loop

I am trying to write a MATLAB function that accepts non-integer, n, and then returns the factorial of it, n!. I am supposed to use a for loop. I tried with
"for n >= 0"
but this did not work. Is there a way how I can fix this?
I wrote this code over here but this doesn't give me the correct answer..
function fact = fac(n);
for fact = n
if n >=0
factorial(n)
disp(n)
elseif n < 0
disp('Cannot take negative integers')
break
end
end
Any kind of help will be highly appreciated.
You need to read the docs and I would highly recommend doing a basic tutorial. The docs state
for index = values
statements
end
So your first idea of for n >= 0 is completely wrong because a for doesn't allow for the >. That would be the way you would write a while loop.
Your next idea of for fact = n does fit the pattern of for index = values, however, your values is a single number, n, and so this loop will only have one single iteration which is obviously not what you want.
If you wanted to loop from 1 to n you need to create a vector, (i.e. the values from the docs) that contains all the numbers from 1 to n. In MATLAB you can do this easily like this: values = 1:n. Now you can call for fact = values and you will iterate all the way from 1 to n. However, it is very strange practice to use this intermediary variable values, I was just using it to illustrate what the docs are talking about. The correct standard syntax is
for fact = 1:n
Now, for a factorial (although technically you'll get the same thing), it is clearer to actually loop from n down to 1. So we can do that by declaring a step size of -1:
for fact = n:-1:1
So now we can find the factorial like so:
function output = fac(n)
output = n;
for iter = n-1:-1:2 %// note there is really no need to go to 1 since multiplying by 1 doesn't change the value. Also start at n-1 since we initialized output to be n already
output = output*iter;
end
end
Calling the builtin factorial function inside your own function really defeats the purpose of this exercise. Lastly I see that you have added a little error check to make sure you don't get negative numbers, that is good however the check should not be inside the loop!
function output = fac(n)
if n < 0
error('Input n must be greater than zero'); %// I use error rather than disp here as it gives clearer feedback to the user
else if n == 0
output = 1; %// by definition
else
output = n;
for iter = n-1:-1:2
output = output*iter;
end
end
end
I don't get the point, what you are trying to do with "for". What I think, what you want to do is:
function fact = fac(n);
if n >= 0
n = floor(n);
fact = factorial(n);
disp(fact)
elseif n < 0
disp('Cannot take negative integers')
return
end
end
Depending on your preferences you can replace floor(round towards minus infinity) by round(round towards nearest integer) or ceil(round towards plus infinity). Any round operation is necessary to ensure n is an integer.

Nearest column in matlab

I want to find the nearest column of a matrix with a vector.
Consider the matrix is D and the vector is y. I want an acceleration method for this function
function BOOLEAN = IsExsist(D,y)
[~, Ysize, ~] = size(D);
BOOLEAN = 0;
MIN = 1.5;
for i=1:Ysize
if(BOOLEAN == 1)
break;
end;
if(norm(y - D(:,i),1) < MIN )
BOOLEAN = 1;
end;
end;
end
I am assuming you are looking to "accelerate" this procedure. For the same, try this -
[~,nearest_column_number] = min(sum(abs(bsxfun(#minus,D,y))))
The above code uses 1-Norm (as used by you) along all the columns of D w.r.t. y. nearest_column_number is your desired output.
If you are interested in using a threshold MIN for the getting the first nearest column number, you may use the following code -
normvals = sum(abs(bsxfun(#minus,D,y)))
nearest_column_number = find(normvals<MIN,1)
BOOLEAN = ~isempty(nearest_column_number)
nearest_column_number and BOOLEAN are the outputs you might be interested in.
If you are looking to make a function out of it, just wrap in the above code into the function format you were using, as you already have the desired output from the code.
Edit 1: If you are using this procedure for a case with large D matrices with sizes like 9x88800, use this -
normvals = sum(abs(bsxfun(#minus,D,y)));
BOOLEAN = false;
for k = 1:numel(normvals)
if normvals(k) < MIN
BOOLEAN = true;
break;
end
end
Edit 2: It appears that you are calling this procedure/function a lot of times, which is the bottleneck here. So, my suggestion at this point would be to look into your calling function and see if you could reduce the number of calls, otherwise use your own code or try this slight modified version of it -
BOOLEAN = false;
for k = 1:numel(y)
if norm(y - D(:,k),1) < MIN %// You can try replacing this with "if sum(abs(y - D(:,i),1)) < MIN" to see if it gives any performance improvement
BOOLEAN = true;
break;
end
end
To find the nearest column of a matrix D to a column vector y, with respect to 1-norm distance, you can use pdist2:
[~, index] = min(pdist2(y.',D.','minkowski',1));
What you are currently trying to do is optimize your Matlab implementation of linear search.
No matter how much you optimize that it will always need to calculate all D=88800 distances over all d=9 dimensions for each search.
Now that's easy to implement in Matlab as discussed in the other answer, but if you are planning to do many such searches, I would recommend to use a different data-structure and search-algorithm instead.
A good canditate would be (binary) space partioning which recursively splits your space into two parts along your dimensions. This adds quite some intial overhead to create the tree and makes insert- and remove-operations a bit more expensive. But as I understand your comments, searches are much more frequent and their execution will reduce in complexits from O(D) downto O(log(D)) which is a tremendous improvement for this problem size.
I think that there should be some usable Matlab-implementations of BSP around, e.g. on Mathworks file-exchange.
But if you don't manage to find one, I'd be happy to provide some pointers as well.

Order of convergence Newton

Hello I have written this to determine a root using Newton's method. The algorithm works. I also tried to implement an Experimental order of convergence EOC. It also works but I get the result that the order of convergence for Newton's method is 1 when in fact it is 2.
function [x,y,eoc,k]=newnew(f,df,x0,xe,eps,kmax)
x = x0;
y = feval(f,x);
for m=1:kmax
z = -y/feval(df,x);
x = x + z;
y = feval(f,x);
k = m;
for n=m
Ek=abs(x-xe);
end
for n=m+1
Ekp=abs(x-xe);
end
eoc=log(Ek)/log(Ekp);
if abs(y)<eps
return
end
end
disp('no convergence');
end
what is wrong?
When you say Ek=abs(x-xe) and Exp=abs(x-xe), they are exactly the same thing! That's why eoc evaluates to 1 every time.
Notice that you have no n in those equations. In fact, you don't need those extra for n=m loops either. Inside the for m=1:kmax loop, m is a single value not an array.
eoc needs to be calculated by comparing the previous loop iteration to the current one (since it doesn't make much sense to compare to a future loop iteration which hasn't happened yet). Because this looks like homework, I won't give you any code.. but this is a very strong hint.

find consecutive nonzero values

I am trying to write a simple MATLAB program that will find the first chain (more than 70) of consecutive nonzero values and return the starting value of that consecutive chain.
I am working with movement data from a joystick and there are a few thousand rows of data with a mix of zeros and nonzero values before the actual trial begins (coming from subjects slightly moving the joystick before the trial actually started).
I need to get rid of these rows before I can start analyzing the movement from the trials.
I am sure this is a relatively simple thing to do so I was hoping someone could offer insight.
Thank you in advance
EDIT: Here's what I tried:
s = zeros(size(x1));
for i=2:length(x1)
if(x1(i-1) ~= 0)
s(i) = 1 + s(i-1);
end
end
display(S);
for a vector x1 which has a max chain of 72 but I dont know how to find the max chain and return its first value, so I know where to trim. I also really don't think this is the best strategy, since the max chain in my data will be tens of thousands of values.
This answer is generic for any chain size. It finds the longest chain in a vector x1 and retrieves the first element of that chain val.
First we'll use bwlabel to label connected components, For example:
s=bwlabel(x1);
Then we can use tabulate to get a frequency table of s, and find the first element of the biggest connected component:
t=tabulate(s);
[C,I]=max(t(:,2));
val=x1(find(s==t(I,1),1, 'first'));
This should work for the case you have one distinct maximal size chain. What happens for the case if you have more than one chain that has maximal lengths? (you can still use my code with slight modifications...)
You don't need to use an auxiliary vector to keep track of the index:
for i = 1:length(x)
if x(i) ~= 0
count = count + 1;
elseif count >= 70
lastIndex = i;
break;
else
count = 0;
end
if count == 70
index = i - 69;
end
end
To remove all of the elements in the chain from x, you can simply do:
x = x([lastIndex + 1:end]);
EDIT (based off comment):
The reason that the way you did it didn't work was because you didn't reset the counter when you ran into a 0, that's what the:
else
count = 0;
is for; it resets the process, if you will.
For some more clarity, in your original code, this would be reflected by:
if x1(i-1) ~= 0
s(i) = 1 + s(i-1);
else
s(i) = 0;
end