Sampling according to difference in function value - matlab

I have 20 values x1,...x20. Each value is between 0 and 1, for example 0.22,0.23,0.25,...
x = rand(20,1);
x = sort(x);
Now I would like to choose one data point but not uniform at random. The data point with the lowest value should have the highest probability and the other values should have a probability proportional to the difference in function value to the lowest value.
For example, if the lowest function value is 0.22, a data point with a function value of 0.23 has a difference to the best value of 0.23 - 0.22 = 0.01 and should therefore have a probability similar to the 0.22 value. But a value of 0.3 has a difference of 0.3 - 0.22 = 0.08 and should therefore have a much smaller probability.
How can this be done?

I would leave this as a comment, but I unfortunately don't have the rep yet.
This looks interesting, and I have a few questions for you. (I will edit this answer to be an answer later.)
The data point with the lowest value should have the highest probability and the other values should have a probability proportional to the difference in function value to the lowest value.
Lets take an array of 20 items, and subtract the lowest number from the entire array. This leaves us with our smallest value (which you want to be the most probable) as 0. We need to define a function now, that goes over all of the points and integrates to 1.
I've done the following:
x = rand(20, 1);
x = sort(x);
xx = x - x(1);
I suppose at this point we can invert our answers so the lowest point is 1.
Px = 1 - xx; %For probabilities
TotalP = sum(Px);
Now we have everything we need, I think... So lets see what we can make.
P = Px/TotalP; %This will be our probability.
SanityCheck = sum(P); %Make sure that it sums up to 1.
Looks like that works, so lets make our cumulative sum array, and get an element.
PI = cumsum(P); %This will be the integral form of the probability function.
test = rand; %Create a test number so we can place it in the integral function
index = find(PI > test, 1); %This will return the first entry that is greater than our test value...
result = x(index); %And here's our value
I hope this is along what you were looking for. If not, please comment and I'll get back to you. :)
[edited to incorporate comments]

Related

how to define an array with fractional index number

Like suppose that I need to create a function named pressure denoted by p (a 2-D matrix) which depends on 2 variables r and z.
u, v, w are linear matrices which also depend on 2 variables r and z.
r and z are linear matrix defined below take i={1,2,3,4,5,6,7,8,9,10}
r(i)=i/10
z(i)=i/10
u(i) = 2*r(i) + 3*z(i)
v(i) = 8*r(i) + 4*z(i)
w(i) = 3*r(i) + 2*z(i)
p = p(r,z) %, which is given as,
p(r(i),z(j)) = 2*v(i) - 4*u(i) + w(j)
Now suppose the value of p at a given location (r,z) say (0.4,0.8) is needed, I want that if I give the input p(0.4,0.8), I get the result.
In your case the easiest way is to convert the fractional numbers to integers by multiplying by 10.
This way the location (r,z) = (0.4, 0.8) will become (4,8).
If you don't want to remember every time to provide the locations multiplied by 10, just create a function that will do it for you, so you can call the function with the fractional location.
If your matrices are linear, you will always find a multiplying factor to get rid of the fractional coordinates.
Not entirely sure what you mean here, but if your matrix is only defined in the indices you give (i.e. you only want to draw values from the fixed set of indices you defined), then this should do it:
% the query indices
r_i = 0.4;
z_i = 0.8;
value = p(r_i*10,z_i*10);
if you want to look at values between the ones you defined, you need to look at interpolation:
% the query indices
r_i = 0.46;
z_i = 0.84;
value = interp2(r,z, p, r_i, z_i);
(I may have gotten r and z in that last function in the wrong order, try it out).

find intersection between vectors holding the peaks of two signals

I have two signals and I calculated the local peaks of each signal and saved them in two different vectors for amplitude and another two for timing. I need to get the intersection between peaks. Each peak has a value and a time so I am trying to extract the peaks which has nearly the same amplitude at nearly the same time .. Any help??
My Code:
[svalue1,stime1] = findpeaks(O1);
[svalue2,stime2] = findpeaks(O2);
%note that the peaks count is different in each signal
% This is my try but it is not working
x = length(intersect(ceil(svalue1),ceil(svalue2)))/min(length(svalue1),length(svalue2));
It is my understanding what you want to determine those values in svalue1 and svalue2 that are similar to each other, and what's more important is that they are unequal in length.
What you can do is compare every value in svalue1 with every value in svalue2 and if the difference between a value in svalue1 and a value in svalue2 is less than a certain amount, then we would classify these two elements to be the same.
This can be achieved by bsxfun with the #minus function and eliminating any sign changes with abs. After, we can determine the locations where the values are below a certain amount.
Something like this:
tol = 0.5; %// Adjust if necessary
A = abs(bsxfun(#minus, svalue1(:), svalue2(:).')) <= tol;
[row,col] = find(A);
out = [row,col];
tol is the tolerance that we would use to define whether two values are close together. I chose this to be 0.5, but adjust this for your application. out is a 2D matrix that tells you which value in svalue1 was closest to svalue2. Rather than giving a verbose explanation, let's just show you an example of this working and we can explain along the way.
Let's try this on an example:
>> svalue1 = [0 0.1 1 2.2 3];
>> svalue2 = [0.1 0.2 2 3 4];
Running the above code, we get:
>> out
ans =
1 1
2 1
1 2
2 2
4 3
5 4
Now this makes sense. Each row tells you which value in svalue1 is close to svalue2. For example, the first row says that the first value in svalue1, or 0 is close to the second value in svalue2 or 0.1. The next row says that the second value of svalue1, or 0.2, is close to the first value of svalue2, or 0.
Obviously, this operation includes non-unique values. For example, the row with [1 2] and [2 1] are the same. I'm assuming this isn't a problem, so we'll leave that alone.
Now what I didn't cover is whether the peaks also happen within the same time value. This can be done by performing another bsxfun operation on the time vector values of stime1 and stime2 much like we did with svalue1 and svalue2, and performing a logical AND operation between the two matrices. Should the peaks be the same in both amplitude and time, then the result follows.... so something like this:
tol_amplitude = 5; %// Adjust if necessary
tol_time = 0.5;
A = abs(bsxfun(#minus, svalue1(:), svalue2(:).')) <= tol_amplitude;
Atime = abs(bsxfun(#minus, stime1(:), stime2(:).')) <= tol_time;
Afinal = A & Atime;
[row,col] = find(Afinal);
out = [row,col];
You'll notice that we have two thresholds for the time and the amplitudes. Adjust both if necessary. out will contain the results like we saw earlier, but these will give you those indices that are close in both time and amplitude. If you want to see what those are, you can do something like this:
peaks = [svalue1(out(:,1)) svalue2(out(:,2))];
times = [stime1(out(:,1)) stime2(out(:,2))];
peaks and times will give you what the corresponding peaks and times were that would be considered as "close" between the two signals. The first column denotes the peaks and times for the first signal and the second column is for the peaks and times for the second signal. The difference between columns should be less than their prescribed thresholds.

determine the frequency of a number if a simulation

I have the following function:
I have to generate 2000 random numbers from this function and then make a histogram.
then I have to determine how many of them is greater that 2 with P(X>2).
this is my function:
%function [ output_args ] = Weibullverdeling( X )
%UNTITLED Summary of this function goes here
% Detailed explanation goes here
for i=1:2000
% x= rand*1000;
%x=ceil(x);
x=i;
Y(i) = 3*(log(x))^(6/5);
X(i)=x;
end
plot(X,Y)
and it gives me the following image:
how can I possibly make it to tell me how many values Do i Have more than 2?
Very simple:
>> Y_greater_than_2 = Y(Y>2);
>> size(Y_greater_than_2)
ans =
1 1998
So that's 1998 values out of 2000 that are greater than 2.
EDIT
If you want to find the values between two other values, say between 1 and 4, you need to do something like:
>> Y_between = Y(Y>=1 & Y<=4);
>> size(Y_between)
ans =
1 2
This is what I think:
for i=1:2000
x=rand(1);
Y(i) = 3*(log(x))^(6/5);
X(i)=x;
end
plot(X,Y)
U is a uniform random variable from which you can get the X. So you need to use rand function in MATLAB.
After which you implement:
size(Y(Y>2),2);
You can implement the code directly (here k is your root, n is number of data points, y is the highest number of distribution, x is smallest number of distribution and lambda the lambda in your equation):
X=(log(x+rand(1,n).*(y-x)).*lambda).^(1/k);
result=numel(X(X>2));
Lets split it and explain it detailed:
You want the k-th root of a number:
number.^(1/k)
you want the natural logarithmic of a number:
log(number)
you want to multiply sth.:
numberA.*numberB
you want to get lets say 1000 random numbers between x and y:
(x+rand(1,1000).*(y-x))
you want to combine all of that:
x= lower_bound;
y= upper_bound;
n= No_Of_data;
lambda=wavelength; %my guess
k= No_of_the_root;
X=(log(x+rand(1,n).*(y-x)).*lambda).^(1/k);
So you just have to insert your x,y,n,lambda and k
and then check
bigger_2 = X(X>2);
which would return only the values bigger than 2 and if you want the number of elements bigger than 2
No_bigger_2=numel(bigger_2);
I'm going to go with the assumption that what you've presented is supposed to be a random variate generation algorithm based on inversion, and that you want real-valued (not complex) solutions so you've omitted a negative sign on the logarithm. If those assumptions are correct, there's no need to simulate to get your answer.
Under the stated assumptions, your formula is the inverse of the complementary cumulative distribution function (CCDF). It's complementary because smaller values of U give larger values of X, and vice-versa. Solve the (corrected) formula for U. Using the values from your Matlab implementation:
X = 3 * (-log(U))^(6/5)
X / 3 = (-log(U))^(6/5)
-log(U) = (X / 3)^(5/6)
U = exp(-((X / 3)^(5/6)))
Since this is the CCDF, plugging in a value for X gives the probability (or proportion) of outcomes greater than X. Solving for X=2 yields 0.49, i.e., 49% of your outcomes should be greater than 2.
Make suitable adjustments if lambda is inside the radical, but the algebra leading to solution is similar. Unless I messed up my arithmetic, the proportion would then be 55.22%.
If you still are required to simulate this, knowing the analytical answer should help you confirm the correctness of your simulation.

NaN produced from division of values put a Vector

Below is some MATLAB code which is inside a loop that increments position every cycle. However after this code has run, the values in the vector F1 are all NaN. I print A to the screen and it outputs the expected values.
% 500 Hz Bands
total_energy = sum(F(1:(F_L*8)));
A = (sum(F(1:F_L))) /total_energy %% correct value printed to screen
F1(position) = (sum(F(1:F_L))) /total_energy;
Any help is appreciated
Assuming that F_L = position * interval, I suggest you use something like:
cumulative_energy_distribution = cumsum(abs(F).^2);
positions = 1:q;
F1 = cumulative_energy_distribution(positions * interval) ./ cumulative_energy_distribution(positions * 8 * interval);
sum-of-squares, which is the energy density (and also seen in Parseval's theorem) is monotonically increasing, so you don't need to worry about the energy going back down to zero.
In MATLAB, if you divide zero by zero, you get a NaN. Therefore, its always better to add an infinitesimal number to the denominator such that its addition doesn't change the final result, but it avoids division by zero. You can choose any small value as that infinitesimal number (such as 10^-10) but MATLAB already has a variable called eps.
eps(n) gives a distance to the next-largest number (whose precision is same as n) which could be represented in MATLAB. For example, if n=1, next double-precision number you could get to from 1 is 1+eps(1)=1+2.2204e-16. If n=10^10, then the next number is 10^10+1.9073e-06. eps is same as eps(1)=2.2204e-16. Adding eps doesn't change the output and avoids the situation of 0/0.

Using find on non-integer MATLAB array values

I've got a huge array of values, all or which are much smaller than 1, so using a round up/down function is useless. Is there anyway I can use/make the 'find' function on these non-integer values?
e.g.
ind=find(x,9.5201e-007)
FWIW all the values are in acceding sequential order in the array.
Much appreciated!
The syntax you're using isn't correct.
find(X,k)
returns k non-zero values, which is why k must be an integer. You want
find(x==9.5021e-007);
%# ______________<-- logical index: ones where condition is true, else zeros
%# the single-argument of find returns all non-zero elements, which happens
%# at the locations of your value of interest.
Note that this needs to be an exact representation of the floating point number, otherwise it will fail. If you need tolerance, try the following example:
tol = 1e-9; %# or some other value
val = 9.5021e-007;
find(abs(x-val)<tol);
When I want to find real numbers in some range of tolerance, I usually round them all to that level of toleranace and then do my finding, sorting, whatever.
If x is my real numbers, I do something like
xr = 0.01 * round(x/0.01);
then xr are all multiples of .01, i.e., rounded to the nearest .01. I can then do
t = find(xr=9.22)
and then x(t) will be every value of x between 9.2144444444449 and 9.225.
It sounds from your comments what you want is
`[b,m,n] = unique(x,'first');
then b will be a sorted version of the elements in x with no repeats, and
x = b(n);
So if there are 4 '1's in n, it means the value b(1) shows up in x 4 times, and its locations in x are at find(n==1).