Generate matrix based on density function - matlab

I'm trying to generate a 2-by-6 matrix of random numbers based on their density function, for example
f(x)= 2x-4 for 2 < x < 3; 0 otherwise
So from what I understand I have to find the cumulative density function first, x2-4x, and then I have to invert it so that I can use the rand function.
This is that part I do not understand, how do I get the inverted function

Try something similar to this method: https://stackoverflow.com/a/13914141/1011724.
However, your PDF is continuous so you need to adjust it slightly. The cumsum part becomes your CDF and the sum(r >= ... part becomes a definite integral from 0 to rand (which is just your CDF since it evaluates to 0 at x==0) so (ignoring your limits) you get
X = #(x)x.^2 - 4x
To generate a random matrix go X(rand(2,6))
To account for your limits you can just multiply the entire function by x > 2 & x < 3 but also if it's greater than 3 then although the PDF is 0, the CDF should still be 32 - 4 =5
X_limited = #(x)(x.^2 - 4x ).*(x > 2 & x < 3) + (x>=3)*5
If you plot a graph of (x > 2 & x < 3) you will see it is a rectangular function between 2 and 3 and so multiplying by it makes anything outside of that window 0 but leaves anything inside the window unchanged. Similarly, x >= 3 is a step function start at x == 3 and thus it adds 5 to any values higher than 3 and since the windowing function will make sure the first term is zero when x is greater then 3, this step function ensures a value of 5 for all x greater than 3.
Now you just need to generate random numbers in whatever your range is. Assuming it's between 0 and 5
x = rand(2,6)*5
X_limited(x)

Related

Zero crossings around mean

I am working on developing a suite classifiers for EEG signals and I will be needing a zero-crossings around mean function, defined in the following manner:
Ideally if I have some vector with a range of values representing a sinusoid or any time varying signal, I will want to return a vector of Booleans of the same size as the vector saying if that particular value is a mean crossing. I have the following Matlab implementation:
ZX = #(x) sum(((x - mean(x)>0) & (x - mean(x)<0)) | ((x - mean(x)<0) & (x - mean(x)>0)));
Testing it on toy data:
[0 4 -6 9 -20 -5]
Yields:
0
EDIT:
Yet I believe it should return:
3
What am I missing here?
An expression like:
((x-m)>0) & ((x-m)<0)
is always going to return a vector of all zeros because no individual element of x is both greater and less than zero. You need to take into account the subscripts on the xs in the definition of ZX:
((x(1:end-1)-m)>0) & ((x(2:end)-m)<0)
You can use the findpeaks function on -abs(x), where x is your original data, to find the peak locations. This would give you the zero crossings in general for continuous signals which do not have zero as an actual maximum of the signal.
t = 0:0.01:10;
x = sin(pi*t);
plot(t,x)
grid
y = -abs(x);
[P,L] = findpeaks(y,t);
hold on
plot(L,P,'*')
A simple solution is to use movprod, and count the products which are negative, i.e.,
cnt = sum(sign(movprod(x-mean(x),2))<0);
With your toy example, you will get cnt = 3.

determine lag between two vector

I want to find the minimum amount of lag between two vector , I mean the minimum distance that something is repeated in vector based on another one
for example for
x=[0 0 1 2 2 2 0 0 0 0]
y=[1 2 2 2 0 0 1 2 2 2]
I want to obtain 4 for x to y and obtain 2 for y to x .
I found out a finddelay(x,y) function that works correctly only for x to y (it gives -4 for y to x).
is there any function that only give me lag based on going to the right direction of the vector? I will be so thankful if you'd mind helping me to get this result
I think this may be a potential bug in finddelay. Note this excerpt from the documentation (emphasis mine):
X and Y need not be exact delayed copies of each other, as finddelay(X,Y) returns an estimate of the delay via cross-correlation. However this estimated delay has a useful meaning only if there is sufficient correlation between delayed versions of X and Y. Also, if several delays are possible, as in the case of periodic signals, the delay with the smallest absolute value is returned. In the case that both a positive and a negative delay with the same absolute value are possible, the positive delay is returned.
This would seem to imply that finddelay(y, x) should return 2, when it actually returns -4.
EDIT:
This would appear to be an issue related to floating-point errors introduced by xcorr as I describe in my answer to this related question. If you type type finddelay into the Command Window, you can see that finddelay uses xcorr internally. Even when the inputs to xcorr are integer values, the results (which you would expect to be integer values as well) can end up having floating-point errors that cause them to be slightly larger or smaller than an integer value. This can then change the indices where maxima would be located. The solution is to round the output from xcorr when you know your inputs are all integer values.
A better implementation of finddelay for integer values might be something like this, which would actually return the delay with the smallest absolute value:
function delay = finddelay_int(x, y)
[d, lags] = xcorr(x, y);
d = round(d);
lags = -lags(d == max(d));
[~, index] = min(abs(lags));
delay = lags(index);
end
However, in your question you are asking for the positive delays to be returned, which won't necessarily be the smallest in absolute value. Here's a different implementation of finddelay that works correctly for integer values and gives preference to positive delays:
function delay = finddelay_pos(x, y)
[d, lags] = xcorr(x, y);
d = round(d);
lags = -lags(d == max(d));
index = (lags <= 0);
if all(index)
delay = lags(1);
else
delay = lags(find(index, 1)-1);
end
end
And here are the various results for your test case:
>> x = [0 0 1 2 2 2 0 0 0 0];
>> y = [1 2 2 2 0 0 1 2 2 2];
>> [finddelay(x, y) finddelay(y, x)] % The default behavior, which fails to find
% the delays with smallest absolute value
ans =
4 -4
>> [finddelay_int(x, y) finddelay_int(y, x)] % Correctly finds the delays with the
% smallest absolute value
ans =
-2 2
>> [finddelay_pos(x, y) finddelay_pos(y, x)] % Finds the smallest positive delays
ans =
4 2

sum matrix using logical matrix - index exceeds matrix dimensions

I have two matrices.
mcaps which is a double 1698 x 2
index_g which is a logical 1698 x 2
When using the line of code below I get the error message that Index exceeds matrix dimensions. I don't see how this is the case though?
tsp = nansum(mcaps(index_g==1, :));
Update
Sorry I should have mentioned that I need the sum of each column in the mcaps vector
** Example of data **
mcaps index_g
5 6 0 0
4 3 0 0
6 5 1 1
4 6 0 1
8 7 0 0
There are two problems here. I missed one. Original answer is below.
What I missed is that when you use the logical index in this way, you are picking out elements of the matrix that may have different numbers of elements in each column, so MATLAB can't return a well formed matrix back to nansum, and so returns a vector. To get around this, use the fact that 0 + anything = 0
% create a mask of values you don't want to sum. Note that since
% index_g is already logical, you don't have to test equal to 1.
mask = ~index_g & isnan(mcaps)
% create a temporary variable
mcaps_to_sum = mcaps;
% change all of the values that you don't want to sum to zero
mcaps_to_sum(mask) = 0;
% do the sum
sum(mcaps_to_sum,1);
This is basically all that the nansum function does internally, is to set all of the NaN values to zero and then call the sum function.
index_g == 1 returns a 1698 x 2 logical matrix, but then you add in an extra dimension with the colon. To sum the columns, use the optional dim input. You want:
tsp = nansum(mcaps(index_g == 1),1);

determine the frequency of a number if a simulation

I have the following function:
I have to generate 2000 random numbers from this function and then make a histogram.
then I have to determine how many of them is greater that 2 with P(X>2).
this is my function:
%function [ output_args ] = Weibullverdeling( X )
%UNTITLED Summary of this function goes here
% Detailed explanation goes here
for i=1:2000
% x= rand*1000;
%x=ceil(x);
x=i;
Y(i) = 3*(log(x))^(6/5);
X(i)=x;
end
plot(X,Y)
and it gives me the following image:
how can I possibly make it to tell me how many values Do i Have more than 2?
Very simple:
>> Y_greater_than_2 = Y(Y>2);
>> size(Y_greater_than_2)
ans =
1 1998
So that's 1998 values out of 2000 that are greater than 2.
EDIT
If you want to find the values between two other values, say between 1 and 4, you need to do something like:
>> Y_between = Y(Y>=1 & Y<=4);
>> size(Y_between)
ans =
1 2
This is what I think:
for i=1:2000
x=rand(1);
Y(i) = 3*(log(x))^(6/5);
X(i)=x;
end
plot(X,Y)
U is a uniform random variable from which you can get the X. So you need to use rand function in MATLAB.
After which you implement:
size(Y(Y>2),2);
You can implement the code directly (here k is your root, n is number of data points, y is the highest number of distribution, x is smallest number of distribution and lambda the lambda in your equation):
X=(log(x+rand(1,n).*(y-x)).*lambda).^(1/k);
result=numel(X(X>2));
Lets split it and explain it detailed:
You want the k-th root of a number:
number.^(1/k)
you want the natural logarithmic of a number:
log(number)
you want to multiply sth.:
numberA.*numberB
you want to get lets say 1000 random numbers between x and y:
(x+rand(1,1000).*(y-x))
you want to combine all of that:
x= lower_bound;
y= upper_bound;
n= No_Of_data;
lambda=wavelength; %my guess
k= No_of_the_root;
X=(log(x+rand(1,n).*(y-x)).*lambda).^(1/k);
So you just have to insert your x,y,n,lambda and k
and then check
bigger_2 = X(X>2);
which would return only the values bigger than 2 and if you want the number of elements bigger than 2
No_bigger_2=numel(bigger_2);
I'm going to go with the assumption that what you've presented is supposed to be a random variate generation algorithm based on inversion, and that you want real-valued (not complex) solutions so you've omitted a negative sign on the logarithm. If those assumptions are correct, there's no need to simulate to get your answer.
Under the stated assumptions, your formula is the inverse of the complementary cumulative distribution function (CCDF). It's complementary because smaller values of U give larger values of X, and vice-versa. Solve the (corrected) formula for U. Using the values from your Matlab implementation:
X = 3 * (-log(U))^(6/5)
X / 3 = (-log(U))^(6/5)
-log(U) = (X / 3)^(5/6)
U = exp(-((X / 3)^(5/6)))
Since this is the CCDF, plugging in a value for X gives the probability (or proportion) of outcomes greater than X. Solving for X=2 yields 0.49, i.e., 49% of your outcomes should be greater than 2.
Make suitable adjustments if lambda is inside the radical, but the algebra leading to solution is similar. Unless I messed up my arithmetic, the proportion would then be 55.22%.
If you still are required to simulate this, knowing the analytical answer should help you confirm the correctness of your simulation.

Generate nonnegative random number ( matlab) [duplicate]

This question already has answers here:
Generate a random number with max, min and mean (average) in Matlab
(6 answers)
Closed 9 years ago.
I want to generate a N dimensional column vector in matlab, with mean 0.5 ( variance is ok to adjust ) , but I want all numbers to be positive, does anyone know how to do it?
You can try this:
E.g. create a vector of 1000 random values drawn from a normal distribution with a mean of 0.5 and a standard deviation of 5.
a = 5;
b = 0.5;
y = a.*randn(1000,1) + b;
To make it positive, then you can delete all the numbers that are negative or zero and generate some more until you got n positive numbers.
Check out here for more info.
It depends on the distribution that you want. The rand(v) function generates a uniform random distribution (range [0,1] I believe although I'm not sure if either 0 or 1 are theoretically possible values) in an array with dimensions v.
So if you want a 3x4x5x6 array, you would do
myRandArray = rand([3 4 5 6]);
If you want the upper value to be larger, you could do
myRandArray = maxVal * rand([3 4 5 6]);
Where maxVal is the largest value. And if you want a range minVal to maxVal, then do
myRandArray = minVal + (maxVal - minVal) * rand([3 4 5 6]);
For other distributions (like randn for normal distribution) you can make adjustments to the above, obviously. If you want a "truncated normal distribution" - you may need to start with too many values:
dims = [3 4 5 6];
n = prod( dims );
tooMany = 0.5 + randn(2 * n); % since you want mean = 0.5
tooMany(tooMany < 0) = [];
if numel( tooMany ) > n
myRandArray = reshape(tooMany(1:n), dims);
end
You can obviously improve on this, but it's a general idea.
For example, you could generate only n values, see how many fail (say m), then generate another m * n / (n - m), and repeat as needed until you have exactly n.
Note that the mean of the final distribution is no longer 0.5 since we cut off the tail. A 'normal distribution' cannot remain 'normal' if you exclude certain values. But then you didn't specify what distribution you wanted...