I have attempted to generate a triangular probability distribution in Matlab, but was not successful. I used the formula at http://en.wikipedia.org/wiki/Triangular_distribution.
n = 10000000;
a = 0.2;
b = 0.7;
c = 0.5;
u = sqrt(rand(n, 1));
x = zeros(n, 1);
for i = 1:n
U = u(i);
if U < (c-a)/(b-a)
X = a + sqrt(U*(b-a)*(c-a));
else
X = b - sqrt((1-U)*(b-a)*(b-c));
end
x(i) = X;
end
hist(x, 100);
The histogram looks like so:
Doesn't look like much of a triangle to me. What's the problem? Am I abusing rand(n)?
you can add up two uniform distributions, the distribution graphs convolve, and you get a triangular distribution.
easy-to-understand example: rolling two dice, each action has uniform distribution to result in a number from 1-6, combined action has triangular distribution to result in a number 2-12
edit: minimal working example:
a=randint(10000,1,10);
b=randint(10000,1,10);
c=a+b;
hist(c,max(c)-min(c)+1)
edit2: looked in your script again. It's working but you've made one mistake:
u = sqrt(rand(n, 1));
should be
u = rand(n, 1);
edit3: optimized code
n = 10000000;
a = 0.2;
b = 0.7;
c = 0.5;
u = rand(n, 1);
x = zeros(n, 1);
idx = find(u < (c-a)/(b-a));
x(idx) = a + sqrt(u(idx)*(b-a)*(c-a));
idx =setdiff(1:n,idx);
x(idx) = b - sqrt((1-u(idx))*(b-a)*(b-c));
hist(x, 100);
This example uses the makedist() and pdf() commands.
a = 2; m = 7; b = 10;
N = 50000; % Number of samples
pd = makedist('Triangular',a,m,b); % Create probability distribution object
T = random(pd,N,1); % Generate samples from distribution
Triangular Distribution with lowerbound a = 7, mode m = 10, and upperbound b = 10.
% Plot PDF & Compare with Generated Sample
X = (a-2:.1:b+2);
figure, hold on, box on
histogram(T,'Normalization','pdf') % Note normalization-pdf option name-value pair
title([num2str(N) ' Samples'])
plot(X,pdf(pd,X),'r--','LineWidth',1.8)
legend('Empirical Density','Theoretical Density','Location','northwest')
MATLAB introduced makedist() in R2013a. Requires Stats toolbox.
Reference:
Triangular Distribution
Change
u = sqrt(rand(n, 1));
to
u = rand(n, 1);
The nice thing about this formula is that you can distribute a sample from a general triangle distribution with a single random sample.
Related
I have been working on a FastICA algorithm implementation using MatLab. Currently the code does not separate the signals as good as id like. I was wondering if anyone here could give me some advice on what I could do to fix this problem?
disp('*****Importing Signals*****');
s = [1,30000];
[m1,Fs1] = audioread('OSR_us_000_0034_8k.wav', s);
[f1,Fs2] = audioread('OSR_us_000_0017_8k.wav', s);
ss = size(f1,1);
n = 2;
disp('*****Mixing Signals*****');
A = randn(n,n); %developing mixing matrix
x = A*[m1';f1']; %A*x
m_x = sum(x, n)/ss; %mean of x
xx = x - repmat(m_x, 1, ss); %centering the matrix
c = cov(x');
sq = inv(sqrtm(c)); %whitening the data
x = c*xx;
D = diff(tanh(x)); %setting up newtons method
SD = diff(D);
disp('*****Generating Weighted Matrix*****');
w = randn(n,1); %Random weight vector
w = w/norm(w,2); %unit vector
w0 = randn(n,1);
w0 = w0/norm(w0,2); %unit vector
disp('*****Unmixing Signals*****');
while abs(abs(w0'*w)-1) > size(w,1)
w0 = w;
w = x*D(w'*x) - sum(SD'*(w'*x))*w; %perform ICA
w = w/norm(w, 2);
end
disp('*****Output After ICA*****');
sound(w'*x); % Supposed to be one of the original signals
subplot(4,1,1);plot(m1); title('Original Male Voice');
subplot(4,1,2);plot(f1); title('Original Female Voice');
subplot(4,1,4);plot(w'*x); title('Post ICA: Estimated Signal');
%figure;
%plot(z); title('Random Mixed Signal');
%figure;
%plot(100*(w'*x)); title('Post ICA: Estimated Signal');
Your covariance matrix c is 2 by 2, you cannot work with that. You have to mix your signal multiple times with random numbers to get anywhere, because you must have some signal (m1) common to different channels. I was unable to follow through your code for fast-ICA but here is a PCA example:
url = {'https://www.voiptroubleshooter.com/open_speech/american/OSR_us_000_0034_8k.wav';...
'https://www.voiptroubleshooter.com/open_speech/american/OSR_us_000_0017_8k.wav'};
%fs = 8000;
m1 = webread(url{1});
m1 = m1(1:30000);
f1 = webread(url{2});
f1 = f1(1:30000);
ss = size(f1,1);
n = 2;
disp('*****Mixing Signals*****');
A = randn(50,n); %developing mixing matrix
x = A*[m1';f1']; %A*x
[www,comp] = pca(x');
sound(comp(:,1)',8000)
Is there any function in Matlab which calculates the correlation ratio?
Here is an implementation I tried to do, but the results are not right.
function cr = correlation_ratio(X, Y, L)
ni = zeros(1, L);
sigmai = ni;
for i = 0:(L-1)
Yn = Y(X == i);
ni(1, i+1) = numel(Yn);
m = (1/ni(1, i+1))*sum(Yn);
sigmai(1, i+1) = (1/ni(1, i+1))*sum((Yn - m).^2);
end
n = sum(ni);
prod = ni.*sigmai;
cr = (1-(1/n)*sum(prod))^0.5;
This is the equation on the Wikipedia page:
where:
η is the correlation ratio,
yx,i are the sample values (x is the class label, i the sample index),
yx (with the bar on top) is the mean of sample values for class x,
y (with the bar on top) is the mean for all samples across all classes, and
nx is the number of samples in class x.
This is how I interpreted it into code:
function eta = correlation_ratio(X, Y)
X = X(:); % make sure we've got column vectors, simplifies things below a bit
Y = Y(:);
L = max(X);
mYx = zeros(1, L+1); % we'll write mean per class here
nx = zeros(1, L+1); % we'll write number of samples per class here
for i = unique(X).'
Yn = Y(X == i);
if numel(Yn)>1
mYx(i+1) = mean(Yn);
nx(i+1) = numel(Yn);
end
end
mY = mean(Y); % mean across all samples
eta = sqrt(sum(nx .* (mYx - mY).^2) / sum((Y-mY).^2));
The loop could be replaced with accumarray.
I am trying to calculate Pearson coefficients between all pair combinations of my variables of all my samples.
Say i have an m*n matrix where m are the variables and n are the samples
i want to calculate for each variable of my data what is the correlation to every other variable.
So, i managed to do that with nested loops:
X = rand[1000 100];
for i = 1:1000
base = X(i, :);
for j = 1:1000
target = X(j, :);
correlation = corrcoef(base, target);
correlation = correlation(2, 1);
corData(1, j) = correlation
end
totalCor(i, :) = corData
end
and it works, but takes too much time to run
I am trying to find a way to run the corrcoef function on a row basis, meaning maybe to create an additional matrix with repmat of the base values and correlate to the X data using some FUN function.
Could not figure out how to use the fun with inputs from to arrays, running between individuals lines/columns
help will be appreciated
This post involves a bit of hacking, so bear with it!
Stage #0 To start off, we have -
for i = 1:N
base = X(i, :);
for j = 1:N
target = X(j, :);
correlation = corrcoef(base, target);
correlation = correlation(2, 1)
corData(1, j) = correlation;
end
end
Stage #1 From the documentation of corrcoef in its source code :
If C is the covariance matrix, C = COV(X), then CORRCOEF(X) is the
matrix whose (i,j)'th element is : C(i,j)/SQRT(C(i,i)*C(j,j)).
After hacking into the code of covariance, we see that for the default case of one input, the covariance formula is simply -
[m,n] = size(x);
xc = bsxfun(#minus,x,sum(x,1)/m);
xy = (xc' * xc) / (m-1);
Thus, mixing the two definitions and putting them into the problem at hand, we have -
m = size(X,2);
for i = 1:N
base = X(i, :);
for j = 1:N
target = X(j, :);
BT = [base(:) target(:)];
xc = bsxfun(#minus,BT,sum(BT,1)/m);
C = (xc' * xc) / (m-1); %//'
corData = C(2,1)/sqrt(C(2,2)*C(1,1))
end
end
Stage #2 This is the final stage where we use the real fun aka bsxfun to kill all loops, like so -
%// Broadcasted subtract of each row by the average of it.
%// This corresponds to "xc = bsxfun(#minus,BT,sum(BT,1)/m)"
p1 = bsxfun(#minus,X,mean(X,2));
%// Get pairs of rows from X and get the dot product.
%// Thus, a total of "N x N" such products would be obtained.
p2 = sum(bsxfun(#times,permute(p1,[1 3 2]),permute(p1,[3 1 2])),3);
%// Scale them down by "size(X,2)-1".
%// This was for the part : "C = (xc' * xc) / (m-1)".
p3 = p2/(size(X,2)-1);
%// "C(2,2)" and "C(1,1)" are diagonal elements from "p3", so store them.
dp3 = diag(p3);
%// Get "sqrt(C(2,2)*C(1,1))" by broadcasting elementwise multiplication
%// of "dp3". Finally do elementwise division of "p3" by it.
totalCor_out = p3./sqrt(bsxfun(#times,dp3,dp3.'));
Benchmarking
This section compares the original approach against the proposed one and also verifies the output. Here's the benchmarking code -
disp('---------- With original approach')
tic
X = rand(1000,100);
corData = zeros(1,1000);
totalCor = zeros(1000,1000);
for i = 1:1000
base = X(i, :);
for j = 1:1000
target = X(j, :);
correlation = corrcoef(base, target);
correlation = correlation(2, 1);
corData(1, j) = correlation;
end
totalCor(i, :) = corData;
end
toc
disp('---------- With the real fun aka BSXFUN')
tic
p1 = bsxfun(#minus,X,mean(X,2));
p2 = sum(bsxfun(#times,permute(p1,[1 3 2]),permute(p1,[3 1 2])),3);
p3 = p2/(size(X,2)-1);
dp3 = diag(p3);
totalCor_out = p3./sqrt(bsxfun(#times,dp3,dp3.')); %//'
toc
error_val = max(abs(totalCor(:)-totalCor_out(:)))
Output -
---------- With original approach
Elapsed time is 186.501746 seconds.
---------- With the real fun aka BSXFUN
Elapsed time is 1.423448 seconds.
error_val =
4.996e-16
I implemented a method for removing shadows based on invariant color features found in the paper Entropy Minimization for Shadow Removal. My implementation seems to be yielding similar computational results sometimes, but they are always off, and my grayscale image is blocky, maybe as a result of incorrectly taking the geometric mean.
Here is an example plot of the information potential from the horse image in the paper as well as my invariant image. Multiply the x-axis by 3 to get theta(which goes from 0 to 180):
And here is the grayscale Image my code outputs for the correct maximum theta (mine is off by 10):
You can see the blockiness that their image doesn't have:
Here is their information potential:
When dividing by the geometric mean, I have tried using NaN and tresholding the image so the smallest possible value is .01, but it doesn't seem to change my output.
Here is my code:
I = im2double(imread(strname));
[m,n,d] = size(I);
I = max(I, .01);
chrom = zeros(m, n, 3, 'double');
for i = 1:m
for j = 1:n
% if ((I(i,j,1)*I(i,j,2)*I(i,j,3))~= 0)
chrom(i,j, 1) = I(i,j,1)/((I(i,j,1)*I(i,j,2)*I(i,j, 3))^(1/3));
chrom(i,j, 2) = I(i,j,2)/((I(i,j,1)*I(i,j,2)*I(i,j, 3))^(1/3));
chrom(i,j, 3) = I(i,j,3)/((I(i,j,1)*I(i,j,2)*I(i,j, 3))^(1/3));
% else
% chrom(i,j, 1) = 1;
% chrom(i,j, 2) = 1;
% chrom(i,j, 3) = 1;
% end
end
end
p1 = mat2gray(log(chrom(:,:,1)));
p2 = mat2gray(log(chrom(:,:,2)));
p3 = mat2gray(log(chrom(:,:,3)));
X1 = mat2gray(p1*1/(sqrt(2)) - p2*1/(sqrt(2)));
X2 = mat2gray(p1*1/(sqrt(6)) + p2*1/(sqrt(6)) - p3*2/(sqrt(6)));
maxinf = 0;
maxtheta = 0;
data2 = zeros(1, 61);
for theta = 0:3:180
M = X1*cos(theta*pi/180) - X2*sin(theta*pi/180);
s = sqrt(std2(X1)^(2)*cos(theta*pi/180) + std2(X2)^(2)*sin(theta*pi/180));
s = abs(1.06*s*((m*n)^(-1/5)));
[m, n] = size(M);
length = m*n;
sources = zeros(1, length, 'double');
count = 1;
for x=1:m
for y = 1:n
sources(1, count) = M(x , y);
count = count + 1;
end
end
weights = ones(1, length);
sigma = 2*s;
[xc , Ak] = fgt_model(sources , weights , sigma , 10, sqrt(length) , 6 );
sum1 = sum(fgt_predict(sources , xc , Ak , sigma , 10 ));
sum1 = sum1/sqrt(2*pi*2*s*s);
data2(theta/3 + 1) = sum1;
if (sum1 > maxinf)
maxinf = sum1;
maxtheta = theta;
end
end
InvariantImage2 = cos(maxtheta*pi/180)*X1 + sin(maxtheta*pi/180)*X2;
Assume the Fast Gauss Transform is correct.
I don't know whether this makes any difference as it is more than a month now, but the blockiness and different information potential plot is simply caused by compression of the used image. You can't expect to be getting same results using this image as they had, because they have used raw, high resolution uncompressed version of it. I have to say I am fairly impressed with your results, especially with implementing the information potential. That thing went over my head a little.
John.
I'm trying to implement a very basic eigenface calculation in Matlab. It kind of works but I get only two meaningful eigenvalues - the rest are zero. The corresponding eigenvectors seem to be right since most of them will show an eigenface when converting to an image.
So why are most of my eigenvalues zero? I need them to be different from zero in order to sort the eigenfaces by their significance (greatest magnitude eigenvalues).
I am reading 400 images, each size h/w = 112/92 px
They can be found here: http://www.cl.cam.ac.uk/Research/DTG/attarchive/pub/data/att_faces.zip
The code:
clear all;
files = dir('eigenfaces2/training/*.pgm');
[numFaces, discard] = size(files);
h = 112;
w = 92;
s = h * w;
%calculate average face
avgFace = zeros(s, 1);
faces = [];
for i=1:numFaces
file = strcat('eigenfaces2/training/', files(i).name);
im = double(imread(file));
im = reshape(im, s, 1);
avgFace = avgFace + im;
faces(:,i) = im;
end
avgFace = avgFace ./ numFaces;
A = [];
for i=1:numFaces
diff = avgFace - faces(i);
A(:,i) = diff;
end
numEigs = 20;
L = (A' * A) / numFaces;
[tmpEigs, discard] = eigs(L, numEigs);
eigenfaces = [];
for i=1:numEigs
v = tmpEigs(:,i);
eigenfaces(:,i) = A * v;
end
%visualize largest eigenfaces
figure;
for i=1:numEigs
eigface = eigenfaces(:,i);
mmax = max(eigface);
mmin = min(eigface);
eigface = 255 .* (eigface-mmin) ./ (mmax-mmin);
eigface = reshape(eigface, h, w);
subplot(4,5,i); imshow(uint8(eigface));
end
I've don't have much experience with computer vision/image recognition, but I think you might want
diff = avgFace - faces(:,i);
in your second for loop. Otherwise it's just subtracting a constant from avgFace each time, and so A (and hence L) only gets a rank of 2.