Im trying to divide a Gaussian distribution into equiprobable parts. Im using the following code:
function main
mu=100;
sigma=2;
n=100;
k=3;
samp = mu + sigma.*randi([20,100],1,n);
%hist(samp)
v=optim_m2(samp,k)
end
function v=optim_m(d,k)
v=-inf;
mu=mean(d);
sigma=var(d);
for i=1:k
[x, ~] = fminbnd(#(x) (0.5*( 1+erf( (x-mu)/(((sigma^2) )^0.5 ) ) ) -i/k )^2 ,mu-3*sigma,mu+3*sigma);
v=[v,x];
end
end
I get some pretty strange results such as negative values and so on. If I use the same function but with small natural numbers everything seems to work fine.
PS. Im a NOOB so ...:) dont judge too harsh
If you can use Matlab commands, you can simply use:
boundaries = norminv(1/8:1/8:7/8,mu,sigma);
Related
Given some data (from the function 1/x), I solved for powers of a model function made of the sum of exponentials. I did this for several different samples of data and I want to plot the powers that I find and test to see what type of distribution they have.
My main problems are:
My loops are quitting when the nlinfit() function says the function gives -Inf values. I've tried playing around with initial conditions but can't see how to fix it. I know that I am working with the function 1/x, so there may not be a way around this problem. Is there at least a way to force the loop to keep going after getting this error? It only happens for some iterations. If I manually regenerate my sample data and run it again it works. So if I could force the loop to keep going it would be ideal. I have tried switching the for loop to a while loop and putting a try catch statement inside but this just seems to run endlessly.
I have code set to test how if the distribution of the resulting power is close to the exponential distribution. Is this the correct way to do this? When I add in the Monte Carlo tolerance ('MCTol') it has a drastic effect on p in some cases (going from 0.5 to 0.9). Do I have it set up correctly? Is there a way to also see empirically how far the curve is from the exponential distribution?
Are there better ways to fit this curve or do any of the above? I'm just learning so any suggestions are appreciated.
Here is my current code :
%Function that I use to fit curve and get error for each iteration
function[beta,error] = Fit_and_Calc_Error(T,X,modelfun, beta0)
opts = statset('nlinfit');
opts.RobustWgtFun = 'bisquare';
beta = nlinfit(T,X,modelfun,beta0,opts);
error = immse(X,modelfun(beta,T));
end
%Starting actual loop stuff
tracking_a = zeros();
tracking_error = zeros();
modelfun = #(a,x)(exp(-a(1)*x)+exp(-a(2)*x));
for i = 1:60
%make new random data
x = 20*rand(300,1)+1; %between 1 and 21 (Avoiding x=0)
X = sort(x);
Y = 1./(X);
%fit random data
beta0 = [max(X(:,1))/10;max(X(:,1))/10];
[beta,error]= Fit_and_Calc_Error(X,Y,modelfun, beta0);
%storing data found
tracking_a(i,1:length(beta)) = beta;
tracking_error(i) = error;
end
%Testing if first power found has exponential dist
pdN = fitdist(tracking_a(:,1),'Exponential');
histogram(tracking_a(:,1),10)
hold on
x_values2 = 0:0.001:5;
y2 = pdf(pdN,x_values2);
plot(x_values2,y2)
hold off
[h,p] = lillietest(tracking_a(:,1),'Distribution','exponential','MCTol',1e-4)
%Testing if second power found has exponential dist
pdN2 = fitdist(tracking_a(:,2),'Exponential');
histogram(tracking_a(:,2),10)
hold on
y3 = pdf(pdN2,x_values2);
plot(x_values2,y3)
hold off
[h2,p2] = lillietest(tracking_a(:,2),'Distribution','exponential','MCTol',1e-4)
This is my first go with ML (and Matlab) and I'm following "Learning From Data" by Yaser S. Abu-Mostafa.
I'm trying to implement the Perceptron algorithm, after trying to go through the pseudocode, using other people's solutions I can't seem to fix my problem (I went through other threads too).
The algorithm separates the data fine, it works. However, I want to plot a single line, but it seems as it separates them in a way so the '-1' cluster is divided to a second cluster or more.
This is the code:
iterations = 100;
dim = 3;
X1=[rand(1,dim);rand(1,dim);ones(1,dim)]; % class '+1'
X2=[rand(1,dim);1+rand(1,dim);ones(1,dim)]; % class '-1'
X=[X1,X2];
Y=[-ones(1,dim),ones(1,dim)];
w=[0,0,0]';
% call perceptron
wtag=weight(X,Y,w,iterations);
% predict
ytag=wtag'*X;
% plot prediction over origianl data
figure;hold on
plot(X1(1,:),X1(2,:),'b.')
plot(X2(1,:),X2(2,:),'r.')
plot(X(1,ytag<0),X(2,ytag<0),'bo')
plot(X(1,ytag>0),X(2,ytag>0),'ro')
legend('class -1','class +1','pred -1','pred +1')
%Why don't I get just one line?
plot(X,Y);
The weight function (Perceptron):
function [w] = weight(X,Y,w_init,iterations)
%WEIGHT Summary of this function goes here
% Detailed explanation goes here
w = w_init;
for iteration = 1 : iterations %<- was 100!
for ii = 1 : size(X,2) %cycle through training set
if sign(w'*X(:,ii)) ~= Y(ii) %wrong decision?
w = w + X(:,ii) * Y(ii); %then add (or subtract) this point to w
end
end
sum(sign(w'*X)~=Y)/size(X,2); %show misclassification rate
end
I don't think the problem is in the second function but I added it regardless
I'm pretty sure the algorithm separates it to more than one cluster but I can't tell why most of the learning I've done so far was math and theory and not actual coding so I'm probably missing something obvious..
I am completely new to Matlab. I am trying to simulate a Wiener and Poisson combined process.
Why do I get Subscripted assignment dimension mismatch?
I am trying to simulate
Z(t)=lambda*W^2(t)-N(t)
Where W is a wiener process and N is a poisson process.
The code I am using is below:
T=500
dt=1
K=T/dt
W(1)=0
lambda=3
t=0:dt:T
for k=1:K
r=randn
W(k+1)=W(k)+sqrt(dt)*r
N=poissrnd(lambda*dt,1,k)
Z(k)=lambda*W.^2-N
end
plot(t,Z)
It is true that some indexing is missing, but I think you would benefit from rewriting your code in a more 'Matlab way'. The following code is using the fact that Matlab basic variables are matrices, and compute the results in a vectorized way. Try to understand this kind of writing, as this is the way to exploit Matlab more efficiently, along with writing shorter and readable code:
T = 500;
dt = 1;
K = T/dt;
lambda = 3;
t = 1:dt:T;
sqdtr = sqrt(dt)*randn(K-1,1); % define sqrt(dt)*r as a vector
N = poissrnd(lambda*dt,K,1); % define N as a vector
W = cumsum([0; sqdtr],1); % cumulative sum instead of the loop
Z = lambda*W.^2-N; % summing the processes element-wiesly
plot(t,Z)
Example for a result:
you forget index
Z(k)=lambda*W.^2-N
it must be
Z(k)=lambda*W(k).^2-N(k)
the following matlab code is a regression loop:
for j=1:size(X,2)
IdentityVector=ones((size(t,1)-1),1);
Y=X((2:end),j);
if j==1
X2=[IdentityVector,X((2:end),((j+1):end)),Diff1X];
elseif j>1 & j<size(X,2)
X2=[IdentityVector,X((2:end),(1:(j-1))),X((2:end),((j+1):end)),Diff1X];
elseif j==size(X,2)
X2=[IdentityVector,X((2:end),(1:(j-1))),Diff1X];
end
[b(:,j)]= regress(Y,X2);
end
this works fine for the beta estimate as it for each j the dimensions of the beta are adjusting accordingly, although if I request from the estimation some additional features/statistics e.g. [b,bint,r,rint,stats] = regress(y,X) the adjustments i have tried for each j do not work. Any help?
My best guess is that you are treating all the outputs as vectors, which they are not. If you read doc regress you will see that bint is a p-by-2, r is n-by-2, so on and so forth. That means you can't assign bint(:,j) because bint itself is a matrix. Instead try cells.
[b{j}, bint{j}, r{j}, rint{j}, stats{j}]= regress(Y,X2);
I am trying to integrate all the 2x2 matrices A(i-1:1,j-1:j) in Matlab without using a loop. Right now I am doing in a loop but it is extremely slow. The code is shown below:
A=rand(100)
t=linespace(0,1,100);
for i=2:length(A)
for j=2:length(A)
A_minor=A(i-1:i,j-1:j);
B(i,j)=trapz(t(j-1:j),trapz(t(i-1:i),A_minor));
end
end
I'd like to do this without using loops to speed up computation.
If you have the Matlab Image Processing Toolbox, you may be able to use blockproc to do what you want.
http://www.mathworks.com/help/images/ref/blockproc.html
To use blockproc, you will need to define a function that does what you want to be executed on each position in the matrix. Note that the way you are using trapz makes things a little trickier (passing the x-values in - if you can get away without them, you can simplify the code) - here I run trapz without them and scale the results.
% Data
foo = rand(100);
t = linspace(0,1,100);
% Execute blockproc on the indexes
fooproc = blockproc(foo, [2, 2], #(x) trapz(trapz(x.data)));
fooproc = fooproc * (t(2)-t(1))^2; % re-scale by the square of the step size
If you need to pass the x values to trapz, the solution gets a bit trickier.
As trapz is a simple function (especially on a 2x2 matrix), you can just compute the result directly, without calling a function:
t = linspace(0,1,100); % Note that this is a step size of 0.010101
A = rand(100);
B = nan(size(A));
Atmp = (A(1:end-1,:) + A(2:end,:))/2;
Atmp = (Atmp(:,1:end-1) + Atmp(:,2:end))/2;
B(2:end,2:end) = Atmp * (t(2)-t(1))^2;
This should give you the exact same result as your for loop, but much faster.