for loop over the first column of a matrix and mean - matlab

I have codes to solve some stochastic equations.
My function file
[S,T]=simByEuler(MDL, nPeriods,'DeltaTime',params.dt);
is running fine where S is a matrix its first column is theta and the second column is phi. S has 1001 rows. T is a vector of time of dimension 1001*1. Please ignore T. Focus on S which has 2 columns theta and phi,
i.e S=[theta, phi]. Dimension of S is 1001*2.
I want to write a for loop to generate 500 sample of the theta then take the mean of all of these 500. To start, the code I put in the question has random parameters. This code should be inside the for loop so that as the loop start again the code:
[S,T]=simByEuler(MDL, nPeriods,'DeltaTime',params.dt);
will generate random variable and run the for loop till end to generate new theta. What I am looking for is to have a block of these 500 theta (remember theta is a column). This block should be of size 1001*500. Then I want to take the mean of all columns in this block so the mean should be of size 1001*1.
My questions are:
1) How to write a for loop to generate this block of size 1001*500.
2) How to take the mean of all columns in the block so that the mean will be of size 1001*1.
I hope my questions are clear and I really appreciate any help.

Is this what you look for?
N = 500; % the size of a block
S = zeros(1001,2,N);
for k = 1:N
[S(:,:,k),T]=simByEuler(MDL, nPeriods,'DeltaTime',params.dt);
end
mean_theta = mean(squeeze(S(:,1,:)),2);

Related

Function presents data wrong, Monte Carlo method

I am trying to write a program that does integration with monte carlo method. One of its features is to place dots on the graph with different colours, blue or red depending on the if statement. The if statement is put in a "for" loop and i dont know why but it seems like the first option is ignored after first iteration. The whole thing looks like this :
but it should look like this :
In addition i dont know why but it looks like the plot makes some additional empty space at the top
The whole code is not finished yet, its just a matter of a few lines but these dots are so annoying that I want to figure out whats wrong first. Heres the code.
function p=montecarlo(f, a, b, n, t)
%f is a function provided by user
%a and b is a range
%n is the amount of random points
%t is a t=a:01:b vector to draw a plot
upper=max(f(t));
lower=min(f(t));
x=a+(b-a).*(rand(n,1)) %generates vector of random numbers from a to b
y=lower+(upper-lower).*(rand(n,1)) %generates vector of ranom numbers from min to max
hold on
for i=1:n
if y(i)>=f(i)
plot(x(i),y(i),'bo')
else
plot(x(i),y(i),'ro')
end
plot(t,f(t),'k')
end
end
Arguments provided to the function : f= x.^2+3*x+5, a= -4 , b= 2, n= 1000 .
The problem is simple. The statement:
if y(i)>=f(i)
is wrong. What you want to do is compare the random value y(i) with the function value at the corresponding point x(i), so it should be:
if y(i)>=f(x(i))
plot(x(i),y(i),'bo')
else
plot(x(i),y(i),'ro')
end

How can I make each array multiply by each other in a nested for loop, using MATLAB?

Currently, I'm working on a school project involving buck converters. As current increase through an inductor, its inductance decrease (most likely). Each phase is adding an inductor. By adding an inductor, I divide the current by each added inductor. The current is ramped from 0 to 500.
My issue with the following code is that it does not use each array value of i_L(i,j) correctly. I receive some negative values, which is absolutely wrong.
In example...
At 500 Amps with 10 phases, each inductor uses 50 amps. Now L will be designed after
i_L(i,j)=current(j)./phases(i)= 500/10=50amps
L(i,j)= (-9.22297516731983*10^(-16).*(50^(4)))+(9.96260934359008*10^(-14).*(50^(3)))-(3.6355216850551*10^(-12).*(50^(2)))+(9.0205832462444*10^(-12).*(50^(1)))+1.06054781561763E-07 = 1.04106*10^(-7)
and so on
creating 10x10 = 100 cells
clc; clear all;
phases=linspace(1,10,10);
current=linspace(0,500,10);
for j = 1:10
for i=1:10
i_L(i,j)=current(j)./phases(i);
L(i,j)=(-0.000000000000000922297516731983*(i_L(i,j).^(4)))+(0.000000000000099626093435900800*(i_L(i,j).^(3)))-(0.000000000003635521685055100000*(i_L(i,j).^(2)))+(0.000000000009020583246244400000*(i_L(i,j).^(1)))+0.000000106054781561763000000000;
end
end
Thank you!
Your matrix i_L(i,j) got values up to 500=500(current)/1(amp).
The polynomial you're using is generating negative solutions for values greater than 130.
So the operation is using each array value correct.
Maybe you should reevaluate the polynomial, if you're dissatisfied with the solution.
Try:
x=[0:1:500];
y=(-9.22297516731983*10^(-16).*(x.^(4)))+(9.96260934359008*10^(-14).*(x.^(3)))-(3.6355216850551*10^(-12).*(x.^(2)))+(9.0205832462444*10^(-12).*(x.^(1)))+1.06054781561763E-07;
plot(x,y)
You will see the polynomial will diverge against negative infinite for positive values.

comparing generated data to measured data

we have measured data that we managed to determine the distribution type that it follows (Gamma) and its parameters (A,B)
And we generated n samples (10000) from the same distribution with the same parameters and in the same range (between 18.5 and 59) using for loop
for i=1:1:10000
tot=makedist('Gamma','A',11.8919,'B',2.9927);
tot= truncate(tot,18.5,59);
W(i,:) =random(tot,1,1);
end
Then we tried to fit the generated data using:
h1=histfit(W);
After this we tried to plot the Gamma curve to compare the two curves on the same figure uing:
hold on
h2=histfit(W,[],'Gamma');
h2(1).Visible='off';
The problem s the two curves are shifted as in the following figure "Figure 1 is the generated data from the previous code and Figure 2 is without truncating the generated data"
enter image description here
Any one knows why??
Thanks in advance
By default histfit fits a normal probability density function (PDF) on the histogram. I'm not sure what you were actually trying to do, but what you did is:
% fit a normal PDF
h1=histfit(W); % this is equal to h1 = histfit(W,[],'normal');
% fit a gamma PDF
h2=histfit(W,[],'Gamma');
Obviously that will result in different fits because a normal PDF != a gamma PDF. The only thing you see is that for the gamma PDF fits the curve better because you sampled the data from that distribution.
If you want to check whether the data follows a certain distribution you can also use a KS-test. In your case
% check if the data follows the distribution speccified in tot
[h p] = kstest(W,'CDF',tot)
If the data follows a gamma dist. then h = 0 and p > 0.05, else h = 1 and p < 0.05.
Now some general comments on your code:
Please look up preallocation of memory, it will speed up loops greatly. E.g.
W = zeros(10000,1);
for i=1:1:10000
tot=makedist('Gamma','A',11.8919,'B',2.9927);
tot= truncate(tot,18.5,59);
W(i,:) =random(tot,1,1);
end
Also,
tot=makedist('Gamma','A',11.8919,'B',2.9927);
tot= truncate(tot,18.5,59);
is not depending in the loop index and can therefore be moved in front of the loop to speed things up further. It is also good practice to avoid using i as loop variable.
But you can actually skip the whole loop because random() allows to return multiple samples at once:
tot=makedist('Gamma','A',11.8919,'B',2.9927);
tot= truncate(tot,18.5,59);
W =random(tot,10000,1);

K-means Stopping Criteria in Matlab?

Im implementing the k-means algorithm on matlab without using the k-means built-in function, The stopping criteria is when the new centroids doesn't change by new iterations, but i cannot implement it in matlab , can anybody help?
Thanks
Setting no change as a stopping criteria is a bad idea. There are a few main reasons you shouldn't use a 0 change condition
even for a well behaved function the difference between 0 change and a very small change (say 1e-5 perhaps)could be 1000+ iterations, so you are wasting time trying to get them to be exactly the same. Especially because computers usually keep far more digits than we are interested in. IF you only need 1 digit accuracy, why wait for the computer to find an answer within 1e-31?
computers have floating point errors everywhere. Try doing some easily reversible matrix operations like a = rand(3,3); b = a*a*inv(a); a-b theoretically this should be 0 but you will see it isn't. So these errors alone could prevent your program from ever stopping
dithering. lets say we have a 1d k means problem with 3 numbers and we want to split them into 2 groups. One iteration the grouping can be a,b vs c. the next iteration could be a vs b,c the next could be a,b vs c the next.... This is of course a simplified example, but there can be instances where a few data points can dither between clusters, and you will end up with a never ending algorithm. Since those few points are reassigned, the change will never be 0
the solution is to use a delta threshold. basically you subtract the current values from the previous and if they are less than a threshold you are done. This on its own is powerful, but as with any loop, you need a backup escape plan. And that is setting a max_iterations variable. Look at matlabs documentation for kmeans, even they have a MaxIter variable (default is 100) so even if your kmeans doesn't converge, at least it wont run endlessly. Something like this might work
%problem specific
max_iter = 100;
%choose a small number appropriate to your problem
thresh = 1e-3;
%ensures it runs the first time
delta_mu = thresh + 1;
num_iter = 0;
%do your kmeans in the loop
while (delta_mu > thresh && num_iter < max_iter)
%save these right away
old_mu = curr_mu;
%calculate new means and variances, this is the standard kmeans iteration
%then store the values in a variable called curr_mu
curr_mu = newly_calculate_values;
%use the two norm to find the delta as a single number. no matter what
%the original dimensionality of mu was. If old_mu -new_mu was
% 0 the norm is still 0. so it behaves well as a distance measure.
delta_mu = norm(old_mu - curr_mu,2);
num_ter = num_iter + 1;
end
edit
if you don't know the 2 norm is essentially the euclidean distance

Matlab vectorization of multiple embedded for loops

Suppose you have 5 vectors: v_1, v_2, v_3, v_4 and v_5. These vectors each contain a range of values from a minimum to a maximum. So for example:
v_1 = minimum_value:step:maximum_value;
Each of these vectors uses the same step size but has a different minimum and maximum value. Thus they are each of a different length.
A function F(v_1, v_2, v_3, v_4, v_5) is dependant on these vectors and can use any combination of the elements within them. (Apologies for the poor explanation). I am trying to find the maximum value of F and record the values which resulted in it. My current approach has been to use multiple embedded for loops as shown to work out the function for every combination of the vectors elements:
% Set the temp value to a small value
temp = 0;
% For every combination of the five vectors use the equation. If the result
% is greater than the one calculated previously, store it along with the values
% (postitions) of elements within the vectors
for a=1:length(v_1)
for b=1:length(v_2)
for c=1:length(v_3)
for d=1:length(v_4)
for e=1:length(v_5)
% The function is a combination of trigonometrics, summations,
% multiplications etc..
Result = F(v_1(a), v_2(b), v_3(c), v_4(d), v_5(e))
% If the value of Result is greater that the previous value,
% store it and record the values of 'a','b','c','d' and 'e'
if Result > temp;
temp = Result;
f = a;
g = b;
h = c;
i = d;
j = e;
end
end
end
end
end
end
This gets incredibly slow, for small step sizes. If there are around 100 elements in each vector the number of combinations is around 100*100*100*100*100. This is a problem as I need small step values to get a suitably converged answer.
I was wondering if it was possible to speed this up using Vectorization, or any other method. I was also looking at generating the combinations prior to the calculation but this seemed even slower than my current method. I haven't used Matlab for a long time but just looking at the number of embedded for loops makes me think that this can definitely be sped up. Thank you for the suggestions.
No matter how you generate your parameter combination, you will end up calling your function F 100^5 times. The easiest solution would be to use parfor instead in order to exploit multi-core calculation. If you do that, you should store the calculation results and find the maximum after the loop, because your current approach would not be thread-safe.
Having said that and not knowing anything about your actual problem, I would advise you to implement a more structured approach, like first finding a coarse solution with a bigger step size and narrowing it down successivley by reducing the min/max values of your parameter intervals. What you have currently is the absolute brute-force method which will never be very effective.