I am tasked with creating a t-distribution for a homework problem. I have created the code, but I get a result that doesn't look like a t-distribution. What am I doing wrong?
Task:
u=0
n=20
for i=1:5000;
r=randn(20,1);
x(i)=mean(r);
s(i)=std(r);
t(i)=(x-u)/(s/sqrt(n)) ;
end
hist(t)
Hmm, I suspect that you are not using the operator that you think you are using. division is not just limited to scalars, and here you're accidentally getting a scalar result from a matrix operation.
Hint: when you calculate the ith value of t, you probably only want to be using the ith terms for mean and standard deviation.
As pointed out by Matt, you have forgotten to iterate through the means and standard deviation values. What are you doing now is dividing two arrays. Matlab interprets your code then as scalar product of array x and transposed array s. That is why the result is a scalar and the error is not so easy to spot.
Updated code should be fine:
clc
clear
u=0.0
n=20
for i=1:5000
r=randn(n,1);
x(i)=mean(r);
s(i)=std(r);
t(i)=(x(i)-u)/(s(i)/sqrt(n)) ;
end
hist(t)
Generated result for me:
Hint: For small scripts I advise you to add clc (clear command window) and clear (clears workspace) command lines. Sometimes there might be a lot of garbage from previous runned scripts that might spoil the result, and clearing command window definitely make it easier to debug, at least for me.
Related
I would like to optimize this piece of Matlab code but so far I have failed. I have tried different combinations of repmat and sums and cumsums, but all my attempts seem to not give the correct result. I would appreciate some expert guidance on this tough problem.
S=1000; T=10;
X=rand(T,S),
X=sort(X,1,'ascend');
Result=zeros(S,1);
for c=1:T-1
for cc=c+1:T
d=(X(cc,:)-X(c,:))-(cc-c)/T;
Result=Result+abs(d');
end
end
Basically I create 1000 vectors of 10 random numbers, and for each vector I calculate for each pair of values (say the mth and the nth) the difference between them, minus the difference (n-m). I sum over of possible pairs and I return the result for every vector.
I hope this explanation is clear,
Thanks a lot in advance.
It is at least easy to vectorize your inner loop:
Result=zeros(S,1);
for c=1:T-1
d=(X(c+1:T,:)-X(c,:))-((c+1:T)'-c)./T;
Result=Result+sum(abs(d),1)';
end
Here, I'm using the new automatic singleton expansion. If you have an older version of MATLAB you'll need to use bsxfun for two of the subtraction operations. For example, X(c+1:T,:)-X(c,:) is the same as bsxfun(#minus,X(c+1:T,:),X(c,:)).
What is happening in the bit of code is that instead of looping cc=c+1:T, we take all of those indices at once. So I simply replaced cc for c+1:T. d is then a matrix with multiple rows (9 in the first iteration, and one fewer in each subsequent iteration).
Surprisingly, this is slower than the double loop, and similar in speed to Jodag's answer.
Next, we can try to improve indexing. Note that the code above extracts data row-wise from the matrix. MATLAB stores data column-wise. So it's more efficient to extract a column than a row from a matrix. Let's transpose X:
X=X';
Result=zeros(S,1);
for c=1:T-1
d=(X(:,c+1:T)-X(:,c))-((c+1:T)-c)./T;
Result=Result+sum(abs(d),2);
end
This is more than twice as fast as the code that indexes row-wise.
But of course the same trick can be applied to the code in the question, speeding it up by about 50%:
X=X';
Result=zeros(S,1);
for c=1:T-1
for cc=c+1:T
d=(X(:,cc)-X(:,c))-(cc-c)/T;
Result=Result+abs(d);
end
end
My takeaway message from this exercise is that MATLAB's JIT compiler has improved things a lot. Back in the day any sort of loop would halt code to a grind. Today it's not necessarily the worst approach, especially if all you do is use built-in functions.
The nchoosek(v,k) function generates all combinations of the elements in v taken k at a time. We can use this to generate all possible pairs of indicies then use this to vectorize the loops. It appears that in this case the vectorization doesn't actually improve performance (at least on my machine with 2017a). Maybe someone will come up with a more efficient approach.
idx = nchoosek(1:T,2);
d = bsxfun(#minus,(X(idx(:,2),:) - X(idx(:,1),:)), (idx(:,2)-idx(:,1))/T);
Result = sum(abs(d),1)';
Update: here are the results for the running times for the different proposals (10^5 trials):
So it looks like the transformation of the matrix is the most efficient intervention, and my original double-loop implementation is, amazingly, the best compared to the vectorized versions. However, in my hands (2017a) the improvement is only 16.6% compared to the original using the mean (18.2% using the median).
Maybe there is still room for improvement?
Code
clear;clc
T=800;
Pc=48.45;
Tc=375;
w=0.153;
R=82.06;
a=((0.45724)*(R^2)*(Tc^2))/Pc;
b=((0.07780)*R*Tc)/Pc;
B=(0.37464+(1.54226*w)-(0.26992*(w^2)));
Tr=T/Tc;
s=(1+(B*(1-sqrt(Tr))))^2;
for Vm=90:5:1000
P=((R*T)/(Vm-b))-((a*s)/((Vm)^2+(2*b*Vm)-b^2));
end
plot(Vm, P)
Problem
Every time I run this code, it comes out with a completely empty plot with just numbers on both axes as the image shown below. I've checked my code a few times, but I still can't find the problem, especially since the code runs with no errors. The result I am supposed to be getting on this plot is the behavior of P as the value of Vm increases.
The result of the code
Additional information about the source of the question
Here's the original question if you're interested (Exercise 1).
The original question (Exercise 1)
Try displaying your variables. You'll see Vm is not an array, rather it's a single-valued scalar. When you loop over Vm it takes one value at a time; it doesn't build an array.
MATLAB can do calculations on multiple values at once, so if you define Vm to be an array and drop the loop I'm guessing it'll work...
Try something like this (replace the for-loop with these lines):
Vm = 90:5:1000
P=((R*T)./(Vm-b))-((a*s)./((Vm).^2+(2*b.*Vm)-b^2));
P will then be an array. Notice we use .* rather than * when multiplying by the array Vm since we want to do element-wise multiplication, rather than matrix multiplication. Similarly we use ./ rather than / and .^ rather than ^.
EDIT: If you need to use a for-loop then you could define both P and Vm as arrays, and then work on each element separately within a loop:
Vm = 90:5:1000;
P = NaN(size(Vm));
for i=1:numel(Vm)
P(i)=((R*T)./(Vm(i)-b))-((a*s)./((Vm(i)).^2+(2*b.*Vm(i))-b^2));
end
Since the above is working on scalar values, it doesn't matter if you use .* or *...
I am trying to vectorise a for loop. I have a set of coordinates listed in a [68x200] matrix called plt2, and I have another set of coordinates listed in a [400x1] matrix called trans1. I want to create a three dimensional array called dist1, where in dist1(:,:,1) I have all of the values of plt2 with the first value of trans1 subtracted, all the way through to the end of trans1. I have a for loop like this which works but is very slow:
for i=1:source_points;
dist1(:,:,i)=plt2-trans1(i,1);
end
Thanks for any help.
If I understood correctly, this can be easily solved with bsxfun:
dist1 = bsxfun(#minus, plt2, shiftdim(trans1,-2));
Or, if speed is important, use this equivalent version (thanks to #chappjc), which seems to be much faster:
dist1 = bsxfun(#minus, plt2, reshape(trans1,1,1,[]));
In general, bsxfun is a very useful function for cases like this. Its behaviour can be summarized as follows: for any singleton dimension of any of its two input arrays, it applies an "implicit" for loop to the other array along the same dimension. See the doc for further details.
Vectorizing is a good first optimization, and is usually much easier than going all in writing your own compiled mex-function (in c).
However, the golden middle-way for power users is Matlab Coder (this also applies to slightly harder problems than the one posted, where vectorization is more or less impossible). First, create a small m-file function around the slow code, in your case:
function dist1 = do_some_stuff(source_points,dist1,plt2,trans1)
for i=1:source_points;
dist1(:,:,i)=plt2-trans1(i,1);
end
Then create a simple wrapper function which calls do_some_stuff as well as defines the inputs. This file should really be only 5 rows, with only the bare essentials needed. Matlab Coder uses the wrapper function to understand what typical proper inputs to do_some_stuff are.
You can now fire up the Matlab Coder gui from the Apps section and simply add do_some_stuff under Entry-Point Files. Press Autodefine types and select your wrapper function. Go to build and press build, and you are good to go! This approach usually bumps up the execution speed substantially with almost no effort.
BR
Magnus
I've got an ODE system working perfectly. But now, I want in each iteration, sort in ascending order the solution vector. I've tried many ways but I could not do it. Does anyone know how to do?
Here is a simplified code:
function dtemp = tanque1(t,temp)
for i=1:N
if i==1
dtemp(i)=(((-k(i)*At*(temp(i)-temp(i+1)))/(y))-(U*As(i)*(temp(i)-Tamb)))/(ro(i)*vol_nodo*cp(i));
end
if i>1 && i<N
dtemp(i)=(((k(i)*At*(temp(i-1)-temp(i)))/(y))-((k(i)*At*(temp(i)-temp(i+1)))/(y))-(U*As(i)*(temp(i)-Tamb)))/(ro(i)*vol_nodo*cp(i));
end
if i==N
dtemp(i)=(((k(i)*At*(temp(i-1)-temp(i)))/(y))-(U*As(i)*(temp(i)-Tamb)))/(ro(i)*vol_nodo*cp(i));
end
end
end
Test Script:
inicial=343.15*ones(200,1);
[t temp]=ode45(#tanque1,0:360:18000,inicial);
It looks like you have three different sets of differential equations depending on the index i of the solution vector. I don't think you mean "sort," but rather a more efficient way to implement what you've already done - basically vectorization. Provided I haven't accidentally made any typos (you should check), the following should do what you need:
function dtemp = tanque1(t,temp)
dtemp(1) = (-k(1)*At*(temp(1)-temp(2))/y-U*As(1)*(temp(1)-Tamb))/(ro(1)*vol_nodo*cp(1));
dtemp(2:N-1) = (k(2:N-1).*(diff(temp(1:N-1))-diff(temp(2:N)))*At/y-U*As(2:N-1).*(temp(2:N-1)-Tamb))./(vol_nodo*ro(2:N-1).*cp(2:N-1));
dtemp(N) = (k(N)*At*(temp(N-1)-temp(N))/y-U*As(N)*(temp(N)-Tamb))/(ro(N)*vol_nodo*cp(N));
You'll still need to define N and the other parameters and ensure that temp is returned as a column vector. You could also try replacing N with the end keyword, which might be faster. The two uses of diff make the code shorter, but, depending on the value of N, they may also speed up the calculation. They could be replaced with temp(1:N-2)-temp(2:N-1) and temp(2:N-1)-temp(3:N). It may be possible to collapse these down to a single vectorized equation, but I'll leave that as an exercise for you to attempt if you like.
Note that I also removed a great many unnecessary parentheses for clarity. As you learn Matlab you'll to get used to the order of operations and figure out when parentheses are needed.
I have been happily using MATLAB to solve some project Euler problems. Yesterday, I wrote some code to solve one of these problems (14). When I write code containing long loops I always test the code by running it with short loops. If it runs fine and it does what it's supposed to do I assume this will also be the case when the length of the loop is longer.
This assumption turned out to be wrong. While executing the code below, MATLAB ran out of memory somewhere around the 75000th iteration.
c=1;
e=1000000;
for s=c:e
n=s;
t=1;
while n>1
a(s,t)=n;
if mod(n,2) == 0
n=n/2;
else
n=3*n+1;
end
a(s,t+1)=n;
t=t+1;
end
end
What can I do to prevent this from happening? Do I need to clear variables or free up memory somewhere in the process? Will saving the resulting matrix a to the hard drive help?
Here is the solution, staying as close as possible to your code (which is very close, the main difference is that you only need a 1D matrix):
c=1;
e=1000000;
a=zeros(e,1);
for s=c:e
n=s;
t=1;
while n>1
if mod(n,2) == 0
n=n/2;
else
n=3*n+1;
end
t=t+1;
end
a(s)=t;
end
[f g]=max(a);
This takes a few seconds (note the preallocation), and the result g unlocks the Euler 14 door.
Simply put, there's not enough memory to hold the matrix a.
Why are you making a two-dimensional matrix here anyway? You're storing information that you can compute just as fast as looking it up.
There's a much better thing to memoize here.
EDIT: Looking again, you're not even using the stuff you put in that matrix! Why are you bothering to create it?
The code appears to be storing every sequence in a different row of a matrix. The number of columns of that matrix will be equal to the length of the longest sequence currently found. This means that a sequence of two numbers will be padded with a bunch of right hand zeros.
I am sure you can see how this is incredibly inefficient. That may be the point of the exercise, or it will be for you in this implementation.
Better is to keep a variable like "Seed of longest solution found" which would store the seed for the longest solution. I would also keep a "length of longest solution found" keep the length. As you try every new seed, if it wins the title of longest, then update those variables.
This will keep only what you need in memory.
Short Answer:Use a 2d sparse matrix instead.
Long Answer: http://www.mathworks.com/access/helpdesk/help/techdoc/ref/sparse.html