Variable distributions in Matlab [closed] - matlab

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
I have a Matlab assignment which is very important for my final grade but I'm not sure how to begin. Basically it has 7 clauses that are related to each other but I will ask about the first 3 because I really just want to know how I can approach the assignment and then do the rest on my own.
Create a random vector that includes 5 random variables that are statistically dependant.
I looked this up and the only thing I could think of is either using the pdf command or the cdf command. Should I use either one of them? And how can I make sure they are statistically dependant? And since I am not told which distribution to use, is it possible to make Matlab decide that one for me?
Show the distribution of each of the random variables on a single plot.
I assume I should just use plot on this one? For this I just need to know how to do the first clause, I guess.
Calculate the covariance matrix of the vector. Check what would have happened if the vector had included 5 variables that are statistically independant.
How to I Calculate the covariance matrix? Is there a Matlab command for that?
I really hope you can help me, as I have no idea how to even begin.
Thank you!

(1) There are many ways to create random variables which are dependent. Perhaps the simplest to make a variable Y which is dependent on another X is to take a function of X and add noise. E.g. let X be a Gaussian variable with some mean and variance, then let Y = X^2 + e where e is a Gaussian variable with mean 0 and any variance. So for your 5 variables: let the first be Gaussian, then let the other 4 be functions of the first plus noise.
(2) Dunno for sure but I think you can use plot for that.
(3) cov computes covariance if I remember correctly.

Related

Given a Cost Function, C(weights), that depends on expected and network outputs, how is C differentiated with respect to weights?

I'm building a Neural Network from scratch that categorizes values of x into 21 possible estimates for sin(x). I plan to use MSE as my loss function.
MSE of each minibatch = ||y(x) - a||^2, where y(x) is the vector of network outputs for x-
values in the minibatch. a is the vector of expected outputs that correspond to each x.
After finding the loss, the column vector of all weights in the network is recalculated. Column vector of delta w's ~= column vector of partial derivatives of C with respect to each weight.
∇C≡(∂C/∂w1,∂C/∂w2...).T and Δw =−η∇C where η is the (positive) learn rate.
The problem is, to find the gradient of C, you have to differentiate with respect to each weight. What does that function even look like? It's not just the previously stated MSE right?
Any help is appreciated. Also, apologies in advance if this question is misplaced, I wasn't sure if it belonged here or in a math forum.
Thank you.
(I might add that I have tried to find an answer to this online, but few examples exist that either don't use libraries to do the dirty work or present the information clearly.)
http://neuralnetworksanddeeplearning.com/chap2.html
I had found this a while ago but only now realized it's significance. The link describes δ(j,l) as an intermediary value to arrive at the partial derivative of C with respect to weights. I will post back here with a full answer if the link above answers my question, as I've seen a few posts similar to mine that have yet to be answered.

Solving a system of linear inequalities in Matlab and getting the entire set of solutions [duplicate]

This question already has an answer here:
How to set the objective function when using linprog in Matlab to solve a system of linear inequalities? [closed]
(1 answer)
Closed 6 years ago.
I want to solve a system of linear inequalities in Matlab, where the unknowns are x(1), x(2), x(3), x(4). I want the entire set of solutions x(1), x(2), x(3), x(4). Hence, I can't use linprog because it gives me just one feasible point.
Clarification: This question https://stackoverflow.com/questions/37258835/how-to-set-the-objective-function-when-using-linprog-in-matlab-to-solve-a-system was about linprogr which however gives only one possible solution. Here I'm asking how to find the entire set of solutions.
This is the set of inequalities. Any suggestion?
5x(1)+3x(2)+3x(3)+5x(4)<5
-5x(1)-3x(2)-3x(3)-5x(4)<-3
-x(2)-x(3)<0
x(2)+x(3)<1
-x(1)-x(4)<0
x(1)+x(4)<1
-3x(3)-5x(4)<-1
3x(3)+5x(4)>3
x(3)<1
-x(3)<0
x(4)<1
-x(4)<0
-5x(1)-3x(2)<0
5x(1)+3x(2)<2
x(2)<1
-x(2)<0
x(1)<1
-x(1)<0
With continuous variables we basically have zero, one or infinitely many solutions. Of course showing all the solutions is not possible for this. However, there is a concept of corner points in linear programming, and those points we can enumerate, although with much effort.
Here are some links for tools that can do this:
cdd
ppl
A different approach is to enumerate the optimal bases using extra binary variables. (You have a zero objective so this becomes effectively: enumerating all feasible LP bases). This approach makes the problem a MIP. We can enumerate this by an algorithm like:
solve mip
if infeasible: stop
add constraint to forbid current point
goto step 1
Here is a link that illustrates this approach to enumerate all optimal bases of a (continuous) LP problem.
Note that enumerating all integer solutions of a system of inequalities is easier. Many constraint programming tools will do that for you automatically. In addition we can use the "cutting plane" technique described above.

How am I to predict and extend data that I have acquired as a 1D vector in MATLAB? [duplicate]

This question already has answers here:
How can I extrapolate to higher values in Matlab? [closed]
(2 answers)
Closed 7 years ago.
Currently I have a 1d vector, which when plotted, gives the blue line in the plot below. Now I want to extend this line based on the data values of vector I already have (as shown by the red line). I am aware that I can use simple machine learning to this problem. But is there an inbuilt MATLAB library functon which can also achieve this?
What exactly would you call this problem of extending the data? It's not interpolation, and I'm sure extrapolation is not a concept. Do not hesitate to ask any questions that would clarify this problem.
Extrapolation is what you're looking for. Since the final part of the curve you want to estimate is rather linear you can use the linear extrapolation.
Let's say our function is f(i)=i for i=1,...,50, with some random noise added.
signal=(1:50)+rand(1,50);
The original signal looks like
Now let's say we want to estimate the following 10 samples, that is for i=51,...,60. By means of linear extrapolation, we can append these 10 samples by the following loop:
for i=51:60
signal(i)=signal(i-2)+((i-(i-2))/((i-1)-(i-2)))*(signal(i-1)-signal(i-2));
end
The original formula has been taken from here, in which x_star=i, x_{k-1}=i-2, x_{k}=i-1, y(x_star) is the value we're estimating, y_{k-1}=signal(i-2), y_{k}=signal(i-1). Obviously you should re-adapt such formula with the function you're using. Basically you're using the previous 2 values to evaluate the new value.
Now that these newly estimated 10 samples have been appended, signal has the form

Matlab Confidence Interval for Degrees of Freedom

I would like to calculate a Confidence Interval along with my Degrees of Freedom (DOF) estimation in Matlab. I am trying to run the following line of code:
[R, DoF, ciDOF] = copulafit('t', U); % fit the copula
The code line without the "ciDOF" arguments takes between 1-3 hours to run with my data. I tried to run the code with the "ciDOF" argument several times, but the calculations seem to take very long (I stopped the calculation after 8 hours). No error message is generated.
Does anyone have experience with this argument and could kindly tell me how long I should expect the calculation to take (the size of my data is 167*19) and if I have specified the "ciDOF" argument correctly?
Many thanks for the help!
Carolin
If your data matrix U is of size 167 x 19, then what you are asking for is a copula-fit distribution dependent on 19-dimensions, making your copula a distribution in a 20-dimensional space with 19 dependent variables.
This is almost definitely why it is taking so long, because whether it is your intention or not, you are asking MATLAB to solve a minimization problem of taking 19 marginal distributions and come-up with the 19-variate joint distribution (the copula) where each marginal distribution (represented by 167 x 1 row-vectors) is uniform.
Most-likely this is a limit of the MATLAB implementation that is iterating through many independent computations and then trying to combine them together to fit the joint distribution's ideal conditions.
First and foremost -- and not to be insulting or insinuating -- you should definitely check that you really are trying to find a 19-variate copula. Also, just in case, make sure that your matrix U is oriented in the proper way, because if you have it transposed, you could be trying to ask for the solution to a 167-variate distribution.
But, if this is what you are actually trying to do, there is not really an easy way to predict how long it will take or how long it should take. Even with multiple dimensions, if your marginals are simple or uniform already, that would greatly reduce the copula computation. But, really, there is no way to tell.
Although this may seem like a cop-out, you may actually have better luck switching from MATLAB to R, especially if you have a lot of multivariate data, and you will probably find a lot more functionality in R than MATLAB. R is freely available and comes with a Graphical User Interface (GUI), in-case you aren't comfortable with command-line programming.
There are many more sources, but here is one PDF lecture on computing copula-fits in R:
http://faculty.washington.edu/ezivot/econ589/copulasPowerpoint.pdf

Explanation of two integral equations and implementation

I have a problem with these two equations showing in the pictures.
I have two vectors represented the C(m) and S(m) in the two equations. I am trying to implement these equations in Matlab. Instead of doing a continuous integral operation, I think I should do the summation. For example, the first equation
A1 = sqrt(sum(C.^2));
Am I right? Also, I am not sure how to implement equation two that contains a ||dM||. Please help.
What are the mathematical meaning of these two equations? I think the first one may related to the 'sum of squares', if C(m) is a vector then this equation will measure the total variance of the random variable in vector C or some kind of average of vector C? What about the second one?
Thanks very much for your help!
A.
In MATLAB there are typically two different ways to do an integration.
For people who have access to the symbolic toolbox, algebraic integration is an option. If this is the case for you, I would look into help int and see which inputs you need.
For the rest, numerical integration is available, this basically means that you just calculate a function on a lot of points and then take the mean of the function values in these points.
For the mathematical meaning some more context would be helpful, and you may want to ask this question at math.stackexchange.com or on the site of whatever field you are in. (stats, physics?)