This question already has answers here:
Test if a data distribution follows a Gaussian distribution in MATLAB
(3 answers)
Closed 8 years ago.
I have a data set that i want find which distribution is fit to it. How can I check difference distributions on this database? Is any code or automatic code for do that in MATLAB?
Thanks.
I think what you're looking for is called the Bayesian Information Criterion or BIC. Check it out on Wikipedia... Then pick several distributions, calculate the BIC for each distribution with your data, and finally see which one has the best BIC.
Although I make this out to be a simple problem, it actually isn't. For many distributions calculating the BIC requires numerical optimization over the parameters of the distribution. However for some distributions Matlab can calculate the Maximum Likelihood Estimator (MLE) for you automatically, which is part of what you'll need for the BIC.
Related
This question already has an answer here:
How to set the same initial seed random numbers in Matlab?
(1 answer)
Closed 6 years ago.
I run ANN on MATLAB, and the output of ANN is not consistent every time I run it? How to overcome this problem. I used same data and ANN structure.
clear;
clc;
load ('C:\USers\ARMA\Desktop\DATA.txt');
data=DATA;
N=length(data);
DT=data;
X=DT(1:N,1:2);
Y=DT(1:N,3);
H=3;
net=newff(minmax(X),[H,1],{'logsig','purelin'},'traingdx');
net=init(net);
net.trainparam.Ir=0.9;
net.trainparam.mc=0.1;
net.trainparam.epochs=10000;
net.trainparam.goal=0.001;
net.trainparam.show=1000;
[net,tr]=train(net,X,Y);
plotperform(tr)
The ANN toolbox uses a randomised initial values as initial weights and biases. So apparently the results are sensitive to them.
You need to fix them before training to achieve similar results.
This question already has an answer here:
How to set the objective function when using linprog in Matlab to solve a system of linear inequalities? [closed]
(1 answer)
Closed 6 years ago.
I want to solve a system of linear inequalities in Matlab, where the unknowns are x(1), x(2), x(3), x(4). I want the entire set of solutions x(1), x(2), x(3), x(4). Hence, I can't use linprog because it gives me just one feasible point.
Clarification: This question https://stackoverflow.com/questions/37258835/how-to-set-the-objective-function-when-using-linprog-in-matlab-to-solve-a-system was about linprogr which however gives only one possible solution. Here I'm asking how to find the entire set of solutions.
This is the set of inequalities. Any suggestion?
5x(1)+3x(2)+3x(3)+5x(4)<5
-5x(1)-3x(2)-3x(3)-5x(4)<-3
-x(2)-x(3)<0
x(2)+x(3)<1
-x(1)-x(4)<0
x(1)+x(4)<1
-3x(3)-5x(4)<-1
3x(3)+5x(4)>3
x(3)<1
-x(3)<0
x(4)<1
-x(4)<0
-5x(1)-3x(2)<0
5x(1)+3x(2)<2
x(2)<1
-x(2)<0
x(1)<1
-x(1)<0
With continuous variables we basically have zero, one or infinitely many solutions. Of course showing all the solutions is not possible for this. However, there is a concept of corner points in linear programming, and those points we can enumerate, although with much effort.
Here are some links for tools that can do this:
cdd
ppl
A different approach is to enumerate the optimal bases using extra binary variables. (You have a zero objective so this becomes effectively: enumerating all feasible LP bases). This approach makes the problem a MIP. We can enumerate this by an algorithm like:
solve mip
if infeasible: stop
add constraint to forbid current point
goto step 1
Here is a link that illustrates this approach to enumerate all optimal bases of a (continuous) LP problem.
Note that enumerating all integer solutions of a system of inequalities is easier. Many constraint programming tools will do that for you automatically. In addition we can use the "cutting plane" technique described above.
Would it be accurate to include an expert system in an image classifying application? (I am working with Matlab, have some experience with image processing and no experience with expert systems.)
What I'm planning on doing is adding an extra feature vector that is actually an answer to a question. Is this fine?
For example: Assume I have two questions that I want the answers to : Question 1 and Question 2. Knowing the answers to these 2 questions should help classify the test image more accurately. I understand expert systems are coded differently from an image classifier but my question is would it be wrong to include the answers to these 2 questions, in a numerical form (1 can be yes, and 0 can be no) and pass this information along with the other feature vectors into a classifier.
If it matters, my current classifier is an SVM.
Regarding training images: yes, they too will be trained with the 2 extra feature vectors.
Converting a set of comments to an answer:
A similar question in cross-validated already explains that it can be done as long as data is properly preprocessed.
In short: you can combine them as long as training (and testing) data is properly preprocessed (e.g. standardized). Standardization improves the performance of most linear classifiers because it scales the variables so they have the similar weight in the learning process and improves the numerical stability (and performance) when variables are sampled from gaussian-like distributions (which is achieved by standarization).
With that, if continuous variables are standardized and categorical variables are encoded as (-1, +1) the SVM should work well. Whether it will improve or not the performance of the classifier depends on the quality of those cathegorical variables.
Answering the other question in the comment.. while using kernel SVM with for example a chi square kernel, the rows of the training data are suppose to behave like histograms (all positive and usually l1-normalized) and therefore introducing a (-1, +1) feature breaks the kernel. Using a RBF kernel the rows of the data are suppose to be L2 normalized, and again, introducing (-1, +1) features might introduce unexpected behaviour (I'm not very sure what exactly the effect would be..).
I worked on similar problem. if multiple features can be extracted from your images then you can train different classifier by using different features. You can think about these classifiers as experts in answering questions based on the features they used in training. Instead of using labels as outputs, it is better to use confidence values. uncertainty can be very important in this manner. you can use these experts to generate values. these values can be combined and used to train another classifier.
This question already has answers here:
How can I extrapolate to higher values in Matlab? [closed]
(2 answers)
Closed 7 years ago.
Currently I have a 1d vector, which when plotted, gives the blue line in the plot below. Now I want to extend this line based on the data values of vector I already have (as shown by the red line). I am aware that I can use simple machine learning to this problem. But is there an inbuilt MATLAB library functon which can also achieve this?
What exactly would you call this problem of extending the data? It's not interpolation, and I'm sure extrapolation is not a concept. Do not hesitate to ask any questions that would clarify this problem.
Extrapolation is what you're looking for. Since the final part of the curve you want to estimate is rather linear you can use the linear extrapolation.
Let's say our function is f(i)=i for i=1,...,50, with some random noise added.
signal=(1:50)+rand(1,50);
The original signal looks like
Now let's say we want to estimate the following 10 samples, that is for i=51,...,60. By means of linear extrapolation, we can append these 10 samples by the following loop:
for i=51:60
signal(i)=signal(i-2)+((i-(i-2))/((i-1)-(i-2)))*(signal(i-1)-signal(i-2));
end
The original formula has been taken from here, in which x_star=i, x_{k-1}=i-2, x_{k}=i-1, y(x_star) is the value we're estimating, y_{k-1}=signal(i-2), y_{k}=signal(i-1). Obviously you should re-adapt such formula with the function you're using. Basically you're using the previous 2 values to evaluate the new value.
Now that these newly estimated 10 samples have been appended, signal has the form
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
I have a Matlab assignment which is very important for my final grade but I'm not sure how to begin. Basically it has 7 clauses that are related to each other but I will ask about the first 3 because I really just want to know how I can approach the assignment and then do the rest on my own.
Create a random vector that includes 5 random variables that are statistically dependant.
I looked this up and the only thing I could think of is either using the pdf command or the cdf command. Should I use either one of them? And how can I make sure they are statistically dependant? And since I am not told which distribution to use, is it possible to make Matlab decide that one for me?
Show the distribution of each of the random variables on a single plot.
I assume I should just use plot on this one? For this I just need to know how to do the first clause, I guess.
Calculate the covariance matrix of the vector. Check what would have happened if the vector had included 5 variables that are statistically independant.
How to I Calculate the covariance matrix? Is there a Matlab command for that?
I really hope you can help me, as I have no idea how to even begin.
Thank you!
(1) There are many ways to create random variables which are dependent. Perhaps the simplest to make a variable Y which is dependent on another X is to take a function of X and add noise. E.g. let X be a Gaussian variable with some mean and variance, then let Y = X^2 + e where e is a Gaussian variable with mean 0 and any variance. So for your 5 variables: let the first be Gaussian, then let the other 4 be functions of the first plus noise.
(2) Dunno for sure but I think you can use plot for that.
(3) cov computes covariance if I remember correctly.