How to find values of the input data in plotsommaphits - matlab

I have used SOM tool box in MATLAB or iris data set. following example and using the plotsomhits i can see that how many data values are in each neuron of my neuron grid of the SOM . However I wish to know actual data values which are grouped in every neuron of the given SOM configuration .Is there any way to do it. this is the example I used.
net = selforgmap([8 8]);
view(net)
[net,tr] = train(net,x);
nntraintool
plotsomhits(net,x)

not that hard. plotsomhits just plots "fancily" the results of the simulation of the net.
so if you simulate it and add the "hits" you have the result!
basicaly:
hits=sum(sim(net,x)');
In your net case this was my results, that coincide with the numbers in the plotsomehits
hits= 6 5 0 7 15 7 4 0 8 20 3 3 9 3 0 8 6 3 11 4 5 5 7 10 1
PD: you can learn a lot in this amazing SO answer:
MATLAB: help needed with Self-Organizing Map (SOM) clustering

You need to convert vector to indices first and then you can see what input values a neuron correspond to.
>> input_neuron_mapping = vec2ind(net(x))';
Now, look into the neuron's inputs.
For example, you want to see neuron input values for neuron 2.
>> neuron_2_input_indices = find(input_neuron_mapping == 2)
>> neuron_2_input_values = x(neuron_2_input_indices)
It will display all the input values from your data.
Read here for more details:
https://bioinformaticsreview.com/20220603/how-to-get-input-values-from-som-sample-hits-plot-in-matlab/

Related

What is the difference between predict and svmclassify?

I tried the following code
data = [27 9 0
11.6723281 28.93422177 0
25 9 0
23 8 0
5.896096039 23.97745722 1
21 6 0
21.16823369 5.292058423 0
4.242640687 13.43502884 1
22 6 0];
Attributes = data(:,1:2);
Classes = data(:,3);
train = [1 3 4 5 6 7];
test = [2 8 9];
%%# Train
SVMModel = fitcsvm(Classes(train),Attributes(train,:))
classOrder = SVMModel.ClassNames
sv = SVMModel.SupportVectors;
figure
gscatter(train(:,1),train(:,2),Classes)
hold on
plot(train(:,1),train(:,2),'ko','MarkerSize',10)
legend('good','bad','Support Vector')
hold off
I tried both predict and svmclassify; but it returns an error. What is the basic difference between these two functions?
[label,score] = predict(SVMModel,test);
label = svmclassify(SVMModel, test);
First off, there's quite a big note on top of the documentation page on svmclassify:
svmclassify will be removed in a future release. See fitcsvm, ClassificationSVM, and CompactClassificationSVM instead.
MATLAB is a bit vague in its naming of functions, as there's loads of functions named predict, using different schemes and algorithms. I suspect you'll want to use the one for SVMs. This should return the same result as svmclassify, but I think that either something went wrong in determining which predict MATLAB decided to use, or that predict has a newer algorithm than the unsupported svmclassify, hence a different output may result.
The conclusion is that you should use the newest functions to be able to run your code in future releases and get the newest algorithms. MATLAB will choose the correct version of predict based on what kind of input structure you feed it.

Average on contiguos segments of a vector

I'm sure this is a trivial question for a signals person. I need to find the function in Matlab that outputs averaging of contiguous segments of windowsize= l of a vector, e.g.
origSignal: [1 2 3 4 5 6 7 8 9];
windowSize = 3;
output = [2 5 8]; % i.e. [(1+2+3)/3 (4+5+6)/3 (7+8+9)/3]
EDIT: Neither one of the options presented in How can I (efficiently) compute a moving average of a vector? seems to work because I need that the window of size 3 slides, and doesnt include any of the previous elements... Maybe I'm missing it. Take a look at my example...
Thanks!
If the size of the original data is always a multiple of widowsize:
mean(reshape(origSignal,windowSize,[]));
Else, in one line:
mean(reshape(origSignal(1:end-mod(length(origSignal),windowSize)),windowSize,[]))
This is the same as before, but the signal is only taken to the end minus the extra values less than windowsize.

why should i transpose in neural network in matlab?

I would like to ask a question about matlab transpose symbol. For example in this case:
input=input';
It makes transpose of input but i want to learn why we should use transpose via usin Artificial Neural Network in matlab?
Second Question is:
I am trying to create a classification using ANN in matlab. I showed results like that:
a=sim(neuralnetworkname,test)
test is represens my test data in Neural network.
and the results is like that:
a =
Columns 1 through 12
2.0374 3.9589 3.2162 2.0771 2.0931 3.9947 3.1718 3.9813 2.1528 3.9995 3.8968 3.9808
Columns 13 through 20
3.9996 3.7478 2.1088 3.9932 2.0966 2.0644 2.0377 2.0653
If the result of a is about 2, it would benign, if the result of a is about 4,it is malignant.
So, I want to calculate that :for example,there are 100 benign in 500 data.(100/500) How can i write screen this 100/500
I tried to be clear, but if i didn't clear enough, I can try to explain more.Thanks.
First Question
You don't need to transpose input values everytime. Matlab nntool normally gets input values column by column by default. So you have two choice: 1. Change dataset order 2. Transpose input
Second Question
Suppose you have matrix like this:
a=[1 2 3 4 5 6 7 8 9 0 0 0];
To count how many elements below 8, write this:
sum(a<8) %[1 2 3 4 5 6 7 0 0 0]
Output will be:
10

Unreasonable [positive] log-likelihood values from matlab "fitgmdist" function

I want to fit a data sets with Gaussian mixture model, the data sets contains about 120k samples and each sample has about 130 dimensions. When I use matlab to do it, so I run scripts (with cluster number 1000):
gm = fitgmdist(data, 1000, 'Options', statset('Display', 'iter'), 'RegularizationValue', 0.01);
I get the following outputs:
iter log-likelihood
1 -6.66298e+07
2 -1.87763e+07
3 -5.00384e+06
4 -1.11863e+06
5 299767
6 985834
7 1.39525e+06
8 1.70956e+06
9 1.94637e+06
The log likelihood is bigger than 0! I think it's unreasonable, and don't know why.
Could somebody help me?
First of all, it is not a problem of how large your dataset is.
Here is some code that produces similar results with a quite small dataset:
options = statset('Display', 'iter');
x = ones(5,2) + (rand(5,2)-0.5)/1000;
fitgmdist(x,1,'Options',options);
this produces
iter log-likelihood
1 64.4731
2 73.4987
3 73.4987
Of course you know that the log function (the natural logarithm) has a range from -inf to +inf. I guess your problem is that you think the input to the log (i.e. the aposteriori function) should be bounded by [0,1]. Well, the aposteriori function is a pdf function, which means that its value can be very large for very dense dataset.
PDFs must be positive (which is why we can use the log on them) and must integrate to 1. But they are not bounded by [0,1].
You can verify this by reducing the density in the above code
x = ones(5,2) + (rand(5,2)-0.5)/1;
fitgmdist(x,1,'Options',options);
this produces
iter log-likelihood
1 -8.99083
2 -3.06465
3 -3.06465
So, I would rather assume that your dataset contains several duplicate (or very close) values.

Questions about matlab median filter commands

This is a question about Matlab/Octave.
I am seeing some results of the medfilt1(1D Median filter command in matlab) computation by which I am confused.
EDIT:Sorry forgot to mention:I am using Octave for Windows 3.2.4. This is where i see this behavior.
Please see the questions below, and point if I am missing something.
1] I have a 1D data array b=[ 3 5 -8 6 0];
out=medfilt1(b,3);
I expected the output to be [3 3 5 0 0] but it is showing the output as [4 3 5 0 3]
How come? What is wrong here?
FYI-Help says it pads the data at boundaries by 0(zero).
2] How does medfilt2(2D median filter command in matlab) work.
Help says "Each output pixel contains the median value in the m-by-n neighborhood around the corresponding pixel in the input image".
For m=3,n=3, So does it calculate a 3x3 matrix MAT for each of input pixels placed at its center and do median(median(MAT)) to compute its median value in the m-by-n neighbourhood?
Any pointers will help.
thank you. -AD
I was not able to replicate your error with Matlab 7.11.0, but from the information in your question it seems like your version of medfilt1 does not differentiate between an odd or even n.
When finding the median in a vector of even length, one usually take the mean of the two median values,
median([1 3 4 5]) = (3+4)/2 = 3.5
This seems to be what happens in your case. Instead of treating n as odd, and setting the value to be 3, n is treated as even and your first out value is calculated to be
median([0 3 5]) = (3+5)/2 = 4
and so on.. EDIT: This only seems to happen in the endpoints, which suggest that the padding with zeros is not properly working in your Octave code.
For your second question, you are almost right, it calculates a 3x3 matrix in each center, but it does not do median(median(MAT)), but median(MAT(:)). There is a difference!
A = [1 2 3
14 5 33
11 7 13];
median(median(A)) = 11
median(A(:)) = 7