Jensen–Shannon divergence analysis of numbers - distance

The Jensen–Shannon divergence is a method of measuring the similarity between two probability distributions and it is bounded by 1 ( 0 <= JSD(p,q) <= 1). I have applied the python code of Jensen–Shannon divergence and I want to analyze my results. I can't understand what the resulted numbers mean. What is the meaning of JSD(p,q)=1 or JSD(p,q)=0 ?

If you're calculating Jensen-Shannon distance, JSD(p,q) = 0 says the two distributions are similar/equal. If it's 1 they're very different.

Related

MATLAB: How to compute the similarity of two signals and get the correct consistency or coherence metric

I was wondering about the consistency metric. Generally, it allows us to deduce the parity or similarity between two signals, right? If so, if the probability is higher (from 0.5 to 1), does it means that there is a strong similarity of the signals? If the margin is less than (0.1-0.43), can this predict the poor coherence between the signals (or poor similarity, the probability the signals are different)? So, if we got the metric <0, is this approved the signal is totally different? Because I'm getting negative numbers. Is this hypothesis possible?
Can I have a clear understanding of the consistency metric of the signal? Here is my small code and figure. Thanks in advance.
s1 = signal3
s2 = signal4
if s1 ~= s2
[C1] = xcorr(s1);
[C2] = xcorr(s2);
signal_mix = C1.*C2 %mixing vector
signal_mix1 = signal_mix
else
s1(1,:) == s2(1,:)
s3 = s1
s3= s2
signal_mix = s2
end
n =2;
for i = length(signal_mix1)
signal_mix1(i) = min(C1(i),C2(i))/ max(C1(i),C2(i)) % consistency score
signal_mix2 = sum(signal_mix1(i))
end
Depending on your use case you might want to consider a dynamic time wraping distance (Matlab has a build in function for that) as similarity metric. One problem with using correlation as a metric is that it compares always the same timestep of the signals. So two identical signals, where one is time delayed, could lead to low correlation. The DTW distance adresses this by comparing to values of adjacent timesteps.
The downside of the dtw distance is that the distance it self can't be interpretet on its only only relative to other distances. So you can tell that two signals A & B with a distance of 150 are more similar than A & C with a distance of 250. But the distance of 150 on its own doesn't tell you a lot.
first of all, you could use xcorrfunction to calculate cross-correlation between two signals.
from Matlab help:
r = xcorr(x,y) returns the cross-correlation of two discrete-time
sequences. Cross-correlation measures the similarity between a vector
x and shifted (lagged) copies of a vector y as a function of the lag.
If x and y have different lengths, the function appends zeros to the
end of the shorter vector so it has the same length as the other.
additionally you could use xcov:
xcov computes the mean of its inputs, subtracts the mean, and then
calls xcorr.
The result of xcov can be interpreted as an estimate of the covariance
between two random sequences or as the deterministic covariance
between two deterministic signals.
in case of your example you are using xcorr with one signal so it computes auto-correlation between the signal itself and its lagged signal.
update:
based on the comment, it seems you need linear correlation, it can be calculated by corr function:
p=corr(x,y)
the value of p is 1 when x , y behave exactly like each other, and is -1 when x and y behave quite the opposite of each other.
when p is 0 it means there is no correlation between two signals.

Poor performance help- muti-class classification by ANN

I'm implementing a 7-class classification task with normalised features and one-hot encoded labels. However, the training and validation accuracies have been extremely poor.
As shown, I normalised features from with StandardScaler() method and each feature vector turns out a 54-dim numpy array. Also, I one-encoded labels in the following manner.
As illustrated below, the labels are (num_y, 7) numpy arrays.
My network architecture:
It is shown here how I designed my model. And I'm wonder if the poor result has something to do with the selection of loss function (I've been using Categorical Cross-Entropy)
I appreciate any response from you. Thanks a lot!
The use of accuracy is obviously wrong. The code I refer to is not provided in your question, but I can speculate that you are comparing the true labels with your model outputs. Your model probably returns a vector of dimensionality 7 which constitutes a probability density function over the classes (due to the softmax activation in your final layer) like this:
model returns: (0.7 0 0.02 0.02 0.02 0.04 0.2) -- they sum to 1 because they represent probabilities
and then you are comparing these numbers with: (1 0 0 0 0 0 0)
what you have to do is translate the model output to the corresponding predicted label ((0.7 0 0.02 0.02 0.02 0.04 0.2) corresponds to (1 0 0 0 0 0 0) because the first output neuron has the larger value (0.7)). You may do that by applying a max function after your model outputs.
To make sure thats whats wrong with your problem formulation print the vector you are comparing with the true labels to get your accuracy and check if they are 7 numbers that sum up to 1.

How do I write a script to find the cost of each item in a list if only the total cost is known? MATLAB

I am trying to write a script in MATLAB for my class. The scenario is that there are four different types of pens. I only know the total cost of all four pens (total is not actually given to me). I am trying to find the individual cost of each different type of pen. My 3 "friends" also each bought the four pens themselves. That makes for a total of 16 pens among 4 people. Everyone's total cost should be the same. The book suggests creating a matrix for the pens made up of columns for each different type of pen and rows for each of the people (4x4). It also says to have a column vector for the totals each person spent on the pens, which I presume would all be the same. I am stuck and really not sure how to go about solving this since I do not know the cost of even one of the pens. Any help would greatly be appreciated.
#TTT is right, linear algebra solves your task. The great thing about Matlab is, that it can actually calculate linear algebra without the fuzz of building for-loops.
Here is a simple example that should suit your case.
Footnote:
Note that the matrix inversion with inv() will be flagged as inefficient by the Matlab-IDE (i.e. the program) because it is much faster and more accurate to calculate inv(NumPens) * total jointly (which is expressed as NumPens\total) than explicitly calculating the inverse of the matrix first -- but to teach linear algebra, this way is much better!)
total = [17;13;12;27]; % vector 4x1 (number of persons x 1)
NumPens = [1 1 3 1
1 0 1 1
0 1 0 2
3 0 1 1]; % matrix 4x4 (number of persons x number of pen types)
% total = NumPens * x % original system
x = inv(NumPens) * total % how to calculate the number of pens

Cosine distance range interpretation

I am trying to use the cosine distance in pdist2. I am confused about it's output. As far as I know it should be between 0 and 1. Since MATLAB uses 1-(cosine), then 1 would be the highest variability while 0 would be the lowest. However the output seems to range from 0.5 to 1.5 or something along that!
Can somebody please advise me on how to interpret this output?
From help pdist2:
'cosine' - One minus the cosine of the included angle
between observations (treated as vectors)
Since the cosine varies between -1 and 1, the result of pdist2(...'cosine') varies between 0 and 2. If you want the cosine, use 1-pdist2(matrix1,matrix2,'cosine').

Approximation for mean(abs(fft(vector)))?

Some MatLab code I'm trying to simplify goes through the effort of finding FFT, only to take the absolute value, and then the mean:
> vector = [0 -1 -2 -1 0 +1 +2 +1];
> mean(abs(fft(vector)))
ans = 2
All these coefficients are built up, then chopped back down to a single value. The platform I'm working on is really limited, and if I can get by without using FFT, all the better. Is there a way to approximate these operations without FFT?
Can assume vector will be at most 64 values in length.
I think this is a just a very inefficient way of calculating the RMS value of the signal. See Parseval's Theorem. You can probably just simplify the expression to:
sqrt(mean(vector.*vector))
If your vector is real and has zero average as in your example, you can take advantage of the fact that the two halves of the DFT are complex-conjugate (from basic signal processing) and save half the abs computations. Also, use sum, which is faster than mean:
fft_vector = fft(vector);
len = length(fft_vector)/2;
sum(abs(fft_vector(1:len)))/len