How to get the most frequent value of vector on Matlab ? - matlab

I have a vector A that contains let say [1,2,2,4]. I am looking for a way to get the most frequent value on A (here 2).

This is more a statistical question. The technical term for your request is mode.
So, in MATLAB, you can simply do:
A=[1,2,2,4]
[my_value,my_frequency]=mode(A)

Related

Element wise exp() of scipy sparse matrix

I have a very big sparse csc_matrix x. I want to do elementwise exp() on it. Basically what I want is to get the same result as I would have got with numpy.exp(x.toarray()). But I can't do that(my memory won't allow me to convert the sparse matrix into an array). Is there any way out? Thanks in advance!
If you don't have the memory to hold x.toarray(), you don't have the memory to hold the output you're asking for. The output won't be sparse; in fact, unless your input has negative infinities in it, the output probably won't have a single 0.
It'd probably be better to compute exp(x)-1, which is as simple as
x.expm1()
If you want to do something on nonzeros only: the data attribute is writable at least in some representations including csr and csc. Some representations allow for duplicate entries, so make sure you are acting on a "normalised" form.
To change non-zero elements, maybe this would work for you:
x = some big sparse matrix
np.exp( x.data, out=x.data ) # ask np.exp() to store results in existing x.data
presumably slower:
# above seems more efficient (no new memory alloc).
x.data = np.exp( x.data )
I've been wrestling with how to get an element-wise log2() of each non-zero array element. I ended up doing smth like:
np.log2( x.data, out=x.data )
The following two techniques seem like exactly what I was looking for. My matrix is sparse but it still plenty of non-zero elements.
Credit to #DSM here for the idea of directly changing x.data, I think that is a superb insight about sparse matrices.
Credit to #Mike Müller for the idea of using "out" as itself. In the same thread, #kmario23 points out an important caveat about promoting .data to floats (inputs could be int or smth) so it is compatible with the .exp() or whatever function, I would want to do that if I was writing smth for general use.
note: I'm just starting to learn about sparse matrices, so would like to know if this is a bad idea for reason(s) I'm not seeing. Please do let me know if I'm on thin ice with this.
Normally I wouldn't mess with private attributes, but .data shows up pretty clearly in the attributes documentation for the various sparse matrices I've looked at.

Multiexperiments in iddata function from Matlab's system ID toolbox

I am trying to evaluate with iddata (INFO) in matlab, a number of N_E experiments.
I already computed and have as cell arrays of size 1xN_E the outputs and inputs, y and u respectively. Every entry of the cell arrays y and u is a vector of length N=316 (SISO system). For the sake of correctness, period is also a cell array of size 1xN_E, with the period in every entry.
Using the command:
data = iddata(y,u,period);
doesn't produce the expected averaged data-set. Instead, it is handled as a 361x361MIMO system (!).
I've already tried transposing, without results.
data = iddata(y.',u.',period.');
Does someone know why this happens, and how can I produce the desired multi-experiment data-set?
P.S. the documentation I read is for Matlab R2014b, and I am running R2013b. Does someone know if this was not supported in my edition? Or how can I find out?
Actually, the Matlab documentation provides an answer to my question.
The function iddata is very strict regarding how the dimension of output y, input u and period period are defined.
Defining 1xN_experiments cell arrays for y,u and period (note: same size for all!; also N_experimentsx1 won't be recognized by iddata) and then using iddata:
data = iddata(y,u,period);
gives the desired iddata structure.
Note all vectors within y and u must be of same length(!)

matlab zplane function: handles of vectors

I'm interested in understanding the variety of zeroes that a given function produces with the ultimate goal of identifying the what frequencies are passed in high/low pass filters. My idea is that finding the lowest value zero of a filter will identify the passband for a LPF specifically. I'm attempting to use the [hz,hp,ht] = zplane(z,p) function to do so.
The description for that function reads "returns vectors of handles to the zero lines, hz". Could someone help me with what a vector of a handle is and what I do with one to be able to find the various zeros?
For example, a simple 5-point running average filter:
runavh = (1/5) * ones(1,5);
using zplane(runavh) gives an acceptable pole/zero plot, but running the [hz,hp,ht] = zplane(z,p) function results in hz=175.1075. I don't know what this number represents and how to use it.
Many thanks.
Using the get command, you can find out things about the data.
For example, type G=get(hz) to get a list of properties of the zero lines. Then the XData is given by G.XData, i.e. X=G.XData.
Alternatively, you can only pull out the data you want
X=get(hz,'XData')
Hope that helps.

Suppress kinks in a plot matlab

I have a csv file which contains data like below:[1st row is header]
Element,State,Time
Water,Solid,1
Water,Solid,2
Water,Solid,3
Water,Solid,4
Water,Solid,5
Water,Solid,2
Water,Solid,3
Water,Solid,4
Water,Solid,5
Water,Solid,6
Water,Solid,7
Water,Solid,8
Water,Solid,7
Water,Solid,6
Water,Solid,5
Water,Solid,4
Water,Solid,3
The similar pattern is repeated for State: "Solid" replaced with Liquid and Gas.
And moreover the Element "Water" can be replaced by some other element too.
Time as Integer's are in seconds (to simplify) but can be any real number.
Additionally there might by some comment line starting with # in between the file.
Problem Statement: I want to eliminate the first dip in Time values and smooth out using some quadratic or cubic or polynomial interpolation [please notice the first change from 5->2 --->8. I want to replace these numbers to intermediate values giving a gradual/smooth increase from 5--->8].
And I wish this to be done for all the combinations of Elements and States.
Is this possible through some sort of coding in Matlab etc ?
Any Pointers will be helpful !!
Thanks in advance :)
You can use the interp1 function for 1D-interpolation. The syntax is
yi = interp1(x,y,xi,method)
where x are your original coordinates, y are your original values, xi are the coordinates at which you want the values to be interpolated at and yi are the interpolated values. method can be 'spline' (cubic spline interpolation), 'pchip' (piece-wise Hermite), 'cubic' (cubic polynomial) and others (see the documentation for details).
You have alot of options here, it really depends on the nature of your data, but I would start of with a simple moving average (MA) filter (which replaces each data point with the average of the neighboring data points), and see were that takes me. It's easy to implement, and fine-tuning the MA-span a couple of times on some sample data is usually enough.
http://www.mathworks.se/help/curvefit/smoothing-data.html
I would not try to fit a polynomial to the entire data set unless I really needed to compress it, (but to do so you can use the polyfit function).

How to get level of fitness of data to a distribution by using probplot() in Matlab?

I have 2 sets of data of float numbers, set A and set B. Both of them are matrices of size 40*40. I would like to find out which set is closer to the normal distribution. I know how to use probplot() in matlab to plot the probability of one set. However, I do not know how to find out the level of the fitness of the distribution is.
In python, when people use problot, a parameter ,R^2, shows how good the distribution of the data is against to the normal distribution. The closer the R^2 value to value 1, the better the fitness is. Thus, I can simply use the function to compare two set of data by their R^2 value. However, because of some machine problem, I can not use the python in my current machine. Is there such parameter or function similar to the R^2 value in matlab ?
Thank you very much,
Fitting a curve or surface to data and obtaining the goodness of fit, i.e., sse, rsquare, dfe, adjrsquare, rmse, can be done using the function fit. More info here...
The approach of #nate (+1) is definitely one possible way of going about this problem. However, the statistician in me is compelled to suggest the following alternative (that does, alas, require the statistics toolbox - but you have this if you have the student version):
Given that your data is Normal (not Multivariate normal), consider using the Jarque-Bera test.
Jarque-Bera tests the null hypothesis that a given dataset is generated by a Normal distribution, versus the alternative that it is generated by some other distribution. If the Jarque-Bera test statistic is less than some critical value, then we fail to reject the null hypothesis.
So how does this help with the goodness-of-fit problem? Well, the larger the test statistic, the more "non-Normal" the data is. The smaller the test statistic, the more "Normal" the data is.
So, assuming you have converted your matrices into two vectors, A and B (each should be 1600 by 1 based on the dimensions you provide in the question), you could do the following:
%# Build sample data
A = randn(1600, 1);
B = rand(1600, 1);
%# Perform JB test
[ANormal, ~, AStat] = jbtest(A);
[BNormal, ~, BStat] = jbtest(B);
%# Display result
if AStat < BStat
disp('A is closer to normal');
else
disp('B is closer to normal');
end
As a little bonus of doing things this way, ANormal and BNormal tell you whether you can reject or fail to reject the null hypothesis that the sample in A or B comes from a normal distribution! Specifically, if ANormal is 1, then you fail to reject the null (ie the test statistic indicates that A is probably drawn from a Normal). If ANormal is 0, then the data in A is probably not generated from a Normal distribution.
CAUTION: The approach I've advocated here is only valid if A and B are the same size, but you've indicated in the question that they are :-)