Removing mean does not give symmetric signal - matlab

I am using 64-bit Windows with Matlab R2017a.
I have Matlab data stored in a vector here. When I plot the data using the command figure; plot(B), it looks like this:
Normally, when you remove the mean from a signal like this which looks almost periodic, the signal becomes symmetric about the x-axis. I tried this using the code B2 = B - mean(B);. Upon plotting with the command figure; plot(B2), I get this:
which is not symmetric (max value is around 0.9 and min value is around -1.25). However, this result is not true for a very similar dataset found here. Before removing the mean, C looks like this:
And after, C2 = C - mean(C) looks like this:
which is symmetric about the x-axis (max value is around 1.1 and min value is around -1.1).
What results in this difference for these two seemingly similar datasets?

"Normally, when you remove the mean from a signal like this which looks almost periodic, the signal becomes symmetric about the x-axis."
That only is true, if your values are equally distributed. And your "looks periodic" is exactly what your dataset is: It looks kinda periodic, but it isn't. You have much more values close to zero than to -2. You see this a) when calculating your median, which is -0.1618 on dataset B and also visually the time it rests at zero is much longer (approx. 700 samples) than when it's around -2.2 (~400 samples).

While Christians Answer is 100% correct. It doesn't offer a solution to the problem.
To center your function like you have it around the x-axis you would need to calculate:
B3 = B - (max(B) + min(B))/2
Note: This only works sol nicely because your function "look periodic"

Related

MATLAB Code: Help required on algorithm

I have the following code from my predecessor. I am unable to figure out what is the math that is happening here and how is the values avgCov and stdCov are different, and what they signify.
Cprofile_f is a curve similar to gaussian curve, like a peak. Cprofile_f is an array of known size (5700).
b1, d1 are index values. Usually, b1 is 2000, d1 is 4300.
avgCov=sum(Cprofile_f(b1:d1))/(d1-b1)
stdCov=0;
for ii=b1:d1
stdCov =stdCov + sqrt((avgCov - Cprofile_f(ii))^2);
end
stdCov =1- stdCov/(d1-b1)/avgCov
Trying to figure out, what stdCov mean here.
Looks like it's computing the average (avgCov) and standard deviation (stdCov, sort of) in order to compute 1 minus the coefficient of variation (stored in stdCov).
https://en.wikipedia.org/wiki/Coefficient_of_variation

MATLAB Simple - Linear Predictive Coding and Energy Forecasting

I have a dataset with 274 samples (9 months) of the daily energy (Watts.hour) used on a residential household. I'm not sure if i'm applying the lpc function correctly.
My code is the following:
filename='9-months.csv';
energy = csvread(filename);
C=zeros(5,1);
counter=0;
N=3;
for n=274:-1:31
w2=energy(1:n-1,1);
a=lpc(w2,N);
energy_estimated=0;
for X = 1:N
energy_estimated = energy_estimated + (-a(X+1)*energy(n-X));
end
w_real=energy(n);
error2=abs(w_real-energy_estimated);
counter=counter+1;
C(counter,1)=error2;
end
mean_error=round(mean(C));
Being "n" the sample on analysis, I will use the energy array's values, from 1 to n-1, to calculate the lpc coefficientes (with N=3).
After that, it will apply the calculated coefficients on the "for" cycle presented, in order to calculate the estimated energy.
Finally, error2 outputs the error between the real energy and estimated value.
On the example presented ( http://www.mathworks.com/help/signal/ref/lpc.html ) some filters are used. Do I need to apply any filter to it? Is my methodology correct?
Thank you very much in advance!
The lpc seems to be used correctly, but there are a few other things about your code. I am adressign the part at he "for n" :
for n=31:274 %for me it would seem more logically to go forward in time
w2=energy(1:n-1,1);
a=lpc(w2,N);
energy_estimate=filter([0 -a(2:end)],1,w2);
energy_estimate=energy_estimate(end);
estimates(n)=energy_estimate;
end
error=energy(31:274)-estimates(31:274)';
meanerror=mean(error); %you dont really round mean errors
filter is exactly what you are trying to do with the X=1:N loop. but this will perform the calculation for the entire w2 vector. If you just want the last value take the (end) command as well.
Now there is no reason to calculate the error for every single value and then add them to a vector you can do that faster after the calculation.
Now if your trying to estimate future values with a lpc it could work like that, but you are implying that every value is only dependend on the last 3 values. Have you tried something like a polynominal approach? i would think that this would be closer to reality.

Matlab fast neighborhood operation

I have a Problem. I have a Matrix A with integer values between 0 and 5.
for example like:
x=randi(5,10,10)
Now I want to call a filter, size 3x3, which gives me the the most common value
I have tried 2 solutions:
fun = #(z) mode(z(:));
y1 = nlfilter(x,[3 3],fun);
which takes very long...
and
y2 = colfilt(x,[3 3],'sliding',#mode);
which also takes long.
I have some really big matrices and both solutions take a long time.
Is there any faster way?
+1 to #Floris for the excellent suggestion to use hist. It's very fast. You can do a bit better though. hist is based on histc, which can be used instead. histc is a compiled function, i.e., not written in Matlab, which is why the solution is much faster.
Here's a small function that attempts to generalize what #Floris did (also that solution returns a vector rather than the desired matrix) and achieve what you're doing with nlfilter and colfilt. It doesn't require that the input have particular dimensions and uses im2col to efficiently rearrange the data. In fact, the the first three lines and the call to im2col are virtually identical to what colfit does in your case.
function a=intmodefilt(a,nhood)
[ma,na] = size(a);
aa(ma+nhood(1)-1,na+nhood(2)-1) = 0;
aa(floor((nhood(1)-1)/2)+(1:ma),floor((nhood(2)-1)/2)+(1:na)) = a;
[~,a(:)] = max(histc(im2col(aa,nhood,'sliding'),min(a(:))-1:max(a(:))));
a = a-1;
Usage:
x = randi(5,10,10);
y3 = intmodefilt(x,[3 3]);
For large arrays, this is over 75 times faster than colfilt on my machine. Replacing hist with histc is responsible for a factor of two speedup. There is of course no input checking so the function assumes that a is all integers, etc.
Lastly, note that randi(IMAX,N,N) returns values in the range 1:IMAX, not 0:IMAX as you seem to state.
One suggestion would be to reshape your array so each 3x3 block becomes a column vector. If your initial array dimensions are divisible by 3, this is simple. If they don't, you need to work a little bit harder. And you need to repeat this nine times, starting at different offsets into the matrix - I will leave that as an exercise.
Here is some code that shows the basic idea (using only functions available in FreeMat - I don't have Matlab on my machine at home...):
N = 100;
A = randi(0,5*ones(3*N,3*N));
B = reshape(permute(reshape(A,[3 N 3 N]),[1 3 2 4]), [ 9 N*N]);
hh = hist(B, 0:5); % histogram of each 3x3 block: bin with largest value is the mode
[mm mi] = max(hh); % mi will contain bin with largest value
figure; hist(B(:),0:5); title 'histogram of B'; % flat, as expected
figure; hist(mi-1, 0:5); title 'histogram of mi' % not flat?...
Here are the plots:
The strange thing, when you run this code, is that the distribution of mi is not flat, but skewed towards smaller values. When you inspect the histograms, you will see that is because you will frequently have more than one bin with the "max" value in it. In that case, you get the first bin with the max number. This is obviously going to skew your results badly; something to think about. A much better filter might be a median filter - the one that has equal numbers of neighboring pixels above and below. That has a unique solution (while mode can have up to four values, for nine pixels - namely, four bins with two values each).
Something to think about.
Can't show you a mex example today (wrong computer); but there are ample good examples on the Mathworks website (and all over the web) that are quite easy to follow. See for example http://www.shawnlankton.com/2008/03/getting-started-with-mex-a-short-tutorial/

Calculating the maximum distance between elements of vector in MATLAB

Let's assume that we have a vector like
x = -1:0.05:1;
ids = randperm(length(x));
x = x(ids(1:20));
I would like to calculate the maximum distance between the elements of x in some idiomatic way. It would be easy to just iterate over all possible combinations of x's elements but I feel like there could be a way to do it with MATLAB's built-in functions in some crazy but idiomatic way.
What about
max_dist = max(x) - min(x)
?
Do you mean the difference between the largest and smallest elements in your vector ? If you do, then something like this will work:
max(x) - min(x)
If you don't, then I've misunderstood the question.
This is an interpoint distance computation, although a simple one, since you are working in one dimension. Really that point which falls at a maximum distance in one dimension is always one of two possible points. So all you need do is grab the minimum value and the maximum value from the list, and see which is farther away from the point in question. So assuming that the numbers in x are real numbers, this will work:
xmin = min(x);
xmax = max(x);
maxdistance = max(x - xmin,xmax - x);
As an alternative, some time ago I put a general interpoint distance computation tool up on the file exchange (IPDM). It is smart enough to special case simple problems like the 1-d farthest point problem. This call would do it for you:
D = ipdm(x,'subset','farthest','result','struct');
Of course, it will not be as efficient as the simple code I wrote above, since it is a fully general tool.
Uhh... would love to have a MATLAB at my hands and its still early in the morning, but what about something like:
max_dist = max(x(2:end) - x(1:end-1));
I don't know if this is what You are looking for.

modem.oqpskmod for BER

hi can anyone show how to use the modem.oqpskmod for BER. thanks!
h = modem.oqpskmod
y = modulate(h, values);
g = modem.oqpskdemod(h)
z = demodulate(g, y)
let's assume that i have array called values which contains only 1s and 0s.
my question is how would i calculate BER? of course if above my code is correct.
Based on this Wikipedia page, you simply have to compute the number of incorrect bits and divide by the total number of transferred bits to get the bit error rate (BER). If values is the unmodulated input signal and z is the output signal after modulation and demodulation, you can compute it like this:
BER = sum(logical(values(:)-z(:)))/numel(values);
EDIT: I modified the above code just in case you run into two situations:
If z has values other than 0 and 1.
If z is a different size than values (i.e. row vector versus column vector).
I don't know if you are ever likely to come across these two situations, but better safe than sorry. ;)