vDSP Biquad Remove Transient Signals Swift - swift

I am using Apple's Accelerate Framework, vDSP biquad function https://developer.apple.com/documentation/accelerate/1450838-vdsp_biquad?language=objc#parameters in order to emulate pythons sosfilt function. I have done all the filtering but notice I have transient signals that I am trying to remove. In python, this is done with finding the initial conditions from sosfilt_zi https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.sosfilt_zi.html#scipy.signal.sosfilt_zi and multiplying by the first element in the data that is being filtered. I followed this amazing example Matlab filtfilt() function implementation in Java.
When I print out the initial conditions in python using sosfilt_zi, I am getting two values for each section (I'm using a second-order cascade) so I have four values in total.
However in the single channel biquad function for iOS, they are asking for (2*M)+2 for the Delay. I was assuming that the initial conditions are related to the Delay and that this is what I would need to change in order to adjust my signal and remove the transient.
The function definition for the Delay Parameter for the biquad function in iOS is " An array of single-precision values initialized with direct-form 1 “past” state data for each section of the biquad. The length of the array should be (2 * M) + 2, where M is the number of sections. For each section m, Delay[2m:2m+1] represent the two delayed input values for section m and Delay[2M:2M+1] represent the two delayed output values of the filter. After this function executes, this array contains the final state data of the filters."
If I have 2 sections, then my delay array should look like :
[0:1] - input delay value for section 0
[2:3] - input delay value for section 1
[4:5] - output delay value
I initially had the Delay set to an array of 6 zeros. If the output of sosfilt_zi would be the proper values for the two input sections of the Delay, then how could I find the delayed output values of the filter?
Is anyone familiar with vDSP and able to remove transient signals or can explain how to adjust the Delay values for a cascaded second-order sections? Is adjusting the Delay values the correct move and how could I get these values in python?
I was thinking that my delay array would look like (using values shown in second link below):
[0.72095219 -0.28358224] - section 0
[-0.80004591 0.80004591] - section 1
However, I don't know what that output would be if this is even correct. I'd appreciate any help! I've found really little documentation or examples on the iOS biquad function. Thank you!
Butterworth Bandpass filter - Initial Conditions is being applied
Output of sosfilt_zi - gives inital conditions
Result of Biquad Filter from iOS vDSP on top and result of sosfilt with initial conditions on bottom
Delays being initialized
Biquad Function

Related

Add missing/extra values in data array in Matlab

I have recorded WiFi CSI sensor data 5000 packets in 5 seconds(5000 packets x 57 subcarriers). But due to dynamic hardware configuration sometimes I only receive 4998 x 57. I want to add and estimate 2 rows so that my original design has consistent 5000 rows x 57 columns.
As you can see some data are 5000x57, and some are 4998x57.
You can achieve your desired output using mean()-function combined with the concatenation operator [] and the repmat() like this:
A=randi(100,4998,57);
A=[A;repmat(mean(A),2,1)];
Most of the functions in Matlab that take arrays as an input will calculate for each column except if the input array hast just 1 row. So does the mean function and you can just append means output to your arrays.
If you show me the code that you used to import the data, I might be able to help you create a cleaner data structure and thus be able to automatically process all of your arrays. The way the data is currently designed it's only possible to do this with dynamic variable names which is considered bad programming practice.

what is the difference between defining a vector using linspace and defining a vector using steps?

i am trying to learn the basics of matlab ,
i wanted to write a mattlab script ,
in this script i defined a vector x with a "d" step that it's length is (2*pi/1000)
and i wanted to plot two sin function according to x :
the first sin is with a frequency of 1, and the second sin frequency 10.3 ..
this is what i did:
d=(2*pi/1000);
x=-pi:d:pi;
first=sin(x);
second=sin(10.3*x);
plot(x,first,x,second);
my question:
what is the different between :
x=linspace(-pi,pi,1000);
and ..
d=(2*pi/1000);
x=-pi:d:pi;
? i am asking because i got confused since i think they both are the same but i think there is something wrong with my assumption ..
also is there is a more sufficient way to write sin function with a giveng frequency ?
The main difference can be summarizes as predefined size vs predefined step. And your example highlights it very well, indeed (1000 elements vs 1001 elements).
The linspace function produces a fixed-length vector (the length being defined by the third input argument, which defaults to 100) whose lower and upper limits are set, respectively, by the first and the second input arguments. The correct step to use is internally computed by the function itself (step = (x2 - x1) / n).
The colon operator defines a vector of elements whose values range between the specified lower and upper limits. The step, which is an optional parameter that defaults to 1, is the discriminant of the vector length. This means that the length of the result is determined by the number of steps that must be accomplished in order to reach the upper limit, starting from the lower one. On an side note, on this MathWorks thread you can find a very interesting discussion concerning the behavior of the colon operator in respect of floating-point management.
Another difference, related to the first one, is that linspace always includes the upper limit value while the colon operator only contains it if the specified step allows it (0:5:14 = [0 5 10]).
As a general rule, I prefer to use the former when I want to produce a vector of a predefined length (pretty obvious, isn't it?), and the latter when I need to create a sequence whose length has only a marginal relevance (or no relevance at all)

(matlab matrix operation), Is it possible to get a group of value from matrix without loop?

I'm currently working on implementing a gradient check function in which it requires to get certain index values from the result matrix. Could someone tell me how to get a group of values from the matrix?
To be specific, for a result matrx res with size M x N, I'll need to get element res(3,1), res(4,2), res(1,3), res(2,4)...
In my case, M is dimension and N is batch size and there's a label array whose size is 1xbatch_size, [3 4 1 2...]. So the desired values are res(label(:),1:batch_size). Since I'm trying to practice vectorization programming and it's better not using loop. Could someone tell me how to get a group of value without a iteration?
Cheers.
--------------------------UPDATE----------------------------------------------
The only idea I found is firstly building a 'mask matrix' then use the original result matrix to do element wise multiplication (technically called 'Hadamard product', see in wiki). After that just get non-zero element out and do the sum operation, the code in matlab should look like:
temp=Mask.*res;
desired_res=temp(temp~=0); %Note: the temp(temp~=0) extract non-zero elements in a 'column' fashion: it searches temp matrix column by column then put the non-zero number into container 'desired_res'.
In my case, what I wanna do next is simply sum(desired_res) so I don't need to consider the order of those non-zero elements in 'desired_res'.
Based on this idea above, creating mask matrix is the key aim. There are two methods to do this job.
Codes are shown below. In my case, use accumarray function to add '1' in certain location (which are stored in matrix 'subs') and add '0' to other space. This will give you a mask matrix size [rwo column]. The usage of full(sparse()) is similar. I made some comparisons on those two methods (repeat around 10 times), turns out full(sparse) is faster and their time costs magnitude is 10^-4. So small difference but in a large scale experiments, this matters. One benefit of using accumarray is that it could define the matrix size while full(sparse()) cannot. The full(sparse(subs, 1)) would create matrix with size [max(subs(:,1)), max(subs(:,2))]. Since in my case, this is sufficient for my requirement and I only know few of their usage. If you find out more, please share with us. Thanks.
The detailed description of those two functions could be found on matlab's official website. accumarray and full, sparse.
% assume we have a label vector
test_labels=ones(10000,1);
% method one, accumarray(subs,1,[row column])
tic
subs=zeros(10000,2);
subs(:,1)=test_labels;
subs(:,2)=1:10000;
k1=accumarray(subs,1,[10, 10000]);
t1=toc % to compare with method two to check which one is faster
%method two: full(sparse(),1)
tic
k2=full(sparse(test_labels,1:10000,1));
t2=toc

MATLAB Murphy's HMM Toolbox: Inconsistent Output Sequence and Label Statesname and Symbols

Hi I have been using Murphy's HMM toolbox with output of Gaussian Mixture. In brief, I have 2 datasets for training. Each dataset comprises of 2000 observations with 11 dimensions per observation. I implemented the following steps to observe the path sequence output.
N_states=2
N_Gaussian_Mixture=1
For each of the dataset, a HMM model was generated. The steps are:
Step 1: mixgauss_init() was used to generated GMM signature for my training data.
Step 2: After declaring the matrices for Prior and Transmat, mhmm_em() was used to generate HMM model for the training dataset.
Testing: 2 test data from each of the dataset are used for testing using mhm_logprob(). The output were correctly predicted using loglikelihood scores in every run.
However, when I tried to observe the sequence of the HMM modelling (Dataset_123 with testdata_123) via mixgauss_prob() followed by viterbi_path(), the output sequences were inconsistent. For example, for the first run, the output sequence can be 2221111111111. But when I rerun the program again, the sequence can change to 1111111111111 or 1111111111222. Initially I thought it could be due to my Prior matrix. I fixed the Prior value but it is not helping.
Secondly, it there a possibility when I can assigned labels to the states and sequence? Like Matlab function:
hmmgenerate(...,'Symbols',SYMBOLS) specifies the symbols that are emitted. SYMBOLS can be a numeric array or a cell array of the names of the symbols. The default symbols are integers 1 through N, where N is the number of possible emissions.
`hmmgenerate(...,'Statenames',STATENAMES) specifies the names of the states. STATENAMES can be a numeric array or a cell array of the names of the states. The default state names are 1 through M, where M is the number of states.?
Thank you for your time and hope to hear from the expert sharing.

MATLAB Simple - Linear Predictive Coding and Energy Forecasting

I have a dataset with 274 samples (9 months) of the daily energy (Watts.hour) used on a residential household. I'm not sure if i'm applying the lpc function correctly.
My code is the following:
filename='9-months.csv';
energy = csvread(filename);
C=zeros(5,1);
counter=0;
N=3;
for n=274:-1:31
w2=energy(1:n-1,1);
a=lpc(w2,N);
energy_estimated=0;
for X = 1:N
energy_estimated = energy_estimated + (-a(X+1)*energy(n-X));
end
w_real=energy(n);
error2=abs(w_real-energy_estimated);
counter=counter+1;
C(counter,1)=error2;
end
mean_error=round(mean(C));
Being "n" the sample on analysis, I will use the energy array's values, from 1 to n-1, to calculate the lpc coefficientes (with N=3).
After that, it will apply the calculated coefficients on the "for" cycle presented, in order to calculate the estimated energy.
Finally, error2 outputs the error between the real energy and estimated value.
On the example presented ( http://www.mathworks.com/help/signal/ref/lpc.html ) some filters are used. Do I need to apply any filter to it? Is my methodology correct?
Thank you very much in advance!
The lpc seems to be used correctly, but there are a few other things about your code. I am adressign the part at he "for n" :
for n=31:274 %for me it would seem more logically to go forward in time
w2=energy(1:n-1,1);
a=lpc(w2,N);
energy_estimate=filter([0 -a(2:end)],1,w2);
energy_estimate=energy_estimate(end);
estimates(n)=energy_estimate;
end
error=energy(31:274)-estimates(31:274)';
meanerror=mean(error); %you dont really round mean errors
filter is exactly what you are trying to do with the X=1:N loop. but this will perform the calculation for the entire w2 vector. If you just want the last value take the (end) command as well.
Now there is no reason to calculate the error for every single value and then add them to a vector you can do that faster after the calculation.
Now if your trying to estimate future values with a lpc it could work like that, but you are implying that every value is only dependend on the last 3 values. Have you tried something like a polynominal approach? i would think that this would be closer to reality.