Converting .wav files to an array in MATLAB - matlab

I am interested in creating an artificial neural network to create a simple guitar tuner and I am in the process of training the ANN in MATLAB. I am currently using [y, Fs, nbits] = wavread('file.wav') to capture the sound and then call X = fft(y,256) to generate an array. I want to generate the array so that I can use it as the input for the ANN. I would like to know if there is a better way to do this conversion since I am not getting the desired result.

I guess your y(:,1) and y(:,2) represents the left and right channels respectively, and they are probably identity. You can apply fft on only one of them.
Another approach if you are handling two different channels is that you can use two nodes at input layer with y(:,1) and y(:,2) as the value.

Related

Combining a LGR and a Kalman Filter into a single control

What I'm trying to do
I'm trying to create a LQG controller to control a given system by combining a Linear Quadratic Regulator (LQR) and a Kalman Filter.
Where I'm stuck
I have found both separately, but am unsure how I combine them in MATLAB. This is the system and LQR solution:
I am able to create both the Kalman Filter and the LQR seperately, but I can't figure out how to combine the LQR to take the Kalman Filter State Estimate as its input.
Akal = Afull;
Bkal = [B1, B2];
Ckal = Cfull;
Dkal = [0 0];
sys_kal = ss(Akal,Bkal,Ckal,Dkal);
[KEST,L,P] = kalman(sys_kal,E_d, E_n, 0)
[K,S,e] = lqr(Afull,B1,Q,r);
When I use the kalman function, here is what size(KEST) gives me:
State-space model with 11 outputs, 2 inputs, and 10 states.
I want my U to use the estimate given by the new SS system KEST. KEST provides an estimate for the output (y, dimension 1), and an estimate for all 10 states (X, dimension 10). I can write/draw out the closed loop control path that I am looking to create using the LQR and Kalman functions, but I am stuck at this point because I don't know how to implement it via MATLAB. I am unsure of the syntax as well.
I have searched for MATLAB examples but haven't found any that show me how to combine what I have found. I know that KEST is a state space model but I don't know how to use it or select a single output.
What I'm hoping to get help with
If I use bode(KEST), it gives me a BODE plot of all 11 outputs. I am not sure how to select just a single output of KEST.
I would like to have U = -K*X_est, but currently I just know the value of K. I don't know how to obtain an X_est from my KEST state space system.
What you are looking for is a command called lqgreg:
rlqg = lqgreg(kest,k)
Also make sure that kest outputs are the 10 states, and the y (output) is not included in the estimation.
To understand it better: LQR is a state-feedback, so the control is feeding back all of your states with an optimal k gain. The size of k is equal to the states of your system model.
Usual problem with LQR is that you rarely have all states measured. Here comes the kalman filter and helps to estimate all states from the measured outputs and known inputs. Note that kalman filter can be used for many other things, but here the only role of kalman filter is to create state estimates for state feedback, therefore it should estimate nothing else than the system states.
If you need to know you system output (for plotting or anything else) it is very easy to calculate from state estimates and the input (Cx + Du), but you can create another kalman filter just for the output estimation. This later solution is not acceptable when running in a microcontroller or other low capacity enviroment, since you are duplicate basically the same algorithm.

How to model best an input/output model with neural network?

The theme here is the use of neural network in learning time histories.
Lets consider for clarity Y = f(X) where X is a vector 1xN and Y 1xN.
In most of the models I can find or test online, X is directly the time vectorised with regular timesteps (X=T).
The prediction task performed on the time history is done therefore using the output Y, and using a sequence of this let say Y(i:i+Nsample) as an Neural Network input and then the output is Y(i+Nsample+1). Then the prediction is performed moving the window one step at the time ( i= i+1).
Now my question is the following. In the case where we have a vector X which is a generic function whose values are known, the problem to model with neural network is:
knowing X(i:i+Nsample+1) and Y(i:i+Nsample) we want to predict Y(i:i+Nsample+1)
then we can do i=i+1 and proceed forward.
What are the best solutions to design such a system, is there an example with Keras or other project from which I could be inspired?
I see several solutions but without being convinced.
a)Set a multidimensional vector (2xNsample) as input [X(i:i+Nsample ) ; Y(i-1:i+Nsample-1)] and predict Y(:i+Nsample) (treat the output as a second input)
b) set two separate lstm for X and Y and then concatenate them in some way

Neural Network Multiple Inputs and one output

i want to create a Neural Network with "three (2D) Matrices" as a inputs , and
the output is a 1 (2D) Matrix , so the three inputs is :
1-2D Matrix Contains ( X ,Y ) Coordinates From a device
2-2D Matrix Contains ( X ,Y ) Coordinates From another different Device
3-2D Matrix Contains the True exact( X , Y ) Coordinates that i already
measured it ( i don't know if that exact include from the inputs or What??)
***Note that each input have his own Error and i want to make the Neural Network
to Minimize that error and choose the best result depends on the true exact (X,Y)***
**Notice that : im Working on object tracking that i extracting (x,y) Coordinate
from the camera and the other device is same the data so for example
i will simulate the Coordinates as Follows:
{ (1,2), (1,3), (1,4), (1,5) , (1,6).......}
and so on
For Sure the Output is one 2D Matrix the best or the True Exact (x,y) ,So im a
beginner and i want to understand how to create this Network with this Different
inputs and choose the best training method to have the Best Results ...?!
thanks in Advance
It sounds like what you want is a HxWx2 input where the first channel (depthwise layer) is your 1st input and the 2nd channel is your 2nd input. Your "true exact" coordinates would be the target that your net output is compared to, rather than being an input.
Note that neural nets don't really handle regression (real-valued outputs) very well - you may get better results dividing your coordinate range into buckets and then treating it as a classification problem instead (use softmax loss vs regression mean-squared error loss).
Expanding on the regression vs classification point:
A regression problem is one where you want the net to output a real value such as a coordinate value in range 0-100. A classification problem is one where you want the net to output a set of propabilities that your input belongs to a given class it was trained on (e.g. you train a net on images belonging to classes "cat" "dog" and "rabbit").
It turns out that modern neural nets are much better at classification than regression, because the way they work is basically by subdividing the N-dimensional input spaces into sub-regions corresponding to the outputs they are being trained to make. What they are naturally doing is classifying.
The obvious way to turn a regression problem into a classification problem, which may work better, is to divide your desired output range into sub-ranges (aka buckets) which you treat as classes. e.g. instead of training your net to output a single (or multiple) value in range 0-100, instead train it to output class probabilities representing, for example, each of 10 separate sub-ranges (classes) 0-9, 10-19, 20-20, etc.

Using large input values with Auto Encoders

I have created an Auto Encoder Neural Network in MATLAB. I have quite large inputs at the first layer which I have to reconstruct through the network's output layer. I cannot use the large inputs as it is,so I convert it to between [0, 1] using sigmf function of MATLAB. It gives me a values of 1.000000 for all the large values. I have tried using setting the format but it does not help.
Is there a workaround to using large values with my auto encoder?
The process of convert your inputs to the range [0,1] is called normalization, however, as you noticed, the sigmf function is not adequate for this task. This link maybe is useful to you.
Suposse that your inputs are given by a matrix of N rows and M columns, where each row represent an input pattern and each column is a feature. If your first column is:
vec =
-0.1941
-2.1384
-0.8396
1.3546
-1.0722
Then you can convert it to the range [0,1] using:
%# get max and min
maxVec = max(vec);
minVec = min(vec);
%# normalize to -1...1
vecNormalized = ((vec-minVec)./(maxVec-minVec))
vecNormalized =
0.5566
0
0.3718
1.0000
0.3052
As #Dan indicates in the comments, another option is to standarize the data. The goal of this process is to scale the inputs to have mean 0 and a variance of 1. In this case, you need to substract the mean value of the column and divide by the standard deviation:
meanVec = mean(vec);
stdVec = std(vec);
vecStandarized = (vec-meanVec)./ stdVec
vecStandarized =
0.2981
-1.2121
-0.2032
1.5011
-0.3839
Before I give you my answer, let's think a bit about the rationale behind an auto-encoder (AE):
The purpose of auto-encoder is to learn, in an unsupervised manner, something about the underlying structure of the input data. How does AE achieves this goal? If it manages to reconstruct the input signal from its output signal (that is usually of lower dimension) it means that it did not lost information and it effectively managed to learn a more compact representation.
In most examples, it is assumed, for simplicity, that both input signal and output signal ranges in [0..1]. Therefore, the same non-linearity (sigmf) is applied both for obtaining the output signal and for reconstructing back the inputs from the outputs.
Something like
output = sigmf( W*input + b ); % compute output signal
reconstruct = sigmf( W'*output + b_prime ); % notice the different constant b_prime
Then the AE learning stage tries to minimize the training error || output - reconstruct ||.
However, who said the reconstruction non-linearity must be identical to the one used for computing the output?
In your case, the assumption that inputs ranges in [0..1] does not hold. Therefore, it seems that you need to use a different non-linearity for the reconstruction. You should pick one that agrees with the actual range of you inputs.
If, for example, your input ranges in (0..inf) you may consider using exp or ().^2 as the reconstruction non-linearity. You may use polynomials of various degrees, log or whatever function you think may fit the spread of your input data.
Disclaimer: I never actually encountered such a case and have not seen this type of solution in literature. However, I believe it makes sense and at least worth trying.

Time of Arrival estimation of a signal in Matlab

I want to estimate the time of arrival of GPR echo signals using Music algorithm in matlab, I am using the duality property of Fourier transform.
I am first applying FFT on the obtained signal and then passing these as parameters to pmusic function, i am still getting the result in frequency domain.?
Short Answer: You're using the wrong function here.
As far as I can tell Matlab's pmusic function returns the pseudospectrum of an input signal.
If you click on the pseudospectrum link, you'll see that the pseudospectrum of a signal lives in the frequency domain. In particular, look at the plot:
(from Matlab's documentation: Plotting Pseudospectrum Data)
Notice that the result is in the frequency domain.
Assuming that by GPR you mean Ground Penetrating Radar, then try radar or sonar echo detection approach to estimate the two way transit time.
This can be done and the theory has been published in several papers. See, for example, here:
STAR Channel Estimation in DS-CDMA Systems
That paper describes spatiotemporal estimation (i.e. estimation of both time and direction of arrival), but you can ignore the spatial part and just do temporal estimation if you have a single-antenna receiver.
You probably won't want to use Matlab's pmusic function directly. It's always quicker and easier to write these sorts of functions for yourself, so you know what is actually going on. In the case of MUSIC:
% Get noise subspace (where M is number of signals)
[E, D] = eig(Rxx);
[lambda, idx] = sort(diag(D), 'descend');
E = E(:, idx);
En = E(:,M+1:end);
% [Construct matrix S, whose columns are the vectors to search]
% Calculate MUSIC null spectrum and convert to dB
Z = 10*log10(sum(abs(S'*En).^2, 2));
You can use the Phased array system toolbox of MATLAB if you want to estimate the DOA using different algorithms using a single command. Such as for Root MUSIC it is phased.RootMUSICEstimator phased.ESPRITEstimator.
However as Harry mentioned its easy to write your own function, once you define the signal subspace and receive vector, you can directly apply it in the MUSIC function to find its peaks.
This is another good reference.
http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=1143830