What format should the LSTM input have for the EEG classification? - classification

I'm new to deep learning (and on this forum).
I have a problem understanding the input format of the LSTM network I try to build in a sequence classification problem. Here is my dataset:
EEG_1 Labels_1 EEG_2 Labels_2 …
epoch_1 1 epoch_1 1
epoch_2 0 epoch_2 0
… … … …
epoch_n 1 epoch_n 1
Where each epoch contains 50 samples and where label is the class of those 50 samples.
For one EEG, the labels of the successive epochs are dependent, which is why I would like to develop an RNN allowing to learn how to classify each epoch of this signal. On the other hand, the different EEG are not linked together (it concerns different patients). So I would like the RNN to consider them as different examples in its learning process (The EEGs of two patients are independent)
While browsing through the literature, I have seen quite a few examples of how to feed one signal to an RNN. But never (or I misunderstood) several independent signals. Can you tell me how I can do that? The coin doesn't fall out. what would be the input format of this LSTM?
Thank you in advance

Related

multiple regressor time series producing one output

Absolute beginner here. I'm trying to use a neural network to predict price of a product that's being shipped while using temperature, deaths during a pandemic, rain volume, and a column of 0 and 1's (dummy variable).
So imagine that I have a dataset given those values as well as column giving me time in a year/week format.
I started reading Rob Hyndman's forecasting book but I haven't yet seen anything that can help me. One idea that I have is to make a variable that's going to take out each column of the dataframe and make it into a time series. For example, for rain, I can do something like
rain <- df$rain_inches cost<-mainset3 %>% select(approx_cost) raintimeseries <-ts(rain, frequency=52, start=c(2015,1), end=c(2021,5))
I would the same for the other regressors.
I want to use neural networks on each of the regressors to predict cost and then put them all together.
Ideally I'm thinking it would be a good idea to train on say, 3/4 ths of the time series data and test on the remain 1/4 and then possibly make predictions.
I'm now seeing that even if I am using one regressor I'm still left with a multivariate time series and I've only found examples online for univariate models.
I'd appreciate if someone could give me ideas on how to model my problem using neural networks
I saw this link Keras RNN with LSTM cells for predicting multiple output time series based on multiple intput time series
but I just see a bunch of functions and nowhere where I can actually insert my function
The solution to your problem is the same as for the univariate case you found online except for the fact that you just need to work differently with your feature/independent set. your y variable or cost variable remains as is but your X variables will have to be in 3 dimensions which is (Number of observations, time steps, number of independent variables)

Extracting Patterns using Neural Networks

I am trying to extract common patterns that always appear whenever a certain event occurs.
For example, patient A, B, and C all had a heart attack. Using the readings from there pulse, I want to find the common patterns before the heart attack stroke.
In the next stage I want to do this using multiple dimensions. For example, using the readings from the patients pulse, temperature, and blood pressure, what are the common patterns that occurred in the three dimensions taking into consideration the time and order between each dimension.
What is the best way to solve this problem using Neural Networks and which type of network is best?
(Just need some pointing in the right direction)
and thank you all for reading
Described problem looks like a time series prediction problem. That means a basic prediction problem for a continuous or discrete phenomena generated by some existing process. As a raw data for this problem we will have a sequence of samples x(t), x(t+1), x(t+2), ..., where x() means an output of considered process and t means some arbitrary timepoint.
For artificial neural networks solution we will consider a time series prediction, where we will organize our raw data to a new sequences. As you should know, we consider X as a matrix of input vectors that will be used in ANN learning. For time series prediction we will construct a new collection on following schema.
In the most basic form your input vector x will be a sequence of samples (x(t-k), x(t-k+1), ..., x(t-1), x(t)) taken at some arbitrary timepoint t, appended to it predecessor samples from timepoints t-k, t-k+1, ..., t-1. You should generate every example for every possible timepoint t like this.
But the key is to preprocess data so that we get the best prediction results.
Assuming your data (phenomena) is continuous, you should consider to apply some sampling technique. You could start with an experiment for some naive sampling period Δt, but there are stronger methods. See for example Nyquist–Shannon Sampling Theorem, where the key idea is to allow to recover continuous x(t) from discrete x(Δt) samples. This is reasonable when we consider that we probably expect our ANNs to do this.
Assuming your data is discrete... you still should need to try sampling, as this will speed up your computations and might possibly provide better generalization. But the key advice is: do experiments! as the best architecture depends on data and also will require to preprocess them correctly.
The next thing is network output layer. From your question, it appears that this will be a binary class prediction. But maybe a wider prediction vector is worth considering? How about to predict the future of considered samples, that is x(t+1), x(t+2) and experiment with different horizons (length of the future)?
Further reading:
Somebody mentioned Python here. Here is some good tutorial on timeseries prediction with Keras: Victor Schmidt, Keras recurrent tutorial, Deep Learning Tutorials
This paper is good if you need some real example: Fessant, Francoise, Samy Bengio, and Daniel Collobert. "On the prediction of solar activity using different neural network models." Annales Geophysicae. Vol. 14. No. 1. 1996.

ANN multiple vs single outputs

I recently started studying ANN, and there is something that I've been trying to figure out that I can't seem to find an answer to (probably because it's too trivial or because I'm searching for the wrong keywords..).
When do you use multiple outputs instead of single outputs? I guess in simplest case of 1/0-classification its the easiest to use the "sign" as the output activiation function. But in which case do you use several outputs? Is it if you have for instance a multiple classification problem, so you want to classify something as, say for instance, A, B or C and you choose 1 output neuron for each class? How do you determine which class it belongs to?
In a classification context, there are a couple of situations where using multiple output units can be helpful: multiclass classification, and explicit confidence estimation.
Multiclass
For the multiclass case, as you wrote in your question, you typically have one output unit in your network for each class of data you're interested in. So if you're trying to classify data as one of A, B, or C, you can train your network on labeled data, but convert all of your "A" labels to [1 0 0], all your "B" labels to [0 1 0], and your "C" labels to [0 0 1]. (This is called a "one-hot" encoding.) You also probably want to use a logistic activation on your output units to restrict their activation values to the interval (0, 1).
Then, when you're training your network, it's often useful to optimize a "cross-entropy" loss (as opposed to a somewhat more intuitive Euclidean distance loss), since you're basically trying to teach your network to output the probability of each class for a given input. Often one uses a "softmax" (also sometimes called a Boltzmann) distribution to define this probability.
For more info, please check out http://www.willamette.edu/~gorr/classes/cs449/classify.html (slightly more theoretical) and http://deeplearning.net/tutorial/logreg.html (more aimed at the code side of things).
Confidence estimation
Another cool use of multiple outputs is to use one output as a standard classifier (e.g., just one output unit that generates a 0 or 1), and a second output to indicate the confidence that this network has in its classification of the input signal (e.g., another output unit that generates a value in the interval (0, 1)).
This could be useful if you trained up a separate network on each of your A, B, and C classes of data, but then also presented data to the system later that came from class D (or whatever) -- in this case, you'd want each of the networks to indicate that they were uncertain of the output because they've never seen something from class D before.
Have a look at softmax layer for instance. Maximum output of this layer is your class. And it has got nice theoretical justification.
To be concise : you take previous layer's output and interpret it as a vector in m dimensional space. After that you fit K gaussians to it, which are sharing covariance matrices. If you model it and write out equations it amounts to softmax layer. For more details see "Machine Learning. A Probabilistic Perspective" by Kevin Murphy.
It is just an example of using last layer for multiclass classification. You can as well use multiple outputs for something else. For instance you can train ANN to "compress" your data, that is calculate a function from N dimensional to M dimensional space that minimizes loss of information (this model is called autoencoder)

How to use created "net" neural network object for prediction?

I used ntstool to create NAR (nonlinear Autoregressive) net object, by training on a 1x1247 input vector. (daily stock price for 6 years)
I have finished all the steps and saved the resulting net object to workspace.
Now I am clueless on how to use this object to predict the y(t) for example t = 2000, (I trained the model for t = 1:1247)
In some other threads, people recommended to use sim(net, t) function - however this will give me the same result for any value of t. (same with net(t) function)
I am not familiar with the specific neural net commands, but I think you are approaching this problem in the wrong way. Typically you want to model the evolution in time. You do this by specifying a certain window, say 3 months.
What you are training now is a single input vector, which has no information about evolution in time. The reason you always get the same prediction is because you only used a single point for training (even though it is 1247 dimensional, it is still 1 point).
You probably want to make input vectors of this nature (for simplicity, assume you are working with months):
[month1 month2; month2 month 3; month3 month4]
This example contains 2 training points with the evolution of 3 months. Note that they overlap.
Use the Network
After the network is trained and validated, the network object can be used to calculate the network response to any input. For example, if you want to find the network response to the fifth input vector in the building data set, you can use the following
a = net(houseInputs(:,5))
a =
34.3922
If you try this command, your output might be different, depending on the state of your random number generator when the network was initialized. Below, the network object is called to calculate the outputs for a concurrent set of all the input vectors in the housing data set. This is the batch mode form of simulation, in which all the input vectors are placed in one matrix. This is much more efficient than presenting the vectors one at a time.
a = net(houseInputs);
Each time a neural network is trained, can result in a different solution due to different initial weight and bias values and different divisions of data into training, validation, and test sets. As a result, different neural networks trained on the same problem can give different outputs for the same input. To ensure that a neural network of good accuracy has been found, retrain several times.
There are several other techniques for improving upon initial solutions if higher accuracy is desired. For more information, see Improve Neural Network Generalization and Avoid Overfitting.
strong text

Time series classification MATLAB

My task is to classify time-series data with use of MATLAB and any neural-network framework.
Describing task more specifically:
Is is a problem from computer-vision field. Is is a scene boundary detection task.
Source data are 4 arrays of neighbouring frame histogram correlations from the videoflow.
Based on this data, we have to classify this timeseries with 2 classes:
"scene break"
"no scene break"
So network input is 4 double values for each source data entry, and output is one binary value. I am going to show example of src data below:
0.997894,0.999413,0.982098,0.992164
0.998964,0.999986,0.999127,0.982068
0.993807,0.998823,0.994008,0.994299
0.225917,0.000000,0.407494,0.400424
0.881150,0.999427,0.949031,0.994918
Problem is that pattern-recogition tools from Matlab Neural Toolbox (like patternnet) threat source data like independant entrues. But I have strong belief that results will be precise only if net take decision based on the history of previous correlations.
But I also did not manage to get valid response from reccurent nets which serve time series analysis (like delaynet and narxnet).
narxnet and delaynet return lousy result and it looks like these types of networks not supposed to solve classification tasks. I am not insert any code here while it is allmost totally autogenerated with use of Matlab Neural Toolbox GUI.
I would apprecite any help. Especially, some advice which tool fits better for accomplishing my task.
I am not sure how difficult to classify this problem.
Given your sample, 4 input and 1 output feed-forward neural network is sufficient.
If you insist on using historical inputs, you simply pre-process your input d, such that
Your new input D(t) (a vector at time t) is composed of d(t) is a 1x4 vector at time t; d(t-1) is 1x4 vector at time t-1;... and d(t-k) is a 1x4 vector at time t-k.
If t-k <0, just treat it as '0'.
So you have a 1x(4(k+1)) vector as input, and 1 output.
Similar as Dan mentioned, you need to find a good k.
Speaking of the weights, I think additional pre-processing like windowing method on the input is not necessary, since neural network would be trained to assign weights to each input dimension.
It sounds a bit messy, since the neural network would consider each input dimension independently. That means you lose the information as four neighboring correlations.
One possible solution is the pre-processing extracts the neighborhood features, e.g. using mean and std as two features representative for the originals.