multiple regressor time series producing one output - neural-network

Absolute beginner here. I'm trying to use a neural network to predict price of a product that's being shipped while using temperature, deaths during a pandemic, rain volume, and a column of 0 and 1's (dummy variable).
So imagine that I have a dataset given those values as well as column giving me time in a year/week format.
I started reading Rob Hyndman's forecasting book but I haven't yet seen anything that can help me. One idea that I have is to make a variable that's going to take out each column of the dataframe and make it into a time series. For example, for rain, I can do something like
rain <- df$rain_inches cost<-mainset3 %>% select(approx_cost) raintimeseries <-ts(rain, frequency=52, start=c(2015,1), end=c(2021,5))
I would the same for the other regressors.
I want to use neural networks on each of the regressors to predict cost and then put them all together.
Ideally I'm thinking it would be a good idea to train on say, 3/4 ths of the time series data and test on the remain 1/4 and then possibly make predictions.
I'm now seeing that even if I am using one regressor I'm still left with a multivariate time series and I've only found examples online for univariate models.
I'd appreciate if someone could give me ideas on how to model my problem using neural networks
I saw this link Keras RNN with LSTM cells for predicting multiple output time series based on multiple intput time series
but I just see a bunch of functions and nowhere where I can actually insert my function

The solution to your problem is the same as for the univariate case you found online except for the fact that you just need to work differently with your feature/independent set. your y variable or cost variable remains as is but your X variables will have to be in 3 dimensions which is (Number of observations, time steps, number of independent variables)

Related

Matlab's fitcsvm classfication model "Beta weights" empty

Hello problem solvers,
I am currently analyzing some fMRI data of a visual task and stuck with a problem. The idea was to train an SVM classifier on some data and then use the weight vector to project novel data onto it. However, the model struct created by fitcsvm (Matlab 2018a) only sometimes actually contains a weight vector. In many cases mdl.Betas is empty.
mdl = fitcsvm(training, labels)
weights = mdl.Beta
I looked into why that could be and made sure the input is double, is not just zeros and does not contain NaNs. So far, I have not been able to identify a rule as to why it sometimes returns empty and sometimes not. If anything it seems that as I increase the amount of input data, mdl.Betas is less frequently empty. But I can only change so much about that. :(
I am happy about any help!
Thanks!
Edit: Unfortunately, like so often with fMRI, training data is quite limited. The training data consists of 50 to 300 features (depending on brain area) and one example for class A and five examples for class B.

How to Predict a Trend in MATLAB with irregular sampling time?

I'm using MATLAB to predict a trend with a machine learning approach.
My data file is an .xlsx file containing a timeline in one column (various sampling timestamps, i.e. numbers that represents seconds), and in the other columns I have some integers representing my trend.
My .xlsx file is pretty much like this:
0,0100 | 0
0,0110 | 1
0,0135 | 5
And so on.
I used "|" to distinguish between columns. The sampling time is not regular.
Given 10 values of the trend taken from 10 consecutive timestamps, I'd like to predict the 11th value at a given timestamp. For example, if the 9th value is at 34,010 and the 10th value is at 34,568s I'd like to know the value at 37,431s.
How can I do it?
I've found this link: Time Series Forecasting Using Deep Learning, but there the sampling time is regular.
Should I interpolate my trend values and re-sample them with a constant sampling time?
I would distinguish the forecasting problem from the data sampling time problem. You are dealing substantially with missing data.
Forecasting problem: You may use any machine learning technique just ignoring missing data. If you are not familiar with machine learning, I would suggest you to use LASSO (least absolute shrinkage and selection operator), which has been demonstrated to have predicting power (see "Sparse Signals in the Cross-Section of Returns" by ALEX CHINCO, ADAM D. CLARK-JOSEPH, and MAO YE).
Missing imputation problem: In the first place you should consider the reason why you have missing data. Sometime it makes no sense to impute values because the information that the value is missing is itself important and should not be overridden. Otherwise you have multiple options, other than linear interpolation, to estimate the missing values. For example check the MATLAB function fillmissing.

Appropriate method for clustering ordinal variables

I was reading through all (or most) previously asked questions, but couldn't find an answer to my problem...
I have 13 variables measured on an ordinal scale (thy represent knowledge transfer channels), which I want to cluster (HCA) for a following binary logistic regression analysis (including all 13 variables is not possible due to sample size of N=208). A Factor Analysis seems inappropriate due to the scale level. I am using SPSS (but tried R as well).
Questions:
1: Am I right in using the Chi-Squared measure for count data instead of the (squared) euclidian distance?
2. How can I justify a choice of method? I tried single, complete, Ward and average, but all give different results and I can't find a source to base my decision on.
Thanks a lot in advance!
Answer 1: Since the variables are on ordinal scale, the chi-square test is an appropriate measurement test. Because, "A Chi-square test is designed to analyze categorical data. That means that the data has been counted and divided into categories. It will not work with parametric or continuous data (such as height in inches)." Reference.
Again, ordinal scaled data is essentially count or frequency data you can use regular parametric statistics: mean, standard deviation, etc Or non-parametric tests like ANOVA or Mann-Whitney U test to compare 2 groups or Kruskal–Wallis H test to compare three or more groups.
Answer 2: In a clustering problem, the choice of distance method solely depends upon the type of variables. I recommend you to read these detailed posts 1, 2,3

Extracting Patterns using Neural Networks

I am trying to extract common patterns that always appear whenever a certain event occurs.
For example, patient A, B, and C all had a heart attack. Using the readings from there pulse, I want to find the common patterns before the heart attack stroke.
In the next stage I want to do this using multiple dimensions. For example, using the readings from the patients pulse, temperature, and blood pressure, what are the common patterns that occurred in the three dimensions taking into consideration the time and order between each dimension.
What is the best way to solve this problem using Neural Networks and which type of network is best?
(Just need some pointing in the right direction)
and thank you all for reading
Described problem looks like a time series prediction problem. That means a basic prediction problem for a continuous or discrete phenomena generated by some existing process. As a raw data for this problem we will have a sequence of samples x(t), x(t+1), x(t+2), ..., where x() means an output of considered process and t means some arbitrary timepoint.
For artificial neural networks solution we will consider a time series prediction, where we will organize our raw data to a new sequences. As you should know, we consider X as a matrix of input vectors that will be used in ANN learning. For time series prediction we will construct a new collection on following schema.
In the most basic form your input vector x will be a sequence of samples (x(t-k), x(t-k+1), ..., x(t-1), x(t)) taken at some arbitrary timepoint t, appended to it predecessor samples from timepoints t-k, t-k+1, ..., t-1. You should generate every example for every possible timepoint t like this.
But the key is to preprocess data so that we get the best prediction results.
Assuming your data (phenomena) is continuous, you should consider to apply some sampling technique. You could start with an experiment for some naive sampling period Δt, but there are stronger methods. See for example Nyquist–Shannon Sampling Theorem, where the key idea is to allow to recover continuous x(t) from discrete x(Δt) samples. This is reasonable when we consider that we probably expect our ANNs to do this.
Assuming your data is discrete... you still should need to try sampling, as this will speed up your computations and might possibly provide better generalization. But the key advice is: do experiments! as the best architecture depends on data and also will require to preprocess them correctly.
The next thing is network output layer. From your question, it appears that this will be a binary class prediction. But maybe a wider prediction vector is worth considering? How about to predict the future of considered samples, that is x(t+1), x(t+2) and experiment with different horizons (length of the future)?
Further reading:
Somebody mentioned Python here. Here is some good tutorial on timeseries prediction with Keras: Victor Schmidt, Keras recurrent tutorial, Deep Learning Tutorials
This paper is good if you need some real example: Fessant, Francoise, Samy Bengio, and Daniel Collobert. "On the prediction of solar activity using different neural network models." Annales Geophysicae. Vol. 14. No. 1. 1996.

How to use created "net" neural network object for prediction?

I used ntstool to create NAR (nonlinear Autoregressive) net object, by training on a 1x1247 input vector. (daily stock price for 6 years)
I have finished all the steps and saved the resulting net object to workspace.
Now I am clueless on how to use this object to predict the y(t) for example t = 2000, (I trained the model for t = 1:1247)
In some other threads, people recommended to use sim(net, t) function - however this will give me the same result for any value of t. (same with net(t) function)
I am not familiar with the specific neural net commands, but I think you are approaching this problem in the wrong way. Typically you want to model the evolution in time. You do this by specifying a certain window, say 3 months.
What you are training now is a single input vector, which has no information about evolution in time. The reason you always get the same prediction is because you only used a single point for training (even though it is 1247 dimensional, it is still 1 point).
You probably want to make input vectors of this nature (for simplicity, assume you are working with months):
[month1 month2; month2 month 3; month3 month4]
This example contains 2 training points with the evolution of 3 months. Note that they overlap.
Use the Network
After the network is trained and validated, the network object can be used to calculate the network response to any input. For example, if you want to find the network response to the fifth input vector in the building data set, you can use the following
a = net(houseInputs(:,5))
a =
34.3922
If you try this command, your output might be different, depending on the state of your random number generator when the network was initialized. Below, the network object is called to calculate the outputs for a concurrent set of all the input vectors in the housing data set. This is the batch mode form of simulation, in which all the input vectors are placed in one matrix. This is much more efficient than presenting the vectors one at a time.
a = net(houseInputs);
Each time a neural network is trained, can result in a different solution due to different initial weight and bias values and different divisions of data into training, validation, and test sets. As a result, different neural networks trained on the same problem can give different outputs for the same input. To ensure that a neural network of good accuracy has been found, retrain several times.
There are several other techniques for improving upon initial solutions if higher accuracy is desired. For more information, see Improve Neural Network Generalization and Avoid Overfitting.
strong text