Is neural network suitable for supervised learning where the data (inputs and outputs) are continuous? - neural-network

I am working on a regression model with a set of 158 inputs and 4 outputs of glass manufacturing project which is a continuous process of inputs and outputs. Is the usage of Neural Net a suitable solution for such kind of regression models? If yes, I have understood that Recurrent Neural Nets can be used for time series data, which Recurrent Neural Net shall I use? If usage of NN is not suitable, what are the other types of solutions available other than Linear Regression and Regression Trees?

Neural Networks are indeed suitable for continuous data. In fact, it is continous by default I would say. It is possible to have discrete I/O for sure, it all depend on your functions.
Secondly, it is true that RNN are suitable for time series, in a way. RNN are in fact suitable for timesteps more than timestamps. RNN are working by iterations. Typically, each iteration can be seen as a fixed step forward in time. This said, if you data is more like (date, value) (what I call timestamp), it may not be so good. It would not be absolutely impossible, but that's not the idea.
Hope it helps, start with simple RNN, try to understand how it works, then, if you need more, read about more complex cells.

Related

how to derive a model equation from the artificial neural networks?

I have used the neural network software for predicting the continous data. Obviously the prediction was better than the results obtained through regression analysis. Now i would like to derive a model expression from the trained weights obtained from the training of the continous data through the software, as suggested by many researchers on how to interpret the trained weights and biases for deriving the model equation i tried to derive one from the similar lines.
After deriving the equation i found that the equation was not able to replicate the same results as given by the neural network software. so i am exploring the new methods to derive the equation. I want to know where i am going wrong and if any one can provide me steps for deriving one it will be helpful.
I have read sometime ago about what you're talking about, but with some diferences. It would probably be useful to you. It's called 'knowledge distilling', if I remember well, and it is a way of extracting the knowledge inside the blackbox that a neural network is. It consists, roughly speaking, in training a simpler model that is easier to interpret, but preserving al the predictive power of the original neural network. I'm speaking from memory, so I'm sorry about the lack of detail. A search on Google will provide the exact references for it.
Hope to have helped.

Deep Learning Neural Networks for Time Series Prediction

I'm starting a work on Internet traffic prediction (time series prediction) using artificial neural networks, but I have few experience with the matter.
Does anyone knows which method is the best for that? (which type
of neural network to use for time series prediction)
Is Deep Learning with unsupervised training a good idea for time
series learning?
You can do time-series prediction with neural nets, but it can get pretty tricky.
1) The obvious choice is a recurrent neural network (RNN). However, these can be really difficult to train, and I would not recommend RNNs if this is your first time using neural nets. Recently there has been some interesting work on easing the training of RNNs (e.g. Hessian-free optimization), but again - it's probably not for beginners ;-) Alternatively, you could try a scheme where you use a standard neural net (i.e. not a RNN), and try to predict the next frame of data from the previous? That might work.
2) This question is too general, there is no categorical right answer. Yes, you can use unsupervised feature learning as part of your solution (e.g. pre-training your model), but if your end goal is time-series prediction you will need to do some supervised learning too.
Good luck!

Parameter settings for neural networks based classification using Matlab

Recently, I am trying to using Matlab build-in neural networks toolbox to accomplish my classification problem. However, I have some questions about the parameter settings.
a. The number of neurons in the hidden layer:
The example on this page Matlab neural networks classification example shows a two-layer (i.e. one-hidden-layer and one-output-layer) feed forward neural networks. In this example, it uses 10 neurons in the hidden layer
net = patternnet(10);
My first question is how to define the best number of neurons for my classification problem? Should I use cross-validation method to get the best performed number of neurons using a training data set?
b. Is there a method to choose three-layer or more multi-layer neural networks?
c. There are many different training method we can use in the neural networks toolbox. A list can be found at Training methods list. The page mentioned that the fastest training function is generally 'trainlm'; however, generally speaking, which one will perform best? Or it totally depends on the data set I am using?
d. In each training method, there is a parameter called 'epochs', which is the training iteration for my understanding. For each training method, Matlab defined the maximum number of epochs to train. However, from the example, it seems like 'epochs' is another parameter we can tune. Am I right? Or we just set the maximum number of epochs or leave it as default?
Any experience with Matlab neural networks toolbox is welcome and thanks very much for your reply. A.
a. You can refer to How to choose number of hidden layers and nodes in neural network? and ftp://ftp.sas.com/pub/neural/FAQ3.html#A_hu
Surely you can do cross-validation to determine the parameter of best number of neurons. But it's not recommended as it's more suitable to use it in the stage of weights training of a certain network.
b. Refer to ftp://ftp.sas.com/pub/neural/FAQ3.html#A_hl
And for more layers of neural network, you can refer to Deep Learning, which is very hot in recent years and gets state-of-the-art performances in many of the pattern recognition tasks.
c. It depends on your data. trainlm performs better on function fitting (nonlinear regression) problems than on pattern recognition problems while training large networks and pattern recognition networks, trainscg and trainrp are good choices. Generally, Gradient Descent and Resilient Backpropagation is recommended. More detailed comparison can be found here: http://www.mathworks.cn/cn/help/nnet/ug/choose-a-multilayer-neural-network-training-function.html
d. Yes, you're right. We can tune the epochs parameter. Generally you can output the recognition results/accuracy at every epoch and you will see that it is promoting more and more slowly, and the more epochs the more computing time. You can make a compromise between the accuracy and computation time.
For part b of your question:
You can use like this code:
net = patternnet([10 15 20]);
This script create a network with 3 hidden layer that first layer has 10 neurons, second layer has 15 neurons and 3th layer has 20 neurons.

Continuously train MATLAB ANN, i.e. online training?

I would like to ask for ideas what options there is for training a MATLAB ANN (artificial neural network) continuously, i.e. not having a pre-prepared training set? The idea is to have an "online" data stream thus, when first creating the network it's completely untrained but as samples flow in the ANN is trained and converges.
The ANN will be used to classify a set of values and the implementation would visualize how the training of the ANN gets improved as samples flows through the system. I.e. each sample is used for training and then also evaluated by the ANN and the response is visualized.
The effect that I expect is that for the very first samples the response of the ANN will be more or less random but as the training progress the accuracy improves.
Any ideas are most welcome.
Regards, Ola
In MATLAB you can use the adapt function instead of train. You can do this incrementally (change weights every time you get a new piece of information) or you can do it every N-samples, batch-style.
This document gives an in-depth run-down on the different styles of training from the perspective of a time-series problem.
I'd really think about what you're trying to do here, because adaptive learning strategies can be difficult. I found that they like to flail all over compared to their batch counterparts. This was especially true in my case where I work with very noisy signals.
Are you sure that you need adaptive learning? You can't periodically re-train your NN? Or build one that generalizes well enough?

Optimization of Neural Network input data

I'm trying to build an app to detect images which are advertisements from the webpages. Once I detect those I`ll not be allowing those to be displayed on the client side.
Basically I'm using Back-propagation algorithm to train the neural network using the dataset given here: http://archive.ics.uci.edu/ml/datasets/Internet+Advertisements.
But in that dataset no. of attributes are very high. In fact one of the mentors of the project told me that If you train the Neural Network with that many attributes, it'll take lots of time to get trained. So is there a way to optimize the input dataset? Or I just have to use that many attributes?
1558 is actually a modest number of features/attributes. The # of instances(3279) is also small. The problem is not on the dataset side, but on the training algorithm side.
ANN is slow in training, I'd suggest you to use a logistic regression or svm. Both of them are very fast to train. Especially, svm has a lot of fast algorithms.
In this dataset, you are actually analyzing text, but not image. I think a linear family classifier, i.e. logistic regression or svm, is better for your job.
If you are using for production and you cannot use open source code. Logistic regression is very easy to implement compared to a good ANN and SVM.
If you decide to use logistic regression or SVM, I can future recommend some articles or source code for you to refer.
If you're actually using a backpropagation network with 1558 input nodes and only 3279 samples, then the training time is the least of your problems: Even if you have a very small network with only one hidden layer containing 10 neurons, you have 1558*10 weights between the input layer and the hidden layer. How can you expect to get a good estimate for 15580 degrees of freedom from only 3279 samples? (And that simple calculation doesn't even take the "curse of dimensionality" into account)
You have to analyze your data to find out how to optimize it. Try to understand your input data: Which (tuples of) features are (jointly) statistically significant? (use standard statistical methods for this) Are some features redundant? (Principal component analysis is a good stating point for this.) Don't expect the artificial neural network to do that work for you.
Also: remeber Duda&Hart's famous "no-free-lunch-theorem": No classification algorithm works for every problem. And for any classification algorithm X, there is a problem where flipping a coin leads to better results than X. If you take this into account, deciding what algorithm to use before analyzing your data might not be a smart idea. You might well have picked the algorithm that actually performs worse than blind guessing on your specific problem! (By the way: Duda&Hart&Storks's book about pattern classification is a great starting point to learn about this, if you haven't read it yet.)
aplly a seperate ANN for each category of features
for example
457 inputs 1 output for url terms ( ANN1 )
495 inputs 1 output for origurl ( ANN2 )
...
then train all of them
use another main ANN to join results