Lake Visitor Modeling by Neural Networks - matlab

Let's say I want to model the amount of visitors at an arbitrary lake at specific time.
Given Data:
Time Series of Amount of Visitors for 12 lakes.
Weather Time Series for the 12 lakes
Number of Trees at lake
Percentage of grass/stone ground of the beach.
Hereby I want to use a Neural Network (NN) to model the amount of visitors and I have some essential questions which I want to introduce step by step. Note that the visitor time series shall not be used!
1) we only use the Inputs:
Time of Day
Day of Week
So there is two inputs and one output. I read of a rule of thumb which says that the hidden neurons should be chosen as
#input>=neurons>=#output.
Is the number of inputs here 2 or is it an estimate of the real amount of dependent variables (as weather, mood of persons, economical situation, ....). If yes so I should choose my hidden neurons as 1 or 2, correct?
2) If I want to include lake specific parameters as the number of treas or the ground ratio, can I just add these as additional inputs (constant for each of the twelve lakes) or would that not help for some reason? How could I assure that there is a causal connection between these inputs and the output?
3) For weather since it is a time series which weather values should I use. How do I get the optimal delay for example. Would Granger Causality be a mean to determine that?
Hope you can help. I just wanna discuss on the strength of NNs for modeling and want to hear your opinion. I would use Matlabs Neural Network Toolbox for this.
Thanks in advance.

Related

Choice of Neural Network and Activation Function

I am very new to the field of Neural Network. Apologies, if this question is very amateurish.
I am looking to build a neural network model to predict whether a particular image that I am about to post on a social media platform will get a certain engagement rate.
I have around 120 images with historical data about the engagement rate. The following information is available:
Images of size 501 px x 501 px
Type of image (Exterior photoshoot/Interior photoshoot)
Day of posting the image (Sunday/Monday/Tuesday/Wednesday/Thursday/Friday/Saturday)
Time of posting the image (18:33, 10:13, 19:36 etc)
No. of people who have seen the post (15659, 35754, 25312 etc)
Engagement rate (5.22%, 3.12%, 2.63% etc)
I would like the model to predict if a certain image when posted on a particular day and time will give an engagement rate of 3% or more.
As you may have noticed, the input data is images, text (signifying what type or day), time and numbers.
Could you please help me understand how to build a neural network for this problem?
P.S: I am very new to this field. It would be great if you can give a detailed direction how I should proceed to solve this problem.
A neural network has three kinds of neuronal layers:
Input layer. It stores the inputs this network will receive. The number of neurons must equal the number of inputs you have;
Hidden layer. It uses the inputs that come from the previous layer and it does the necessary calculations so as to obtain a result, which passes to the output layer. More complex problems may require more than one hidden layer. As far as I know, there is not an algorithm to determine the number of neurons in this layer, so I think you determine this number based on trial and error and previous experience;
Output layer. It gets the results from the hidden layer and gives it to the user for his personal use. The number of neurons from the output layer equals the number of outputs you have.
According to what you write here, your training database has 6 inputs and one output (the engagement rate). This means that your artificial neural network (ANN) will have 6 neurons on the input layer and one neuron on the output layer.
I not sure if you can pass images as inputs to a neural network. Also, because in theory there are an infinite types of images, I think you should categorize them a bit, each category receiving a number. An example of categorization would be:
Images with dogs are in category 1;
Images with hospitals are in category 2, etc.
So, your inputs will look like this:
Image category (dogs=1, hospitals=2, etc.);
Type of image (Exterior photoshoot=1, interior photoshoot=2);
Posting day (Sunday=1, Monday=2, etc.);
Time of posting the image;
Number of people who have seen the post;
Engagement rate.
The number of hidden layers and the number of each neuron from each hidden layer depends on your problem's complexity. Having 120 pictures, I think one hidden layer and 10 neurons on this layer is enough.
The ANN will have one hidden layer (the engagement rate).
Once the database containing the information about the 120 pictures is created (known as training database) is created, the next step is to train the ANN using the database. However, there is some discussion here.
Training an ANN means computing some parameters of the hidden neurons by using an optimization algorithm so as the sum of squared errors is minimum. The training process has some degree of randomness to it. To minimize the effect of the randomness factor and to get as precise estimations as possible, your training database must have:
Consistent data;
Many records;
I don't know how consistent your data are, but from my experience, a small training database with consistent data beats a huge database with non-consistent ones.
Judging by the problem, I think you should use the default activation function provided by the software you use for ANN handling.
Once you have trained your database, it is time to see how efficient this training was. The software which you use for ANN should provide you with tools to estimate this, tools which should be documented. If training is satisfactory for you, you may begin using it. If it is not, you may either re-train the ANN or use a larger database.

how to conduct the proper multilayer perceptron on interval data

I have a dataset of the daily temperature for a couple of years. The data is in the interval form, including daily high temp and daily low temp.
I want to do a forecasting of the data, and I recently read several paper mention that the multilayer perceptron have the advantage to do this. However, after reading the paper I still got puzzled. I know in order to conduct it, I will need to have input, hidden layer and output. But in Matlab, though I have the code already, I still don't know how to simulate it. What should I put as its input and output, should I put the interval data as the input and output? And how can I choose hidden layer?
The input in an MLP network is the input feature data that you are trying to predict the outcome of. The output is what you are trying to predict. For the hidden layer that will determine how well it predicts, which you want as large as it needs to achieve reasonable prediction results. Going too large and it just memorizes the data rather than generalize on a pattern when training is run.
For example, if your input layer would be what day of the year it is (1-365), what the high was of the day, and what the low is of the day. And I assume is what the high and low temperature would be for the next day?
The more relevant input features you have the better the network will be.

How to create a neural network for finding geolocation

There was made an order of food in a certain time from a certain place. I have the following info:
Location - delivery address
Date and time of the order
Day of the week
The weather on that day
The aim of that neural network is to train it and then predict the future orders. E.g. the places where the most of the orders will be tomorrow.
What type of the neural network should be used to make achieve that goal?
What framework or lib is better to be used for such kind of neural network?
It would be great to have an example of a working similar neural network!
Your data is a time series and such can be tackled by e.g. a Recurrent Neural Network (RNN). The popular choice here is LSTM.
You should consider what granularity your location system should have. The exact coordinates are irrelevant, you should translate it to address or, better yet, districts. You can try square grid, but something hand-crafted will work better. Visualise your training data, see how orders are clustering and based on this you could create districts.
Expect seasonality of various forms. Your ANN should for sure look at least at full month, if not a year (depending on climate). You can start with a week though.

what should be my input in ANN

I am getting confusing about Input data set . I am studying about Artificial Neural Network , my purpose is that I wanted to use the historical data (I have stock data of last 10 years ) to predict stock value in the future (for example 2015). So, where is my input? For example i have a Excel sheet data as [Column1-Date| Column2-High | Column3-low |Column4-opening|Column5-closing]
By profession I am a quant and I am currently pursuing a masters degree in Computer Science. There are a many considerations when selecting financial input for a neural network including,
Select indicators which which are positively correlated to returns.
Indicators are independent variables which have predictive power on the dependent variable (stock returns). Common popular indicators include technical indicators derived from price and volume data, fundamental indicators about the underlying company or asset, and quantitative indicators such as descriptive statistics or even model parameters. If you have many indicators, you can narrow them down using correlation analysis, best subset, or principal component analysis.
Pre-process the indicators for use in Neural Networks
Neural networks work by connecting perceptrons together. Each perceptron contains an activation function e.g. the sigmoid function or tanh. Most activation functions have an active range. For the sigmoid function this is between -sqrt(3) and +sqrt(3). What this means is that you should normalize your data to within the active range and seriously consider removing outliers.
There are many other potential issues with using Neural Networks. I wrote an article a while back which identified ten issues, including the ones mentioned here. Feel free to check it out.

Multi Step Prediction Neural Networks

I have been working with the matlab neural network toolkit. Here I am using the NARX network. I have a dataset consisting of prices of an object as well as the quantity of the object purchased over a period of time. Essential this network does one step prediction which is defined mathematically as follows:
y(t)= f (y(t −1),y(t −2),...,y(t −ny),x(t −1),x(t −2),...,x(t −nx))
Here y(t) is the price at time t and x is the amount. So the input features I am using are price and amount and the target is the price at time t+1. Suppose I have 100 records of such transactions and each transaction consists of the price and the amount.Then essentially my neural network can predict the price of the 101st transaction. This works fine for one step predictions. However, if i want to do multiple step predictions, so say i want to predict 10 transactions ahead(110th transaction), then I assume that i do a one step prediction of the price and then feed this back into the neural network. I keep doing this until I reach the 110th prediction. However, in this scenario, after i predict the 101st price , I can feed this price into the neural network to predict the 102nd price, however, I do not know the amount of the object at the 101st transaction. How do I go about this ? I was thinking about setting my targets to be the prices of transactions that are 10 transactions ahead of the current one, so that when I predict the 101st transaction, I am essentially predicting the price of the 110th transaction. Is this a viable solution or am i going about this in a completely wrong manner. Thanks in advance for any help
Similar to what kostas said, once you have the predicted 101 price, you can use all your data to predict the 101 amount, then use that to predict the 102 price, then use the 102 price to predict the 102 amount, etc. However, this compounds any error in your predictions for each variable. To mitigate that, you can add several other features, like a tapering discount on past values or a measure of error to use in the prediction (search temporal difference learning for similar ideas in the reinforcement learning realm).
I guess you can use a separate neural network to do time series prediction for x in order to produce x(t+1) up to x(t+10) and then use these values to feed another ANN to predict y(t).