How to create your own Autoencoder in Matlab? - matlab

I am trying to make an autoencoder that would work on the ORL dataset. I have the images ready in vectors(1024 * 400) and I was thinking of making an autoencoder with a linear (fully connected) layer.
Of course, with the help of the Internet and a little searching, you can come across the trainAutoencoder function.
network = trainAutoencoder(fea, 512)
But in this function I can't make an autoencoder with multiple layers?? By googling, I found stack autoencoder, which solves that problem. But I ask here a few questions about how to change the activation function (for example ReLu), and not the sigmoid that comes automatically.
autoenc1 = [featureInputLayer(32*32)
fullyConnectedLayer(16*16,"Name","fc_1")
reluLayer("Name","relu_1")
fullyConnectedLayer(8*8,"Name","fc_2")
fullyConnectedLayer(16*16,"Name","fc_3")
reluLayer("Name","relu_2")
fullyConnectedLayer(32*32,"Name","fc_4")
classificationLayer("Name","classoutput")]
Is it possible to write an autoencoder in this way? I know the classification output doesn't make sense with an unsupervised network, but MatLab was forcing me to set something up. Is it possible to make an autoencoder using Deep Network Designer?

Related

Can I train Word2vec using a Stacked Autoencoder with non-linearities?

Every time I read about Word2vec, the embedding is obtained with a very simple Autoencoder: just one hidden layer, linear activation for the initial layer, and softmax for the output layer.
My question is: why can't I train some Word2vec model using a stacked Autoencoder, with several hidden layers with fancier activation functions? (The softmax at the output would be kept, of course.)
I never found any explanation about this, therefore any hint is welcome.
Word vectors are noting but hidden states of a neural network trying to get good at something.
To answer your question
Of course you can.
If you are going to do it why not use fancier networks/encoders as well like BiLSTM or Transformers.
This is what people who created things like ElMo and BERT did(though their networks were a lot fancier).

Unsupervised training of sparse autoencoders in matlab

I've tried to follow the example provided at mathworks for training a deep sparse autoencoder (4 layers), so i pre-trained the autoencoders separately and then stacked then into a deep network. When i try to finetune this network though, via the
train(deepnet, InputDataset)
instruction, the training stops instantly and i receive a "performance goals met" message.
Is there a way to train and finetune a deep autoencoder network in an unsupervised manner in Matlab (no labels provided)?
Have you set the "MSE" goal? Secondly, for fine tuning of network use a conventional back-propagation algorithm in supervised fashion.

How does a neural network work with correlated image data

I am new to TensorFlow and deep learning. I am trying to create a fully connected neural network for image processing. I am somewhat confused.
We have an image, say 28x28 pixels. This will have 784 inputs to the NN. For non-correlated inputs, this is fine, but image pixels are generally correlated. For instance, consider a picture of a cow's eye. How can a neural network understand this when we have all pixels lined up in an array for a fully-connected network. How does it determine the correlation?
Please research some tutorials on CNN (Convolutional Neural Network); here is a starting point for you. A fully connected layer of a NN surrenders all of the correlation information it might have had with the input. Structurally, it implements the principle that the inputs are statistically independent.
Alternately, a convolution layer depends upon the physical organization of the inputs (such as pixel adjacency), using that to find simple combinations (convolutions) of feature form one layer to another.
Bottom line: your NN doesn't find the correlation: the topology is wrong, and cannot do the job you want.
Also, please note that a layered network consisting of fully-connected neurons with linear weight combinations, is not deep learning. Deep learning has at least one hidden layer, a topology which fosters "understanding" of intermediate structures. A purely linear, fully-connected layering provides no such hidden layers. Even if you program hidden layers, the outputs remain a simple linear combination of the inputs.
Deep learning requires some other discrimination, such as convolutions, pooling, rectification, or other non-linear combinations.
Let's take it into peaces to understand the intuition behind NN learning to predict.
to predict a class of given image we have to find a correlation or direct link between once of it is input values to the class. we can think about finding one pixel can tell us this image belongs to this class. which is impossible so what we have to do is build up more complex function or let's call complex features. which will help us to find to generate a correlated data to the wanted class.
To make it simpler imagine you want to build AND function (p and q), OR function (p or q) in the both cases there is a direct link between the input and the output. in and function if there 0 in the input the output always zero. so what if we want to xor function (p xor q) there is no direct link between the input and the output. the answer is to build first layer of classifying AND and OR then by a second layer taking the result of the first layer we can build the function and classify the XOR function
(p xor q) = (p or q) and not (p and q)
By applying this method on Multi-layer NN you'll have the same result. but then you'll have to deal with huge amount of parameters. one solution to avoid this is to extract representative, variance and uncorrelated features between images and correlated with their class from the images and feed the to the Network. you can look for image features extraction on the web.
this is a small explanation for how to see the link between images and their classes and how NN work to classify them. you need to understand NN concept and then you can go to read about Deep-learning.

Patternet not converging to a solution

I need to use a neural network for a binary classifier. I am using Matlab to classify the data, specifically with Patternet. The problem is, the neural network doesn't seem to find a solution. The performance seems to be asymptotic, it doesn't move at all! It's static across the whole training session.
I have had better results with the feedforward net, I get real values as the output and not binary, so I define a threshold (for instance above 0.5 is 1, below 0.5 is zero). Is there a better way to do it?
Why would the feedforward pattern network seems useless for this task but the regular feedforward net for fitting seems a better approach?
You can try standardizing your data. Non-standardized data tends to have the training algorithm get stuck in local maximum.
You can also increase the hidden layer neurons or even number of hidden layers and try.

Optimization of Neural Network input data

I'm trying to build an app to detect images which are advertisements from the webpages. Once I detect those I`ll not be allowing those to be displayed on the client side.
Basically I'm using Back-propagation algorithm to train the neural network using the dataset given here: http://archive.ics.uci.edu/ml/datasets/Internet+Advertisements.
But in that dataset no. of attributes are very high. In fact one of the mentors of the project told me that If you train the Neural Network with that many attributes, it'll take lots of time to get trained. So is there a way to optimize the input dataset? Or I just have to use that many attributes?
1558 is actually a modest number of features/attributes. The # of instances(3279) is also small. The problem is not on the dataset side, but on the training algorithm side.
ANN is slow in training, I'd suggest you to use a logistic regression or svm. Both of them are very fast to train. Especially, svm has a lot of fast algorithms.
In this dataset, you are actually analyzing text, but not image. I think a linear family classifier, i.e. logistic regression or svm, is better for your job.
If you are using for production and you cannot use open source code. Logistic regression is very easy to implement compared to a good ANN and SVM.
If you decide to use logistic regression or SVM, I can future recommend some articles or source code for you to refer.
If you're actually using a backpropagation network with 1558 input nodes and only 3279 samples, then the training time is the least of your problems: Even if you have a very small network with only one hidden layer containing 10 neurons, you have 1558*10 weights between the input layer and the hidden layer. How can you expect to get a good estimate for 15580 degrees of freedom from only 3279 samples? (And that simple calculation doesn't even take the "curse of dimensionality" into account)
You have to analyze your data to find out how to optimize it. Try to understand your input data: Which (tuples of) features are (jointly) statistically significant? (use standard statistical methods for this) Are some features redundant? (Principal component analysis is a good stating point for this.) Don't expect the artificial neural network to do that work for you.
Also: remeber Duda&Hart's famous "no-free-lunch-theorem": No classification algorithm works for every problem. And for any classification algorithm X, there is a problem where flipping a coin leads to better results than X. If you take this into account, deciding what algorithm to use before analyzing your data might not be a smart idea. You might well have picked the algorithm that actually performs worse than blind guessing on your specific problem! (By the way: Duda&Hart&Storks's book about pattern classification is a great starting point to learn about this, if you haven't read it yet.)
aplly a seperate ANN for each category of features
for example
457 inputs 1 output for url terms ( ANN1 )
495 inputs 1 output for origurl ( ANN2 )
...
then train all of them
use another main ANN to join results