Applying Feature Scaling in a Neural Network

Applying Feature Scaling in a Neural Network - neural-network

I have two questions:
Do I have to apply Feature Scaling over ALL features in Neural Network(and Deep Learning too)?
How can I scale categorical features in a dataset for neural network(if needed)?

Its depends, what are you are trying to you could:
Use one hot encoding, to create numeric values.
For numerical features you could divide in each value into the average value for example, ant then you will will get values between
[0,1], which make make it , good to feed a neural network.

Related

Thresholding in intermediate layer using Gumbel Softmax

In a neural network, for an intermediate layer, I need to threshold the output. The output of each neuron in the layer is a real value, but I need to binarize it (to 0 or 1). But with hard thresholding, backpropagation won't work. Is there a way to achieve this?
Details:
I have a GAN kind of network i.e. there are 2 neural networks trained end-to-end. The output of first neural network is real values. I need them to be binary values. I read that Gumbel Softmax (Categorical Reparameterization) is used to handle discrete variables in a neural network. Is there a way to use that for my use-case? If yes, how? If not, is there any other way?
From what I could gather in internet is that Gumbel is a probability distribution. Using that we can generate a discrete distribution. But for use-case, I need a function that can take a real input and output a binary value. So, I need an activation function of that form. How can I achieve that?
Thanks!

Deep Learning on Encrypted Images

Suppose we have a set of images and labels meant for a machine-learning classification task. The problem is that these images come with a relatively short retention policy. While one could train a model online (i.e. update it with new image data every day), I'm ideally interested in a solution that can somehow retain images for training and testing.
To this end, I'm interested if there are any known techniques, for example some kind of one-way hashing on images, which obfuscates the image, but still allows for deep learning techniques on it.
I'm not an expert on this but the way I'm thinking about it is as follows: we have a NxN image I (say 1024x1024) with pixel values in P:={0,1,...,255}^3, and a one-way hash map f(I):P^(NxN) -> S. Then, when we train a convolutional neural network on I, we first map the convolutional filters via f, to then train on a high-dimensional space S. I think there's no need for f to locally-sensitive, in that pixels near each other don't need to map to values in S near each other, as long as we know how to map the convolutional filters to S. Please note that it's imperative that f is not invertible, and that the resulting stored image in S is unrecognizable.
One option for f,S is to use a convolutional neural network on I to then extract the representation of I from it's fully connected layer. This is not ideal because there's a high chance that this network won't retain the finer features needed for the classification task. So I think this rules out a CNN or auto encoder for f.

Face Recognition based on Deep Learning (Siamese Architecture)

I want to use pre-trained model for the face identification. I try to use Siamese architecture which requires a few number of images. Could you give me any trained model which I can change for the Siamese architecture? How can I change the network model which I can put two images to find their similarities (I do not want to create image based on the tutorial here)? I only want to use the system for real time application. Do you have any recommendations?

I suppose you can use this model, described in Xiang Wu, Ran He, Zhenan Sun, Tieniu Tan A Light CNN for Deep Face Representation with Noisy Labels (arXiv 2015) as a a strating point for your experiments.
As for the Siamese network, what you are trying to earn is a mapping from a face image into some high dimensional vector space, in which distances between points reflects (dis)similarity between faces.
To do so, you only need one network that gets a face as an input and produce a high-dim vector as an output.
However, to train this single network using the Siamese approach, you are going to duplicate it: creating two instances of the same net (you need to explicitly link the weights of the two copies). During training you are going to provide pairs of faces to the nets: one to each copy, then the single loss layer on top of the two copies can compare the high-dimensional vectors representing the two faces and compute a loss according to a "same/not same" label associated with this pair.
Hence, you only need the duplication for the training. In test time ('deploy') you are going to have a single net providing you with a semantically meaningful high dimensional representation of faces.
For a more advance Siamese architecture and loss see this thread.
On the other hand, you might want to consider the approach described in Oren Tadmor, Yonatan Wexler, Tal Rosenwein, Shai Shalev-Shwartz, Amnon Shashua Learning a Metric Embedding for Face Recognition using the Multibatch Method (arXiv 2016). This approach is more efficient and easy to implement than pair-wise losses over image pairs.

Self organizing Maps and Linear vector quantization

Self organizing maps are more suited for clustering(dimension reduction) rather than classification. But SOM's are used in Linear vector quantization for fine tuning. But LVQ is a supervised leaning method. So to use SOM's in LVQ, LVQ should be provided with a labelled training data set. But since SOM's only do clustering and not classification and thus cannot have labelled data how can SOM be used as an input for LVQ?
Does LVQ fine tune the clusters in SOM?
Before using in LVQ should SOM be put through another classification algorithm so that it can classify the inputs so that these labelled inputs maybe used in LVQ?

It must be clear that supervised differs from unsupervised because in the first the target values are known.
Therefore, the output of supervised models is a prediction.
Instead, the output of unsupervised models is a label for which we don't know the meaning yet. For this purpose, after clustering, it is necessary to do the profiling of each one of those new label.
Having said so, you could label the dataset using an unsupervised learning technique such as SOM. Then, you should profile each class in order to be sure to understand the meaning of each class.
At this point, you can pursue two different path depending on what is your final objective:
1. use this new variable as a way for dimensionality reduction
2. use this new dataset featured with the additional variable representing the class as a labelled data that you will try to predict using the LVQ
Hope this can be useful!

neural network with multiplicative probability factor

I'm developing a project for the university. I have to create a classifier for a disease. The data-set i have contains several inputs (symptoms) and each of them is associated to a multiplicative probability factor (e.g. if patient has the symptom A, he has a double probability to have that disease).
So, how can i do this type of classifier? Is there any type of neural network or other instrument to do this??
Thanks in advance

You should specify how much labeled data and unlabeled data you have.
Let's assume you have only labeled data. Then you could use neural networks, but IMHO, SVM or random forests are the best techniques for a first try.
Note that if you use machine learning techniques, your prior information about symptoms (multiplicative coefficients) are not used because the labels are used instead. If you want to use these coefficients, it's no more machine learning.

You can use neural network for this purpose also. If to speak about your situation, with binding symptom A to more chances for decease B, that is what neural network should be able to accomplish. To bind connection weights from input A ( symptom A ) to desease B. From your side, you can engrain such classification rule in case if you'll have enough training data in your training data set. Also I propose you to try two different approaches: 1. neural network with N outputs (N = number of deseases to clasif). 2. Create for each desease neural network.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse