Deep Learning on Encrypted Images - hash

Suppose we have a set of images and labels meant for a machine-learning classification task. The problem is that these images come with a relatively short retention policy. While one could train a model online (i.e. update it with new image data every day), I'm ideally interested in a solution that can somehow retain images for training and testing.
To this end, I'm interested if there are any known techniques, for example some kind of one-way hashing on images, which obfuscates the image, but still allows for deep learning techniques on it.
I'm not an expert on this but the way I'm thinking about it is as follows: we have a NxN image I (say 1024x1024) with pixel values in P:={0,1,...,255}^3, and a one-way hash map f(I):P^(NxN) -> S. Then, when we train a convolutional neural network on I, we first map the convolutional filters via f, to then train on a high-dimensional space S. I think there's no need for f to locally-sensitive, in that pixels near each other don't need to map to values in S near each other, as long as we know how to map the convolutional filters to S. Please note that it's imperative that f is not invertible, and that the resulting stored image in S is unrecognizable.
One option for f,S is to use a convolutional neural network on I to then extract the representation of I from it's fully connected layer. This is not ideal because there's a high chance that this network won't retain the finer features needed for the classification task. So I think this rules out a CNN or auto encoder for f.

Related

Face Recognition based on Deep Learning (Siamese Architecture)

I want to use pre-trained model for the face identification. I try to use Siamese architecture which requires a few number of images. Could you give me any trained model which I can change for the Siamese architecture? How can I change the network model which I can put two images to find their similarities (I do not want to create image based on the tutorial here)? I only want to use the system for real time application. Do you have any recommendations?
I suppose you can use this model, described in Xiang Wu, Ran He, Zhenan Sun, Tieniu Tan A Light CNN for Deep Face Representation with Noisy Labels (arXiv 2015) as a a strating point for your experiments.
As for the Siamese network, what you are trying to earn is a mapping from a face image into some high dimensional vector space, in which distances between points reflects (dis)similarity between faces.
To do so, you only need one network that gets a face as an input and produce a high-dim vector as an output.
However, to train this single network using the Siamese approach, you are going to duplicate it: creating two instances of the same net (you need to explicitly link the weights of the two copies). During training you are going to provide pairs of faces to the nets: one to each copy, then the single loss layer on top of the two copies can compare the high-dimensional vectors representing the two faces and compute a loss according to a "same/not same" label associated with this pair.
Hence, you only need the duplication for the training. In test time ('deploy') you are going to have a single net providing you with a semantically meaningful high dimensional representation of faces.
For a more advance Siamese architecture and loss see this thread.
On the other hand, you might want to consider the approach described in Oren Tadmor, Yonatan Wexler, Tal Rosenwein, Shai Shalev-Shwartz, Amnon Shashua Learning a Metric Embedding for Face Recognition using the Multibatch Method (arXiv 2016). This approach is more efficient and easy to implement than pair-wise losses over image pairs.

Convolution Neural Network for image detection/classification

So here is there setup, I have a set of images (labeled train and test) and I want to train a conv net that tells me whether or not a specific object is within this image.
To do this, I followed the tensorflow tutorial on MNIST, and I train a simple conv net reduced to the area of interest (the object) which are training on image of size 128x128. The architecture is as follows : successively 3 layers consisting of 2 conv layers and 1 max pool down-sampling layers, and one fully connected softmax layers (with two class 0 and 1 whether the object is present or not)
I impleted it using tensorflow, and this works quite well, but since I have enough computing power I was wondering how I could improve the complexity of the classification:
- adding more layers ?
- adding more channel at each layer ? (currently 32,64,128 and 1024 for the fully connected)
- anything else ?
But the most important part is that now I want to detect this same object on larger images (roughle 600x600 whereas the size of the object should be around 100x100).
I was wondering how I could use the previously training "small" network used for small images, in order to pretrained a larger network on the large images ? One option could be to classify the image using a slicing window of size 128x128 and scan the whole image but I would like to try if possible to train a whole network on it.
Any suggestion on how to proceed ? Or an article / ressource tackling this kind of problem ? (I am really new to deep learning so sorry if this is stupid question...)
Thanks !
I suggest that you continue reading on the field overall. Your search keys include CNN, image classification, neural net, AlexNet, GoogleNet, and ResNet. This will return many articles, on-line classes and lectures, and other materials to help you learn about classification with neural nets.
Don't just add layers or filters: the complexity of the topology (net design) must be fitted to the task; a net that's too complex will over-fit the training data. The one you've been using is probably LeNet; the three I cite above are for the ImageNet image classification contest.
Since you are working on images, I would suggest you to use a pretrained image classification network (like VGG, Alexnet etc.)and fine tune this network with your 128x128 image data. In my experience until we have very large data set fine tuned network will give more accuracy and also save training time. After building a good image classifier on your data set you can use any popular algorithm to generate region of proposal from the image. Now take all regions of proposal and pass them to classification network one by one and check weather this network is classifying given region of proposal as positive or negative. If it classifying as positively then most probably your object is present in that region. Otherwise it's not. If there are a lot of region of proposal in which object is present according to classifier then you can use non maximal suppression algorithms to reduce number of positive proposals.

How does a neural network work with correlated image data

I am new to TensorFlow and deep learning. I am trying to create a fully connected neural network for image processing. I am somewhat confused.
We have an image, say 28x28 pixels. This will have 784 inputs to the NN. For non-correlated inputs, this is fine, but image pixels are generally correlated. For instance, consider a picture of a cow's eye. How can a neural network understand this when we have all pixels lined up in an array for a fully-connected network. How does it determine the correlation?
Please research some tutorials on CNN (Convolutional Neural Network); here is a starting point for you. A fully connected layer of a NN surrenders all of the correlation information it might have had with the input. Structurally, it implements the principle that the inputs are statistically independent.
Alternately, a convolution layer depends upon the physical organization of the inputs (such as pixel adjacency), using that to find simple combinations (convolutions) of feature form one layer to another.
Bottom line: your NN doesn't find the correlation: the topology is wrong, and cannot do the job you want.
Also, please note that a layered network consisting of fully-connected neurons with linear weight combinations, is not deep learning. Deep learning has at least one hidden layer, a topology which fosters "understanding" of intermediate structures. A purely linear, fully-connected layering provides no such hidden layers. Even if you program hidden layers, the outputs remain a simple linear combination of the inputs.
Deep learning requires some other discrimination, such as convolutions, pooling, rectification, or other non-linear combinations.
Let's take it into peaces to understand the intuition behind NN learning to predict.
to predict a class of given image we have to find a correlation or direct link between once of it is input values to the class. we can think about finding one pixel can tell us this image belongs to this class. which is impossible so what we have to do is build up more complex function or let's call complex features. which will help us to find to generate a correlated data to the wanted class.
To make it simpler imagine you want to build AND function (p and q), OR function (p or q) in the both cases there is a direct link between the input and the output. in and function if there 0 in the input the output always zero. so what if we want to xor function (p xor q) there is no direct link between the input and the output. the answer is to build first layer of classifying AND and OR then by a second layer taking the result of the first layer we can build the function and classify the XOR function
(p xor q) = (p or q) and not (p and q)
By applying this method on Multi-layer NN you'll have the same result. but then you'll have to deal with huge amount of parameters. one solution to avoid this is to extract representative, variance and uncorrelated features between images and correlated with their class from the images and feed the to the Network. you can look for image features extraction on the web.
this is a small explanation for how to see the link between images and their classes and how NN work to classify them. you need to understand NN concept and then you can go to read about Deep-learning.

Matlab Neural Network to classify fingerprint

I have already extracted the features of a fingerprint database then a Neural Network should be applied to classify the images by gender. I haven't worked with NN yet and I know a bit.
What type of NN should be used? Is it Artificial Neural Network or Multi-layer perceptron?
If the image size is not the same among all, does it matter?
Maybe some code sample in this area could help.
A neural network is a function approximator. You can think of it as a high-tech cousin to piecewise linear fitting. If you want to fit the most complex phenomena ever with a single parameter - you are going to get the mean and should not be surprised if it isn't infinitely useful. To get a useful fit, you must couple the nature of the phenomena being modeled with the NN. If you are modeling a planar surface, then you are going to need more than one coefficient (typically 3 or 4 depending on your formulation).
One of the questions behind this question is "what is the basis of fingerprints". By basis I mean the heavily baggaged word from Linear Algebra and calculus that talks about vector spaces, span, and eigens. Once you know what the "basis" is then you can build a neural network to approximate the basis, and this neural network will give reasonable results.
So while I was looking for a paper on the basis, I found this:
http://phys.org/news/2012-02-experts-human-error-fingerprint-analysis.html
http://phys.org/news/2013-07-fingerprint-grading.html
http://phys.org/news/2013-04-forensic-scientists-recover-fingerprints-foods.html
http://phys.org/news/2012-11-method-artificial-fingerprints.html
http://phys.org/news/2011-08-chemist-contributes-method-recovering-fingerprints.html
And here you go, a good document of the basis of fingerprints:
http://math.arizona.edu/~anewell/publications/Fingerprint_Formation.pdf
Taking a very crude stab, you might try growing some variation on an narxnet (nonlinear autogregressive network with external inputs) link. I would grow it until it characterizes your set using some sort of doubling the capacity. I would look at convergence rates as a function of "size" so that the smaller networks inform how long convergence takes for the larger ones. That means it might take a very large network to make this work, but large networks are like the 787 - they cost a lot, take forever to build, and sometimes do not fly well.
If I were being clever, I would pay attention to the article by Kucken and formulate the inputs as some sort of a inverse modeling of a stress field.
Best of luck.
You can try a SOM/LVQ network for classification in MATLAB, and image sizes does matter you should try to normalize the images down to a standard size before doing the feature extraction. This will ensure that each feature vector gets assigned to an input neuron.
function scan(img)
files = dir('*.jpg');
hist = [];
for n = 1 : length(files)
filename = files(n).name;
file = imread(filename);
hist = [hist, imhist(rgb2gray(imresize(file,[ 50 50])))]; %#ok
end
som = selforgmap([10 10]);
som = train(som, hist);
t = som(hist); %extract class data
net = lvqnet(10);
net = train(net, hist, t);
like(img, hist, files, net)
end
Doesn't have code examples but this paper may be helpful: An Effective Fingerprint Verification Technique, Gogoi & Bhattacharyya
This paper presents an effective method for fingerprint verification based on a data mining technique called minutiae clustering and a graph-theoretic approach to analyze the process of fingerprint comparison to give a feature space representation of minutiae and to produce a lower bound on the number of detectably distinct fingerprints. The method also proving the invariance of each individual fingerprint by using both the topological behavior of the minutiae graph and also using a distance measure called Hausdorff distance.The method provides a graph based index generation mechanism of fingerprint biometric data. The self-organizing map neural network is also used for classifying the fingerprints.

How do neural networks handle large images where the area of interest is small?

If I've understood correctly, when training neural networks to recognize objects in images it's common to map single pixel to a single input layer node. However, sometimes we might have a large picture with only a small area of interest. For example, if we're training a neural net to recognize traffic signs, we might have images where the traffic sign covers only a small portion of it, while the rest is taken by the road, trees, sky etc. Creating a neural net which tries to find a traffic sign from every position seems extremely expensive.
My question is, are there any specific strategies to handle these sort of situations with neural networks, apart from preprocessing the image?
Thanks.
Using 1 pixel per input node is usually not done. What enters your network is the feature vector and as such you should input actual features, not raw data. Inputing raw data (with all its noise) will not only lead to bad classification but training will take longer than necessary.
In short: preprocessing is unavoidable. You need a more abstract representation of your data. There are hundreds of ways to deal with the problem you're asking. Let me give you some popular approaches.
1) Image proccessing to find regions of interest. When detecting traffic signs a common strategy is to use edge detection (i.e. convolution with some filter), apply some heuristics, use a threshold filter and isolate regions of interest (blobs, strongly connected components etc) which are taken as input to the network.
2) Applying features without any prior knowledge or image processing. Viola/Jones use a specific image representation, from which they can compute features in a very fast way. Their framework has been shown to work in real-time. (I know their original work doesn't state NNs but I applied their features to Multilayer Perceptrons in my thesis, so you can use it with any classifier, really.)
3) Deep Learning.
Learning better representations of the data can be incorporated into the neural network itself. These approaches are amongst the most popular researched atm. Since this is a very large topic, I can only give you some keywords so that you can research it on your own. Autoencoders are networks that learn efficient representations. It is possible to use them with conventional ANNs. Convolutional Neural Networks seem a bit sophisticated at first sight but they are worth checking out. Before the actual classification of a neural network, they have alternating layers of subwindow convolution (edge detection) and resampling. CNNs are currently able to achieve some of the best results in OCR.
In every scenario you have to ask yourself: Am I 1) giving my ANN a representation that has all the data it needs to do the job (a representation that is not too abstract) and 2) keeping too much noise away (and thus staying abstract enough).
We usually dont use fully connected network to deal with image because the number of units in the input layer will be huge. In neural network, we have specific neural network to deal with image which is Convolutional neural network(CNN).
However, CNN plays a role of feature extractor. The encoded feature will finally feed into a fully connected network which act as a classifier. In your case, I dont know how small your object is compare to the full image. But if the interested object is really small, even use CNN, the performance for image classification wont be very good. Then we probably need to use object detection(which used sliding window) to deal with it.
If you want recognize small objects on large sized image, you should use "scanning window".
For "scanning window" you can to apply dimention reducing methods:
DCT (http://en.wikipedia.org/wiki/Discrete_cosine_transform)
PCA (http://en.wikipedia.org/wiki/Principal_component_analysis)