Neural Network to identify Seven-Segment Numerals - neural-network

I am studying machine learning and I am working on my first neural network as a project for one of my classes. I am programming the network in java. The point of the network is to identify seven-segmented numeral (like on a regular digital clock). The network does not actually have to be linked to any real sensors, it just needs to work in theory based on inputs as 0's and 1's in text form, not binary, which correspond to a hypothetical sensor matrix laid across the top of the number.
My question is, what sort of output am I looking to get?
Will the binary output be just correspond to the same sort of matrix as input or is the binary output supposed to represent the input number in binary such as returning 111 for 7?
If it does just return another matrix, what is the point of the network?

The input for a seven-segment numeral would be a (1 X 7) vector, with 1 for segments that are on and 0 for segments that are off.
As for the output, you don't specify what you want it to be, so let's assume you want it to tell you "which digit is the screen showing". Since there are 10 digits (0 through 9), you have 10 possible answers. The output would be a (1 X 10) vector, with each number corresponding to one of the digits. Its value represents how confident the network is that this is the correct answer (typically the output values lie in [0, 1], but it depends on your setup) Ideally you would want the network to return a vector having 1 on one attribute and zeros in all others.
Note however, that this case a classifier is not useful. A classification algorithm generalizes what it has seen in the past. So, it would be useful for handwritten recognition, because even if the same person writes the same digit twice, it is not exactly the same. In your case, each digit is the same across all the 7-segment displays, so you network is not exactly learning, rather memorizing the input.

Related

Restricting output classes in multi-class classification in Tensorflow

I am building a bidirectional LSTM to do multi-class sentence classification.
I have in total 13 classes to choose from and I am multiplying the output of my LSTM network to a matrix whose dimensionality is [2*num_hidden_unit,num_classes] and then apply softmax to get the probability of the sentence to fall into 1 of the 13 classes.
So if we consider output[-1] as the network output:
W_output = tf.Variable(tf.truncated_normal([2*num_hidden_unit,num_classes]))
result = tf.matmul(output[-1],W_output) + bias
and I get my [1, 13] matrix (assuming I am not working with batches for the moment).
Now, I also have information that a given sentence does not fall into a given class for sure and I want to restrict the number of classes considered for a given sentence. So let's say for instance that for a given sentence, I know it can fall only in 6 classes so the output should really be a matrix of dimensionality [1,6].
One option I was thinking of is to put a mask over the result matrix where I multiply the rows corresponding to the classes that I want to keep by 1 and the ones I want to discard by 0, by in this way I will just lose some of the information instead of redirecting it.
Anyone has a clue on what to do in this case?
I think your best bet is, as you seem to have described, using a weighted cross entropy loss function where the weights for your "impossible class" are 0 and 1 for the other possible classes. Tensorflow has a weighted cross entropy loss function.
Another interesting but probably less effective method is to feed whatever information you now have about what classes your sentence can/cannot fall into the network at some point (probably towards the end).

Using neural network for classification in matlab

I'm working on optical character recognition problem. I've successfully extracted features which is a [1X32] matrix (I've extracted 32 features from each segmented character). I've the complete training data set (the images of every individual character), but I'm breaking my head on creating Input & Target data set matrices. So please tell me about those matrices, the testing data, & in what format will I get output from neural network.
1)There are 258 different patterns (characters), so, should there be 258 class labels ?
My input matrix size is No. of rows = 32 (features) No. of cols = 258*4=1032 (No of characters*No of instances for each character)
2) what should be the size of my target matrix ? Just draw a dummy target matrix for my case.
Did you checked the Neural Network Toolbox of MATLAB already (http://www.mathworks.co.uk/help/nnet/examples/crab-classification.html?prodcode=NN&language=en) ? There you can find some examples how to work with neural networks.
Regarding your two specific questions:
1) Typically if you want to differentiate between N different characters you will need that amount of class labels. So in your case yes you should have 258 class labels. The output of a classification problem using neural networks is typically a binary output where one goes for the identified class and 0 for the remain classes. It can happen however, if you use a sigmoid function as the last activation function that neither output node is exactly 0 or 1, and in this case you can for example take the maximum of all output nodes, to get the highest or more probable class for a certain input.
2) The target matrix should be a binary matrix where 1 goes for the correct class and 0 for all the others classes for each input. So in your case it should be 258*1032 matrix. Again I recommend you to check the link given above.
Good luck.

Neural Network theory to implementation mix up

I'm looking to create a neural network for the first time in matlab. As such I'm just a little confused and need some quick guidance. Below is an image:
Now the problem I'm currently having/ needs verification is the values that are generated from my hidden layer that move to my outer layer are these values 0's and 1's? i.e from u0 to unh do these nodes output 0's and 1's or values in between 0 and 1 like 0.8,0.4 etc? Another question is then my output node that should be outputting for me a value in between 0 and 1, so that an error can be found and used in the back propagation?
Like I said it's my first time doing this so I just need some guidance.
Not quite, the output of the hidden layer is like any other layer and each node gives a ranged value. The output of any node in a neural network is thus usually restricted to the [0, 1] or the [-1, 1] range. Your output node will similarly output a range of values, but that range is oftentimes thresholded to snap to 0 or 1 for simplicity of interpretation.
This however, doesn't mean that the outputs are linearly distributed. Usually you have a sigmoid, or some other non-linear, distribution which spreads more information through the middle, [-0.5, 0.5], range rather than evenly across the domain. Sometimes specialty functions are used to detect certain patterns, such as sinusoids -- though generally this is rarer and usually unnecessary.

Confusion with inputs and targets for a neural network

Recently I've posted many question s regarding a character recognition program that I am making. I thought I had it working fully until today. I think it has to do with my training of the network. What follows is an explanation of how I think the training and simulation procedure goes.
Give these two images
targets
inputs
I want to train the network to recognize the letter D. Note that before this is done, I've processed the images into a binary matrix. For training I use
[net,tr] = train(net,inputs,targets);
where instead of inputs I was targets because I want to train the network to recognize all the letters in the target image.
I then run
outputs = sim(net,inputs);
where inputs is the image with the letter "D", or an image with any other letter that is in ABCD. The basic premise here is that I want to train the network to recognized all the letters in ABCD, then choose any letter A, B, C, or D and see if the network recognizes this choosen letter.
Question:
Am I correct with the training procedure?
Well it greatly depends on how you implemented your neural network. Although regarding the question you're asking I guess you didn't implement it yourself but used some ready made API.
Anyways, you should first understand the tools you use before you use them (here neural networks).
A neural network takes an input and performs linear or non-linear transformations of the input and returns an output.
Inputs and outputs are always numeric values. However they may represent any kind of data.
Inputs can be:
Pixels of an image
Real valued or integer attributes
Categories
etc.
In your case the inputs are the pixels of your character images (your binary matrices).
Outputs can be:
Classes (if you're doing classification)
Values (if you're doing regression)
Next value in a time series (if you're doing time series prediction)
In your case, you're doing classification (predicting which character the inputs represent) so your output is a class.
For you to understand how the network is trained, I'll first explain how to use it once it's trained and then what it implies for the training phase.
So once you've trained you network, you will give it the binary matrix representing your image and it will output the class (the character) which will be (for example): 0 for A, 1 for B, 2 for C and 3 for D. In other words, you have:
Input: binary matrix (image)
Output: 0,1,2 or 3 (depending on which character the network recognizes in the image)
The training phase consists in telling the network which output you would like for each input.
The type of data used during the training phase is the same as the one being used in the "prediction phase". Hence, for the training phase:
Inputs: binary matrices [A,B,C,D] (One for each letter! Very important !)
Targets: corresponding classes [0,1,2,3]
This way, you're telling the network to learn that if you give it the image of A it should output 0, if you give it the image of B it should output 1, and so on.
Note: You were mistaken because you thought of the "inputs" as the inputs you wanted to give the network after the training phase, when they were actually the inputs given to the network during the training phase.

Few questions about kohonen neural network

I have big data set (time-series, about 50 parameters/values). I want to use Kohonen network to group similar data rows. I've read some about Kohonen neural networks, i understand idea of Kohonen network, but:
I don't know how to implement Kohonen with so many dimensions. I found example on CodeProject, but only with 2 or 3 dimensional input vector. When i have 50 parameters - shall i create 50 weights in my neurons?
I don't know how to update weights of winning neuron (how to calculate new weights?).
My english is not perfect and I don't understand everything I read about Kohonen network, especially descriptions of variables in formulas, thats why im asking.
One should distinguish the dimensionality of the map, which is usually low (e.g. 2 in the common case of a rectangular grid) and the dimensionality of the reference vectors which can be arbitrarily high without problems.
Look at http://www.psychology.mcmaster.ca/4i03/demos/competitive-demo.html for a nice example with 49-dimensional input vectors (7x7 pixel images). The Kohonen map in this case has the form of a one-dimensional ring of 8 units.
See also http://www.demogng.de for a java simulator for various Kohonen-like networks including ring-shaped ones like the one at McMasters. The reference vectors, however, are all 2-dimensional, but only for easier display. They could have arbitrary high dimensions without any change in the algorithms.
Yes, you would need 50 neurons. However, these types of networks are usually low dimensional as described in this self-organizing map article. I have never seen them use more than a few inputs.
You have to use an update formula. From the same article: Wv(s + 1) = Wv(s) + Θ(u, v, s) α(s)(D(t) - Wv(s))
yes, you'll need 50 inputs for each neuron
you basically do a linear interpolation between the neurons and the target (input) neuron, and use W(s + 1) = W(s) + Θ() * α(s) * (Input(t) - W(s)) with Θ being your neighbourhood function.
and you should update all your neurons, not only the winner
which function you use as a neighbourhood function depends on your actual problem.
a common property of such a function is that it has a value 1 when i=k and falls off with the distance euclidian distance. additionally it shrinks with time (in order to localize clusters).
simple neighbourhood functions include linear interpolation (up to a "maximum distance") or a gaussian function