iRPROP+ Multilayer Perceptron - neural-network

Hello everyone This is the code of iRPROP+ algo for my MLP. When I try to train my network, standart deviation decreases for 1500 epoches (so slow: from ~0.5 to 0.4732) but suddenly it starts to increase.
Can someone say what did I do wrong?
public void RPROP()
{
double a = 1.2, b = 0.5, nMax = 50, nMin = 0.000001;
for (int l = Network.Length - 1; l > 0; l--)
{
for (int i = 0; i < Network[l].getSize(); i++)
{
Neuron n = Network[l].Neurons[i];
double sum = 0;
if (l == Network.Length - 1) n.Delta = (n.Output - DesiredOutput[i]) * ActFunc.calcDeprivateFunction(n.Output);
else
{
for (int k = 0; k < Network[l + 1].getSize(); k++)
{
sum += Network[l + 1].Neurons[k].getWeight(i) * Network[l + 1].Neurons[k].Delta;
}
n.Delta = sum * ActFunc.calcDeprivateFunction(n.Output);
}
}
}
for (int l = 1; l < Network.Length; l++)
{
for (int i = 0; i < Network[l].getSize(); i++)
{
Neuron n = Network[l].Neurons[i];
if ((n.PrevDelta * n.Delta) > 0)
{
n.N = Math.Min(a * n.PrevN, nMax);
n.Bias -= n.N * Math.Sign(n.Delta);
for (int j = 0; j < Network[l - 1].getSize(); j++)
{
n.setWeight(j, n.getWeight(j) - n.N * Math.Sign(n.Delta));
}
n.PrevDelta = n.Delta;
}
else if ((n.PrevDelta * n.Delta) < 0)
{
n.N = Math.Max(b * n.PrevN, nMin);
if (this.CurrentError > this.LastError)
{
n.Bias += n.PrevN * Math.Sign(n.PrevDelta);
for (int j = 0; j < Network[l - 1].getSize(); j++)
{
n.setWeight(j, n.getWeight(j) + n.PrevN * Math.Sign(n.PrevDelta));
}
}
n.Delta = 0;
}
else if ((n.PrevDelta * n.Delta) == 0)
{
n.Bias -= n.N * Math.Sign(n.Delta);
for (int j = 0; j < Network[l - 1].getSize(); j++)
{
n.setWeight(j, n.getWeight(j) - n.N * Math.Sign(n.Delta));
}
n.PrevDelta = n.Delta;
}
n.PrevN = n.N;
}
}
}

For the first view, you calculate one train element error and you instantly teach it to the network. try to run over the full train set, without change the weights, and just summarize the Delta. After that, update the weights once, set the prev delta and start over.
Also, there is no update for neuron threshold.

Related

How to get the hash key?

When I solve this question(149. Max Points on a Line) on leetcode, it have a bug when met this case:
Input [[0,0],[94911151,94911150],[94911152,94911151]]
Output 3
Expected 2
This is my code:
/**
* Definition for a point.
* struct Point {
* int x;
* int y;
* Point() : x(0), y(0) {}
* Point(int a, int b) : x(a), y(b) {}
* };
*/
class Solution {
public:
int maxPoints(vector<Point>& points) {
int size = points.size();
int ans = 0;
if (size == 0) return 0;
unordered_map<double, int> mp;
double k;
for (int i = 0; i < size; ++i) {
int num = 0;
for (int j = i + 1; j < size; ++j) {
if (points[i].x == points[j].x && points[i].y == points[j].y) {
num++;
continue;
}
// my question in below code.
// how can I get the hash key according to slope
if (points[j].x - points[i].x != 0)
k = (double)(points[j].y - points[i].y) / (double)(points[j].x - points[i].x); // calculate the slope.
else k = INT_MAX;
mp[k]++;
}
if (mp[k] == 0) mp[k] = 1, num--;
for (auto it = mp.begin(); it != mp.end(); ++it) {
if (it->second > ans) {
ans = it->second;
ans += num;
}
}
mp.clear();
}
return ans+1;
}
};
In above test case, when it calculate the slope with [0,0] and [94911151,94911150] it comeback k = 1. So I want to know how to get the right hash key to solve this problem?

neural network for mnist keep guessing 1 digit

I'm working on feed forward neural network for solving mnist dataset without library to help me better understand the concept of neural network. But I think I miss something as the result guessed by the neural network keep guessing just one number. For example just guess digit 5 or digit 9, even if the weight just pure random.
node[0] always bias
feedforward:
int c = 1;
input[0] = 1;
for (int j = 0; j < 28; j++)
{
for (int k = 0; k < 28; k++)
{
if (traindata[i, j, k] > 126)
{
input[c] = 1;
}
else
{
input[c] = 0;
} //Console.Write(input[c]);
} //Console.WriteLine();
} //MessageBox.Show("Test");
//feed forward
hiddenlayer1[0] = 1;
double temp;
for (int j = 1; j < HIDDEN1; j++)
{
temp = 0;
for (int k = 0; k < INPUT; k++)
{
temp += input[k] * Winput_hiddenlayer1[k, j];
} hiddenlayer1[j] = sigmoid(temp); //MessageBox.Show(hiddenlayer1[j].ToString());
}
hiddenlayer2[0] = 1;
for (int j = 1; j < HIDDEN2; j++)
{
temp = 0;
for (int k = 0; k < HIDDEN1; k++)
{
temp += hiddenlayer1[k] * Whiddenlayer1_hiddenlayer2[k, j];
} hiddenlayer2[j] = sigmoid(temp);
}
for (int j = 0; j < OUTPUT; j++)
{
temp = 0;
for (int k = 0; k < HIDDEN2; k++)
{
temp += hiddenlayer2[k] * Whiddenlayer2_output[k, j];
} output[j] = sigmoid(temp);
}
and the backpropagation:
//set desired output
for (int j = 0; j < OUTPUT; j++)
{
Doutput[j] = 0;
} Doutput[labeltrain[i]] = 1;
//for (int j = 0; j < OUTPUT; j++)
//{
// Console.Write(Doutput[j].ToString());
//} Console.WriteLine();
//MessageBox.Show("Test");
//output error calculation
for (int j = 0; j < OUTPUT; j++)
{
outputerror[j] = (Doutput[j] - output[j]) * (1.0 - output[j]);
//Console.WriteLine("expected: " + Doutput[j]);
//Console.WriteLine("real: " + output[j]);
//Console.WriteLine("(Doutput[j] - output[j]): " + (Doutput[j] - output[j]));
//Console.WriteLine("1.0 - output[j]: " + (1.0 - output[j]));
//Console.WriteLine("output error: " + outputerror[j]);
//MessageBox.Show("Test");
}
//hidden2 error calculation
for (int j = 0; j < HIDDEN2; j++)
{
temp = 0;
for (int k = 0; k < OUTPUT; k++)
{
for (int l = 0; l < HIDDEN1; l++)
{
temp += outputerror[k] * Whiddenlayer1_hiddenlayer2[l, k];
}
} hidden2error[j] = temp * hiddenlayer2[j] * (1.0 - hiddenlayer2[j]);
}
//hidden1 error calculation
for (int j = 0; j < HIDDEN1; j++)
{
temp = 0;
for (int k = 0; k < HIDDEN2; k++)
{
for (int l = 0; l < INPUT; l++)
{
temp += hidden2error[k] * Winput_hiddenlayer1[l, k];
}
} hidden1error[j] = temp * hiddenlayer1[j] * (1.0 - hiddenlayer1[j]);
}
//hidden2-output weight adjustment
for (int j = 0; j < HIDDEN2; j++)
{
for (int k = 0; k < OUTPUT; k++)
{
Whiddenlayer2_output[j,k] += LEARNING_RATE * outputerror[k] * hiddenlayer2[j];
}
}
//hidden1-hidden2 weight adjusment
for (int j = 0; j < HIDDEN1; j++)
{
for (int k = 0; k < HIDDEN2; k++)
{
Whiddenlayer1_hiddenlayer2[j, k] += LEARNING_RATE * hidden2error[k] * hiddenlayer1[j];
}
}
//input-hidden1 weight adjustment
for (int j = 0; j < INPUT; j++)
{
for (int k = 0; k < HIDDEN1; k++)
{
Winput_hiddenlayer1[j, k] += LEARNING_RATE * hidden1error[k] * input[j];
}
}

Where is the huge performance difference for the two versions of the code?

I am working on a problem that Given a string s, partitions s such that every substring of the partition is a palindrome.
Return the minimum cuts needed for a palindrome partitioning of s. The problem can also be found in here. https://oj.leetcode.com/problems/palindrome-partitioning-ii/
Version 1 is one version of solution I found online.
Version 2 is my code.
They both seem to work in very similar ways. However, with a reasonably large input, version 2 takes more than 6000 milliseconds whereas version 1 takes around 71 milliseconds.
Can anyone provide any idea where the time difference is from?
Version 1:
int minSol(string s) {
int len = s.size();
vector<int> D(len + 1);
vector<vector<int>> P;
for (int i = 0; i < len; i++){
vector<int> t(len);
P.push_back(t);
}
for (int i = 0; i <= len; i++)
D[i] = len - i;
for (int i = 0; i < len; i++)
for (int j = 0; j < len; j++)
P[i][j] = false;
for (int i = len - 1; i >= 0; i--){
for (int j = i; j < len; j++){
if (s[i] == s[j] && (j - i < 2 || P[i + 1][j - 1])){
P[i][j] = true;
D[i] = min(D[i], D[j + 1] + 1);
}
}
}
return D[0] - 1;
}
Version 2:
int minCut(string s) {
int size = s.size();
vector<vector<bool>> map;
for (int i = 0; i < size; i++){
vector<bool> t;
for (int j = 0; j < size; j++){
t.push_back(false);
}
map.push_back(t);
}
vector<int> minCuts;
for (int i = 0; i < size; i++){
map[i][i] = true;
minCuts.push_back(size - i - 1);
}
for (int i = size - 1; i >= 0; i--){
for (int j = size - 1; j >= i; j--){
if (s[i] == s[j] && (j - i <= 1 || map[i + 1][j - 1])){
map[i][j] = true;
if (j == size - 1){
minCuts[i] = 0;
}else if (minCuts[i] > minCuts[j + 1] + 1){
minCuts[i] = minCuts[j + 1] + 1;
}
}
}
}
return minCuts[0];
}
I would guess it's because in the second version you're doing size^2 push_back's, whereas in the first version you're just doing size push_back's.

Looking for SLAB6 implementation

I'm looking to implement SLAB6 into my raycaster, especially the kv6 support for voxelmodels. However the SLAB6 source by Ken Silverman is totally unreadably (mostly ASM) so I was hoping someone could point me to a proper C / Java source to load kv6 models or maybe to explain me the workings in some pseudocode preferably (since I want to know how to support the kv6, I know how it works). Thanks, Kaj
EDIT: the implementation would be in Java.
I found some code in an application called VoxelGL (author not mentioned in sourcecode):
void CVoxelWorld::generateSlabFromData(unsigned char *data, VoxelData *vdata, Slab *slab)
{
int currentpattern = 1;
int i = 0;
int n, totalcount, v, count;
n = 0;
v = 0;
while (1)
{
while (data[i] == currentpattern)
{
if (currentpattern == 1)
v++;
i++;
if (i == 256)
break;
}
n++;
if (i == 256)
{
if (currentpattern == 0)
n--;
break;
}
currentpattern ^= 1;
}
slab->nentries = n;
if (slab->description != 0)delete [] slab->description;
if (slab->data != 0)delete [] slab->data;
slab->description = new int[n];
slab->data = new VoxelData[v];
totalcount = 0;
v = 0;
currentpattern = 1;
for (i = 0; i < n; i++)
{
count = 0;
while (data[totalcount] == currentpattern)
{
count++;
totalcount++;
if (totalcount == 256)
break;
}
slab->description[i] = count-1;
if (i % 2 == 0)
{
memcpy(slab->data + v, vdata + totalcount - count, 3 * count);
v += count;
}
currentpattern ^= 1;
}
}
And:
#define clustersize 8
Slab *CVoxelWorld::getSlab(int x, int z)
{
int xgrid = x / clustersize;
int ygrid = z / clustersize;
int clusteroffset = xgrid * 1024 * clustersize + ygrid * clustersize * clustersize;
return &m_data[clusteroffset + (x & (clustersize - 1)) + (z & (clustersize - 1)) * clustersize];
}
And:
int CVoxelWorld::isSolid(int x, int y, int z)
{
Slab *slab;
if (y < 0 || y > 256)
return 0;
slab = getSlab(x, z);
int counter = 0;
for (int i = 0; i < slab->nentries; i++)
{
int height = slab->description[i] + 1;
if (i % 2 == 0)
{
if (y >= counter && y < counter + height)
return 1;
}
counter += height;
}
return 0;
}

Teaching a Neural Net: Bipolar XOR

I'm trying to to teach a neural net of 2 inputs, 4 hidden nodes (all in same layer) and 1 output node. The binary representation works fine, but I have problems with the Bipolar. I can't figure out why, but the total error will sometimes converge to the same number around 2.xx. My sigmoid is 2/(1+ exp(-x)) - 1. Perhaps I'm sigmoiding in the wrong place. For example to calculate the output error should I be comparing the sigmoided output with the expected value or with the sigmoided expected value?
I was following this website here: http://galaxy.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html , but they use different functions then I was instructed to use. Even when I did try to implement their functions I still ran into the same problem. Either way I get stuck about half the time at the same number (a different number for different implementations). Please tell me if I have made a mistake in my code somewhere or if this is normal (I don't see how it could be). Momentum is set to 0. Is this a common 0 momentum problem? The error functions we are supposed to be using are:
if ui is an output unit
Error(i) = (Ci - ui ) * f'(Si )
if ui is a hidden unit
Error(i) = Error(Output) * weight(i to output) * f'(Si)
public double sigmoid( double x ) {
double fBipolar, fBinary, temp;
temp = (1 + Math.exp(-x));
fBipolar = (2 / temp) - 1;
fBinary = 1 / temp;
if(bipolar){
return fBipolar;
}else{
return fBinary;
}
}
// Initialize the weights to random values.
private void initializeWeights(double neg, double pos) {
for(int i = 0; i < numInputs + 1; i++){
for(int j = 0; j < numHiddenNeurons; j++){
inputWeights[i][j] = Math.random() - pos;
if(inputWeights[i][j] < neg || inputWeights[i][j] > pos){
print("ERROR ");
print(inputWeights[i][j]);
}
}
}
for(int i = 0; i < numHiddenNeurons + 1; i++){
hiddenWeights[i] = Math.random() - pos;
if(hiddenWeights[i] < neg || hiddenWeights[i] > pos){
print("ERROR ");
print(hiddenWeights[i]);
}
}
}
// Computes output of the NN without training. I.e. a forward pass
public double outputFor ( double[] argInputVector ) {
for(int i = 0; i < numInputs; i++){
inputs[i] = argInputVector[i];
}
double weightedSum = 0;
for(int i = 0; i < numHiddenNeurons; i++){
weightedSum = 0;
for(int j = 0; j < numInputs + 1; j++){
weightedSum += inputWeights[j][i] * inputs[j];
}
hiddenActivation[i] = sigmoid(weightedSum);
}
weightedSum = 0;
for(int j = 0; j < numHiddenNeurons + 1; j++){
weightedSum += (hiddenActivation[j] * hiddenWeights[j]);
}
return sigmoid(weightedSum);
}
//Computes the derivative of f
public static double fPrime(double u){
double fBipolar, fBinary;
fBipolar = 0.5 * (1 - Math.pow(u,2));
fBinary = u * (1 - u);
if(bipolar){
return fBipolar;
}else{
return fBinary;
}
}
// This method is used to update the weights of the neural net.
public double train ( double [] argInputVector, double argTargetOutput ){
double output = outputFor(argInputVector);
double lastDelta;
double outputError = (argTargetOutput - output) * fPrime(output);
if(outputError != 0){
for(int i = 0; i < numHiddenNeurons + 1; i++){
hiddenError[i] = hiddenWeights[i] * outputError * fPrime(hiddenActivation[i]);
deltaHiddenWeights[i] = learningRate * outputError * hiddenActivation[i] + (momentum * lastDelta);
hiddenWeights[i] += deltaHiddenWeights[i];
}
for(int in = 0; in < numInputs + 1; in++){
for(int hid = 0; hid < numHiddenNeurons; hid++){
lastDelta = deltaInputWeights[in][hid];
deltaInputWeights[in][hid] = learningRate * hiddenError[hid] * inputs[in] + (momentum * lastDelta);
inputWeights[in][hid] += deltaInputWeights[in][hid];
}
}
}
return 0.5 * (argTargetOutput - output) * (argTargetOutput - output);
}
General coding comments:
initializeWeights(-1.0, 1.0);
may not actually get the initial values you were expecting.
initializeWeights should probably have:
inputWeights[i][j] = Math.random() * (pos - neg) + neg;
// ...
hiddenWeights[i] = (Math.random() * (pos - neg)) + neg;
instead of:
Math.random() - pos;
so that this works:
initializeWeights(0.0, 1.0);
and gives you initial values between 0.0 and 1.0 rather than between -1.0 and 0.0.
lastDelta is used before it is declared:
deltaHiddenWeights[i] = learningRate * outputError * hiddenActivation[i] + (momentum * lastDelta);
I'm not sure if the + 1 on numInputs + 1 and numHiddenNeurons + 1 are necessary.
Remember to watch out for rounding of ints: 5/2 = 2, not 2.5!
Use 5.0/2.0 instead. In general, add the .0 in your code when the output should be a double.
Most importantly, have you trained the NeuralNet long enough?
Try running it with numInputs = 2, numHiddenNeurons = 4, learningRate = 0.9, and train for 1,000 or 10,000 times.
Using numHiddenNeurons = 2 it sometimes get "stuck" when trying to solve the XOR problem.
See also XOR problem - simulation