Question about Perceptron activation threshold - neural-network

I have a Perceptron written in Javascript that works fine, code below. My question is about the threshold in the activation function. Other code I have seen has something like if (sum > 0) {return 1} else {return 0}. My perceptron only works with if (sum > 1) {return 1} else {return 0}. Why is that? Full code below.
(function () {
"use strict";
function Perceptron(numInputs=2) {
let weights = Array.from({length: numInputs}, () => 2 * Math.random() - 1); // [-1, 1)
this.learningRate = 0.5;
this.train = function (inputs, goal) {
// feed forward
let guess = this.predict(inputs);
// back propagation
let error = goal - guess;
if (error !== 0) {
for (let i = 0; i < weights.length; i += 1) {
weights[i] += this.learningRate * error * inputs[i];
}
}
}
this.predict = function (inputs) {
// transfer function
let sumProducts = 0;
for (let i = 0; i < inputs.length; i += 1) {
sumProducts += inputs[i] * weights[i];
}
// activation function, threshold = 1
return (sumProducts >= 1) ? 1 : 0; // <-- this is the line I have a question about
}
}
// train
let data = [
[0, 0, 0],
[0, 1, 0],
[1, 0, 0],
[1, 1, 1]
];
let p = new Perceptron(2);
const epochs = 20;
for (let i = 0; i < epochs; i += 1) {
let r = Math.floor(Math.random() * data.length);
p.train(data[r].slice(0, 2), data[r].slice(-1));
}
// test
for (let i = 0; i < data.length; i += 1) {
let inputs = data[i].slice(0, 2);
console.log(`inputs = ${inputs}; output = ${p.predict(inputs)}`);
}
}());

Your perceptron lacks a bias term, your equation is of form SUM_i w_i x_i, instead of SUM_i w_i x_i + b. With the functional form you have it is impossible to separate points, where the separating hyperplane does not cross the origin (and yours does not). Alternatively you can add a column of "1s" to your data, it will serve the same purpose, as the corresponding w_i will just behave as b (since all x_i will be 1)

For anyone else following this question, here is the final code. Thanks to #lejlot for helping!
/*jslint for:true, long:true, devel:true, this:true */
/*jshint esversion: 6*/
(function () {
"use strict";
function Perceptron(numInputs = 2) {
this.weights = Array.from({length: numInputs}, () => 2 * Math.random() - 1); // [-1, 1)
this.learningRate = 0.5;
this.bias = 1;
this.train = function (inputs, goal) {
// feed forward
let guess = this.predict(inputs);
// back propagation
let error = goal - guess;
if (error !== 0) {
for (let i = 0; i < this.weights.length; i += 1) {
this.weights[i] += error * inputs[i] * this.learningRate;
}
// Bias should be learned, not fixed. The update rules is
// identical to the weight, just lacks "* inputs[i]" (as
// described in the answer it is equivalent to having a constant
// input of 1, and multiplying by 1 does nothing).
this.bias += error * this.learningRate;
}
};
this.predict = function (inputs) {
// transfer function
let sumProducts = 0;
for (let i = 0; i < inputs.length; i += 1) {
sumProducts += inputs[i] * this.weights[i];
}
// Your perception lacks a bias term, your equation if of form
// SUM_i w_i x_i, instead of SUM_i w_i x_i + b. With the
// functional form you have it is impossible to separate points,
// where the separating hyperplane does not cross the origin
// (and yours does not). Alternatively you can add a column of
// "1s" to your data, it will serve the same purpose, as the
// corresponding w_i will just behave as "b" (since all x_i will
// be 1)
sumProducts += this.bias;
// activation function
return (sumProducts >= 0) ? 1 : 0;
};
}
// train - boolean AND
let data = [
[0, 0, 0],
[0, 1, 0],
[1, 0, 0],
[1, 1, 1]
];
let p = new Perceptron(2);
const epochs = 100;
for (let i = 0; i < epochs; i += 1) {
let r = Math.floor(Math.random() * data.length);
p.train(data[r].slice(0, 2), data[r].slice(-1));
}
// test
for (let i = 0; i < data.length; i += 1) {
let inputs = data[i].slice(0, 2);
console.log(`inputs = ${inputs}; output = ${p.predict(inputs)}`);
}
}());

Related

How to get the hash key?

When I solve this question(149. Max Points on a Line) on leetcode, it have a bug when met this case:
Input [[0,0],[94911151,94911150],[94911152,94911151]]
Output 3
Expected 2
This is my code:
/**
* Definition for a point.
* struct Point {
* int x;
* int y;
* Point() : x(0), y(0) {}
* Point(int a, int b) : x(a), y(b) {}
* };
*/
class Solution {
public:
int maxPoints(vector<Point>& points) {
int size = points.size();
int ans = 0;
if (size == 0) return 0;
unordered_map<double, int> mp;
double k;
for (int i = 0; i < size; ++i) {
int num = 0;
for (int j = i + 1; j < size; ++j) {
if (points[i].x == points[j].x && points[i].y == points[j].y) {
num++;
continue;
}
// my question in below code.
// how can I get the hash key according to slope
if (points[j].x - points[i].x != 0)
k = (double)(points[j].y - points[i].y) / (double)(points[j].x - points[i].x); // calculate the slope.
else k = INT_MAX;
mp[k]++;
}
if (mp[k] == 0) mp[k] = 1, num--;
for (auto it = mp.begin(); it != mp.end(); ++it) {
if (it->second > ans) {
ans = it->second;
ans += num;
}
}
mp.clear();
}
return ans+1;
}
};
In above test case, when it calculate the slope with [0,0] and [94911151,94911150] it comeback k = 1. So I want to know how to get the right hash key to solve this problem?

(Physics engine for games like Sugar, Sugar) SpriteKit performance optimization for many physics sprites

I'm a iOS game developer and I saw an interesting physics & draw game "Sugar, Sugar" recently. In the game, there are lots of pixel particles (thousands of them) generated from the screen and free falling to the ground. Player can draw any shape of lines, which can guide those particles to certain cups. A image from google:
I'm trying to achieve similar effect using SpriteKit with Swift. Here's what I got:
Then I encounter a performance problem. Once the number of particles > 100. The CPU and energy costs are very high. (I use iPhone 6s). So I believe the Physics Engine in "Sugar, Sugar" is much simpler than the realistic SpriteKit. But I don't know what's the physics engine there and how can I achieve this in SpriteKit?
PS:
I use one single image as texture for all those particles, only loaded once to save performance. I only use SKSpriteNode, no ShapeNode is used for performance reason too.
I have not done a sand sim for a long time so I thought I would create a quick demo for you.
It is done in javascript, left mouse adds sand, right mouse draws lines. Depending on the machine it will handle thousands of grains of sand.
It works by creating an array of pixels, each pixel has a x,y position a delta x,y and a flag to indicate it is inactive (dead). Every frame I clear the display and then add the walls. Then for each pixel I check if there are pixels to the sides or below (depending on the direction of movement) and add sideways slippage, bounce of wall, or gravity. If a pixel has not moved for some time I set it as dead and only draw it to save time on the calculations.
The sim is very simple, the first pixel (grain) will never bump into another because it is drawn with a clear display, pixels can only see pixels created before them. But this works well as they self organize and will not overlap each other.
You can find the logic in the function display, (second function from bottom) there is some code for auto demo, then code for drawing the walls, displaying the walls, getting the pixel data and then doing the sim for each pixel.
Its not perfect (like the game you have mentioned) but it is just a quick hack to show how it is done. Also I made it to big for the inset window so best viewed full page.
/** SimpleFullCanvasMouse.js begin **/
const CANVAS_ELEMENT_ID = "canv";
const U = undefined;
var w, h, cw, ch; // short cut vars
var canvas, ctx, mouse;
var globalTime = 0;
var createCanvas, resizeCanvas, setGlobals;
var L = typeof log === "function" ? log : function(d){ console.log(d); }
createCanvas = function () {
var c,cs;
cs = (c = document.createElement("canvas")).style;
c.id = CANVAS_ELEMENT_ID;
cs.position = "absolute";
cs.top = cs.left = "0px";
cs.width = cs.height = "100%";
cs.zIndex = 1000;
document.body.appendChild(c);
return c;
}
resizeCanvas = function () {
if (canvas === U) { canvas = createCanvas(); }
canvas.width = Math.floor(window.innerWidth/4);
canvas.height = Math.floor(window.innerHeight/4);
ctx = canvas.getContext("2d");
if (typeof setGlobals === "function") { setGlobals(); }
}
setGlobals = function(){ cw = (w = canvas.width) / 2; ch = (h = canvas.height) / 2; }
mouse = (function(){
function preventDefault(e) { e.preventDefault(); }
var mouse = {
x : 0, y : 0, w : 0, alt : false, shift : false, ctrl : false, buttonRaw : 0,
over : false, // mouse is over the element
bm : [1, 2, 4, 6, 5, 3], // masks for setting and clearing button raw bits;
mouseEvents : "mousemove,mousedown,mouseup,mouseout,mouseover,mousewheel,DOMMouseScroll".split(",")
};
var m = mouse;
function mouseMove(e) {
var t = e.type;
m.x = e.offsetX; m.y = e.offsetY;
if (m.x === U) { m.x = e.clientX; m.y = e.clientY; }
m.alt = e.altKey; m.shift = e.shiftKey; m.ctrl = e.ctrlKey;
if (t === "mousedown") { m.buttonRaw |= m.bm[e.which-1]; }
else if (t === "mouseup") { m.buttonRaw &= m.bm[e.which + 2]; }
else if (t === "mouseout") { m.buttonRaw = 0; m.over = false; }
else if (t === "mouseover") { m.over = true; }
else if (t === "mousewheel") { m.w = e.wheelDelta; }
else if (t === "DOMMouseScroll") { m.w = -e.detail; }
if (m.callbacks) { m.callbacks.forEach(c => c(e)); }
e.preventDefault();
}
m.addCallback = function (callback) {
if (typeof callback === "function") {
if (m.callbacks === U) { m.callbacks = [callback]; }
else { m.callbacks.push(callback); }
} else { throw new TypeError("mouse.addCallback argument must be a function"); }
}
m.start = function (element, blockContextMenu) {
if (m.element !== U) { m.removeMouse(); }
m.element = element === U ? document : element;
m.blockContextMenu = blockContextMenu === U ? false : blockContextMenu;
m.mouseEvents.forEach( n => { m.element.addEventListener(n, mouseMove); } );
if (m.blockContextMenu === true) { m.element.addEventListener("contextmenu", preventDefault, false); }
}
m.remove = function () {
if (m.element !== U) {
m.mouseEvents.forEach(n => { m.element.removeEventListener(n, mouseMove); } );
if (m.contextMenuBlocked === true) { m.element.removeEventListener("contextmenu", preventDefault);}
m.element = m.callbacks = m.contextMenuBlocked = U;
}
}
return mouse;
})();
var done = function(){
window.removeEventListener("resize",resizeCanvas)
mouse.remove();
document.body.removeChild(canvas);
canvas = ctx = mouse = U;
L("All done!")
}
resizeCanvas(); // create and size canvas
mouse.start(canvas,true); // start mouse on canvas and block context menu
window.addEventListener("resize",resizeCanvas); // add resize event
var simW = 200;
var simH = 200;
var wallCanvas = document.createElement("canvas");
wallCanvas.width = simW;
wallCanvas.height = simH;
var wallCtx = wallCanvas.getContext("2d");
var bounceDecay = 0.7;
var grav = 0.5;
var slip = 0.5;
var sandPerFrame = 5;
var idleTime = 50;
var pixels = [];
var inactiveCounter = 0;
var demoStarted;
var lastMouse;
var wallX;
var wallY;
function display(){ // Sim code is in this function
var blocked;
var obstructed;
w = canvas.width;
h = canvas.height;
var startX = Math.floor(w / 2) - Math.floor(simW / 2);
var startY = Math.floor(h / 2) - Math.floor(simH / 2);
if(lastMouse === undefined){
lastMouse = mouse.x + mouse.y;
}
if(lastMouse === mouse.x + mouse.y){
inactiveCounter += 1;
}else{
inactiveCounter = 0;
}
if(inactiveCounter > 10 * 60){
if(demoStarted === undefined){
wallCtx.beginPath();
var sy = simH / 6;
for(var i = 0; i < 4; i ++){
wallCtx.moveTo(simW * (1/6) - 10,sy * i + sy * 1);
wallCtx.lineTo(simW * (3/ 6) - 10,sy * i + sy * 2);
wallCtx.moveTo(simW * (5/6) + 10,sy * i + sy * 0.5);
wallCtx.lineTo(simW * (3/6) +10,sy * i + sy * 1.5);
}
wallCtx.stroke();
}
mouse.x = startX * 4 + (simW * 2);
mouse.y = startY * 4 + (simH * 2 )/5;
lastMouse = mouse.x + mouse.y;
mouse.buttonRaw = 1;
}
ctx.setTransform(1,0,0,1,0,0); // reset transform
ctx.globalAlpha = 1; // reset alpha
ctx.clearRect(0,0,w,h);
ctx.strokeRect(startX+1,startY+1,simW-2,simH-2)
ctx.drawImage(wallCanvas,startX,startY); // draws the walls
if(mouse.buttonRaw & 4){ // if right button draw walls
if(mouse.x/4 > startX && mouse.x/4 < startX + simW && mouse.y/4 > startY && mouse.y/4 < startY + simH){
if(wallX === undefined){
wallX = mouse.x/4 - startX
wallY = mouse.y/4 - startY
}else{
wallCtx.beginPath();
wallCtx.moveTo(wallX,wallY);
wallX = mouse.x/4 - startX
wallY = mouse.y/4 - startY
wallCtx.lineTo(wallX,wallY);
wallCtx.stroke();
}
}
}else{
wallX = undefined;
}
if(mouse.buttonRaw & 1){ // if left button add sand
for(var i = 0; i < sandPerFrame; i ++){
var dir = Math.random() * Math.PI;
var speed = Math.random() * 2;
var dx = Math.cos(dir) * 2;
var dy = Math.sin(dir) * 2;
pixels.push({
x : (Math.floor(mouse.x/4) - startX) + dx,
y : (Math.floor(mouse.y/4) - startY) + dy,
dy : dx * speed,
dx : dy * speed,
dead : false,
inactive : 0,
r : Math.floor((Math.sin(globalTime / 1000) + 1) * 127),
g : Math.floor((Math.sin(globalTime / 5000) + 1) * 127),
b : Math.floor((Math.sin(globalTime / 15000) + 1) * 127),
});
}
if(pixels.length > 10000){ // if over 10000 pixels reset
pixels = [];
}
}
// get the canvas pixel data
var data = ctx.getImageData(startX, startY,simW,simH);
var d = data.data;
// handle each pixel;
for(var i = 0; i < pixels.length; i += 1){
var p = pixels[i];
if(!p.dead){
var ind = Math.floor(p.x) * 4 + Math.floor(p.y) * 4 * simW;
d[ind + 3] = 0;
obstructed = false;
p.dy += grav;
var dist = Math.floor(p.y + p.dy) - Math.floor(p.y);
if(Math.floor(p.y + p.dy) - Math.floor(p.y) >= 1){
if(dist >= 1){
bocked = d[ind + simW * 4 + 3];
}
if(dist >= 2){
bocked += d[ind + simW * 4 * 2 + 3];
}
if(dist >= 3){
bocked += d[ind + simW * 4 * 3 + 3];
}
if(dist >= 4){
bocked += d[ind + simW * 4 * 4 + 3];
}
if( bocked > 0 || p.y + 1 > simH){
p.dy = - p.dy * bounceDecay;
obstructed = true;
}else{
p.y += p.dy;
}
}else{
p.y += p.dy;
}
if(d[ind + simW * 4 + 3] > 0){
if(d[ind + simW * 4 - 1] === 0 && d[ind + simW * 4 + 4 + 3] === 0 ){
p.dx += Math.random() < 0.5 ? -slip/2 : slip/2;
}else
if(d[ind + 4 + 3] > 0 && d[ind + simW * 4 - 1] === 0 ){
p.dx -= slip;
}else
if(d[ind - 1] + d[ind - 1 - 4] > 0 ){
p.dx += slip/2;
}else
if(d[ind +3] + d[ind + 3 + 4] > 0 ){
p.dx -= slip/2;
}else
if(d[ind + 1] + d[ind + 1] > 0 && d[ind + simW * 4 + 3] > 0 && d[ind + simW * 4 + 4 + 3] === 0 ){
p.dx += slip;
}else
if(d[ind + simW * 4 - 1] === 0 ){
p.dx += -slip/2;
}else
if(d[ind + simW * 4 + 4 + 3] === 0 ){
p.dx += -slip/2;
}
}
if(p.dx < 0){
if(Math.floor(p.x + p.dx) - Math.floor(p.x) <= -1){
if(d[ind - 1] > 0){
p.dx = -p.dx * bounceDecay;
}else{
p.x += p.dx;
}
}else{
p.x += p.dx;
}
}else
if(p.dx > 0){
if(Math.floor(p.x + p.dx) - Math.floor(p.x) >= 1){
if(d[ind + 4 + 3] > 0){
p.dx = -p.dx * bounceDecay;
}else{
p.x += p.dx;
}
}else{
p.x += p.dx;
}
}
var ind = Math.floor(p.x) * 4 + Math.floor(p.y) * 4 * simW;
d[ind ] = p.r;
d[ind + 1] = p.g;
d[ind + 2] = p.b;
d[ind + 3] = 255;
if(obstructed && p.dx * p.dx + p.dy * p.dy < 1){
p.inactive += 1;
if(p.inactive > idleTime){
p.dead = true;
}
}
}else{
var ind = Math.floor(p.x) * 4 + Math.floor(p.y) * 4 * simW;
d[ind ] = p.r;
d[ind + 1] = p.g;
d[ind + 2] = p.b;
d[ind + 3] = 255;
}
}
ctx.putImageData(data,startX, startY);
}
function update(timer){ // Main update loop
globalTime = timer;
display(); // call demo code
// continue until mouse right down
if (!(mouse.buttonRaw & 2)) { requestAnimationFrame(update); } else { done(); }
}
requestAnimationFrame(update);
/** SimpleFullCanvasMouse.js end **/
* { font-family: arial; }
canvas { image-rendering: pixelated; }
<p>Right click drag to draw walls</p>
<p>Left click hold to drop sand</p>
<p>Demo auto starts in 10 seconds is no input</p>
<p>Sim resets when sand count reaches 10,000 grains</p>
<p>Middle button quits sim</p>

imregionalmax matlab function's equivalent in opencv

I have an image of connected components(circles filled).If i want to segment them i can use watershed algorithm.I prefer writing my own function for watershed instead of using the inbuilt function in OPENCV.I have successfu How do i find the regionalmax of objects using opencv?
I wrote a function myself. My results were quite similar to MATLAB, although not exact. This function is implemented for CV_32F but it can easily be modified for other types.
I mark all the points that are not part of a minimum region by checking all the neighbors. The remaining regions are either minima, maxima or areas of inflection.
I use connected components to label each region.
I check each region for any point belonging to a maxima, if yes then I push that label into a vector.
Finally I sort the bad labels, erase all duplicates and then mark all the points in the output as not minima.
All that remains are the regions of minima.
Here is the code:
// output is a binary image
// 1: not a min region
// 0: part of a min region
// 2: not sure if min or not
// 3: uninitialized
void imregionalmin(cv::Mat& img, cv::Mat& out_img)
{
// pad the border of img with 1 and copy to img_pad
cv::Mat img_pad;
cv::copyMakeBorder(img, img_pad, 1, 1, 1, 1, IPL_BORDER_CONSTANT, 1);
// initialize binary output to 2, unknown if min
out_img = cv::Mat::ones(img.rows, img.cols, CV_8U)+2;
// initialize pointers to matrices
float* in = (float *)(img_pad.data);
uchar* out = (uchar *)(out_img.data);
// size of matrix
int in_size = img_pad.cols*img_pad.rows;
int out_size = img.cols*img.rows;
int x, y;
for (int i = 0; i < out_size; i++) {
// find x, y indexes
y = i % img.cols;
x = i / img.cols;
neighborCheck(in, out, i, x, y, img_pad.cols); // all regions are either min or max
}
cv::Mat label;
cv::connectedComponents(out_img, label);
int* lab = (int *)(label.data);
in = (float *)(img.data);
in_size = img.cols*img.rows;
std::vector<int> bad_labels;
for (int i = 0; i < out_size; i++) {
// find x, y indexes
y = i % img.cols;
x = i / img.cols;
if (lab[i] != 0) {
if (neighborCleanup(in, out, i, x, y, img.rows, img.cols) == 1) {
bad_labels.push_back(lab[i]);
}
}
}
std::sort(bad_labels.begin(), bad_labels.end());
bad_labels.erase(std::unique(bad_labels.begin(), bad_labels.end()), bad_labels.end());
for (int i = 0; i < out_size; ++i) {
if (lab[i] != 0) {
if (std::find(bad_labels.begin(), bad_labels.end(), lab[i]) != bad_labels.end()) {
out[i] = 0;
}
}
}
}
int inline neighborCleanup(float* in, uchar* out, int i, int x, int y, int x_lim, int y_lim)
{
int index;
for (int xx = x - 1; xx < x + 2; ++xx) {
for (int yy = y - 1; yy < y + 2; ++yy) {
if (((xx == x) && (yy==y)) || xx < 0 || yy < 0 || xx >= x_lim || yy >= y_lim)
continue;
index = xx*y_lim + yy;
if ((in[i] == in[index]) && (out[index] == 0))
return 1;
}
}
return 0;
}
void inline neighborCheck(float* in, uchar* out, int i, int x, int y, int x_lim)
{
int indexes[8], cur_index;
indexes[0] = x*x_lim + y;
indexes[1] = x*x_lim + y+1;
indexes[2] = x*x_lim + y+2;
indexes[3] = (x+1)*x_lim + y+2;
indexes[4] = (x + 2)*x_lim + y+2;
indexes[5] = (x + 2)*x_lim + y + 1;
indexes[6] = (x + 2)*x_lim + y;
indexes[7] = (x + 1)*x_lim + y;
cur_index = (x + 1)*x_lim + y+1;
for (int t = 0; t < 8; t++) {
if (in[indexes[t]] < in[cur_index]) {
out[i] = 0;
break;
}
}
if (out[i] == 3)
out[i] = 1;
}
The following listing is a function similar to Matlab's "imregionalmax". It looks for at most nLocMax local maxima above threshold, where the found local maxima are at least minDistBtwLocMax pixels apart. It returns the actual number of local maxima found. Notice that it uses OpenCV's minMaxLoc to find global maxima. It is "opencv-self-contained" except for the (easy to implement) function vdist, which computes the (euclidian) distance between points (r,c) and (row,col).
input is one-channel CV_32F matrix, and locations is nLocMax (rows) by 2 (columns) CV_32S matrix.
int imregionalmax(Mat input, int nLocMax, float threshold, float minDistBtwLocMax, Mat locations)
{
Mat scratch = input.clone();
int nFoundLocMax = 0;
for (int i = 0; i < nLocMax; i++) {
Point location;
double maxVal;
minMaxLoc(scratch, NULL, &maxVal, NULL, &location);
if (maxVal > threshold) {
nFoundLocMax += 1;
int row = location.y;
int col = location.x;
locations.at<int>(i,0) = row;
locations.at<int>(i,1) = col;
int r0 = (row-minDistBtwLocMax > -1 ? row-minDistBtwLocMax : 0);
int r1 = (row+minDistBtwLocMax < scratch.rows ? row+minDistBtwLocMax : scratch.rows-1);
int c0 = (col-minDistBtwLocMax > -1 ? col-minDistBtwLocMax : 0);
int c1 = (col+minDistBtwLocMax < scratch.cols ? col+minDistBtwLocMax : scratch.cols-1);
for (int r = r0; r <= r1; r++) {
for (int c = c0; c <= c1; c++) {
if (vdist(Point2DMake(r, c),Point2DMake(row, col)) <= minDistBtwLocMax) {
scratch.at<float>(r,c) = 0.0;
}
}
}
} else {
break;
}
}
return nFoundLocMax;
}
I do not know if it is what you want, but in my answer to this post, I gave some code to find local maxima (peaks) in a grayscale image (resulting from distance transform).
The approach relies on subtracting the original image from the dilated image and finding the zero pixels).
I hope it helps,
Good luck
I had the same problem some time ago, and the solution was to reimplement the imregionalmax algorithm in OpenCV/Cpp. It is not that complicated, because you can find the C++ source code of the function in the Matlab distribution. (somewhere in toolbox). All you have to do is to read carefully and understand the algorithm described there. Then rewrite it or remove the matlab-specific checks and you'll have it.

Teaching a Neural Net: Bipolar XOR

I'm trying to to teach a neural net of 2 inputs, 4 hidden nodes (all in same layer) and 1 output node. The binary representation works fine, but I have problems with the Bipolar. I can't figure out why, but the total error will sometimes converge to the same number around 2.xx. My sigmoid is 2/(1+ exp(-x)) - 1. Perhaps I'm sigmoiding in the wrong place. For example to calculate the output error should I be comparing the sigmoided output with the expected value or with the sigmoided expected value?
I was following this website here: http://galaxy.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html , but they use different functions then I was instructed to use. Even when I did try to implement their functions I still ran into the same problem. Either way I get stuck about half the time at the same number (a different number for different implementations). Please tell me if I have made a mistake in my code somewhere or if this is normal (I don't see how it could be). Momentum is set to 0. Is this a common 0 momentum problem? The error functions we are supposed to be using are:
if ui is an output unit
Error(i) = (Ci - ui ) * f'(Si )
if ui is a hidden unit
Error(i) = Error(Output) * weight(i to output) * f'(Si)
public double sigmoid( double x ) {
double fBipolar, fBinary, temp;
temp = (1 + Math.exp(-x));
fBipolar = (2 / temp) - 1;
fBinary = 1 / temp;
if(bipolar){
return fBipolar;
}else{
return fBinary;
}
}
// Initialize the weights to random values.
private void initializeWeights(double neg, double pos) {
for(int i = 0; i < numInputs + 1; i++){
for(int j = 0; j < numHiddenNeurons; j++){
inputWeights[i][j] = Math.random() - pos;
if(inputWeights[i][j] < neg || inputWeights[i][j] > pos){
print("ERROR ");
print(inputWeights[i][j]);
}
}
}
for(int i = 0; i < numHiddenNeurons + 1; i++){
hiddenWeights[i] = Math.random() - pos;
if(hiddenWeights[i] < neg || hiddenWeights[i] > pos){
print("ERROR ");
print(hiddenWeights[i]);
}
}
}
// Computes output of the NN without training. I.e. a forward pass
public double outputFor ( double[] argInputVector ) {
for(int i = 0; i < numInputs; i++){
inputs[i] = argInputVector[i];
}
double weightedSum = 0;
for(int i = 0; i < numHiddenNeurons; i++){
weightedSum = 0;
for(int j = 0; j < numInputs + 1; j++){
weightedSum += inputWeights[j][i] * inputs[j];
}
hiddenActivation[i] = sigmoid(weightedSum);
}
weightedSum = 0;
for(int j = 0; j < numHiddenNeurons + 1; j++){
weightedSum += (hiddenActivation[j] * hiddenWeights[j]);
}
return sigmoid(weightedSum);
}
//Computes the derivative of f
public static double fPrime(double u){
double fBipolar, fBinary;
fBipolar = 0.5 * (1 - Math.pow(u,2));
fBinary = u * (1 - u);
if(bipolar){
return fBipolar;
}else{
return fBinary;
}
}
// This method is used to update the weights of the neural net.
public double train ( double [] argInputVector, double argTargetOutput ){
double output = outputFor(argInputVector);
double lastDelta;
double outputError = (argTargetOutput - output) * fPrime(output);
if(outputError != 0){
for(int i = 0; i < numHiddenNeurons + 1; i++){
hiddenError[i] = hiddenWeights[i] * outputError * fPrime(hiddenActivation[i]);
deltaHiddenWeights[i] = learningRate * outputError * hiddenActivation[i] + (momentum * lastDelta);
hiddenWeights[i] += deltaHiddenWeights[i];
}
for(int in = 0; in < numInputs + 1; in++){
for(int hid = 0; hid < numHiddenNeurons; hid++){
lastDelta = deltaInputWeights[in][hid];
deltaInputWeights[in][hid] = learningRate * hiddenError[hid] * inputs[in] + (momentum * lastDelta);
inputWeights[in][hid] += deltaInputWeights[in][hid];
}
}
}
return 0.5 * (argTargetOutput - output) * (argTargetOutput - output);
}
General coding comments:
initializeWeights(-1.0, 1.0);
may not actually get the initial values you were expecting.
initializeWeights should probably have:
inputWeights[i][j] = Math.random() * (pos - neg) + neg;
// ...
hiddenWeights[i] = (Math.random() * (pos - neg)) + neg;
instead of:
Math.random() - pos;
so that this works:
initializeWeights(0.0, 1.0);
and gives you initial values between 0.0 and 1.0 rather than between -1.0 and 0.0.
lastDelta is used before it is declared:
deltaHiddenWeights[i] = learningRate * outputError * hiddenActivation[i] + (momentum * lastDelta);
I'm not sure if the + 1 on numInputs + 1 and numHiddenNeurons + 1 are necessary.
Remember to watch out for rounding of ints: 5/2 = 2, not 2.5!
Use 5.0/2.0 instead. In general, add the .0 in your code when the output should be a double.
Most importantly, have you trained the NeuralNet long enough?
Try running it with numInputs = 2, numHiddenNeurons = 4, learningRate = 0.9, and train for 1,000 or 10,000 times.
Using numHiddenNeurons = 2 it sometimes get "stuck" when trying to solve the XOR problem.
See also XOR problem - simulation

How to determine Y Axis values on a chart

I'm working on a charting algorithm that will give me a set n array of y axis values I would use on my graph.
The main problem is that I also want to calculate the number of number of steps to use and also use nice numbers for them. It must be able to take integers and doubles and be able to handle small ranges (under 1) and large ranges (over 10000 etc).
For example, if I was given a range of 0.1 - 0.9, ideally i would have values of 0, 0.2, 0.4, 0.6, 0.8, 1 but if I were given 0.3 to 0.7 I might use 0.3, 0.4, 0.5, 0.6, 0.7
This is what I have so far, it works well with small ranges, but terribly in large ranges, and doesn't give me nice numbers
-(double*)yAxisValues:(double)min (double):max {
double diff = max - min;
double divisor = 1.0;
if (diff > 1) {
while (diff > 1) {
diff /= 10;
divisor *= 10;
}
} else {
while (diff < 1) {
diff *= 10;
divisor *= 10;
}
}
double newMin = round(min * divisor) / divisor;
double newMax = round(max * divisor) / divisor;
if (newMin > min) {
newMin -= 1.0/divisor;
}
if (newMax < max) {
newMax += 1.0/divisor;
}
int test2 = round((newMax - newMin) * divisor);
if (test2 >= 7) {
while (test2 % 6 != 0 && test2 % 5 != 0 && test2 % 4 != 0 && test2 % 3 != 0) {
test2++;
newMax += 1.0/divisor;
}
}
if (test2 % 6 == 0) {
test2 = 6;
} else if (test2 % 5 == 0) {
test2 = 5;
} else if (test2 % 4 == 0 || test2 == 2) {
test2 = 4;
} else if (test2 % 3 == 0) {
test2 = 3;
}
double *values = malloc(sizeof(double) * (test2 + 1));
for (int i = 0; i < test2 + 1; i++) {
values[i] = newMin + (newMax - newMin) * i / test2;
}
return values;
}
Any suggestions?
Here's a snippet of code that does something similar, though has a slightly different approach. The "units" refer to what your are plotting on the graph. So if your scale is so that one unit on your graph should be 20 pixels on screen, this function would return how many units each step should be. With that information you can then easily calculate what the axis values are and where to draw them.
- (float)unitsPerMajorGridLine:(float)pixelsPerUnit {
float amountAtMinimum, orderOfMagnitude, fraction;
amountAtMinimum = [[self minimumPixelsPerMajorGridLine] floatValue]/pixelsPerUnit;
orderOfMagnitude = floor(log10(amountAtMinimum));
fraction = amountAtMinimum / pow(10.0, orderOfMagnitude);
if (fraction <= 2) {
return 2 * pow(10.0, orderOfMagnitude);
} else if (fraction <= 5) {
return 5 * pow(10.0, orderOfMagnitude);
} else {
return 10 * pow(10.0, orderOfMagnitude);
}
}
Simple adaptation for JavaScript (thanks a lot to Johan Kool for source)
const step = (() => {let pixelPerUnit = height / (end - size)
, amountAtMinimum = minimumPixelsPerMajorGridLine / pixelPerUnit
, orderOfMagnitude = Math.floor(Math.log10(amountAtMinimum))
, fraction = amountAtMinimum / Math.pow(10.0, orderOfMagnitude);
let result;
if (fraction <= 2) {
result = 2 * Math.pow(10.0, orderOfMagnitude);
} else if (fraction <= 5) {
result = 5 * Math.pow(10.0, orderOfMagnitude);
} else {
result = 10 * Math.pow(10.0, orderOfMagnitude);
}})();
let arr = [];
arr.push(start);
let curVal = start - start % step + step
, pxRatio = height / (end - start);
while (curVal < end) {
arr.push(curVal);
curVal += step;
}
arr.push(end);