Neural Net with batch size 1 and one input at a time - autoencoder

I tried a very simple autoencoder, 3 inputs, one layer with 2 neurons and the out put with 3.
Just numbers
0.01 0.02 ........... 1.0
0.011 0.021 1.01
0.012 0.022 1.02
That works if all samples, 100, are the input and with 200 epoches.
for (size_t i = 0; i < 100; i++)
{
for (size_t j = 0; j < 3; j++)
{
samples[i][0] = (float)(i+1) * 0.01;
samples[i][1] = (float)(i + 1) * 0.01 + 0.01;
samples[i][2] = (float)(i + 1) * 0.01 + 0.01 + 0.01;
}
}
net.fit<mse>(optimizer, samples, samples, 100, NUM_EPOCHS
, onMinibatch, onEpoch);
But feeding one sample at a time doesn't work.
Like this:
for (int i = 0; i < NUM_EPOCHS; i++)
{
for (int j = 0; j < 100; j++)
{
samples[0][0] = (float)(j + 1) * 0.01;
samples[0][1] = (float)(j + 1) * 0.01 + 0.01;
samples[0][2] = (float)(j + 1) * 0.01 + 0.01 + 0.01;
net.fit<mse>(optimizer, samples, samples, 1, 1
, onMinibatch, onEpoch);
}
}
Is it that bad to feed single samples into a neural network?
Many thanks for your help.

I made a mistake, use
net.train_once<mse>...
does the job.

Related

Writing a MATLAB code using an algorithm for a reciprocal matrix

I want to write a code for the following algorithm using MATLAB.
**Input:** An square reciprocal matrix T = (tij) for (i, j = 1, 2, . . . , n),
**Output:** A square reciprocal matrix C = (cij) for (i, j = 1, 2, . . . , n);
1: for i = 1; i < n; i ++ do
2: for j = i ; j < n; j ++ do
3: cij = sqrt(tij /tji );
4: cji = 1/cij ;
5: end for
6: end for
I have the following matrix T:
T=[...
0.08 0.02 0.34 0.67;...
0.01 0.08 0.17 0.34;...
0.02 0.04 0.09 0.18;...
0.01 0.02 0.04 0.09]
The answer C I found on a paper is:
C = [...
1 2 4 8;...
0.50 1 2 4;...
0.25 0.50 1 2;...
0.13 0.25 0.50 1]
So far, I have tried the following code, but I am not certain about it. I couldn't find the exact answer C above. Any idea or help, please?
C=zeros(n,n);
for i = 1:n
for j = i:n
C(i,j) = sqrt(T(i,j)/T(j,i));
C(j,i) = 1/C(i,j) ;
end
end
C;

CUDA fft 2d different results from MATLAB fft on 2d

I have tried to do a simple fft and compare the results between MATLAB and CUDA on 2d arrays.
MATLAB:
array of 9 numbers 1-9
I = [1 2 3
4 5 6
7 8 9];
and use this code:
fft(I)
gives the results:
12.0000 + 0.0000i 15.0000 + 0.0000i 18.0000 + 0.0000i
-4.5000 + 2.5981i -4.5000 + 2.5981i -4.5000 + 2.5981i
-4.5000 - 2.5981i -4.5000 - 2.5981i -4.5000 - 2.5981i
And CUDA code:
int FFT_Test_Function() {
int width = 3;
int height = 3;
int n = width * height;
double in[width][height];
Complex out[width][height];
for (int i = 0; i<width; i++)
{
for (int j = 0; j < height; j++)
{
in[i][j] = (i * width) + j + 1;
}
}
// Allocate the buffer
cufftDoubleReal *d_in;
cufftDoubleComplex *d_out;
unsigned int out_mem_size = sizeof(cufftDoubleComplex)*n;
unsigned int in_mem_size = sizeof(cufftDoubleReal)*n;
cudaMalloc((void **)&d_in, in_mem_size);
cudaMalloc((void **)&d_out, out_mem_size);
// Save time stamp
milliseconds timeStart = getCurrentTimeStamp();
cufftHandle plan;
cufftResult res = cufftPlan2d(&plan, width, height, CUFFT_D2Z);
if (res != CUFFT_SUCCESS) { cout << "cufft plan error: " << res << endl; return 1; }
cudaCheckErrors("cuda malloc fail");
for (int i = 0; i < width; i++)
{
cudaMemcpy(d_in + (i * width), &in[i], height * sizeof(double), cudaMemcpyHostToDevice);
cudaCheckErrors("cuda memcpy H2D fail");
}
cudaCheckErrors("cuda memcpy H2D fail");
res = cufftExecD2Z(plan, d_in, d_out);
if (res != CUFFT_SUCCESS) { cout << "cufft exec error: " << res << endl; return 1; }
for (int i = 0; i < width; i++)
{
cudaMemcpy(&out[i], d_out + (i * width), height * sizeof(Complex), cudaMemcpyDeviceToHost);
cudaCheckErrors("cuda memcpy H2D fail");
}
cudaCheckErrors("cuda memcpy D2H fail");
milliseconds timeEnd = getCurrentTimeStamp();
milliseconds totalTime = timeEnd - timeStart;
std::cout << "Total time: " << totalTime.count() << std::endl;
return 0;
}
In this CUDA code i got the result:
You can see that CUDA gives different results.
What am i missed?
Thank you very much for your attention!
The cuFFT result looks correct, but your FFT code is wrong - it should be:
octave:1> I = [ 1 2 3; 4 5 6; 7 8 9 ]
I =
1 2 3
4 5 6
7 8 9
octave:2> fft2(I)
ans =
45.00000 + 0.00000i -4.50000 + 2.59808i -4.50000 - 2.59808i
-13.50000 + 7.79423i 0.00000 + 0.00000i 0.00000 + 0.00000i
-13.50000 - 7.79423i 0.00000 - 0.00000i 0.00000 - 0.00000i
Note the use of fft2.

XOR with Neural Networks (Matlab)

So, I'm hoping this is a real dumb thing I'm doing, and there's an easy answer. I'm trying to train a 2x3x1 neural network to do the XOR problem. It wasn't working, so I decided to dig in to see what was happening. Finally, I decided to assign the weights my self. This was the weight vector I came up with:
theta1 = [11 0 -5; 0 12 -7;18 17 -20];
theta2 = [14 13 -28 -6];
(In Matlab notation). I deliberately tried to make no two weights be the same (barring the zeros)
And, my code, really simple in matlab is
function layer2 = xornn(iters)
if nargin < 1
iters = 50
end
function s = sigmoid(X)
s = 1.0 ./ (1.0 + exp(-X));
end
T = [0 1 1 0];
X = [0 0 1 1; 0 1 0 1; 1 1 1 1];
theta1 = [11 0 -5; 0 12 -7;18 17 -20];
theta2 = [14 13 -28 -6];
for i = [1:iters]
layer1 = [sigmoid(theta1 * X); 1 1 1 1];
layer2 = sigmoid(theta2 * layer1)
delta2 = T - layer2;
delta1 = layer1 .* (1-layer1) .* (theta2' * delta2);
% remove the bias from delta 1. There's no real point in a delta on the bias.
delta1 = delta1(1:3,:);
theta2d = delta2 * layer1';
theta1d = delta1 * X';
theta1 = theta1 - 0.1 * theta1d;
theta2 = theta2 - 0.1 * theta2d;
end
end
I believe that's right. I tested various parameters (of the thetas) with the finite differences method to see if they were right, and they seemed to be.
But, when I run it, it eventually just all boils down to returning all zeros. If I do xornn(1) (for 1 iteration) I get
0.0027 0.9966 0.9904 0.0008
But, if I do xornn(35)
0.0026 0.9949 0.9572 0.0007
(It's started a descent in the wrong direction) and by the time I get to xornn(45) I get
0.0018 0.0975 0.0000 0.0003
If I run it for 10,000 iterations, it just returns all 0's.
What is going on? Must I add regularization? I would have thought such a simple network wouldn't need it. But, regardless, why does it move away from an obvious good solution that I have hand fed it?
Thanks!
AAARRGGHHH! The solution was simply a matter of changing
theta1 = theta1 - 0.1 * theta1d;
theta2 = theta2 - 0.1 * theta2d;
to
theta1 = theta1 + 0.1 * theta1d;
theta2 = theta2 + 0.1 * theta2d;
sigh
Now tho, I need to figure out how I'm computing the negative derivative somehow when what I thought I was computing was the ... Never mind. I'll post here anyway, just in case it helps someone else.
So, z = is the sum of inputs to the sigmoid, and y is the output of the sigmoid.
C = -(T * Log[y] + (1-T) * Log[(1-y))
dC/dy = -((T/y) - (1-T)/(1-y))
= -((T(1-y)-y(1-T))/(y(1-y)))
= -((T-Ty-y+Ty)/(y(1-y)))
= -((T-y)/(y(1-y)))
= ((y-T)/(y(1-y))) # This is the source of all my woes.
dy/dz = y(1-y)
dC/dz = ((y-T)/(y(1-y))) * y(1-y)
= (y-T)
So, the problem, is that I accidentally was computing T-y, because I forgot about the negative sign in front of the cost function. Then, I was subtracting what I thought was the gradient, but was in fact the negative gradient. And, there. That's the problem.
Once I did that:
function layer2 = xornn(iters)
if nargin < 1
iters = 50
end
function s = sigmoid(X)
s = 1.0 ./ (1.0 + exp(-X));
end
T = [0 1 1 0];
X = [0 0 1 1; 0 1 0 1; 1 1 1 1];
theta1 = [11 0 -5; 0 12 -7;18 17 -20];
theta2 = [14 13 -28 -6];
for i = [1:iters]
layer1 = [sigmoid(theta1 * X); 1 1 1 1];
layer2 = sigmoid(theta2 * layer1)
delta2 = T - layer2;
delta1 = layer1 .* (1-layer1) .* (theta2' * delta2);
% remove the bias from delta 1. There's no real point in a delta on the bias.
delta1 = delta1(1:3,:);
theta2d = delta2 * layer1';
theta1d = delta1 * X';
theta1 = theta1 + 0.1 * theta1d;
theta2 = theta2 + 0.1 * theta2d;
end
end
xornn(50) returns 0.0028 0.9972 0.9948 0.0009 and
xornn(10000) returns 0.0016 0.9989 0.9993 0.0005
Phew! Maybe this will help someone else in debugging their version..

MATLAB bsxfun or vectorization

I have been working on vectorizing my code mostly using bsxfun, but I came across a scenario that I can't quite crack. Here is a small sample of problem. I would like to remove the for loops in this code, but I am having a hard time with the tempEA line.
Index = [2; 3; 4;];
dTime = [25; 26; 27; 28; 25; 26; 27; 28; 27; 28];
dIndex = [3; 3; 3; 2; 1; 3; 2; 4; 4; 2];
aTime = [30; 38; 34; 39; 30; 38; 34; 39; 34; 39];
aIndex = [4; 2; 5; 4; 5; 4; 4; 2; 2; 4];
EA = zeros(numel(Index));
for i = 1:numel(Index)
for j = 1:numel(Index)
tempEA = aTime(Index(i) == dIndex(:,1) & Index(j) == aIndex(:,1));
if i == j
elseif tempEA > 0
EA(i,j) = min(tempEA);
else
EA(i,j) = 50;
end
end
end
The answer should look like this:
EA =
0 50 34
38 0 30
34 50 0
Thanks for help in advance.
This uses bsxfun; no loops. It assumes you don't have NaN's among your aTimevalues.
N = numel(Index);
ii = bsxfun(#eq, dIndex.', Index); %'// selected values according to each i
jj = bsxfun(#eq, aIndex.', Index); %'// selected values according to each j
[ igrid jgrid ] = ndgrid(1:N); %// generate all combinations of i and j
match = double(ii(igrid(:),:) & jj(jgrid(:),:)); %// each row contains the matches for an (i,j) combination
match(~match) = NaN; %// these entries will not be considered when minimizing
result = min(bsxfun(#times, aTime, match.')); %'// minimize according to each row of "match"
result = reshape(result,[N N]);
result(isnan(result)) = 50; %// set NaN to 50
result(result<=0) = 50; %// set nonpositive values to 50
result(1:N+1:end) = 0; %// set diagonal to 0
The line result(result<=0) = 50; is only necessary if your aTime can contain nonpositive values. Can it? Or is your elseif tempEA > 0 just a way of checking that tempEA is not empty?

MATLAB neural networks

I'm using a simple XOR input and output data set in order to train a neural network before attempting anything harder but for some reason it won't work.
may someone please explain what I am doing wrong please?
This is my code:
%user specified values
hidden_neurons = 3;
epochs = 10000;
t_input = [1 1; 1 0; 0 1; 0 0];
t_output = [1; 0; 0; 1];
te_input = [1 1; 1 0; 0 1; 0 0];
net = newff(t_input, t_output, 1);
net = init(net);
net = train(net, t_input, t_output);
net.trainParam.show = 50;
net.trainParam.lr = 0.25;
net.trainParam.epochs = epochs;
net.trainParam.goal = 1e-5;
net = train(net, t_input, t_output);
out = sim(net, te_input);
THis is my error message:
??? Error using ==> network.train at 145 Targets are incorrectly sized for
network. Matrix must have 2 columns.
Error in ==> smallNN at 11 net =
train(net, t_input, t_output);
You must have your samples on columns and not rows (like all the NN software in the world do), so change the data sets creation lines in:
t_input = [1 1; 1 0; 0 1; 0 0]';
t_output = [1; 0; 0; 1]';
te_input = [1 1; 1 0; 0 1; 0 0]';
Now it works.