I have gpu
>> d = gpuDevice
d =
CUDADevice with properties:
Name: 'GeForce 800M'
Index: 1
ComputeCapability: '2.1'
SupportsDouble: 1
DriverVersion: 6
ToolkitVersion: 5
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [65535 65535 65535]
SIMDWidth: 32
TotalMemory: 2.1475e+09
FreeMemory: 1.9886e+09
MultiprocessorCount: 1
ClockRateKHz: 1475000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1`
I try use gpu in neuralnetwork train, but my gpu slower than cpu.
If I try use gpuArray, gpu is faster than cpu, but I haven't speed acceleration in neural network training.
For example
>> a1 = rand(1000); b1 = rand(1000); tic; c1 = a1 * b1; toc;
Elapsed time is 0.044095 seconds.
>> a2 = gpuArray(rand(1000)); b2 = gpuArray(rand(1000)); tic; c2 = a2 * b2; toc;
Elapsed time is 0.000416 seconds.
But in code
net = newff(H, F, Layers, { 'tansig' 'tansig'}, 'traingdx', 'learngdm', 'mse');
net.trainParam.epochs = Epochs;
net.trainParam.show = 500;
net.trainParam.time = 495;
net.trainParam.goal = 2.0000e-11;
net.trainParam.max_fail = 200000;
net.trainParam.min_grad = 1.0000e-050;
net.performParam.regularization = 0.05;
net.divideParam.trainRatio = 1;
net.divideParam.valRatio = 0;
net.divideParam.testRatio = 0;
net.trainParam.showWindow = 0;
net.trainParam.showCommandLine = 0;
if Gpu1 == 1
net = train(net, H, F, 'useGPU', 'yes', 'showResources','yes');
else
net = train(net, H, F, 'showResources','yes');
end;
tic; net = net_example(300, [23, 9], rand(100, 1000), rand(1, 1000), 1); toc;
Computing Resources:
GPU device #1, GeForce 800M
works slower than
tic; net = net_example(300, [23, 9], rand(100, 1000), rand(1, 1000), 0); toc;
Computing Resources:
MEX2
Related
I have the following code:
tic;
H = rand(100, 1000);
F = rand(1, 1000);
net = newff(H, F, [30, 10], { 'tansig' 'tansig'}, 'traingdx', 'learngdm', 'mse');
net.trainParam.epochs = 400;
net.performParam.regularization = 0.05;
net.divideParam.trainRatio = 1;
net.divideParam.valRatio = 0;
net.divideParam.testRatio = 0;
net.trainParam.showWindow = 0;
net.trainParam.showCommandLine = 0;
% net = train(net, H, F, 'useGPU', 'yes', 'showResources', 'yes'); % line 1
net = train(net, H, F, 'showResources', 'yes'); % line 2
toc;
with line 2 uncommented I get
Computing Resources:
GPU device #1, GeForce 800M
Elapsed time is 5.084222 seconds.
and with line 1 uncommented I get
Computing Resources:
MEX2
Elapsed time is 1.870803 seconds.
Why is GPU slower than CPU?
My GPU properties:
CUDADevice with properties:
Name: 'GeForce 800M'
Index: 1
ComputeCapability: '2.1'
SupportsDouble: 1
DriverVersion: 6
ToolkitVersion: 5
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [65535 65535 65535]
SIMDWidth: 32
TotalMemory: 2.1475e+09
FreeMemory: 1.9886e+09
MultiprocessorCount: 1
ClockRateKHz: 1475000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1`
I want to customize my neural network training to process within the range of [0,65] - I tried the code:
net.inputs{1}.processParams;
ymin = 0;
ymax = 65;
net.outputs{2}.processParams;
ymin = 0;
ymax = 65;
Yet when I check the Time Series-Response graph it still processes values beyond these two values.
Using:
SampleData:
18
39
51
26
13
9
13
9
13
13
13
4
46
52
52
8
T = tonndata(SampleData,false,false);
trainFcn = 'trainbr';
feedbackDelays = 1:2;
hiddenLayerSize = 400;
net = narnet(feedbackDelays,hiddenLayerSize, 'open', trainFcn);
[x,xi,ai,li] = preparets(net,{},{},T);
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
then....to scale the training processing...
net.inputs{1}.processParams;
ymin = 0;
ymax = 65;
net.outputs{2}.processParams;
ymin = 0;
ymax = 65;
Well another problem has pop up recently.
I have a set representing a curve and a line I drew with the line() function.
So far my code is :
clc, clear all, close all;
n = 800/1500;
I = [ 0 1.1 4 9.5 15.3 19.5 23.1 26 28.2 30.8 33.3 35.9];
E_up = [ 5.8 10.5 28 60.3 85.5 100.3 108 113.2 117 120.5 123.5 126];
E_up = E_up./n;
Iw = [ 34 31.5 28.2 23.9 19.9 16.1 13 8.1 3.5 1.2 0 NaN];
E_down = [124.6 122.5 118.8 112.2 103.9 93.1 81.6 59.1 29.6 14.5 9.5 NaN];
E_down = E_down./n;
x_est = I;
y_est = spline(Iw,E_down,x_est)
A(:,1)= E_up
A(:,2) = y_est
ma = mean(A,2)
% figure()
% hold all
% % plot(x_est,y_est,'ro')
% plot(I,E_up,'b-',Iw,E_down,'g-')
% plot(I,ma,'r')
% grid on
% legend('up','down','mean')
%dane_znamionowe
clc, clear all, close all;
%data_entry
n = 800/1500;
I = [ 0 1.1 4 9.5 15.3 19.5 23.1 26 28.2 30.8 33.3 35.9];
E_up = [ 5.8 10.5 28 60.3 85.5 100.3 108 113.2 117 120.5 123.5 126];
E_up = E_up./n; %rescalling_EMF
Iw = [ 34 31.5 28.2 23.9 19.9 16.1 13 8.1 3.5 1.2 0 NaN];
E_down = [124.6 122.5 118.8 112.2 103.9 93.1 81.6 59.1 29.6 14.5 9.5 NaN];
E_down = E_down./n; %rescalling_EMF
Un = 220;
In = 28.8;
wn = 1500;
wmax = 3000;
P = 5.5e3;
Rs = 15.8/25;
%interpolation
x_est = I;
y_est = spline(Iw,E_down,x_est);
%mean_values
A(:,1)= E_up;
A(:,2) = y_est;
ma = mean(A,2);
%party_Xd
figure()
[ax,h1,h2] = plotyy(I+30,wn,I,ma,'plot','plot');
set(ax(1),'ylim',[0 3000],'ytick',[1500 3000]);
set(ax(2),'ylim',[0 300],'ytick',[100 200 300]);
hold(ax(1))
hold(ax(2))
%stable_parts
set(ax,'NextPlot','add')
plot(ax(2),I,ma,'b')
plot(ax(2),0,Un,'m*')
i2 = 0:0.01:70;
plot(ax(2),i2,Un-(i2*Rs),'m--')
iin = 0:1:300;
plot(ax(2),In,iin,'g-')
plot(ax(1),i2,wn,'k-','linewidth',8)
plot(ax(1),28.8,1500,'g*')
%loop
p1x = [35 45 55 65];
for ii = 1 :length(p1x)
x11 = p1x(ii);
y11 = 0;
x21 = In;
y21 = wn;
x1 = [35 45 55 65];
y1 = [0 0 0 0];
x2 = [In In In In];
y2 = [wn wn wn wn];
slope = (y21-y11)/(x21-x11);
xLeft = 0;
yLeft = slope * (xLeft - x11) + y11;
xRight = 70;
yRight = slope * (xRight - x11) + y11;
plot(ax(2),x11,0,'r.')
a1 = line([xLeft, xRight], [yLeft, yRight], 'Color', 'c');
x0 = (max(min(x1),min(x2))+min(max(x1),max(x2)))/2;
fun1 = #(x) interp1(x1,y1,x,'linear');
fun2 = #(x) interp1(x2,y2,x,'linear');
difffun = #(x) fun1(x)-fun2(x);
crossing = fzero(difffun,x0); %crossing x coordinate
crossval = fun1(crossing);
end
My graph looks like this which is pretty decent.But I need to find the intersection point of the cyan line and blue curve.
An answer based on my solution to a similar question:
%dummy input
x1=[0 1 2 3];
y1=[1 4 2 0];
x2=[-1 3 4 5];
y2=[-1 2 5 3];
x0 = (max(min(x1),min(x2))+min(max(x1),max(x2)))/2;
fun1 = #(x) interp1(x1,y1,x,'linear','extrap');
fun2 = #(x) interp1(x2,y2,x,'linear','extrap');
difffun = #(x) fun1(x)-fun2(x);
crossing = fzero(difffun,x0); %crossing x coordinate
crossval = fun1(crossing); %substitute either function at crossing point
plot(x1,y1,'b-',x2,y2,'r-',crossing,crossval,'ks');
legend('line1','line2','crossover','location','nw');
after which your crossing point is given by [crossing, crossval].
Result:
I'm working on Matlab and was wondering how I add terms within a large matrix. Specifically, I have a 4914x4914 matrix and would like to create a 189x189 matrix, where each term is equal to the sum of the terms in each 26x26 subset.
To illustrate, say I had the magic 4x4 matrix as follows:
[16 2 3 13;
5 11 10 8;
9 7 6 12;
4 14 15 1]
and I wanted to create a 2x2 matrix equal to the sum of each 2x2 matrix within the original magic 4x4, i.e.:
[(16+2+5+11) (3+13+10+8);
(9+7+4+14) (6+12+15+1)]
Grateful for any advice!
Thanks
jake
Assuming A to be the input 4914x4914 matrix, this could be an efficient (in terms of runtime) approach -
sublen = 26; %// subset length
squeeze(sum(reshape(sum(reshape(A,sublen,[])),size(A,1)/sublen,sublen,[]),2))
For a generic block size, let's have a function -
function out = sum_blocks(A,block_nrows, block_ncols)
out = squeeze(sum(reshape(sum(reshape(A,block_nrows,[])),...
size(A,1)/block_nrows,block_ncols,[]),2));
return
Sample run -
>> A = randi(9,4,6);
>> A
A =
8 2 4 9 4 5
3 3 8 3 6 8
9 6 6 7 1 9
4 5 5 7 1 2
>> sum_blocks(A,2,3)
ans =
28 35
35 27
>> sum(sum(A(1:2,1:3)))
ans =
28
>> sum(sum(A(1:2,4:6)))
ans =
35
>> sum(sum(A(3:4,1:3)))
ans =
35
>> sum(sum(A(3:4,4:6)))
ans =
27
If you would like to avoid squeeze -
sum(permute(reshape(sum(reshape(A,sublen,[])),size(A,1)/sublen,sublen,[]),[1 3 2]),3)
Benchmarking
Hoping you would care about performance, here are the benchmark results for all the solutions posted here. The benchmarking code that I have used -
num_runs = 100; %// Number of iterations to run benchmarks
A = rand(4914);
for k = 1:50000
tic(); elapsed = toc(); %// Warm up tic/toc
end
disp('---------------------- With squeeze + reshape + sum')
tic
for iter = 1:num_runs
sublen = 26; %// subset length
out1 = squeeze(sum(reshape(sum(reshape(A,sublen,[])),...
size(A,1)/sublen,sublen,[]),2));
end
time1 = toc;
disp(['Avg. elapsed time = ' num2str(time1/num_runs) ' sec(s)']), clear out1 sublen
disp('---------------------- With kron + matrix multiplication')
tic
for iter = 1:num_runs
n = 189; k = 26;
B = kron(speye(k), ones(1,n));
result = B*A*B';
end
time2 = toc;
disp(['Avg. elapsed time = ' num2str(time2/num_runs) ' sec(s)']),clear result n k B
disp('---------------------- With accumarray')
tic
for iter = 1:num_runs
s = 26; n = size(A,1)/s;
subs = kron(reshape(1:(n^2), n, n),ones(s));
out2 = reshape(accumarray(subs(:), A(:)), n, n);
end
time2 = toc;
disp(['Avg. elapsed time = ' num2str(time2/num_runs) ' sec(s)']),clear s n subs out2
The benchmarks results I got on my system -
---------------------- With squeeze + reshape + sum
Avg. elapsed time = 0.050729 sec(s)
---------------------- With kron + matrix multiplication
Avg. elapsed time = 0.068293 sec(s)
---------------------- With accumarray
Avg. elapsed time = 0.64745 sec(s)
An alternative way is to reshape the whole matrix into a 4D matrix and sum the elements over first and third dimension:
result = squeeze(sum(sum(reshape(A,26,189,26,189),1),3));
If you don't have the image processing toolbox then you can do this using accumarray:
s = 26;
n = size(A,1)/s;
subs = kron(reshape(1:(n^2), n, n),ones(s));
reshape(accumarray(subs(:), A(:)), n, n)
this is reusable should you decide to aggregate some way other than a simple sum e.g. a median:
reshape(accumarray(subs(:), A(:), [], #median), n, n)
You can use matrix multiplication, of course:
n = 26;
k = 189;
B = kron(speye(k), ones(1,n));
result = B*A*B';
I am facing a problem of classification between 4 classes, I used for this classification Weka and I get a result in this form:
Correctly Classified Instances 3860 96.5 %
Incorrectly Classified Instances 140 3.5 %
Kappa statistic 0.9533
Mean absolute error 0.0178
Root mean squared error 0.1235
Relative absolute error 4.7401 %
Root relative squared error 28.5106 %
Total Number of Instances 4000
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure ROC Area Class
0.98 0.022 0.936 0.98 0.957 0.998 A
0.92 0.009 0.973 0.92 0.946 0.997 B
0.991 0.006 0.982 0.991 0.987 1 C
0.969 0.01 0.971 0.969 0.97 0.998 D
Weighted Avg. 0.965 0.012 0.965 0.965 0.965 0.998
=== Confusion Matrix ===
a b c d <-- classified as
980 17 1 2 | a = A
61 920 1 18 | b = B
0 0 991 9 | c = C
6 9 16 969 | d = D
My goal now is to draw (The Detection Error Trade-off) DET curve from results provided by Weka.
I found a MATLAB code that allows me to draw the DET curve, here are some line of code in this function:
Ntrials_True = 1000;
True_scores = randn(Ntrials_True,1);
Ntrials_False = 1000;
mean_False = -3;
stdv_False = 1.5;
False_scores = stdv_False * randn(Ntrials_False,1) + mean_False;
%-----------------------
% Compute Pmiss and Pfa from experimental detection output scores
[P_miss,P_fa] = Compute_DET(True_scores,False_scores);
the code of function Compute_DET is:
[Pmiss, Pfa] = Compute_DET(true_scores, false_scores)
num_true = max(size(true_scores));
num_false = max(size(false_scores));
total=num_true+num_false;
Pmiss = zeros(num_true+num_false+1, 1); %preallocate for speed
Pfa = zeros(num_true+num_false+1, 1); %preallocate for speed
scores(1:num_false,1) = false_scores;
scores(1:num_false,2) = 0;
scores(num_false+1:total,1) = true_scores;
scores(num_false+1:total,2) = 1;
scores=DETsort(scores);
sumtrue=cumsum(scores(:,2),1);
sumfalse=num_false - ([1:total]'-sumtrue);
Pmiss(1) = 0;
Pfa(1) = 1.0;
Pmiss(2:total+1) = sumtrue ./ num_true;
Pfa(2:total+1) = sumfalse ./ num_false;
return
but I have a problem with the translation of the meaning of different parameters. for example what is the significance of mean_False and stdv_False and what is the correspondence with the parameters of Weka?