Here is the code I have so far :
function [u]=example222(xrange,trange,uinit,u0bound,uLbound);
n = length(xrange);
m = length(trange);
u = zeros(n,m); %%%
Dx = (xrange(n)-xrange(1))/(n-1);
Dt = (trange(m)-trange(1))/(m-1);
u(:,1)=uinit';
u(1,:)=u0bound;
u(n,:)=uLbound;
gegu=0.08;
alpha=0;
koefa=(-Dt*gegu/(2*Dx));
koefb=(alpha*Dt/(Dx)^2);
u
% first time step
for i = 2:n-1
u(i,2) = u(i,1)+2*koefa*(u(i+1,1)-u(i-1,1))+(koefb/2)*(u(i-1,1)-2*u(i,1)+u(i+1,1))
end;
% subsequent time steps
for j = 2:m-1
for i = 2:n-1
u(i,j+1)=u(i,j-1)+koefa*(u(i+1,j)-u(i-1,j))+koefb*(u(i-1,j)-2*u(i,j)+u(i+1,j))
end;
end;
______________________________________
x = (0:0.1:1);
t = (0:0.8:8) ;
u0=zeros;uL=zeros;
uinit=1-(10*x-1).^2;
[u]=example222(x,t,uinit,u0,uL);
surf(x,t,u,'EdgeColor','black')
Next thing I need to do is to implement an interval for uinit=1-(10*x-1).^2
IF x-0.08*t < 0.2. then => uinit=1-(10*x-1).^2;
else uinit=0;
Can someone help me with that please. I was trying to do it with if clauses and loops and couldn't make it work. Help is greatly appreciated.
There are many ways to define a piecewise function my usual method is as follows:
uinit = zeros(size(x));
I = x<(0.08*t+0.2); %// Find the indices where x<(0.08*t+0.2)
uinit(I) = 1-(10*x(I)-1).^2;
This case is easier than what you may often have since you want all the other values of uinit to be zero. If you had another function in the other region (say x^2) you could also do:
uinit(1-I) = x(1-I).^2;
The operation 1-I gives 0 where I==1 and 1 where I==0, therefore your get the complement of I.
Related
I am currently looking for the most efficient way to shift and rearrange large matrices. Essentially, I have data with some parabolic shift that needs to be corrected in order to shift the "signal" to a linear event.
I have currently tried the following solutions and tried timing them. Is there any other method that may prove to be more efficient?
DATA = ones(100000,501);
DATA(10000,251) = 100;
for i=1:250
DATA(10000+i^2-1000:10000+i^2+1000,251-i) = 100;
DATA(10000+i^2-1000:10000+i^2+1000,251+i) = 100;
end
k = abs(-250:1:250).^2;
d = size(DATA,1);
figure(99)
imagesc(DATA)
t_INDEX = timeit(#()fun_INDEX(DATA,k))
t_SNIPPET = timeit(#()fun_SNIPPET(DATA,k))
t_CIRCSHIFT = timeit(#()fun_CIRCSHIFT(DATA,k))
t_INDEX_clean = timeit(#()fun_INDEX_clean(DATA,k))
t_SPARSE = timeit(#()fun_SPARSE(DATA,k))
t_BSXFUN = timeit(#()fun_BSXFUN(DATA,k))
function fun_INDEX(DATA,k)
DATA_1 = zeros(size(DATA));
for i=1:size(DATA,2)
DATA_1(:,i) = DATA([k(i)+1:end 1:k(i)],i);
end
figure(1)
imagesc(DATA_1)
end
function fun_SNIPPET(DATA,k)
kmax = max(k);
DATA_2 = zeros(size(DATA,1)-kmax,size(DATA,2));
for i=1:size(DATA,2)
DATA_2(:,i) = DATA(k(i)+1:end-kmax+k(i),i);
end
figure(2)
imagesc(DATA_2)
end
function fun_CIRCSHIFT(DATA,k)
DATA_3 = zeros(size(DATA));
for i=1:size(DATA,2)
DATA_3(:,i) = circshift(DATA(:,i),-k(i),1);
end
figure(3)
imagesc(DATA_3)
end
function fun_INDEX_clean(DATA,k)
[m, n] = size(DATA);
k = size(DATA,1)-k;
DATA_4 = zeros(m, n);
for i = (1 : n)
DATA_4(:, i) = [DATA((m - k(i) + 1 : m), i); DATA((1 : m - k(i) ), i)];
end
figure(4)
imagesc(DATA_4)
end
function fun_SPARSE(DATA,k)
[m,n] = size(DATA);
k = -k;
S = full(sparse(mod(k,m)+1,1:n,1,m,n));
DATA_5 = ifft(fft(DATA).*fft(S),'symmetric');
figure(5)
imagesc(DATA_5)
end
function fun_BSXFUN(DATA,k)
DATA = DATA';
k = -k;
[m,n] = size(DATA);
idx0 = mod(bsxfun(#plus,n-k(:),1:n)-1,n);
DATA_6 = DATA(bsxfun(#plus,(idx0*m),(1:m)'));
figure(6)
imagesc(DATA_6)
end
Is there any way to decrease computation time for this kind of problem?
Thanks in advance for any tips!
One option would be to use MATLAB's GPU functions, if your workstation has a GPU. Depending on if the entire data fits on the GPU at once, it will start to outperform CPU circshift at 1000 X 1000 matrix size.
The implementation only requires you to copy your data to the GPU with a single statement, and then operate circshift on the newly created you array.
A small discussion on its performance can be found here: https://www.mathworks.com/matlabcentral/answers/274619-circshift-slower-on-gpu . Especially, the last post describes a much faster GPU implementation if you actually don't need to circularly shift, but can get away with zero passing on one side, which might be relevant.
I recently stumbled upon a performance problem while implementing a simulation algorithm. I managed to find the bottleneck function (signally, it's the internal call to arrayfun that slows everything down):
function sim = simulate_frequency(the_f,k,n)
r = rand(1,n); %
x = arrayfun(#(x) find(x <= the_f,1,'first'),r);
sim = (histcounts(x,[1:k Inf]) ./ n).';
end
It is being used in other parts of code as follows:
h0 = zeros(1,sims);
for i = 1:sims
p = simulate_frequency(the_f,k,n);
h0(i) = max(abs(p - the_p));
end
Here are some possible values:
% Test Case 1
sims = 10000;
the_f = [0.3010; 0.4771; 0.6021; 0.6990; 0.7782; 0.8451; 0.9031; 0.9542; 1.0000];
k = 9;
n = 95;
% Test Case 2
sims = 10000;
the_f = [0.0413; 0.0791; 0.1139; 0.1461; 0.1760; 0.2041; 0.2304; 0.2552; 0.2787; 0.3010; 0.3222; 0.3424; 0.3617; 0.3802; 0.3979; 0.4149; 0.4313; 0.4471; 0.4623; 0.4771; 0.4913; 0.5051; 0.5185; 0.5314; 0.5440; 0.5563; 0.5682; 0.5797; 0.5910; 0.6020; 0.6127; 0.6232; 0.6334; 0.6434; 0.6532; 0.6627; 0.6720; 0.6812; 0.6901; 0.6989; 0.7075; 0.7160; 0.7242; 0.7323; 0.7403; 0.7481; 0.7558; 0.7634; 0.7708; 0.7781; 0.7853; 0.7923; 0.7993; 0.8061; 0.8129; 0.8195; 0.8260; 0.8325; 0.8388; 0.8450; 0.8512; 0.8573; 0.8633; 0.8692; 0.8750; 0.8808; 0.8864; 0.8920; 0.8976; 0.9030; 0.9084; 0.9138; 0.9190; 0.9242; 0.9294; 0.9344; 0.9395; 0.9444; 0.9493; 0.9542; 0.9590; 0.9637; 0.9684; 0.9731; 0.9777; 0.9822; 0.9867; 0.9912; 0.9956; 1.000];
k = 90;
n = 95;
The scalar sims must be in the range 1000 1000000. The vector of cumulated frequencies the_f never contains more than 100 elements. The scalar k represents the number of elements in the_f. Finally, the scalar n represents the number of elements in the empirical sample vector, and can even be very large (up to 10000 elements, as far as I can tell).
Any clue about how to improve the computation time of this process?
This seems to be slightly faster for me in the second test case, not the first. The time differences might be larger for longer the_f and larger values of n.
function sim = simulate_frequency(the_f,k,n)
r = rand(1,n); %
[row,col] = find(r <= the_f); % Implicit singleton expansion going on here!
[~,ind] = unique(col,'first');
x = row(ind);
sim = (histcounts(x,[1:k Inf]) ./ n).';
end
I'm using implicit singleton expansion in r <= the_f, use bsxfun if you have an older version of MATLAB (but you know the drill).
Find then returns row and column to all the locations where r is larger than the_f. unique finds the indices into the result for the first element of each column.
Credit: Andrei Bobrov over on MATLAB Answers
Another option (derived from this other answer) is a bit shorter but also a bit more obscure IMO:
mask = r <= the_f;
[x,~] = find(mask & (cumsum(mask,1)==1));
If I want performance, I would avoid arrayfun. Even this for loop is faster:
function sim = simulate_frequency(the_f,k,n)
r = rand(1,n); %
for i = 1:numel(r)
x(i) = find(r(i)<the_f,1,'first');
end
sim = (histcounts(x,[1:k Inf]) ./ n).';
end
Running 10000 sims with the first set of the sample data gives the following timing.
Your arrayfun function:
>Elapsed time is 2.848206 seconds.
The for loop function:
>Elapsed time is 0.938479 seconds.
Inspired by Cris Luengo's answer, I suggest below:
function sim = simulate_frequency(the_f,k,n)
r = rand(1,n); %
x = sum(r > the_f)+1;
sim = (histcounts(x,[1:k Inf]) ./ n)';
end
Time:
>Elapsed time is 0.264146 seconds.
You can use histcounts with r as its input:
r = rand(1,n);
sim = (histcounts(r,[-inf ;the_f]) ./ n).';
If histc is used instead of histcounts the whole simulation can be vectorized:
r = rand(n,sims);
p = histc(r, [-inf; the_f],1);
p = [p(1:end-2,:) ;sum(p(end-1:end,:))]./n;
h0 = max(abs(p-the_p(:))); %h0 = max(abs(bsxfun(#minus,p,the_p(:))));
I have originally written the following Matlab code to find intersection between a set of Axes Aligned Bounding Boxes (AABB) and space partitions (here 8 partitions). I believe it is readable by itself, moreover, I have added some comments for even more clarity.
function [A,B] = AABBPart(bbx,it) % bbx: aabb, it: iteration
global F
IT = it+1;
n = size(bbx,1);
F = cell(n,it);
A = Part([min(bbx(:,1:3)),max(bbx(:,4:6))],it,0); % recursive partitioning
B = F; % matlab does not allow
function s = Part(bx,it,J) % output to be global
s = {};
if it < 1; return; end
s = cell(8,1);
p = bx(1:3);
q = bx(4:6);
h = 0.5*(p+q);
prt = [p,h;... % 8 sub-parts (octa)
h(1),p(2:3),q(1),h(2:3);...
p(1),h(2),p(3),h(1),q(2),h(3);...
h(1:2),p(3),q(1:2),h(3);...
p(1:2),h(1),h(1:2),q(3);...
h(1),p(2),h(3),q(1),h(2),q(3);...
p(1),h(2:3),h(1),q(2:3);...
h,q];
for j=1:8 % check for each sub-part
k = 0;
t = zeros(0,1);
for i=1:n
if all(bbx(i,1:3) <= prt(j,4:6)) && ... % interscetion test for
all(prt(j,1:3) <= bbx(i,4:6)) % every aabb and sub-parts
k = k+1;
t(k) = i;
end
end
if ~isempty(t)
s{j,1} = [t; Part(prt(j,:),it-1,j)]; % recursive call
for i=1:numel(t) % collecting the results
if isempty(F{t(i),IT-it})
F{t(i),IT-it} = [-J,j];
else
F{t(i),IT-it} = [F{t(i),IT-it}; [-J,j]];
end
end
end
end
end
end
Concerns:
In my tests, it seems that probably few intersections are missing, say, 10 or so for 1000 or more setup. So I would be glad if you could help to find out any problematic parts in the code.
I am also concerned about using global F. I prefer to get rid of it.
Any other better solution in terms of speed, will be loved.
Note that the code is complete. And you can easily try it by some following setup.
n = 10000; % in the original application, n would be millions
bbx = rand(n,6);
it = 3;
[A,B] = AABBPart(bbx,it);
I have written a function that estimates the inverse of e and loops through values of n until the approximated value is within a given tolerance of the actual value.
Currently I use this code:
function [approx, n] = calc_e(tolerance)
for n = 1:inf
approx = ((1-1/n)^n);
diff = (1/exp(1)) - approx;
if diff < tolerance, break; end
end
end
This works fine however I have been told that it could be more efficient by using a while loop but I can't work out how to do it in that way.
Can anybody shed some light on this?
Simply do:
function [approx, n] = calc_e(tolerance)
n = 1;
while (1/exp(1)) - ((1-1/n)^n) >= tolerance
n = n + 1;
end
end
I'm trying to optimize the performance (e.g. speed) of my code. I 'm new to vectorization and tried myself to vectorize, but unsucessful ( also try bxsfun, parfor, some kind of vectorization, etc ). Can anyone help me optimize this code, and a short description of how to do this?
% for simplify, create dummy data
Z = rand(250,1)
z1 = rand(100,100)
z2 = rand(100,100)
%update missing param on the last updated, thanks #Bas Swinckels and #Daniel R
j = 2;
n = length(Z);
h = 0.4;
tic
[K1, K2] = size(z1);
result = zeros(K1,K2);
for l = 1 : K1
for m = 1: K2
result(l,m) = sum(K_h(h, z1(l,m), Z(j+1:n)).*K_h(h, z2(l,m), Z(1:n-j)));
end
end
result = result ./ (n-j);
toc
The K_h.m function is the boundary kernel and defined as (x is scalar and y can be vector)
function res = K_h(h, x,y)
res = 0;
if ( x >= 0 & x < h)
denominator = integral(#kernelFunc,-x./h,1);
res = 1./h.*kernelFunc((x-y)/h)/denominator;
elseif (x>=h & x <= 1-h)
res = 1./h*kernelFunc((x-y)/h);
elseif (x > 1 - h & x <= 1)
denominator = integral(#kernelFunc,-1,(1-x)./h);
res = 1./h.*kernelFunc((x-y)/h)/denominator;
else
fprintf('x is out of [0,1]');
return;
end
end
It takes a long time to obtain the results: \Elapsed time is 13.616413 seconds.
Thank you. Any comments are welcome.
P/S: Sorry for my lack of English
Some observations: it seems that Z(j+1:n)) and Z(1:n-j) are constant inside the loop, so do the indexing operation before the loop. Next, it seems that the loop is really simple, every result(l, m) depends on z1(l, m) and z2(l, m). This is an ideal case for the use of arrayfun. A solution might look something like this (untested):
tic
% do constant stuff outside of the loop
Zhigh = Z(j+1:n);
Zlow = Z(1:n-j);
result = arrayfun(#(zz1, zz2) sum(K_h(h, zz1, Zhigh).*K_h(h, zz2, Zlow)), z1, z2)
result = result ./ (n-j);
toc
I am not sure if this will be a lot faster, since I guess the running time will not be dominated by the for-loops, but by all the work done inside the K_h function.