Related
I have a data, which may be simulated in the following way:
N = 10^6;%10^8;
K = 10^4;%10^6;
subs = randi([1 K],N,1);
M = [randn(N,5) subs];
M(M<-1.2) = nan;
In other words, it is a matrix, where the last row is subscripts.
Now I want to calculate nanmean() for each subscript. Also I want to save number of rows for each subscript. I have a 'dummy' code for this:
uniqueSubs = unique(M(:,6));
avM = nan(numel(uniqueSubs),6);
for iSub = 1:numel(uniqueSubs)
tmpM = M(M(:,6)==uniqueSubs(iSub),1:5);
avM(iSub,:) = [nanmean(tmpM,1) size(tmpM,1)];
end
The problem is, that it is too slow. I want it to work for N = 10^8 and K = 10^6 (see commented part in the definition of these variables.
How can I find the mean of the data in a faster way?
This sounds like a perfect job for findgroups and splitapply.
% Find groups in the final column
G = findgroups(M(:,6));
% function to apply per group
fcn = #(group) [mean(group, 1, 'omitnan'), size(group, 1)];
% Use splitapply to apply fcn to each group in M(:,1:5)
result = splitapply(fcn, M(:, 1:5), G);
% Check
assert(isequaln(result, avM));
M = sortrows(M,6); % sort the data per subscript
IDX = diff(M(:,6)); % find where the subscript changes
tmp = find(IDX);
tmp = [0 ;tmp;size(M,1)]; % add start and end of data
for iSub= 2:numel(tmp)
% Calculate the mean over just a single subscript, store in iSub-1
avM2(iSub-1,:) = [nanmean(M(tmp(iSub-1)+1:tmp(iSub),1:5),1) tmp(iSub)-tmp(iSub-1)];tmp(iSub-1)];
end
This is some 60 times faster than your original code on my computer. The speed-up mainly comes from presorting the data and then finding all locations where the subscript changes. That way you do not have to traverse the full array each time to find the correct subscripts, but rather you only check what's necessary each iteration. You thus calculate the mean over ~100 rows, instead of first having to check in 1,000,000 rows whether each row is needed that iteration or not.
Thus: in the original you check numel(uniqueSubs), 10,000 in this case, whether all N, 1,000,000 here, numbers belong to a certain category, which results in 10^12 checks. The proposed code sorts the rows (sorting is NlogN, thus 6,000,000 here), and then loop once over the full array without additional checks.
For completion, here is the original code, along with my version, and it shows the two are the same:
N = 10^6;%10^8;
K = 10^4;%10^6;
subs = randi([1 K],N,1);
M = [randn(N,5) subs];
M(M<-1.2) = nan;
uniqueSubs = unique(M(:,6));
%% zlon's original code
avM = nan(numel(uniqueSubs),7); % add the subscript for comparison later
tic
uniqueSubs = unique(M(:,6));
for iSub = 1:numel(uniqueSubs)
tmpM = M(M(:,6)==uniqueSubs(iSub),1:5);
avM(iSub,:) = [nanmean(tmpM,1) size(tmpM,1) uniqueSubs(iSub)];
end
toc
%%%%% End of zlon's code
avM = sortrows(avM,7); % Sort for comparison
%% Start of Adriaan's code
avM2 = nan(numel(uniqueSubs),6);
tic
M = sortrows(M,6);
IDX = diff(M(:,6));
tmp = find(IDX);
tmp = [0 ;tmp;size(M,1)];
for iSub = 2:numel(tmp)
avM2(iSub-1,:) = [nanmean(M(tmp(iSub-1)+1:tmp(iSub),1:5),1) tmp(iSub)-tmp(iSub-1)];
end
toc %tic/toc should not be used for accurate timing, this is just for order of magnitude
%%%% End of Adriaan's code
all(avM(:,1:6) == avM2) % Do the comparison
% End of script
% Output
Elapsed time is 58.561347 seconds.
Elapsed time is 0.843124 seconds. % ~70 times faster
ans =
1×6 logical array
1 1 1 1 1 1 % i.e. the matrices are equal to one another
I have a 1x2 cell A in Matlab. A{i} is a cell of dimension 30494866x1 for i=1,2. A{i}(j) is a 1x21 char for i=1,2 and j=1,...,30494866.
For example I report here A{2}(1:3)
'116374117927631468606'
'112188647432305746617'
'116374117927631468606'
I want to count how many times each 1x21 char in A{2} is repeated. For example, just considering A{2}(1:3), I want to get
'116374117927631468606' 2
'112188647432305746617' 1
What I am doing at the moment is
a=unique(A{2},'stable');
b=cellfun(#(x) sum(ismember(A{2},x)),a);
However this is incredibly slow (running since yesterday). Do you have any suggestion on how I can speed up the code?
Since you want to know how many times each 21-char string used:
1) sort the cell
2) count how many times each string is used in a for loop.
Your code is O(n^2) so it's very slow. This should take less than a minute.
Based on your code
B=sort(A{2});
U=sort(unique(B));
C=zeros(numel(U),1);
cnt = 1;
for j=1:numel(B)
if strcmp(U(cnt),B(j))==1
C(cnt)=C(cnt)+1;
else
cnt = cnt +1;
if cnt <= numel(U)
C(cnt) = C(cnt)+1;
end
end
end
You can do this with the standard unique- accumarray couple:
data = {'116374117927631468606'
'112188647432305746617'
'116374117927631468606'};
[uu, ~, ww] = unique(data, 'stable');
count = accumarray(ww, 1);
result = [uu, num2cell(count)];
Or, a little more memory-efficient:
data = {'116374117927631468606'
'112188647432305746617'
'116374117927631468606'};
[~, vv, ww] = unique(data, 'stable');
count = accumarray(ww, 1);
result = [data(vv) num2cell(count)];
here is the code I have, its not simple subtraction. We want subtract each value in one vector from each value in the other vector, within certain bounds tmin and tmax. time_a and time_b are the very long vectors with times (in ps). binsize is just for grouping times in a similar range for plotting. The longest way possible would be to loop through each element and subtract each element in the other vector, but this would take forever and we are talking about vectors with hundreds of megabytes up to gb.
function [c, dt, dtEdges] = coincidence4(time_a,time_b,tmin,tmax,binsize)
% round tmin, tmax to a intiger multiple of binsize:
if mod(tmin,binsize)~=0
tmin=tmin-mod(tmin,binsize)+binsize;
end
if mod(tmax,binsize)~=0
tmax=tmax-mod(tmax,binsize);
end
dt = tmin:binsize:tmax;
dtEdges = [dt(1)-binsize/2,dt+binsize/2];
% dtEdges = linspace((tmin-binsize/2),(tmax+binsize/2),length(dt));
c = zeros(1,length(dt));
Na = length(time_a);
Nb = length(time_b);
tic1=tic;
% tic2=tic1;
% bbMax=Nb;
bbMin=1;
for aa = 1:Na
ta = time_a(aa);
bb = bbMin;
% tic
while (bb<=Nb)
tb = time_b(bb);
d = tb - ta;
if d < tmin
bbMin = bb;
bb = bb+1;
elseif d > tmax
bb = Nb+1;
else
% tic
% [dum, dum2] = histc(d,dtEdges);
index = floor((d-dtEdges(1))/(dtEdges(end)-dtEdges(1))*(length(dtEdges)-1)+1);
% toc
% dt(dum2)
c(index)=c(index)+1;
bb = bb+1;
end
end
% if mod(aa, 200) == 0
% toc(tic2)
% tic2=tic;
% end
end
% c=c(1:end-1);
toc(tic1)
end
Well, not a final answer but a few clue to simplify and accelerate your system:
First, use cached values. For example, in your line:
index = floor((d-dtEdges(1))/(dtEdges(end)-dtEdges(1))*(length(dtEdges)-1)+1);
your loop repeat the same computations every iteration. You can calculate the value before starting the loop, cache it then reuse the stored result:
cached_dt_constant = (dtEdges(end)-dtEdges(1))*(length(dtEdges)-1) ;
Then in your loop simply use:
index = floor( (d-dtEdges(1)) / cached_dt_constant +1 ) ;
if you have so many loop iteration you'll save valuable time this way.
Second, I am not entirely sure of what the computations are trying to achieve, but you can save time again by using the indexing power of matlab. By replacing the lower part of your code like this, I get an execution time 2 to 3 time faster (and the same results obviously).
Na = length(time_a);
Nb = length(time_b);
tic1=tic;
dtEdge_span = (dtEdges(end)-dtEdges(1)) ;
cached_dt_constant = dtEdge_span * (length(dtEdges)-1) ;
for aa = 1:Na
ta = time_a(aa);
d = time_b - ta ;
iok = (d>=tmin) & (d<=tmax) ;
index = floor( (d(iok)-dtEdges(1)) ./ cached_dt_constant +1 ) ;
c(index) = c(index) +1 ;
end
toc(tic1)
end
Now there is only one loop to go through, the inner loop has been removed and replaced by vectorized calculation. By scratching the head a bit further there might be a way to do even without the top loop and use only vectorized computations. Although this will require to have enough memory to handle quite big arrays in one go.
If the precision of each value is not critical (I see you round and floor values often), try converting your initial vectors to 'single' type instead of the default matlab 'double'. that would almost double the size of array your memory will be able to handle in one go.
I have surjective functions created by matching one element in an array MatchesX.trainIdx to one or more elements in a second array MatchesX.queryIdx.
To obtain only the bijective elements of said funciton I run the same function forward
Matches1=Matcher.match(Descriptors1,Descriptors2);
and then backwards
Matches2=Matcher.match(Descriptors2,Descriptors1);
and then look for the elements occuring in both function in following fashion:
k=1;
DoubleMatches=Matches1;
for i=1:length(Matches1)
for j=1:length(Matches2)
if((Matches1(i).queryIdx==Matches2(j).trainIdx)&&(Matches1(i).trainIdx==Matches2(j).queryIdx))
DoubleMatches(k)=Matches1(i);
k=k+1;
end
end
end
DoubleMatches(k:end)=[];
This of course does the work, but it is rather unelegant and seems to bother the JIT accelerator (calc time with accel on and accel off is the same).
Can you think of a way to vectorize this expresion? Is there any other way of avoiding the JIT from "striking"?
Thanks a lot and sorry about the strange structs, I'm working with MEX-functions. Let me know if rewriting the code in "normal" arrays would help
Access to data in multi-dimensional structures is notoriously slow in MATLAB, so transforming your data to an ordinary array will certainly help:
kk = 1;
DoubleMatches = Matches1;
%// transform to regular array
Matches1queryIdx = [Matches1.queryIdx];
Matches1trainIdx = [Matches1.trainIdx];
Matches2queryIdx = [Matches2.queryIdx];
Matches2trainIdx = [Matches2.trainIdx];
%// loop through transformed data instead of structures
for ii = 1:length(Matches1queryIdx)
for jj = 1:length(Matches1queryIdx)
if((Matches1queryIdx(ii)==Matches2trainIdx(jj)) && ...
(Matches1trainIdx(ii)==Matches2queryIdx(jj)))
DoubleMatches(kk) = Matches1(ii);
kk = kk+1;
end
end
end
DoubleMatches(kk:end)=[];
There is also a solution that is almost entirely vectorized:
matches = sum(...
bsxfun(#eq, [Matches1.queryIdx], [Matches2.trainIdx].') & ...
bsxfun(#eq, [Matches1.trainIdx], [Matches2.queryIdx].'));
contents = arrayfun(#(x)..
repmat(Matches1(x),1,matches(x)), 1:numel(matches), ...
'Uniformoutput', false);
DoubleMatches2 = [contents{:}]';
Note that this can be a lot more memory intensive (it has O(N²) peak memory footprint, as opposed to O(N) for the others, although the data type at peak memory is logical and thus 8x smaller than double...). Better do some checks beforehand which one you should use.
A little test. I used the following dummy data:
Matches1 = struct(...
'queryIdx', num2cell(randi(25,1000,1)),...
'trainIdx', num2cell(randi(25,1000,1))...
);
Matches2 = struct(...
'queryIdx', num2cell(randi(25,1000,1)),...
'trainIdx', num2cell(randi(25,1000,1))...
);
and the following test:
%// Your original method
tic
kk = 1;
DoubleMatches = Matches1;
for ii = 1:length(Matches1)
for jj = 1:length(Matches2)
if((Matches1(ii).queryIdx==Matches2(jj).trainIdx) && ...
(Matches1(ii).trainIdx==Matches2(jj).queryIdx))
DoubleMatches(kk) = Matches1(ii);
kk = kk+1;
end
end
end
DoubleMatches(kk:end)=[];
toc
DoubleMatches1 = DoubleMatches;
%// Method with data transformed into regular array
tic
kk = 1;
DoubleMatches = Matches1;
Matches1queryIdx = [Matches1.queryIdx];
Matches1trainIdx = [Matches1.trainIdx];
Matches2queryIdx = [Matches2.queryIdx];
Matches2trainIdx = [Matches2.trainIdx];
for ii = 1:length(Matches1queryIdx)
for jj = 1:length(Matches1queryIdx)
if((Matches1queryIdx(ii)==Matches2trainIdx(jj)) && ...
(Matches1trainIdx(ii)==Matches2queryIdx(jj)))
DoubleMatches(kk) = Matches1(ii);
kk = kk+1;
end
end
end
DoubleMatches(kk:end)=[];
toc
DoubleMatches2 = DoubleMatches;
% // Vectorized method
tic
matches = sum(...
bsxfun(#eq, [Matches1.queryIdx], [Matches2.trainIdx].') & ...
bsxfun(#eq, [Matches1.trainIdx], [Matches2.queryIdx].'));
contents = arrayfun(#(x)repmat(Matches1(x),1,matches(x)), 1:numel(matches), 'Uniformoutput', false);
DoubleMatches3 = [contents{:}]';
toc
%// Check if all are equal
isequal(DoubleMatches1,DoubleMatches2, DoubleMatches3)
Results:
Elapsed time is 6.350679 seconds. %// ( 1×) original method
Elapsed time is 0.636479 seconds. %// (~10×) method with regular array
Elapsed time is 0.165935 seconds. %// (~40×) vectorized
ans =
1 %// indeed, outcomes are equal
Assuming Matcher.match returns array of the same objects as passed to it as arguments you can solve this like this
% m1 are all d1s which have relation to d2
m1 = Matcher.match(d1,d2);
% m2 are all d2s, which have relation to m1
% and all m1 already have backward relation
m2 = Matcher.match(d2,m1);
Please help me to improve the following Matlab code to improve execution time.
Actually I want to make a random matrix (size [8,12,10]), and on every row, only have integer values between 1 and 12. I want the random matrix to have the sum of elements which has value (1,2,3,4) per column to equal 2.
The following code will make things more clear, but it is very slow.
Can anyone give me a suggestion??
clc
clear all
jum_kel=8
jum_bag=12
uk_pop=10
for ii=1:uk_pop;
for a=1:jum_kel
krom(a,:,ii)=randperm(jum_bag); %batasan tidak boleh satu kelompok melakukan lebih dari satu aktivitas dalam satu waktu
end
end
for ii=1:uk_pop;
gab1(:,:,ii) = sum(krom(:,:,ii)==1)
gab2(:,:,ii) = sum(krom(:,:,ii)==2)
gab3(:,:,ii) = sum(krom(:,:,ii)==3)
gab4(:,:,ii) = sum(krom(:,:,ii)==4)
end
for jj=1:uk_pop;
gabh1(:,:,jj)=numel(find(gab1(:,:,jj)~=2& gab1(:,:,jj)~=0))
gabh2(:,:,jj)=numel(find(gab2(:,:,jj)~=2& gab2(:,:,jj)~=0))
gabh3(:,:,jj)=numel(find(gab3(:,:,jj)~=2& gab3(:,:,jj)~=0))
gabh4(:,:,jj)=numel(find(gab4(:,:,jj)~=2& gab4(:,:,jj)~=0))
end
for ii=1:uk_pop;
tot(:,:,ii)=gabh1(:,:,ii)+gabh2(:,:,ii)+gabh3(:,:,ii)+gabh4(:,:,ii)
end
for ii=1:uk_pop;
while tot(:,:,ii)~=0;
for a=1:jum_kel
krom(a,:,ii)=randperm(jum_bag); %batasan tidak boleh satu kelompok melakukan lebih dari satu aktivitas dalam satu waktu
end
gabb1 = sum(krom(:,:,ii)==1)
gabb2 = sum(krom(:,:,ii)==2)
gabb3 = sum(krom(:,:,ii)==3)
gabb4 = sum(krom(:,:,ii)==4)
gabbh1=numel(find(gabb1~=2& gabb1~=0));
gabbh2=numel(find(gabb2~=2& gabb2~=0));
gabbh3=numel(find(gabb3~=2& gabb3~=0));
gabbh4=numel(find(gabb4~=2& gabb4~=0));
tot(:,:,ii)=gabbh1+gabbh2+gabbh3+gabbh4;
end
end
Some general suggestions:
Name variables in English. Give a short explanation if it is not immediately clear,
what they are indented for. What is jum_bag for example? For me uk_pop is music style.
Write comments in English, even if you develop source code only for yourself.
If you ever have to share your code with a foreigner, you will spend a lot of time
explaining or re-translating. I would like to know for example, what
%batasan tidak boleh means. Probably, you describe here that this is only a quick
hack but that someone should really check this again, before going into production.
Specific to your code:
Its really easy to confuse gab1 with gabh1 or gabb1.
For me, krom is too similar to the built-in function kron. In fact, I first
thought that you are computing lots of tensor products.
gab1 .. gab4 are probably best combined into an array or into a cell, e.g. you
could use
gab = cell(1, 4);
for ii = ...
gab{1}(:,:,ii) = sum(krom(:,:,ii)==1);
gab{2}(:,:,ii) = sum(krom(:,:,ii)==2);
gab{3}(:,:,ii) = sum(krom(:,:,ii)==3);
gab{4}(:,:,ii) = sum(krom(:,:,ii)==4);
end
The advantage is that you can re-write the comparsisons with another loop.
It also helps when computing gabh1, gabb1 and tot later on.
If you further introduce a variable like highestNumberToCompare, you only have to
make one change, when you certainly find out that its important to check, if the
elements are equal to 5 and 6, too.
Add a semicolon at the end of every command. Having too much output is annoying and
also slow.
The numel(find(gabb1 ~= 2 & gabb1 ~= 0)) is better expressed as
sum(gabb1(:) ~= 2 & gabb1(:) ~= 0). A find is not needed because you do not care
about the indices but only about the number of indices, which is equal to the number
of true's.
And of course: This code
for ii=1:uk_pop
gab1(:,:,ii) = sum(krom(:,:,ii)==1)
end
is really, really slow. In every iteration, you increase the size of the gab1
array, which means that you have to i) allocate more memory, ii) copy the old matrix
and iii) write the new row. This is much faster, if you set the size of the
gab1 array in front of the loop:
gab1 = zeros(... final size ...);
for ii=1:uk_pop
gab1(:,:,ii) = sum(krom(:,:,ii)==1)
end
Probably, you should also re-think the size and shape of gab1. I don't think, you
need a 3D array here, because sum() already reduces one dimension (if krom is
3D the output of sum() is at most 2D).
Probably, you can skip the loop at all and use a simple sum(krom==1, 3) instead.
However, in every case you should be really aware of the size and shape of your
results.
Edit inspired by Rody Oldenhuis:
As Rody pointed out, the 'problem' with your code is that its highly unlikely (though
not impossible) that you create a matrix which fulfills your constraints by assigning
the numbers randomly. The code below creates a matrix temp with the following characteristics:
The numbers 1 .. maxNumber appear either twice per column or not at all.
All rows are a random permutation of the numbers 1 .. B, where B is equal to
the length of a row (i.e. the number of columns).
Finally, the temp matrix is used to fill a 3D array called result. I hope, you can adapt it to your needs.
clear all;
A = 8; B = 12; C = 10;
% The numbers [1 .. maxNumber] have to appear exactly twice in a
% column or not at all.
maxNumber = 4;
result = zeros(A, B, C);
for ii = 1 : C
temp = zeros(A, B);
for number = 1 : maxNumber
forbiddenRows = zeros(1, A);
forbiddenColumns = zeros(1, A/2);
for count = 1 : A/2
illegalIndices = true;
while illegalIndices
illegalIndices = false;
% Draw a column which has not been used for this number.
randomColumn = randi(B);
while any(ismember(forbiddenColumns, randomColumn))
randomColumn = randi(B);
end
% Draw two rows which have not been used for this number.
randomRows = randi(A, 1, 2);
while randomRows(1) == randomRows(2) ...
|| any(ismember(forbiddenRows, randomRows))
randomRows = randi(A, 1, 2);
end
% Make sure not to overwrite previous non-zeros.
if any(temp(randomRows, randomColumn))
illegalIndices = true;
continue;
end
end
% Mark the rows and column as forbidden for this number.
forbiddenColumns(count) = randomColumn;
forbiddenRows((count - 1) * 2 + (1:2)) = randomRows;
temp(randomRows, randomColumn) = number;
end
end
% Now every row contains the numbers [1 .. maxNumber] by
% construction. Fill the zeros with a permutation of the
% interval [maxNumber + 1 .. B].
for count = 1 : A
mask = temp(count, :) == 0;
temp(count, mask) = maxNumber + randperm(B - maxNumber);
end
% Store this page.
result(:,:,ii) = temp;
end
OK, the code below will improve the timing significantly. It's not perfect yet, it can all be optimized a lot further.
But, before I do so: I think what you want is fundamentally impossible.
So you want
all rows contain the numbers 1 through 12, in a random permutation
any value between 1 and 4 must be present either twice or not at all in any column
I have a hunch this is impossible (that's why your code never completes), but let me think about this a bit more.
Anyway, my 5-minute-and-obvious-improvements-only-version:
clc
clear all
jum_kel = 8;
jum_bag = 12;
uk_pop = 10;
A = jum_kel; % renamed to make language independent
B = jum_bag; % and a lot shorter for readability
C = uk_pop;
krom = zeros(A, B, C);
for ii = 1:C;
for a = 1:A
krom(a,:,ii) = randperm(B);
end
end
gab1 = sum(krom == 1);
gab2 = sum(krom == 2);
gab3 = sum(krom == 3);
gab4 = sum(krom == 4);
gabh1 = sum( gab1 ~= 2 & gab1 ~= 0 );
gabh2 = sum( gab2 ~= 2 & gab2 ~= 0 );
gabh3 = sum( gab3 ~= 2 & gab3 ~= 0 );
gabh4 = sum( gab4 ~= 2 & gab4 ~= 0 );
tot = gabh1+gabh2+gabh3+gabh4;
for ii = 1:C
ii
while tot(:,:,ii) ~= 0
for a = 1:A
krom(a,:,ii) = randperm(B);
end
gabb1 = sum(krom(:,:,ii) == 1);
gabb2 = sum(krom(:,:,ii) == 2);
gabb3 = sum(krom(:,:,ii) == 3);
gabb4 = sum(krom(:,:,ii) == 4);
gabbh1 = sum(gabb1 ~= 2 & gabb1 ~= 0)
gabbh2 = sum(gabb2 ~= 2 & gabb2 ~= 0);
gabbh3 = sum(gabb3 ~= 2 & gabb3 ~= 0);
gabbh4 = sum(gabb4 ~= 2 & gabb4 ~= 0);
tot(:,:,ii) = gabbh1+gabbh2+gabbh3+gabbh4;
end
end