I'm trying to figure out how to access Matlab sub-arrays (part of an array) with a generic set of subscript vectors.
In general, the problem is defined as:
Given two n-dim endpoints of an array index (both size nd), one having the initial set of indices (startInd) and the other having the last set of indices (endInd), how to access the sub-matrix which is included between the pair of index-sets?
For example, I want to replace this:
Mat=rand(10,10,10,10);
Mat(2:7, 1:6, 1:6, 2:8) = 1.0;
With an operation that can accept any set of two n-dim vectors specifying the indices for the last operation, which is "abstractly" expressed as:
Mat=rand(10,10,10,10);
startInd=[2 1 1 2];
endInd =[7 6 6 8];
IndexVar=???
Mat(IndexVar) = 1.0;
Thus I want to access the sub-matrix Mat(2:7, 1:6, 1:6, 2:8) using a variable or some other generic form that allows a generic n-dim. Preferably not a loop (as it is slow).
I have tried using something of this nature:
% Generate each index list separately:
nDims=length(startInd);
ind=cell(nDims,1);
for j=1:nDims
ind{j}=startInd(j):1:endInd(j);
end
% Access the matrix:
S.type = '()';
S.subs = ind;
Mat=subsasgn(Mat,S,1.0)
This seems to get the job done, but is very slow and memory-expansive, but might give someone an idea...
If you don't mind looping over dimensions (which should be much faster than looping over array entries):
indexVar = arrayfun(#(a,b) colon(a,b), startInd, endInd, 'UniformOutput', false);
Mat(indexVar{:}) = 1;
This uses arrayfun (essentially a loop) to create a cell array with the indexing vectors, which is then expanded into a comma-separated list.
Now that I see your code: this uses the same approach, only that the loop is replaced by arrayfun and the comma-separated list allows a more natural indexing syntax instead of subsasgn.
My goal is to compare two matrices: A and B in two different files:
function [Result]=test()
A_Mat= load('fileA', 'A')
B_Mat= load('fileB', 'B')
Result= A_Mat == B_Mat
end
The result that I want is a matrix that includes the difference between A and B.
The error that I have is:
error: binary operator '==' not implemented for 'scalar struct' by 'scalar struct' operations
The load function doesn't return what you think it returns. Reading the extensive and easily comprehensible MATLAB documentation always helps.
function Result=test()
load('fileA', 'A');
load('fileB', 'B');
Result = A == B
end
Use the isequal function.
isequal(A,B)
If you simply want the difference between A and B you should first use load as dasdingonesin suggested, and either check for full matrix equality with isequal or elementwise equality with ==. The difference, however, is simply given by - of course:
isequal(A,B); % returns a boolean for full matrix equality
A==B; % returns a logical matrix with element wise equality
A-B; % returns a matrix with differences between the two matrices
Do note that isequal can deal with matrices of unequal size (it will simply return 0), whilst both == and - will crash with the error Matrix dimensions must agree.
First, the operator == does work on matrices, and it returns a logical matrix of true/false (1/0) where the corresponding items are equal or different respectively. From the error you got it seems that you didn't read matrices from the file, but structs, and indeed, == doesn't work for structs.
You can use isequal for both structs and matrices. This function returns only one value - 1 or 0 (true/false).
ADDED
After seeing #dasdingonesin answer, who actually pointed to the exact problem, I just wanted to add that when you write
A_Mat= load('fileA', 'A')
it returns a struct as with the field A.
So:
A_Mat = s.A
I'm trying to insert multiple values into an array using a 'values' array and a 'counter' array. For example, if:
a=[1,3,2,5]
b=[2,2,1,3]
I want the output of some function
c=somefunction(a,b)
to be
c=[1,1,3,3,2,5,5,5]
Where a(1) recurs b(1) number of times, a(2) recurs b(2) times, etc...
Is there a built-in function in MATLAB that does this? I'd like to avoid using a for loop if possible. I've tried variations of 'repmat()' and 'kron()' to no avail.
This is basically Run-length encoding.
Problem Statement
We have an array of values, vals and runlengths, runlens:
vals = [1,3,2,5]
runlens = [2,2,1,3]
We are needed to repeat each element in vals times each corresponding element in runlens. Thus, the final output would be:
output = [1,1,3,3,2,5,5,5]
Prospective Approach
One of the fastest tools with MATLAB is cumsum and is very useful when dealing with vectorizing problems that work on irregular patterns. In the stated problem, the irregularity comes with the different elements in runlens.
Now, to exploit cumsum, we need to do two things here: Initialize an array of zeros and place "appropriate" values at "key" positions over the zeros array, such that after "cumsum" is applied, we would end up with a final array of repeated vals of runlens times.
Steps: Let's number the above mentioned steps to give the prospective approach an easier perspective:
1) Initialize zeros array: What must be the length? Since we are repeating runlens times, the length of the zeros array must be the summation of all runlens.
2) Find key positions/indices: Now these key positions are places along the zeros array where each element from vals start to repeat.
Thus, for runlens = [2,2,1,3], the key positions mapped onto the zeros array would be:
[X 0 X 0 X X 0 0] % where X's are those key positions.
3) Find appropriate values: The final nail to be hammered before using cumsum would be to put "appropriate" values into those key positions. Now, since we would be doing cumsum soon after, if you think closely, you would need a differentiated version of values with diff, so that cumsum on those would bring back our values. Since these differentiated values would be placed on a zeros array at places separated by the runlens distances, after using cumsum we would have each vals element repeated runlens times as the final output.
Solution Code
Here's the implementation stitching up all the above mentioned steps -
% Calculate cumsumed values of runLengths.
% We would need this to initialize zeros array and find key positions later on.
clens = cumsum(runlens)
% Initalize zeros array
array = zeros(1,(clens(end)))
% Find key positions/indices
key_pos = [1 clens(1:end-1)+1]
% Find appropriate values
app_vals = diff([0 vals])
% Map app_values at key_pos on array
array(pos) = app_vals
% cumsum array for final output
output = cumsum(array)
Pre-allocation Hack
As could be seen that the above listed code uses pre-allocation with zeros. Now, according to this UNDOCUMENTED MATLAB blog on faster pre-allocation, one can achieve much faster pre-allocation with -
array(clens(end)) = 0; % instead of array = zeros(1,(clens(end)))
Wrapping up: Function Code
To wrap up everything, we would have a compact function code to achieve this run-length decoding like so -
function out = rle_cumsum_diff(vals,runlens)
clens = cumsum(runlens);
idx(clens(end))=0;
idx([1 clens(1:end-1)+1]) = diff([0 vals]);
out = cumsum(idx);
return;
Benchmarking
Benchmarking Code
Listed next is the benchmarking code to compare runtimes and speedups for the stated cumsum+diff approach in this post over the other cumsum-only based approach on MATLAB 2014B-
datasizes = [reshape(linspace(10,70,4).'*10.^(0:4),1,[]) 10^6 2*10^6]; %
fcns = {'rld_cumsum','rld_cumsum_diff'}; % approaches to be benchmarked
for k1 = 1:numel(datasizes)
n = datasizes(k1); % Create random inputs
vals = randi(200,1,n);
runs = [5000 randi(200,1,n-1)]; % 5000 acts as an aberration
for k2 = 1:numel(fcns) % Time approaches
tsec(k2,k1) = timeit(#() feval(fcns{k2}, vals,runs), 1);
end
end
figure, % Plot runtimes
loglog(datasizes,tsec(1,:),'-bo'), hold on
loglog(datasizes,tsec(2,:),'-k+')
set(gca,'xgrid','on'),set(gca,'ygrid','on'),
xlabel('Datasize ->'), ylabel('Runtimes (s)')
legend(upper(strrep(fcns,'_',' '))),title('Runtime Plot')
figure, % Plot speedups
semilogx(datasizes,tsec(1,:)./tsec(2,:),'-rx')
set(gca,'ygrid','on'), xlabel('Datasize ->')
legend('Speedup(x) with cumsum+diff over cumsum-only'),title('Speedup Plot')
Associated function code for rld_cumsum.m:
function out = rld_cumsum(vals,runlens)
index = zeros(1,sum(runlens));
index([1 cumsum(runlens(1:end-1))+1]) = 1;
out = vals(cumsum(index));
return;
Runtime and Speedup Plots
Conclusions
The proposed approach seems to be giving us a noticeable speedup over the cumsum-only approach, which is about 3x!
Why is this new cumsum+diff based approach better than the previous cumsum-only approach?
Well, the essence of the reason lies at the final step of the cumsum-only approach that needs to map the "cumsumed" values into vals. In the new cumsum+diff based approach, we are doing diff(vals) instead for which MATLAB is processing only n elements (where n is the number of runLengths) as compared to the mapping of sum(runLengths) number of elements for the cumsum-only approach and this number must be many times more than n and therefore the noticeable speedup with this new approach!
Benchmarks
Updated for R2015b: repelem now fastest for all data sizes.
Tested functions:
MATLAB's built-in repelem function that was added in R2015a
gnovice's cumsum solution (rld_cumsum)
Divakar's cumsum+diff solution (rld_cumsum_diff)
knedlsepp's accumarray solution (knedlsepp5cumsumaccumarray) from this post
Naive loop-based implementation (naive_jit_test.m) to test the just-in-time compiler
Results of test_rld.m on R2015b:
Old timing plot using R2015a here.
Findings:
repelem is always the fastest by roughly a factor of 2.
rld_cumsum_diff is consistently faster than rld_cumsum.
repelem is fastest for small data sizes (less than about 300-500 elements)
rld_cumsum_diff becomes significantly faster than repelem around 5 000 elements
repelem becomes slower than rld_cumsum somewhere between 30 000 and 300 000 elements
rld_cumsum has roughly the same performance as knedlsepp5cumsumaccumarray
naive_jit_test.m has nearly constant speed and on par with rld_cumsum and knedlsepp5cumsumaccumarray for smaller sizes, a little faster for large sizes
Old rate plot using R2015a here.
Conclusion
Use repelem below about 5 000 elements and the cumsum+diff solution above.
There's no built-in function I know of, but here's one solution:
index = zeros(1,sum(b));
index([1 cumsum(b(1:end-1))+1]) = 1;
c = a(cumsum(index));
Explanation:
A vector of zeroes is first created of the same length as the output array (i.e. the sum of all the replications in b). Ones are then placed in the first element and each subsequent element representing where the start of a new sequence of values will be in the output. The cumulative sum of the vector index can then be used to index into a, replicating each value the desired number of times.
For the sake of clarity, this is what the various vectors look like for the values of a and b given in the question:
index = [1 0 1 0 1 1 0 0]
cumsum(index) = [1 1 2 2 3 4 4 4]
c = [1 1 3 3 2 5 5 5]
EDIT: For the sake of completeness, there is another alternative using ARRAYFUN, but this seems to take anywhere from 20-100 times longer to run than the above solution with vectors up to 10,000 elements long:
c = arrayfun(#(x,y) x.*ones(1,y),a,b,'UniformOutput',false);
c = [c{:}];
There is finally (as of R2015a) a built-in and documented function to do this, repelem. The following syntax, where the second argument is a vector, is relevant here:
W = repelem(V,N), with vector V and vector N, creates a vector W where element V(i) is repeated N(i) times.
Or put another way, "Each element of N specifies the number of times to repeat the corresponding element of V."
Example:
>> a=[1,3,2,5]
a =
1 3 2 5
>> b=[2,2,1,3]
b =
2 2 1 3
>> repelem(a,b)
ans =
1 1 3 3 2 5 5 5
The performance problems in MATLAB's built-in repelem have been fixed as of R2015b. I have run the test_rld.m program from chappjc's post in R2015b, and repelem is now faster than other algorithms by about a factor 2:
Sorry for the title. I could not think of something better.
I have the following problem.
I have two four-column matrices build up like this:
Property | X | Y | Z
The two matrices have different sizes, since matrix 1 has a large amount of additional rows compared to matrix 2.
What I want to do is the following:
I need to create a third matrix that only features those rows (of the large matrix) that are identical in columns X, Y and Z to rows in matrix2(the property column is always different).
I tried an if-statement but it did not really work out due to my programming syntax. Has somebody a tip?
Thank you!
I tried something like this: (in this case A is the larger matrix and I want its property column for X,Y,Z-positions that are identical to another matrix B.. I am terrible with the MatLab-syntax..
if (A(:,2) == B(:,2) and (A(:,3) == B(:,3) and (A(:,4) == B(:,4))
newArray(:,1) = A(:,1);
end
Use ismember with the 'rows' option to find the desired rows, and then use that as an index to build the result:
ind = ismember(A(:,2:4), B(:,2:4), 'rows');
C = A(ind,:);
I have assumed that a row of A is selected if its last three columns match those of any row of B.
In every other language if I have a matrix, if I call a mono-dimensional index, the result will be an array.I don't know why in Matlab if you take a single index of a matrix, you'll get a single element, that's stupid.
Anyway in C:
mat[4][4];
mat[0] is an array.
In Matlab:
mat=[1 2; 3 4];
How do I take the first row of the matrix? mat(1) is 1, not [1 2].
EDIT: There is another problem, I have a problem with this function:
function str= split(string, del)
index=1;
found=0;
str=['' ; ''];
for i=1:length(string)
if string(i)==del
found=1;
index=1;
elseif found==1
str(2,index)=string(i);
index=index+1;
else
str(1,index)=string(i);
index=index+1;
end
end
end
This returns sometimes a matrix and sometimes an array.
For example if I use split('FF','.') I get 'FF' as result, but what if I want to return a matrix? I can't even choose the dimensione of the matrix, in this context a weak typed language is a big disvantage.
You have to say which columns you want. : stands for all indices in a dimension, so to take first row
mat(1,:)
It is not stupid, but useful. If you address a matrix with only one index, it implicitly gets converted to a vector. This gives you the option to use linear indices (see sub2ind).
This will extract the second row
vector = mat(2,:)
And This will extract the second column
vector = mat(:,2)
You can use
vector = mat(end,:)
To extract the last row
Hope this helps you
From Matrix Indexing in MATLAB:
When you index into the matrix A using only one subscript, MATLAB
treats A as if its elements were strung out in a long column vector,
by going down the columns consecutively
I just hope it doesn't look stupid to you anymore (along with the right answers from angainor and Marwan)