cellfun with conditionals in MATLAB - matlab

Is it possible to use cellfun with a conditional. For example, I have a 144x53 cell array, where the first four columns are of type string, the rest are floats. However, among the numbers, there are empty cells. I wonder if it is possible to use cellfun(#(x)sqrt(x), cellarray) with my array. As it is know, its not possible due to strings and empty cells. Otherwise, this is the solution that I use,
for n = 1:length(results)
for k = 1:length(results(1,:))
if ~isstr(results{n,k})
results{n, k} = sqrt(results{n,k});
end
end
end
Otherwise, is it possible to do a vectorization here?

You can create a logical array by checking if each element is numeric. And then use this to perform your cellfun operation on the subset of the cell array that contains numeric data.
C = {1, 2, 'string', 4};
% Logical array that is TRUE when the element is numeric
is_number = cellfun(#isnumeric, C);
% Perform this operation and replace only the numberic values
C(is_number) = cellfun(#sqrt, C(is_number), 'UniformOutput', 0);
% 1 1.4142 'string' 2
As pointed out by #excaza, you may also consider leaving it as a loop as it is more performant on newer versions of MATLAB (R2015b and newer).

Related

Matlab - Using two index arrays to operate on a sub-matrix

I'm trying to figure out how to access Matlab sub-arrays (part of an array) with a generic set of subscript vectors.
In general, the problem is defined as:
Given two n-dim endpoints of an array index (both size nd), one having the initial set of indices (startInd) and the other having the last set of indices (endInd), how to access the sub-matrix which is included between the pair of index-sets?
For example, I want to replace this:
Mat=rand(10,10,10,10);
Mat(2:7, 1:6, 1:6, 2:8) = 1.0;
With an operation that can accept any set of two n-dim vectors specifying the indices for the last operation, which is "abstractly" expressed as:
Mat=rand(10,10,10,10);
startInd=[2 1 1 2];
endInd =[7 6 6 8];
IndexVar=???
Mat(IndexVar) = 1.0;
Thus I want to access the sub-matrix Mat(2:7, 1:6, 1:6, 2:8) using a variable or some other generic form that allows a generic n-dim. Preferably not a loop (as it is slow).
I have tried using something of this nature:
% Generate each index list separately:
nDims=length(startInd);
ind=cell(nDims,1);
for j=1:nDims
ind{j}=startInd(j):1:endInd(j);
end
% Access the matrix:
S.type = '()';
S.subs = ind;
Mat=subsasgn(Mat,S,1.0)
This seems to get the job done, but is very slow and memory-expansive, but might give someone an idea...
If you don't mind looping over dimensions (which should be much faster than looping over array entries):
indexVar = arrayfun(#(a,b) colon(a,b), startInd, endInd, 'UniformOutput', false);
Mat(indexVar{:}) = 1;
This uses arrayfun (essentially a loop) to create a cell array with the indexing vectors, which is then expanded into a comma-separated list.
Now that I see your code: this uses the same approach, only that the loop is replaced by arrayfun and the comma-separated list allows a more natural indexing syntax instead of subsasgn.

MATLAB: return both arguments from ISMEMBER when used inside SPLITAPPLY

How can I access both arguments of ismember when it is used inside splitapply?
slitapply only returns scalar values for each group, so in order to compute nonscalar values for each group (as returned by the first argument of ismemebr), one has to enclose the anonymous function (in this case ismember) inside curly brackets {} to return a cell array.
But now, when I provide two output arguments to splitapply, I get an error:
Output argument "varargout{2}" (and maybe others) not assigned during call to
"#(x,y) {ismember(x,y)}"
ADD 1
I can create another function, say, ismember2cell which would apply ismember and turn outputs into cell arrays:
function [a, b] = ismember2cell(x,y)
[a,b] = ismember(x,y);
a = {a};
b = {b};
end
but maybe there is a solution which doesn't require this workaround.
One potentially faster option is to just do what splitapply is already doing under the hood by splitting your data into cell arrays (using functions like mat2cell or accumarray) and then using cellfun to apply your function across them. Using cellfun will allow you to easily capture multiple outputs (such as from ismember). For example:
% Sample data:
A = [1 2 3 4 5];
B = [1 2 1 5 5];
G = [1 1 1 2 2]; % Group index
% Group data into cell arrays:
cellA = accumarray(G(:), A(:), [], #(x) {x(:).'}); % See note below about (:).' syntax
cellB = accumarray(G(:), B(:), [], #(x) {x(:).'});
% Apply function:
[Lia, Locb] = cellfun(#ismember, cellA, cellB, 'UniformOutput', false);
NOTE: My sample data are row vectors, but I had to use the colon operator to reshape them into column vectors when passing them to accumarray (it wants columns). Once distributed into a cell array, each piece of the vector would still be a column vector, and I simply wanted to keep them as row vectors to match the original sample data. The syntax (:).' is a colon reshaping followed by a nonconjugate transpose, ensuring a row vector as a result no matter the shape of x. In this case I probably could have just used .', but I've gotten into the habit of never assuming what the shape of a variable is.
I cannot find a global solution, but the accepted answer of this post helps me to define a helper function for your problem:
function varargout = out2cell(varargin)
[x{1:nargout}]=feval(varargin{:});
varargout = num2cell(x);
I think that you may succeed in calling
splitapply(#(x,y) out2cell(#ismember, x, y), A, B);

MATLAB: Simple cellfun does not work on string vector

I constructed my vector as such:
v = ['asdf'; 'qwer'; 'zxcv'];
I just wanted to take the first 2 characters, and I wrote a simple cellfun like so:
A = cellfun(#(x) x(1:2), v, 'UniformOutput', false);
However, it says:
error: cellfun: C must be a cell array
How should I extract the first 2 characters of each string?
That's because v is not a cell array. Turn it into one:
v = {'asdf'; 'qwer'; 'zxcv'};
If you can't use cell arrays, do what Divakar suggested and turn v into one by using cellstr:
v = ['asdf', 'qwer', 'zxcv'];
v_cell = cellstr(v);
If you want to escape the temporary variable, supply the call with v directly into cellfun:
A = cellfun(#(x) x(1:2), cellstr(v), 'UniformOutput', false);
If you want to un-cell the cell array, use cell2mat:
Aout = cell2mat(A);
I question the efficiency of the above though. If you just want to extract the first two characters of your cell array then turn it back into a character array, why don't you simply index the first two columns of all of the rows in the original character array? The use of cellfun adds unnecessary overhead when simple indexing would do the trick. Indexing is much more readable in this instance than using cellfun, which adds a layer of obfuscation.
Aout = v(:,1:2);

Substrings from a Cell Array in Matlab

I'm new to MATLAB and I'm struggling to comprehend the subtleties between array-wise and element wise operations. I'm working with a large dataset and I've found the simplest methods aren't always the fastest. I have a very large Cell Array of strings, like in this simplified example:
% A vertical array of same-length strings
CellArrayOfStrings = {'aaa123'; 'bbb123'; 'ccc123'; 'ddd123'};
I'm trying to extract an array of substrings, for example:
'a1'
'b1'
'c1'
'd1'
I'm happy enough with an element-wise reference like this:
% Simple element-wise substring operation
MySubString = CellArrayOfStrings{2}(3:4); % Expected result is 'b1'
But I can't work out the notation to reference them all in one go, like this:
% Desired result is 'a1','b1','c1','d1'
MyArrayOfSubStrings = CellArrayOfStrings{:}(3:4); % Incorrect notation!
I know that Matlab is capable of performing very fast array-wise operations, such as strcat, so I was hoping for a technique that works at a similar speed:
% An array-wise operation which works quickly
tic
speedTest = strcat(CellArrayOfStrings,'hello');
toc % About 2 seconds on my machine with >500K array elements
All the for loops and functions which use behind-the-scenes iteration I have tried run too slowly with my dataset. Is there some array-wise notation that would do this? Would somebody be able to correct my understanding of element-wise and array-wise operations?! Many thanks!
I can't work out the notation to reference them all in one go, like this:
MyArrayOfSubStrings = CellArrayOfStrings{:}(3:4); % Incorrect notation!
This is because curly braces ({}) return a comma-separated list, which is equivalent to writing the contents of these cells in the following way:
c{1}, c{2}, and so on....
When the subscript index refers to only one element, MATLAB's syntax allows to use parentheses (()) after the curly braces and further extract a sub-array (a substring in your case). However, this syntax is prohibited when the comma separated lists contains multiple items.
So what are the alternatives?
Use a for loop:
MyArrayOfSubStrings = char(zeros(numel(CellArrayOfStrings), 2));
for k = 1:size(MyArrayOfSubStrings, 1)
MyArrayOfSubStrings(k, :) = CellArrayOfStrings{k}(3:4);
end
Use cellfun (a slight variant of Dang Khoa's answer):
MyArrayOfSubStrings = cellfun(#(x){x(3:4)}, CellArrayOfStrings);
MyArrayOfSubStrings = vertcat(MyArrayOfSubStrings{:});
If your original cell array contains strings of a fixed length, you can follow Dan's suggestion and convert the cell array into an array of strings (a matrix of characters), reshape it and extract the desired columns:
MyArrayOfSubStrings =vertcat(CellArrayOfStrings{:});
MyArrayOfSubStrings = MyArrayOfSubStrings(:, 3:4);
Employ more complicated methods, such as regular expressions:
MyArrayOfSubStrings = regexprep(CellArrayOfStrings, '^..(..).*', '$1');
MyArrayOfSubStrings = vertcat(MyArrayOfSubStrings{:});
There are plenty solutions to pick from, just pick the one that fits you most :) I think that with MATLAB's JIT acceleration, a simple loop would be sufficient in most cases.
Also note that in all my suggestions the obtained cell array of substrings cell is converted into an array of strings (a matrix). This is just for the sake of the example; obviously you can keep the substrings stored in a cell array, should you decide so.
cellfun operates on every element of a cell array, so you could do something like this:
>> CellArrayOfStrings = {'aaa123'; 'bbb123'; 'ccc123'; 'ddd123'};
>> MyArrayofSubstrings = cellfun(#(str) str(3:4), CellArrayOfStrings, 'UniformOutput', false)
MyArrayofSubstrings =
'a1'
'b1'
'c1'
'd1'
If you wanted a matrix of strings instead of a cell array whose elements are the strings, use char on MyArrayOfSubstrings. Note that this is only allowed when each string is the same length.
You can do this:
C = {'aaa123'; 'bbb123'; 'ccc123'; 'ddd123'}
t = reshape([C{:}], 6, [])'
t(:, 3:4)
But only if your strings are all of equal length I'm afraid.
You can use char to convert them to a character array, do the indexing and convert it back to cell array
A = char(CellArrayOfStrings);
B = cellstr(A(:,3:4));
Note that if strings are of different lengths, char pads them with spaces at the end to create the array. Therefore if you index for a column that is beyond the length of one of the short strings you may receive some space characters.

Is there a splat operator (or equivalent) in Matlab?

If I have an array (of unknown length until runtime), is there a way to call a function with each element of the array as a separate parameter?
Like so:
foo = #(varargin) sum(cell2mat(varargin));
bar = [3,4,5];
foo(*bar) == foo(3,4,5)
Context: I have a list of indices to an n-d array, Q. What I want is something like Q(a,b,:), but I only have [a,b]. Since I don't know n, I can't just hard-code the indexing.
There is no operator in MATLAB that will do that. However, if your indices (i.e. bar in your example) were stored in a cell array, then you could do this:
bar = {3,4,5}; %# Cell array instead of standard array
foo(bar{:}); %# Pass the contents of each cell as a separate argument
The {:} creates a comma-separated list from a cell array. That's probably the closest thing you can get to the "operator" form you have in your example, aside from overriding one of the existing operators (illustrated here and here) so that it generates a comma-separated list from a standard array, or creating your own class to store your indices and defining how the existing operators operate for it (neither option for the faint of heart!).
For your specific example of indexing an arbitrary N-D array, you could also compute a linear index from your subscripted indices using the sub2ind function (as detailed here and here), but you might end up doing more work than you would for my comma-separated list solution above. Another alternative is to compute the linear index yourself, which would sidestep converting to a cell array and use only matrix/vector operations. Here's an example:
% Precompute these somewhere:
scale = cumprod(size(Q)).'; %'
scale = [1; scale(1:end-1)];
shift = [0 ones(1, ndims(Q)-1)];
% Then compute a linear index like this:
indices = [3 4 5];
linearIndex = (indices-shift)*scale;
Q(linearIndex) % Equivalent to Q(3,4,5)