I have a cell with mathematical expressions that I would like to convert to a numeric array. It look as follows:
a = {};
a{1,1} = '0.55';
a{2,1} = '0.25 + 0.50';
Now I would like to receive the result (but preferably without a for loop):
b(1) = 0.55;
b(2) = 0.75;
How can I achieve this efficiently?
b = cellfun(#eval,a); will create an array b of the same size as cell array a, with each value the evaluation of the corresponding string in the cell array.
a = {};
a{1,1} = '0.55';
a{2,1} = '0.25 + 0.50';
a=repmat(a,1000,20); %Make it big for performance evaluation
tic
b1 = cellfun(#eval,a);
toc %0.662187 seconds
Another option is to make a big string expression so that eval is called only once rather than several times due to cellfun internal loop. This is less safe as abnormal values in the cell array a will likely cause the code to crash, while it may simply produce NaN in the code above.
tic
% add a comma separator after each value
strCell = cellfun(#(x) [x ','],transpose(a),'uniformoutput',false);
% add a semicolon separator at the end of each row
strCell(end,:) = cellfun(#(x) [x(1:end-1) ';'], strCell(end,:), 'uniformoutput',false);
% remove the last separator
strCell{end}=strCell{end}(1,end-1);
% evaluate the line
b2=eval(['[' strCell{:} ']']);
toc %0.313738 seconds but sometimes more than 1 seconds
Related
I Have a 8x18 structure with each cel containing a column vector of occurrences of a single event. I want to obtain data from some of these fields concatenated in a single array, without having to loop through it. I can't seem to find a way to vertically concatenate the fields I am interested in in a single array.
As an example I create the following structure with between 1 and 5 occurrences per cell:
s(62).vector(8,18).heading.occurrences=[1;2;3];
for i=1:62
for j=1:8
for k=1:18
y=ceil(rand(1)*5);
s(i).vector(j,k).heading.occurrences=rand(y,1);
end
end
end
Now if want to obtain all occurrences in several cells while keeping i constant at for instant i=1 the following works:
ss=s(1).vector([1 26 45]);
h=[ss.heading];
cell2mat({h.occurrences}')
Now I would want to do the same for s, for instance s([1 2 3]).vector([1 26 45]), how would that work? I have tried xx=s([1 2 3]), yy=xx.vector([1 26 45]) but this however yields the error:
Expected one output from a curly brace or dot indexing expression, but there were 3 results.
Is this also possible with a vector operation?
Here's a vectorized solution that accommodates using index vectors for s and the field vector:
sIndex = [1 2 3]; % Indices for s
vIndex = [1 26 45]; % Indices for 'vector' field
v = reshape(cat(3, s(sIndex).vector), 144, []);
h = [v(vIndex, :).heading];
out = vertcat(h.occurrences);
It uses cat to concatenate all the vector fields into an 8-by-18-by-numel(sIndex) matrix, reshapes that into a 144-by-numel(sIndex) matrix, then indexes the rows specified by vIndex and collects their heading and occurrences fields, using vertcat instead of cell2mat.
It's difficult to vectorize the entire operation, but this should work.
% get vector field and store in cell array
s_new = { s(1:3).vector };
% now extract heading field, this is a cell-of-cells
s_new_heading = cellfun(#(x) { x.heading }', s_new, 'UniformOutput', false);
occurences = {};
for iCell = 1:length(s_new_heading)
% use current cell
cellHere = s_new_heading{iCell};
% retain indices of interest, these could be different for each cell
cellHere = cellHere([ 1 26 45 ]);
% extract occurrences
h = cellfun(#(x) x.occurrences, cellHere, 'UniformOutput', false);
h_mat = cell2mat(h);
% save them in cell array
occurences = cat(1, occurences, h_mat);
end
I have a structure 1x300 called struct with 3 fields but I'm using only the third field called way. This field is, for each 300 lines, a vertor of index.
Here an exemple with 3 lines to explain my problem : I woud like to search if the last index of the first line is present in an other vector (line) of the field way.
way
[491751 491750 491749 492772 493795 494819 495843 496867]
[491753 491754 491755 491756]
[492776 493800 494823 495847 496867]
I tried with intersect function :
Inter=intersect(struct(1).way(end), struct.way);
but Matlab returns me an error :
Error using intersect (line 80)
Too many input arguments.
Error in file2 (line 9)
Inter=intersect(struct(1).way(end), struct.way);
I don't understand why I have this error. Any explanations and/or other(s) solution(s)?
Let the data be defined as
st(1).way = [491751 491750 491749 492772 493795 494819 495843 496867];
st(2).way = [491753 491754 491755 491756];
st(3).way = [492776 493800 494823 495847 496867]; % define the data
sought = st(1).way(end);
If you want to know which vectors contain the desired value: pack all vectors into a cell array and pass that to cellfun with an anonymous function as follows:
ind = cellfun(#(x) ismember(sought, x), {st.way});
This gives:
ind =
1×3 logical array
1 0 1
If you want to know for each vector the indices of the matching: modify the anonymous function to output a cell with the indices:
ind = cellfun(#(x) {find(x==sought)}, {st.way});
or equivalently
ind = cellfun(#(x) find(x==sought), {st.way}, 'UniformOutput', false);
The result is:
ind =
1×3 cell array
[8] [1×0 double] [5]
Or, to exclude the reference vector:
n = 1; % index of vector whose final element is sought
ind = cellfun(#(x) {find(x==st(n).way(end))}, {st([1:n-1 n+1:end]).way});
You propbably want to use ismember.
Consider what you are passing to the intersect/ismember functions too, struct.way isn't a valid argument, you may need to loop to iterate over each line of your struct (in this case it would be easier to have a cell array, or matrix with equal length rows).
output = zeros(300);
for ii = 1:300
for jj = 1:300
if ii ~= jj && ismember(struct(ii).way(end), struct(jj).way)
output(ii,jj) = 1;
end
end
end
Now you have a matrix output where the elements which are 1 identify a match between the last element in way in the struct row ii and the vector struct(jj).way, where ii are the matrix row numbers and jj the column numbers.
I have one big cell with N by 1 dimension. Each row is either a string or a double. A string is a variable name and the sequential doubles are its values until the next string (another variable name). For example:
data = {
var_name1;
val1;
val2;
val3;
val4;
val5;
var_name2;
val1;
val2;
var_name3;
val1;
val2;
val3;
val4;
val5;
val6;
val7}
and so on. I want to separate the data cell into three cells; {var_name and it's 5 values}, {var_name and it's 2 values}, {var_name and it's 7 values}. I try not to loop as much as possible and have found that vectorization along with cellfun works really well. Is it possible? The data cell has close to million rows.
I believe the following should do what you're after. The main pieces are to use cumsum to work out which name each row corresponds to, and then accumarray to build up lists per name.
% Make some data
data = {'a'; 1; 2; 3;
'b'; 4; 5;
'c'; 6; 7; 8; 9;
'd';
'e'; 10; 11; 12};
% Which elements are the names?
isName = cellfun(#ischar, data);
% Use CUMSUM to work out for each row, which name it corresponds to
whichName = cumsum(isName);
% Pick out only the values from 'data', and filter 'whichName'
% for just the values
justVals = data(~isName);
whichName = whichName(~isName);
% Use ACCUMARRAY to build up lists per name. Note that the function
% used by ACCUMARRAY must return something scalar from a column of
% values, so we return a scalar cell containing a row-vector
% of those values
listPerName = accumarray(whichName, cell2mat(justVals), [], #(x) {x.'});
% All that remains is to prepend the name to each cell. This ends
% up with each row of output being a cell like {'a', [1 2 3]}.
% It's simple to make the output be {'a', 1, 2, 3} by adding
% a call to NUM2CELL on 'v' in the anonymous function.
nameAndVals = cellfun(#(n, v) [{n}, v], data(isName), listPerName, ...
'UniformOutput', false);
cellfun is for applying a function to each element of a cell.
When you pass multiple arguments to cellfun like that, it takes the ith argument of data, indx_first, and indx_last, and uses each of them in the anonymous function. Substituting those variables in, your function evaluates to x(y : z), for each element x in data. In other words, you're doing data{i}(y : z), i.e., indexing the actual elements of the cell array, rather than indexing the cell array itself. I don't think that's what you want. Really you want data{y : z}, for each (y, z) pair given by corresponding elements in indx_first and indx_last, right?
If that's indeed the case, I don't see a vectorized way to solve your problem, because each of the "variables" has different size. But you do know how many variables you have, which is the size of indx_first. So I'd pre-allocate and then loop, like so:
>> vars = cell(length(indx_first), 2);
>> for i = 1:length(vars)
vars{i, 1} = data{indx_first(i) - 1}; % store variable name in first column
vars{i, 2} = [data{indx_first(i) : indx_last(i)}]; % store data in last column
end
At the end of this, you'll have a cell array with 2 columns. The first column in each row is the name of the variable. The second is the actual data. I.e.
{'var_name1', [val1 val2 val3 val4 val5];
'var_name2', [val1 val2];
.
.
.
I have a file in the following format in matlab:
user_id_a: (item_1,rating),(item_2,rating),...(item_n,rating)
user_id_b: (item_25,rating),(item_50,rating),...(item_x,rating)
....
....
so each line has values separated by a colon where the value to the left of the colon is a number representing user_id and the values to the right are tuples of item_ids (also numbers) and rating (numbers not floats).
I would like to read this data into a matlab cell array or better yet ultimately convert it into a sparse matrix wherein the user_id represents the row index, and the item_id represents the column index and store the corresponding rating in that array index. (This would work as I know a-priori the number of users and items in my universe so ids cannot be greater than that ).
Any help would be appreciated.
I have thus far tried the textscan function as follows:
c = textscan(f,'%d %s','delimiter',':') %this creates two cells one with all the user_ids
%and another with all the remaining string values.
Now if I try to do something like str2mat(c{2}), it works but it stores the '(' and ')' characters also in the matrix. I would like to store a sparse matrix in the fashion that I described above.
I am fairly new to matlab and would appreciate any help regarding this matter.
f = fopen('data.txt','rt'); %// data file. Open as text ('t')
str = textscan(f,'%s'); %// gives a cell which contains a cell array of strings
str = str{1}; %// cell array of strings
r = str(1:2:end);
r = cellfun(#(s) str2num(s(1:end-1)), r); %// rows; numeric vector
pairs = str(2:2:end);
pairs = regexprep(pairs,'[(,)]',' ');
pairs = cellfun(#(s) str2num(s(1:end-1)), pairs, 'uni', 0);
%// pairs; cell array of numeric vectors
cols = cellfun(#(x) x(1:2:end), pairs, 'uni', 0);
%// columns; cell array of numeric vectors
vals = cellfun(#(x) x(2:2:end), pairs, 'uni', 0);
%// values; cell array of numeric vectors
rows = arrayfun(#(n) repmat(r(n),1,numel(cols{n})), 1:numel(r), 'uni', 0);
%// rows repeated to match cols; cell array of numeric vectors
matrix = sparse([rows{:}], [cols{:}], [vals{:}]);
%// concat rows, cols and vals into vectors and use as inputs to sparse
For the example file
1: (1,3),(2,4),(3,5)
10: (1,1),(2,2)
this gives the following sparse matrix:
matrix =
(1,1) 3
(10,1) 1
(1,2) 4
(10,2) 2
(1,3) 5
I think newer versions of Matlab have a stringsplit function that makes this approach overkill, but the following works, if not quickly. It splits the file into userid's and "other stuff" as you show, initializes a large empty matrix, and then iterates through the other stuff, breaking it apart and placing in the correct place in the matrix.
(I Didn't see the previous answer when I opened this for some reason - it is more sophisticated than this one, though this may be a little easier to follow at the expense of slowness). I throw in the \s* into the regex in case the spacing is inconsistent, but otherwise don't perform much in the way of data-sanity-checking. Output is the full array, that you can then turn into a sparse array if desired.
% matlab_test.txt:
% 101: (1,42),(2,65),(5,0)
% 102: (25,78),(50,12),(6,143),(2,123)
% 103: (23,6),(56,3)
clear all;
fclose('all');
% your path will vary, of course
file = '<path>/matlab_test.txt';
f = fopen(file);
c = textscan(f,'%d %s','delimiter',':');
celldisp(c)
uids = c{1}
tuples = c{2}
% These are stated as known
num_users = 3;
num_items = 40;
desired_array = zeros(num_users, num_items);
expression = '\((\d+)\s*,\s*(\d+)\)'
% Assuming length(tuples) == num_users for simplicity
for k = 1:num_users
uid = uids(k)
tokens = regexp(tuples{k}, expression, 'tokens');
for l = 1:length(tokens)
item_id = str2num(tokens{l}{1})
rating = str2num(tokens{l}{2})
desired_array(uid, item_id) = rating;
end
end
I want to count all elements in a cell array, including those in "nested" cells.
For a cell array
>> C = {{{1,2},{3,4,5}},{{{6},{7},{8}},{9}},10}
C = {1x2 cell} {1x2 cell} [10]
The answer should be 10.
One way is to use [C{:}] repeatedly until there are no cells left and then use numel but there must be a better way?
Since you are only interested in the number of elements, here is a simplified version of flatten.m that #Ansari linked to:
function n = my_numel(A)
n = 0;
for i=1:numel(A)
if iscell(A{i})
n = n + my_numel(A{i});
else
n = n + numel(A{i});
end
end
end
The result:
>> C = {{{1,2},{3,4,5}},{{{6},{7},{8}},{9}},10};
>> my_numel(C)
ans =
10
EDIT:
If you are feeling lazy, we can let CELLPLOT do the counting:
hFig = figure('Visible','off');
num = numel( findobj(cellplot(C),'type','text') );
close(hFig)
Basically we create an invisible figure, plot the cell array, count how many "text" objects were created, then delete the invisible figure.
This is how the plot looks like underneath:
Put this in a function (say flatten.m) (code from MATLAB Central):
function C = flatten(A)
C = {};
for i=1:numel(A)
if(~iscell(A{i}))
C = [C,A{i}];
else
Ctemp = flatten(A{i});
C = [C,Ctemp{:}];
end
end
Then do numel(flatten(C)) to find the total number of elements.
If you don't like making a separate function, you can employ this clever (but nasty) piece of code for defining a flatten function using anonymous functions (code from here):
flatten = #(nested) feval(Y(#(flat) #(v) foldr(#(x, y) ifthenelse(iscell(x) | iscell(y), #() [flat(x), flat(y)], #() {x, y}), [], v)), nested);
Either way, you need to recurse to flatten the cell array then count.