Loop to create cell array - matlab

I have a structure named data. The structure has 250 elements and one field called codes (whose dimension varies).
As an example: data(1).codes is a 300 x 1 cell of strings and data(2).codes is a 100 x 1 cell of strings.
What I am trying to do is to create a big cell with three columns: id count codes where id indexes the element number (1 to 250), count indexes the row of the string and codes are just the codes.
An example to make it clear:
for k = 1:size(data,2)
id = repmat(k,size(data(k).codes,1),1);
count = linspace(1, size(data(k).codes,1), size(data(k).codes,1))';
codes= data(k).codes;
The loop above creates the columns I want. Now I just need to append them one below the other and then save to excel. If these where only numbers I knew how to concatenate/append matrices. But with cells I am unsure how to do it.
Here is what I have tried:
output = {};
for k = 1:size(data,2)
id = repmat(k,size(data(k).codes,1),1);
count = linspace(1, size(data(k).codes,1), size(data(k).codes,1))';
codes= data(k).codes;
output{1,1} = {output{1,1}; id};
output{1,2} = {output{1,2}; count};
output{1,3} = {output{1,3};

Build up your output into a new cell array, allowing for pre-allocation, then concatenate all of your results.
% Initialise
output = cell(size(data,2), 1);
% Create output for each element of data
for k = 1:size(data,2)
id = repmat(k,size(data(k).codes,1),1);
count = linspace(1, size(data(k).codes,1), size(data(k).codes,1))';
codes = data(k).codes;
% add to output
output{k} = [id, count, codes];
% Vertically concatenate all cell elements
output = vertcat(output{:});
Note: this assumes codes is numerical, and the output will be a numerical matrix. If it isn't, you will need to do some cell conversions for your numerical data (id and count) like so:
id = repmat({k}, size(data(k).codes,1), 1);
count = num2cell(linspace( ... )');


accessing nth column of a cell array in matlab

I have a cell array with for example 3 cells, in which cells are (3,8), (3,2), (3, 30) matrices, now I want to access the nth column of whole data without converting my cell to matrix, for example, if I search for 8th column, it must be the second column of 3th cell. one way is to convert it into a matrix, but my cell is too long and it gives me out of memory when I try to convert the whole cell to the matrix. then I tried this the code below, but it doesn't work correctly. i want to know what i'm doing wrong.
any help is appreciated.
function [col,i,idx] = find_cellCol(cel, idx)
lgh = length(cel);
i = 1;
me = zeros(2,length(cel));
while( i <= lgh && length(cel{1,i})<=idx)
idx = idx - length(cel{1,i});
i = i+1;
end%end while
if idx == 0
col = cel{1,i-1}(:,end);
col = cel{1,i}(:,idx);
Get only the number of line of each matrix of each cell, then sum those number of line and check on wich cell you reach the 8th line.
%dummy data
x{1} = rand(3,8);
x{2} = rand(3,2);
x{3} = rand(3,20);
val = 8;
csize = cellfun(#(x) size(x,1),x); %get the number of line for each cell
csum = cumsum(csize); % [3,6,9]
ind = find(csum>=val,1); % on which cell do we reach the # line
x{ind}((val-csum(ind))+csize(ind),:) %access the right line
fprintf('Accessing the line %d of the cell %d',(val-csum(ind))+csize(ind),ind)
Which will return:
Accessing the line 2 of the cell 3
The given example mislead me since I was sure that you were trying to access a line (first dimension) and not a column (2nd dimension).
But if you want to access a column you can simply adjust the code above:
val = 8;
csize = cellfun(#(x) size(x,2),x); %get the size of the second dimension now.
csum = cumsum(csize);
ind = find(csum>=val,1);
x{ind}(:,(val-csum(ind))+csize(ind)) %access the right column

How to make calculations on certain cells (within a table) that meet specific criteria?

I have the following code:
L_sum = zeros(height(ABC),1);
for i = 1:height(ABC)
L_sum(i) = sum(ABC{i, ABC.L(i,4:281)});
Here my table:
Problem: My sum function sums the entire row values (col. 4-281) per date whereas I only want those cells to be added whose headers are in the cell array of ABC.L, for any given date.
X = ABC.L{1, 1}; gives (excerpt):
Red arrow: what sum function is referencing (L of same date).
Green arrow: what I am trying to reference now (L of previous date).
Thanks for your help
In general, in matlab you dont need to use for loops to do simple operations like selective sums.
[1 2 3;
4 5 6;
7 8 9;
7 7 7];
% sum second dimension 2d array
Result=sum(Data(RowsToSum,ColToSum), 2)
% table mode
Result2=sum(DataTable{RowsToSum,ColToSum}, 2)
To do that you need to first extract the columns you want to sum, and then sum them:
% some arbitrary data:
ABC = table;
ABC.L{1,1} = {'aa','cc'};
ABC.L{2,1} = {'aa','b'};
ABC.L{3,1} = {'aa','d'};
ABC.L{4,1} = {'b','d'};
ABC{1:4,2:5} = magic(4);
ABC.Properties.VariableNames(2:5) = {'aa','b','cc','d'}
% summing the correct columns:
L_sum = zeros(height(ABC),1);
col_names = ABC.Properties.VariableNames; % just to make things shorter
for k = 1:height(ABC)
% the following 'cellfun' compares each column to the values in ABC.L{k},
% and returns a cell array of the result for each of them, then
% 'cell2mat' converts it to logical array, and 'any' combines the
% results for all elements in ABC.L{k} to one logical vector:
col_to_sum = any(cell2mat(...
cellfun(#(x) strcmp(col_names,x),ABC.L{k},...
'UniformOutput', false).'),1);
% then a logical indexing is used to define the columns for summation:
L_sum(k) = sum(ABC{k,col_to_sum});

Join rows in Matrix

I have a very big matrix that looks like this:
There can be at most 2 rows with the same id.
I want to reshape the matrix into the following, preferably removing the id's which only appear once:
Here is an alternative using accumarray to identify values sharing the same index. The code is commented and you can have a look at every intermediary output to see what exactly is going on.
%// Create matrix with your data
id = [1;2;1;3;3];
value = [434 ;454353;4353;3432;4323];
M = [id value]
%// Find unique indices to build final output.
UniqueIdx = unique(M(:,1),'rows')
%// Find values corresponding to every index. Use cell array to account for different sized outputs.
NewM = accumarray(id,value,[],#(x) {x})
%// Get number of elements
NumElements = cellfun(#(x) size(x,1),NewM)
%// Discard rows having orphan index.
NewM(NumElements==1) = [];
UniqueIdx(NumElements==1) = [];
%// Build Output.
Results = [UniqueIdx NewM{1} NewM{2}]
And the output. I can't use the function table to build a nice output but if you do the result looks much nicer :)
Results =
1 434 3432
3 4353 4323
This code does the interesting job of sorting the matrix according to the id and removing the orphans.
x = sortrows(x,1); % sort x according to index
idx = x(:,1);
idxs = 1:max(idx);
rm = idxs(hist(idx, idxs) == 1); %find orphans
x( ismember(x(:,1),rm), : ) = [] %remove orphans
This last part then just shapes the array the way you want it
y = reshape(x', 4, []);
y( 3, : ) = [];

matlab parse file into cell array

I have a file in the following format in matlab:
user_id_a: (item_1,rating),(item_2,rating),...(item_n,rating)
user_id_b: (item_25,rating),(item_50,rating),...(item_x,rating)
so each line has values separated by a colon where the value to the left of the colon is a number representing user_id and the values to the right are tuples of item_ids (also numbers) and rating (numbers not floats).
I would like to read this data into a matlab cell array or better yet ultimately convert it into a sparse matrix wherein the user_id represents the row index, and the item_id represents the column index and store the corresponding rating in that array index. (This would work as I know a-priori the number of users and items in my universe so ids cannot be greater than that ).
Any help would be appreciated.
I have thus far tried the textscan function as follows:
c = textscan(f,'%d %s','delimiter',':') %this creates two cells one with all the user_ids
%and another with all the remaining string values.
Now if I try to do something like str2mat(c{2}), it works but it stores the '(' and ')' characters also in the matrix. I would like to store a sparse matrix in the fashion that I described above.
I am fairly new to matlab and would appreciate any help regarding this matter.
f = fopen('data.txt','rt'); %// data file. Open as text ('t')
str = textscan(f,'%s'); %// gives a cell which contains a cell array of strings
str = str{1}; %// cell array of strings
r = str(1:2:end);
r = cellfun(#(s) str2num(s(1:end-1)), r); %// rows; numeric vector
pairs = str(2:2:end);
pairs = regexprep(pairs,'[(,)]',' ');
pairs = cellfun(#(s) str2num(s(1:end-1)), pairs, 'uni', 0);
%// pairs; cell array of numeric vectors
cols = cellfun(#(x) x(1:2:end), pairs, 'uni', 0);
%// columns; cell array of numeric vectors
vals = cellfun(#(x) x(2:2:end), pairs, 'uni', 0);
%// values; cell array of numeric vectors
rows = arrayfun(#(n) repmat(r(n),1,numel(cols{n})), 1:numel(r), 'uni', 0);
%// rows repeated to match cols; cell array of numeric vectors
matrix = sparse([rows{:}], [cols{:}], [vals{:}]);
%// concat rows, cols and vals into vectors and use as inputs to sparse
For the example file
1: (1,3),(2,4),(3,5)
10: (1,1),(2,2)
this gives the following sparse matrix:
matrix =
(1,1) 3
(10,1) 1
(1,2) 4
(10,2) 2
(1,3) 5
I think newer versions of Matlab have a stringsplit function that makes this approach overkill, but the following works, if not quickly. It splits the file into userid's and "other stuff" as you show, initializes a large empty matrix, and then iterates through the other stuff, breaking it apart and placing in the correct place in the matrix.
(I Didn't see the previous answer when I opened this for some reason - it is more sophisticated than this one, though this may be a little easier to follow at the expense of slowness). I throw in the \s* into the regex in case the spacing is inconsistent, but otherwise don't perform much in the way of data-sanity-checking. Output is the full array, that you can then turn into a sparse array if desired.
% matlab_test.txt:
% 101: (1,42),(2,65),(5,0)
% 102: (25,78),(50,12),(6,143),(2,123)
% 103: (23,6),(56,3)
clear all;
% your path will vary, of course
file = '<path>/matlab_test.txt';
f = fopen(file);
c = textscan(f,'%d %s','delimiter',':');
uids = c{1}
tuples = c{2}
% These are stated as known
num_users = 3;
num_items = 40;
desired_array = zeros(num_users, num_items);
expression = '\((\d+)\s*,\s*(\d+)\)'
% Assuming length(tuples) == num_users for simplicity
for k = 1:num_users
uid = uids(k)
tokens = regexp(tuples{k}, expression, 'tokens');
for l = 1:length(tokens)
item_id = str2num(tokens{l}{1})
rating = str2num(tokens{l}{2})
desired_array(uid, item_id) = rating;

3D cell arrays in matlab

I am currently working using matlab, I have uploaded a csv file into a cell array that I have named B. What I now wish to do is to input the information of B into a 3-D cell array, the 3rd dimension of the array being the first column of B which are strings ranging from "chr1" to "chr24". The full length of B is m, and the maximum length of any "chr" is maxlength. I doubt that this is the best way of going about it but here is my code:
for j = 1:m ,
Ind = findstr(B{1}{j}, 'chr');
Num = B{1}{j}(Ind+3:end-1);
cnum = str2num(Num);
for i = 1:24,
if cnum == i;
for k = 2:9 ,
for l = 1:maxlength ,
C{l}{k}{i} = B{k}{j};
The 3-D array that comes out of this does not match the corresponding values in the initial array. I also want to know if this is the right way to create a 3-D array, I can't seem to find anything on the matlab website about them.
There are a few possible issues with your approach: First of all, Matlab indexing is different from c-style indexing into tables. myCell{i}{j} is the j-th element of the cell array that is contained in the i-th element of the cell array myCell. If you want to index into a 2-d cell array, you would get the contents of the element in row i, column j as myCell{i,j}.
If the columns 2 through 9 of your .csv file contain all numeric data, it may be a lot more convenient to use either a 1D cell array with an entry for every chromosome, or to use a 2D or 3D numeric array if you get, for each chromosome, a single row, or a table, respectively.
Here's one way to do it
%# convert chromosomes to numbers
chromosomes = B{1};
chromosomes = strrep(chromosomes,'X',25);
chromosomes = strrep(chromosomes,'Y',26);
tmp = regexp(chromsomes,'chr(\d+)','tokens','once');
cnum = cellfun(#(x)str2double(x{1}),tmp);
%# catenate the rest of B into a 2D cell array
allNumbers = cell2mat(cat(2,B{2:end}));
%# now we can make a table with [chromosomeNumber,allOtherNumbers]
finalTable = [chromosomeNumber,allNumbers]
%# alternatively, if there are multiple entries for each chromosome, we can
%# group the data in a cell array, so that the i-th entry corresponds to chr.i
%# for readability, use a loop
outputCell = cell(26,1); %# assume 26 chromosomes
for i=1:26
outputCell{i} = allNumbers(cnum==i,:);
I've managed to do this with only two for loops, here is my code:
C = zeros(26,8,maxlength);
next = zeros(1,26);
for j = 1:m ,
Ind = findstr(B{1}{j}, 'chr');
Num = B{1}{j}(Ind+3:end-1);
cnum = str2num(Num);
if Num == 'X'
cnum = 25;
if Num == 'Y'
cnum = 26;
next(cnum) = next(cnum) + 1;
for k = 2:9 ,
D{cnum}{k-1}{next(cnum)} = B{k}{j};
C(cnum,k-1,next(cnum)) = str2num(B{k}{j});