Printing progress in command window - matlab

I'd like to use fprintf to show code execution progress in the command window.
I've got a N x 1 array of structures, let's call it myStructure. Each element has the fields name and data. I'd like to print the name side by side with the number of data points, like such:
name1 number1
name2 number2
name3 number3
name4 number4
...
I can use repmat N times along with fprintf. The problem with that is that all the numbers have to come in between the names in a cell array C.
fprintf(repmat('%s\t%d',N,1),C{:})
I can use cellfun to get the names and number of datapoints.
names = {myStucture.name};
numpoints = cellfun(#numel,{myStructure.data});
However I'm not sure how to get this into a cell array with alternating elements for C to make the fprintf work.
Is there a way to do this? Is there a better way to get fprintf to behave as I desire?

You're very close. What I would do is change your cellfun call so that the output is a cell array instead of a numeric array. Use the 'UniformOutput' flag and set this to 0 or false.
When you're done, make a new cell array where both the name cell array and the size cell array are stacked on top of each other. You can then call fprintf once.
% Save the names in a cell array
A = {myStructure.name};
% Save the sizes in another cell array
B = cellfun(#numel, {myStructure.data}, 'UniformOutput', 0);
% Create a master cell array where the first row are the names
% and the second row are the sizes
out = [A; B];
% Print out the elements side-by-side
fprintf('%s\t%d\n', out{:});
The trick with the third line of code is that when you unroll the cell array using {:}, this creates a comma-separated list unrolled in column-major format, and so doing out{:} actually gives you:
A{1}, B{1}, A{2}, B{2}, ..., A{n}, B{n}
... which provides the interleaving you need. Therefore, providing this order into fprintf coincides with the format specifiers that are specified and thus gives you what you need. That's why it's important to stack the cell arrays so that each column gives the information you need.
Minor Note
Of course one should never forget that one of the easiest ways to tackle your problem is to just use a simple for loop. Even though for loops are considered bad practice, their performance has come a long way throughout MATLAB's evolution.
Simply put, just do this:
for ii = 1 : numel(myStructure)
fprintf('%s\t%d\n', myStructure(ii).name, numel(myStructure(ii).data));
end
The above code is arguably more readable in comparison to what we did above with cell arrays. You're accessing the structure directly rather than having to create intermediate variables for the purpose of calling fprintf once.
Example Run
Here's an example of this running. Using the data shown below:
clear myStructure;
myStructure(1).name = 'hello';
myStructure(1).data = rand(5,1);
myStructure(2).name = 'hi';
myStructure(2).data = zeros(3,3);
myStructure(3).name = 'huh';
myStructure(3).data = ones(6,4);
I get the following output after running the printing code:
hello 5
hi 9
huh 24
We can see that the sizes are correct as the first element in the structure is simply a random 5 element vector, the second element is a 3 x 3 = 9 zeroes matrix while the last element is a 6 x 4 = 24 ones matrix.

Related

MATLAB automatically assigns unwanted dimensions to a dynamically updated cell array

I want to generate a 3D cell array called timeData so that timeData(:,:,a) for some integer a is an nx1 matrix of data, and the number of rows n varies with the value of a in a 1:1 correspondence. To do this, I am generating a 2D array of data called data that is nx1. This assignment statement takes place within a for loop as follows:
% Before iterating, I define an array of indices where I want to store the
% data sets in timeData. This choice of storage location is for
% organizational purposes.
A = [2, 5, 9, 21, 34, 100]; % Notice they are in ascending order, but have
% gaps that have no predictability.
sizeA = size(A);
numIter = A(1);
for m = 1:numIter % numIter is the number of data sets that I need to store
% in timeData
% At this point, some code that is entirely irrelevant to my question
% generates a nx1 array of data. One example of this data array is below.
data = [1.1;2.3;5.5;4.4]; % This is one example of what data could be. Its
% number of rows, n, changes each iteration, as
% do its contents.
B = size(data);
timeData(1:B(1),1,A(m)) = num2cell(data);
end
This code does put all contents of data in the appropriate locations within timeData as I want. However, it also adds {0x0 double} rows to all 2D arrays of timeData(:,:,a) for any a whose corresponding number of rows n was not the largest number of rows. Thus, there are many of these 2D arrays that have 10 to a couple hundred 0-valued rows that I don't want. For values of a that did not have a corresponding data set, the content of timeData(:,:,a) is an nx1 array of {0x0 double}.
I need to iterate over the contents of timeData in subsequent code, and I need to be able to find the size of the data set that is in timeData(:,:,a) without somehow discounting all the {0x0 double}.
How can I modify my assignment statement to fix this?
Edit: Desired output of the above example is the following with n = 5. Let this data set be represented by a = 9.
timeData(:,:,9) = {[1.1]}
{[2.3]}
{[5.5]}
{[8.6]}
{[4.4]}
Now, consider the possibility that a previous or subsequent value of the A matrix had a data set with n = 7, and n = 7 is the largest data set (largest n value). timeData(:,:,9) outputs like so in my code:
timeData(:,:,9) = {[1.1]}
{[2.3]}
{[5.5]}
{[8.6]}
{[4.4]}
{[0x0 double]}
{[0x0 double]}
#Dev-iL, as I understand it, your answer gives me the ability to delete the cells that have {[0x0 double]} in them (this is what I mean by "discounting"). This is a good plan B, but is there a way to prevent the {[0x0 double]} cells from showing up in the first place?
Edit 2: Update to the above statement "your answer gives me the ability to delete the cells that have {[0x0 double]} in them (this is what I mean by "discounting")". The cellfun(#isempty... ) function makes the {[0x0 double]}cells go to {[0x0 cell]}, it does not remove them. In other words, size(timeData(:,:,9)) is the same before and after the command is performed. This is not what I want. I want size(timeData(:,:,9)) to be 5x1 no matter what n is for any other value of a.
Edit 3: I just realized that the most desired output would be the following:
timeData(:,:,9) = {[1.1;2.3;5.5;8.6;4.4]} % An n x 1 column matrix within
% the cell.
but I can work with this outcome or the outcome as described above.
Unfortunately, I don't understand the structure of your dataset, which is why I can't suggest a better assignment method. However, I'd like to point out an operation that can you help deal with your data after it's been created:
cellfun(#isempty,timeData);
What the above does is return a logical array the size of timeData, indicating which cells contain something "empty". Typically, an array of arbitrary datatype is considered "empty" when it has at least one dimension that is equal to 0.
How can you use it to your advantage?
%% Example 1: counting non-empty cells:
nData = sum(~cellfun(#isempty,timeData(:)));
%% Example 2: assigning empty cells in place of empty double arrays:
timeData(cellfun(#isempty,timeData)) = {{}};

Avoid looping in matlab when creating cells containing cell arrays

I'm trying to create a map that has two-element cell arrays as values. Map expects that keys and values have the same number of elements. This code packs those cell arrays into cells in a loop, but I'm suspecting that it can be simplified somehow. Example code:
cells1={'foo1';'foo2';'foo3'};
cells2={'bar1';'bar2';'bar3'};
cells3={'baz1';'baz2';'baz3'};
values=cell(size(cells1));
for ii=1:size(cells1,1)
values{ii}={{cells2{ii},cells3{ii}}};
end
keys=cells1;
containers.Map(keys,values);
you can use vector concatenation and num2cell with 2nd dimension argument (twice if you want to obtain identical result):
% your code
cells1={'foo1';'foo2';'foo3'};
cells2={'bar1';'bar2';'bar3'};
cells3={'baz1';'baz2';'baz3'};
values=cell(size(cells1));
for ii=1:size(cells1,1)
values{ii}={{cells2{ii},cells3{ii}}};
end
% simplified
c = num2cell(num2cell([cells2,cells3],2),2);
% you can also do c = num2cell([cells2,cells3],2); which isn't identical but may be suficcient
isequal(c,values) % yes

Using Matlab to randomly split an Excel Sheet

I have an Excel sheet containing 1838 records and I need to RANDOMLY split these records into 3 Excel Sheets. I am trying to use Matlab but I am quite new to it and I have just managed the following code:
[xlsn, xlst, raw] = xlsread('data.xls');
numrows = 1838;
randindex = ceil(3*rand(numrows, 1));
raw1 = raw(:,randindex==1);
raw2 = raw(:,randindex==2);
raw3 = raw(:,randindex==3);
Your general procedure will be to read the spreadsheet into some matlab variables, operate on those matrices such that you end up with three thirds and then write each third back out.
So you've got the read covered with xlsread, that results in the two matrices xlsnum and xlstxt. I would suggest using the syntax
[~, ~, raw] = xlsread('data.xls');
In the xlsread help file (you can access this by typing doc xlsread into the command window) it says that the three output arguments hold the numeric cells, the text cells and the whole lot. This is because a matlab matrix can only hold one type of value and a spreadsheet will usually be expected to have text or numbers. The raw value will hold all of the values but in a 'cell array' instead, a different kind of matlab data type.
So then you will have a cell array valled raw. From here you want to do three things:
work out how many rows you have (I assume each record is a row) by using the size function and specifying the appropriate dimension (again check the help file to see how to do this)
create an index of random numbers between 1 and 3 inclusive, which you can use as a mask
randindex = ceil(3*rand(numrows, 1));
apply the mask to your cell array to extract the records matching each index
raw1 = raw(:,randindex==1); % do the same for the other two index values
write each cell back to a file
xlswrite('output1.xls', raw1);
You will probably have to fettle the arguments to get it to work the way you want but be sure to check the doc functionname page to get the syntax just right. Your main concern will be to get the indexing correct - matlab indexes row-first whereas spreadsheets tend to be column-first (e.g. cell A2 is column A and row 2, but matlab matrix element M(1,2) is the first row and the second column of matrix M, i.e. cell B1).
UPDATE: to split the file evenly is surprisingly more trouble: because we're using random numbers for the index it's not guaranteed to split evenly. So instead we can generate a vector of random floats and then pick out the lowest 33% of them to make index 1, the highest 33 to make index 3 and let the rest be 2.
randvec = rand(numrows, 1); % float between 0 and 1
pct33 = prctile(randvec,100/3); % value of 33rd percentile
pct67 = prctile(randvec,200/3); % value of 67th percentile
randindex = ones(numrows,1);
randindex(randvec>pct33) = 2;
randindex(randvec>pct67) = 3;
It probably still won't be absolutely even - 1838 isn't a multiple of 3. You can see how many members each group has this way
numel(find(randindex==1))

subindex into a cell array of strings

I have a 6 x 3 cell (called strat) where the first two columns contain text, the last column has either 1 or 2.
I want to take a subset of this cell array. Basically select only the rows where the last column has a 1 in it.
I tried the following,
ff = strat(strat(:, 3), 1:2) == 1;
The error message is,
Function 'subsindex' is not defined for values of class 'cell'.
How can I index into a cell array?
Cell arrays are accessed through braces {} instead of parentheses (). Then, as a 2nd subtlety, when pulling values out of a cell arrays, you need to gather them...for numerics you gather them into regular arrays using [] and for strings you gather them into a new cell array using {}. Confusing, eh?
ff = { strat{ [strat{:,3}]==1 , 1:2 } };
Gathering into cell arrays this way can often give the wrong shape when you're done. So, you might try something like this
ind = find([strat{:,3}]==1); %find the relevant indices
ff = {{strat{ind,1}; strat{ind,2}}'; %this will probably give you the right shape

Fastest way of finding repeated values in different cell arrays of different size

The problem is the following:
I have a cell array of the form indx{jj} where each jj is an array of 1xNjj, meaning they all have different size. In my case max(jj)==3, but lets consider a general case for the shake of it.
How would you find the value(s) repeated in all the jj i the fastest way?
I can guess how to do it with several for loops, but is there a "one (three?) liner"?
Simple example:
indx{1}=[ 1 3 5 7 9];
indx{2}=[ 2 3 4 1];
indx{3}=[ 1 2 5 3 3 5 4];
ans=[1 3];
One possibility is to use a for loop with intersect:
result = indx{1}; %// will be changed
for n = 2:numel(indx)
result = intersect(result, indx{n});
end
Almost no-loop approach (almost because cellfun essentially uses loop(s) inside it, but it's effect here is minimal as we are using it to find just the number of elements in each cell) -
lens = cellfun(#numel,indx);
val_ind = bsxfun(#ge,lens,[1:max(lens)]');
vals = horzcat(indx{:});
mat1(max(lens),numel(lens))=0;
mat1(val_ind) = vals;
unqvals = unique(vals);
out = unqvals(all(any(bsxfun(#eq,mat1,permute(unqvals,[1 3 2]))),2));
Another possibility that I could suggest, though Luis Mendo's answer is very good, is to take all of the vectors in your cell array and remove the duplicates. This can be done through cellfun, and specifying unique as the function to operate on. You'd have to set the UniformOutput flag to false as we are outputting a cell array at each index. You also have to be careful in that each cell array is assumed to be all row vectors, or all column vectors. You can't mix the way the arrays are shaped or this method won't work.
Once you do this, concatenate all of the vectors into a single array through cell2mat, then do a histogram through histc. You'd specify the edges to be all of the unique numbers in the single array created before. Note that you'd have to make an additional call to unique on the output single array before proceeding. Once you calculate the histogram, for any entries with a bin count equal to the total number of elements in your cell array (which is 3 in your case), then these are values that you see in all of your cells. As such:
A = cell2mat(cellfun(#unique, indx, 'uni', 0));
edge_values = unique(A);
h = histc(A, edge_values);
result = edge_values(h == numel(indx));
With the unique call for each cell array, if a number appears in every single cell, then the total number of times you see this number should equal the total number of cells you have.