Replace a string in cell array into 1x3 numeric cell array - matlab

Cell array data as below:
data=
'A' [0.006] 'B'
'C' [3.443] 'C'
i would like to convert character in first column in to 1x3 vector, mean that
'A' replace by [0] [0] [0],
'C' replace by [0] [1] [0]..
the result will be
[0] [0] [0] [0.006] 'B'
[0] [1] [0] [3.443] 'C'
the code i tried as below:
B=data(1:end,1);
B=regexprep(B,'C','[0 0 0]');
B=regexprep(B,'A','[0 1 0]');
the result show me
B=
'[0 0 0]'
'[0 1 0]'
which is wrong, each character does not change to 1x3 array...please help...

Since you did not specify the rule to convert letters to numbers,
I assumed you want to replace A with 000, B with 001, ..., H with 111
(ie numbers from 0 to 7 in binary, corresponding to letters A to H).
In case you want to go up to Z, the code below can be easily changed.
%# you data cell array
data = {
'A' [0.006] 'B'
'C' [3.443] 'C'
};
%# compute binary numbers equivalent to letters A to H
binary = num2cell(dec2bin(0:7)-'0'); %# use 0:25 to go up to Z
%# convert letters in to row indices in the above cell array "binary"
idx = cellfun(#(c) c-'A'+1, upper(data(:,1)));
%# replace first column, and build new data
newData = [binary(idx,:) data(:,2:end)]
The result:
newData =
[0] [0] [0] [0.006] 'B'
[0] [1] [0] [3.443] 'C'

Related

Enumerating combinations of cells

Say I have 3 cells:
M1={ [1,1,1], [2,2,2] }
M2={ [3,3], [4,4] }
M3={ [5], [6] }
I want to take every element in M1, combine it with every element of M2, combine that with every element of M3, ect.
For the input above, I would like to produce one giant cell like:
[1,1,1],[3,3],[5]
[1,1,1],[3,3],[6]
[1,1,1],[4,4],[5]
[1,1,1],[4,4],[6]
[2,2,2],[3,3],[5]
[2,2,2],[3,3],[6]
[2,2,2],[4,4],[5]
[2,2,2],[4,4],[6]
How can I do this? In general, the number of cells (M1,M2...Mn), and their size, are unknown (and changing).
This function does what you want:
function C = add_permutations(A,B)
% A is a cell array NxK, B is 1xM
% C is a cell array N*M x K+1
N = size(A,1);
A = reshape(A,N,1,[]);
C = cat(3,repmat(A,1,numel(B)),repmat(B,N,1));
C = reshape(C,[],size(C,3));
It creates all combinations of two cell arrays by replicating them in different dimensions, then concatenating along the 3rd dimension and collapsing the first two dimensions. Because we want to repeatedly call it with different cell arrays, input A (NxK) has K matrices in each row, these are the previous combinations. B is a cell vector, each element will be combined with each row of A.
You use it as follows:
M1 = { 'a', 'b', 'c', 'd' }; % These are easier for debugging than OP's input, but cell elements can be anything at all.
M2 = { 1, 2 };
M3 = { 10, 12 };
X = M1.';
X = add_permutations(X,M2);
X = add_permutations(X,M3);
X now contains:
X =
16×3 cell array
'a' [1] [10]
'b' [1] [10]
'c' [1] [10]
'd' [1] [10]
'a' [2] [10]
'b' [2] [10]
'c' [2] [10]
'd' [2] [10]
'a' [1] [12]
'b' [1] [12]
'c' [1] [12]
'd' [1] [12]
'a' [2] [12]
'b' [2] [12]
'c' [2] [12]
'd' [2] [12]
That's not a permutation, it's an enumeration: you have 3 symbols, each with 2 possible values, and you are simply enumerating all possible "numbers". You can think about it the same way as if you were counting binary numbers with 3 digits.
In this case, one way to enumerate all these possibilities is with ndgrid. If M1 has n1 elements, M2 has n2 elements, etc:
n1 = numel(M1);
n2 = numel(M2);
n3 = numel(M3);
[a,b,c] = ndgrid(1:n1, 1:n2, 1:n3);
Here a,b,c are each 3-dimensional array, which represent the "grid" of combinations. Obviously you don't need that, so you can vectorise them, and use them to create combinations of the various elements in M1, M2, M3, like so
vertcat( M1(a(:)), M2(b(:)), M3(c(:)) )
If you are interested in generalising this for any number of Ms, this can also be done, but keep in mind that these "grids" are growing very fast as you increase their dimensionality.
Note: vertcat stands for "vertical concatenation", the reason it is vertical and not horizontal is because the result of M1(a(:)) is a row-shaped cell, even though a(:) is a column vector. That's just indexing headache, but you can simply transpose the result if you want it Nx3.

In Matlab, How to eliminate empty columns from the cell array?

So in 3 X 18 cell array, 7 columns are empty and I need a new cell array that's 3 X 11. Any suggestions without going for looping ?
Let's consider the following cell array. Its second column consists only of [], so it should be removed.
>> c = {1 , [], 'a'; 2, [], []; 3, [], 'bc'}
c =
[1] [] 'a'
[2] [] []
[3] [] 'bc'
You can compute a logical index to tell which columns should be kept and then use it to obtain the result:
>> keep = any(~cellfun('isempty',c), 1); %// keep columns that don't only contain []
keep =
1 0 1 %// column 2 should be removed
>> result = c(:,keep)
result =
[1] 'a'
[2] []
[3] 'bc'
How it works:
cellfun('isempty' ,c) is a matrix the same size as c. It contains 1 at entry (m,n) if and only if c{m,n} is empty.
~cellfun('isempty' ,c) is the logical negation of the above, so it contains 1 where c is not empty.
any(~cellfun('isempty' ,c), 1) applies any to each column of the above. So it's a row vector such that its m-th entry equals 1 if any of the cells of c in that column are non-empty, and 0 otherwise.
The above is used as a logical index to select the desired columns of c.
Use cellfun to detect elements, then from that find columns with empty elements and delete those:
cellarray(:, any(cellfun(#isempty, cellarray), 1)) = [];
If instead you'd like to keep columns with at least one non-empty element, use all instead of any.
For example:
>> cellarray = {1 2 ,[], 4;[], 5, [], 3}
[1] [2] [] [4]
[] [5] [] [3]
>> cellarray(:,any(cellfun(#isempty, cellarray), 1))=[]
cellarray =
[2] [4]
[5] [3]

Left join for cell arrays in MATLAB

I've 2 cell arrays in MATLAB, for example:
A= {jim,4,paul,5 ,sean ,5,rose, 1}
and the second:
B= {jim, paul, george, bill, sean ,rose}
I want to make an SQL left join so I'll have all the values from B and their match from A. If they don't appear in A so it will be '0'. means:
C= {jim, 4, paul, 5, george, 0, bill, 0, sean, 5, rose, 1}
didn't find any relevant function for help.
thanks.
Approach #1
%// Inputs
A= {'paul',5 ,'sean' ,5,'rose', 1,'jim',4}
B= {'jim', 'paul', 'george', 'bill', 'sean' ,'rose'}
%// Reshape A to extract the names and the numerals separately later on
Ar = reshape(A,2,[]);
%// Account for unsorted A with respect to B
[sAr,idx] = sort(Ar(1,:))
Ar = [sAr ; Ar(2,idx)]
%// Detect the presence of A's in B's and find the corresponding indices
[detect,pos] = ismember(B,Ar(1,:))
%// Setup the numerals for the output as row2
row2 = num2cell(zeros(1,numel(B)));
row2(detect) = Ar(2,pos(detect)); %//extracting names and numerals here
%// Append numerals as a new row into B and reshape as 1D cell array
out = reshape([B;row2],1,[])
Code run -
A =
'paul' [5] 'sean' [5] 'rose' [1] 'jim' [4]
B =
'jim' 'paul' 'george' 'bill' 'sean' 'rose'
out =
'jim' [4] 'paul' [5] 'george' [0] 'bill' [0] 'sean' [5] 'rose' [1]
Approach #2
If you want to work with the numerals in the cell arrays as strings, you can use this modified version -
%// Inputs [Please edit these to your actual inputs]
A= {'paul',5 ,'sean' ,5,'rose', 1,'jim',4};
B= {'jim', 'paul', 'george', 'bill', 'sean' ,'rose'}
%// Convert the numerals into string format for A
A = cellfun(#(x) num2str(x),A,'Uni',0)
%// Reshape A to extract the names and the numerals separately later on
Ar = reshape(A,2,[]);
%// Account for unsorted A with respect to B
[sAr,idx] = sort(Ar(1,:));
Ar = [sAr ; Ar(2,idx)];
%// Detect the presence of A's in B's and find the corresponding indices
[detect,pos] = ismember(B,Ar(1,:));
%// Setup the numerals for the output as row2
row2 = num2cell(zeros(1,numel(B)));
row2 = cellfun(#(x) num2str(x),row2,'Uni',0); %// Convert to string formats
row2(detect) = Ar(2,pos(detect)); %//extracting names and numerals here
%// Append numerals as a new row into B and reshape as 1D cell array
out = reshape([B;row2],1,[])
Code run -
B =
'jim' 'paul' 'george' 'bill' 'sean' 'rose'
A =
'paul' '5' 'sean' '5' 'rose' '1' 'jim' '4'
out =
'jim' '4' 'paul' '5' 'george' '0' 'bill' '0' 'sean' '5' 'rose' '1'

Finding which letter has maximal occurence

I tried Matlab and the net to find an answer but in vain so I need your help
I have used the code below to find number of occurrences of the letters in an array;
characterCell = {'a' 'b' 'b' 'a' 'b' 'd' 'c' 'c'}; %# Sample cell array
matchCell = {'a' 'b' 'c' 'd' 'e'}; %# Letters to count
[~,index] = ismember(characterCell,matchCell); %# Find indices in matchCell
counts = accumarray(index(:),1,[numel(matchCell) 1]); %# Accumulate indices
results = [matchCell(:) num2cell(counts)] `
results =
'a' [2]
'b' [3]
'c' [2]
'd' [1]
'e' [0]
Now I need to get which letter has the highest occurrence
How to know the index?
The mode function tells you the most frequent value.
mostCommonLetter = mode(matchCell[:]);
The index is the second output of the function max.
So you should do:
[~,index]=max(counts)
mostCommonLetter=matchCell{index};

Sorting a cell array

I want to sort the rows according to their second entries, i.e. by second column. Each entry of the second column is an array chars(representing a time stamp). There also might be missing values, i.e. the entry in the second column can be []. How do I do this?
you need to use the sortrows() function
if the matrix you wanted to sort is A then use
sorted_matrix = sortrows(A,2);
http://www.mathworks.com/help/techdoc/ref/sortrows.html
I would first convert the time stamps from strings to numeric values using the function DATENUM. Then you will want to replace the contents of the empty cells with a place holder, like NaN. The you can use the function SORTROWS to sort based on the second column. Here is an example:
>> mat = {1 '1/1/10' 3; 4 [] 6; 7 '1/1/09' 9} %# Sample cell array
mat =
[1] '1/1/10' [3]
[4] [] [6]
[7] '1/1/09' [9]
>> validIndex = ~cellfun('isempty',mat(:,2)); %# Find non-empty indices
>> mat(validIndex,2) = num2cell(datenum(mat(validIndex,2))); %# Convert dates
>> mat(~validIndex,2) = {NaN}; %# Replace empty cells with NaN
>> mat = sortrows(mat,2) %# Sort based on the second column
mat =
[7] [733774] [9]
[1] [734139] [3]
[4] [ NaN] [6]
The NaN values will be sorted to the bottom in this case.