Combine 2 rows in a cell array - matlab

I have a number of rows in a cell array with lots of extra space at the end of the rows as such:
'a' 'b' 'c' 'd' [] [] [] [] []
'1' '2' '3' [] [] [] [] [] []
'w' 'x' 'y' 'z' [] [] [] [] []
I would like to copy the second row onto the end of the first row, as such:
'a' 'b' 'c' 'd' '1' '2' '3' [] []
'1' '2' '3' [] [] [] [] [] []
'w' 'x' 'y' 'z' [] [] [] [] []
Please note that the code given above is an arbitrary example to demonstrate what I wish to do. In reality I will include this functionality as a step in a more complex function.
I have tried searching for the first empty element in the cell array row, but for some reason isempty does not see them as empty. Is there an alternative method that someone could point me towards?
EDIT:
After the steps carried out above, the second row will be deleted, giving:
'a' 'b' 'c' 'd' '1' '2' '3' [] []
'w' 'x' 'y' 'z' [] [] [] [] []
Although the real cell array will have many more rows than 3.

I think this does what you want. I've denoted your cell array as c.
n1 = find(cellfun('isempty',c(1,:)), 1); %// first empty cell in row 1
n2 = find(cellfun('isempty',c(2,:)), 1); %// first empty cell in row 2
c(1,n1:n1+n2-2) = c(2,1:n2-1); %// copy the relevant part of row 2 onto row 1
This automatically extends your cell horizontally if the number of non-empty cells in row 2 exceeds the number of empty cells in row 1.
Example: input:
c = {'a' 'b' 'c' 'd' [] [] [] [] []
'1' '2' '3' [] [] [] [] [] []
'w' 'x' 'y' 'z' [] [] [] [] []}
Output:
c =
'a' 'b' 'c' 'd' '1' '2' '3' [] []
'1' '2' '3' [] [] [] [] [] []
'w' 'x' 'y' 'z' [] [] [] [] []

Here's one general approach that uses the efficient logical indexing to select the non-empty cells, single call to cellfun('isempty' and does automatic extension as talked about in Luis's solution -
C = {
'a' 'b' 'c' 'd' [] [] [] [] []
'1' '2' '3' [] [] [] [] [] []
'w' 'x' 'y' 'z' [] [] [] [] []} %// Input cell array
N = 2; %//Number of rows to process, starting from 2 until the number of rows in C
Ct = C'; %//'# Transpose input cell array, as collecting elements that way is easier
vals = Ct(~cellfun('isempty',Ct(:,1:N))); %//'# elements from selected row(s)
C(1,1:numel(vals)) = vals; %// Place the values into the first row
With N = 2 which is the case stated in the problem, output would be -
C =
'a' 'b' 'c' 'd' '1' '2' '3' [] []
'1' '2' '3' [] [] [] [] [] []
'w' 'x' 'y' 'z' [] [] [] [] []
With N = 3, you would copy the second and third rows at the end of the first row. Thus, the output would be -
C =
'a' 'b' 'c' 'd' '1' '2' '3' 'w' 'x' 'y' 'z'
'1' '2' '3' [] [] [] [] [] [] [] []
'w' 'x' 'y' 'z' [] [] [] [] [] [] []
and so on.

Related

Deleting and retaining data in a cell array -Matlab

I have a set of data in a cell array ,a part of which is shown below.The first three columns of row 2 and row 3 are same.In the 2nd row from columns 4 onwards it contains P0702 which is already captured in row 3(with the same first three columns as row 2).So i would like the delete the 2nd row.Similarly 5 and 6th rows have same data in the first three columns.P0882 and P0702 in the fifth row is also present in the sixth row,so i would want to delete the 5th row .
Data before duplicates
'1FA' 2 'Fm' [] [] [] [] [] [] [] 'P2700' []
'1Fc' 2 'Fz' [] [] [] 'P0702' [] [] [] [] []
'1Fc' 2 'Fz' [] 'P0702' 'P0801' [] [] [] [] [] []
'1Fj' 8 'Fr' 'P0702' [] [] [] [] [] [] [] []
'1FAH' 2 'Fo' [] [] [] [] [] [] 'P0882' 'P0702' []
'1FAH' 2 'Fo' [] [] [] [] [] [] 'P0882' 'P0702' 'P2700'
Data after Duplicates
'1FA' 2 'Fm' [] [] [] [] [] [] [] 'P2700' []
'1Fc' 2 'Fz' [] 'P0702' 'P0801' [] [] [] [] [] []
'1Fj' 8 'Fr' 'P0702' [] [] [] [] [] [] [] []
'1FAH' 2 'Fo' [] [] [] [] [] [] 'P0882' 'P0702' 'P2700'
Any help would be great on this.
First reading the question I thought this should be possible in 2 or 3 lines, but it took some lines of code to solve:
M={'1FA' 2 'Fm' [] [] [] [] [] [] [] 'P2700' []
'1Fc' 2 'Fz' [] [] [] 'P0702' [] [] [] [] []
'1Fc' 2 'Fz' [] 'P0702' 'P0801' [] [] [] [] [] []
'1Fj' 8 'Fr' 'P0702' [] [] [] [] [] [] [] []
'1FAH' 2 'Fo' [] [] [] [] [] [] 'P0882' 'P0702' []
'1FAH' 2 'Fo' [] [] [] [] [] [] 'P0882' 'P0702' 'P2700' }
%r contains the number of nonempty cells, you want those with highest r
r=sum(cellfun(#(x)~isempty(x),(M(:,4:end))),2);
%Create a index matrix which maps each string of first and third column to
%a double, which allows to use unique.
[~,~,index]=unique(M(:,1));
index(:,2)=[M{:,2}];
[~,~,index(:,3)]=unique(M(:,3));
%fill fourth colum with consecutive numbers, used to restore original
%ordering
index(:,4)=1:size(index,1);
%Next two lines, sort index to have rows with highetst r first
[~,sorted_most_content]=sort(-r);
index=index(sorted_most_content,:);
%Now first three columns of index should be unique and the best choice
%comes first, finally unique can be used.
[~,indices_unique_content,~]=unique(index(:,1:3),'rows');
%use previously appended consecutive numbers to get line numbers we want.
%sort restores original ordering.
unique_content_inorder=sort(index(indices_unique_content,4));
%The data you want:
M(unique_content_inorder,:)

Deleting rows with specific rules

I got a 20*3 cell array and I need to delete the rows contains "137", "2" and "n:T"
Origin data:
'T' '' ''
'NP(*)' '' ''
[ 137] '' ''
[ 2] '' ''
'ARE' 'and' 'NP(FCC_A1#1)'
'' '' '1:T'
[ 1200] [0.7052] ''
[1.2051e+03] [0.7076] ''
'ARE' 'and' 'NP(FCC_A1#3)'
'' '' '2:T'
[ 1200] [0.0673] ''
[1.2051e+03] [0.0671] ''
'ARE' 'and' 'NP(M23C6)'
'' '' '3:T'
[ 1200] [0.2275] ''
[1.2051e+03] [0.2253] ''
[ 137] '' ''
[ 2] '' ''
And I want it to be like
'T' '' ''
'NP(*)' '' ''
'ARE' 'and' 'NP(FCC_A1#1)'
[ 1200] [0.7052] ''
[1.2051e+03] [0.7076] ''
'ARE' 'and' 'NP(FCC_A1#3)'
[ 1200] [0.0673] ''
[1.2051e+03] [0.0671] ''
'ARE' 'and' 'NP(M23C6)'
[ 1200] [0.2275] ''
[1.2051e+03] [0.2253] ''
I've tried regexp and strcmp and they don't work well. Plus the cell array also hard to deal with. Can anyone help?
Thank you in advance.
If you can somehow read your original data so that all cells are strings or empty arrays (not numeric values), you can do it with strcmp and regexprep:
% The variable 'data' is a 2D-cell array of strings or empty arrays
datarep = regexprep(data,'^\d+:T','2'); % replace 'n:T' with '2' for convenience
remove1 = strcmp('2',datarep); % this takes care of '2' and 'n:T'
remove2 = strcmp('137',datarep); % this takes care of '137'
rows_keep = find(~sum(remove1|remove2,2)); % rows that will be kept
solution = data(rows_keep,:)
For example, with this data
'aa' 'bb' 'cc'
'dd' 'dd' '2'
'137' 'dd' 'dd'
'dd' 'dd' '11:T'
'1:T' '1:137' 'dd'
'dd' '' []
the result in the variable solution is
'aa' 'bb' 'cc'
'dd' '' []
I just tried the following codes on my desktop and it seems to do the trick. I made a as the cell array you had.
L = size(a, 1);
mask = false(L, 1);
for ii = 1:L
if isnumeric(a{ii, 1}) && (a{ii, 1} == 137 || a{ii, 1} == 2)
mask(ii) = true;
elseif ~isempty(a{ii, 3}) && strcmp(a{ii, 3}(end-1:end), ':T')
mask(ii) = true;
end
end
b = a(~mask, :)
Now, b should be the cell array you wanted. Basically, I created a logical mask that indicates the position of rows that satisfy the rules, then use the inverse of it to call out the rows.
Here is another simple option:
%Anonymous function that checks if a cell is equal to 173 or to 2 or fits the '*:T*' pattern
Eq137or2 = #(x) sum(x == 137 | x == 2) | sum(strfind(num2str(x), ':T') > 1)
%Use the anonymous functions to find the rows you don't want
mask = = sum(cellfun(Eq137or2, a),2)
%Remove the unwanted rows
a(~mask, :)

how to edit cells array in matlab

I have this cells array which is came from a mat lab code that generates dewey IDs:
POT1 =
'a0' [] [] []
'a0' 'c0' [] []
'a0' 'b0' [] []
'a0' 'c1' [] []
'a0' 'd0' [] []
'a0' 'c0' 'd1' []
'a0' 'b0' 'd2' []
'a0' 'd0' 'd3' []
'a0' 'd0' 'c2' []
'a0' 'd0' 'b1' []
'a0' 'd0' 'd4' []
'a0' 'c1' 'c3' []
'a0' 'c1' 'b2' []
'a0' 'c1' 'c3' 'd5'
'a0' 'c1' 'b2' 'd6'
'a0' 'd0' 'b1' 'd7'
'a0' 'd0' 'c2' 'd8'
note that column 1 is parent of column 2 and column 2 is paret of column 3..etc
so I want to build a code that gives the full name of each cell as follow:
POT1 =
a0 [] [] []
a0 a0.c0 [] []
a0 a0.b0 [] []
a0 a0.c1 [] []
a0 a0.d0 [] []
a0 a0.c0 a0.c0.d1 []
a0 a0.b0 a0.b0.d2 []
a0 a0.d0 a0.d0.d3 []
a0 a0.d0 a0.d0.c2 []
a0 a0.d0 a0.d0.b1 []
.
.
.
.
The code which I build is not complete and gives me :" Index exceeds matrix dimensions" error :
for i=1:length(POT1)
for j=3:size(POT1,2)
if ~isempty(POT1{i,j})
POT1{i,j}=[POT1{i,j-2} POT1{i,j-1} POT1{i,j}];
end
end
end
POT1
I think you're on the right track, but it's easier if you process it column by column. This way, you just have to look one column back for each entry:
for jj=2:size(POT1,2)
for ii=1:size(POT1,1)
if ~isempty(POT1{ii,jj})
POT1{ii,jj}=[POT1{ii,jj-1} '.' POT1{ii,jj}];
end
end
end
btw: length returns the Length of vector or largest array dimension, so next time, better use size.
If you have vectors with ' %Two spaces ' on the empty spaces it will be really easy.
You can just transform it into a matrix and the rest is simple as this:
[POT1(:,1:2) '.' POT1(:,3:4)]
Afterwards you can just strip the spaces and done.

Extract a single column from a matrix

I have a matrix generated from the program written in Matlab something like this :
'A' 'B' 'C' 'D' 'E'
[ 4] [ 1] [ 0.9837] [ 0.9928] [0.9928]
[ 4] [ 1] [ 0.9995] [ 0.9887] [0.9995]
[ 4] [ 1] [ 0.9982] [ 0.9995] [0.9995]
[ 4] [ 1] [ 0.9959] [ 0.9982] [0.9887]
I am trying to extract the column 'D' without the header 'D'.
I can put into a temporary variable and then extract the column data. But I am wondering, if it could be done in a single step.
Thanks
If your variable is data, then data(2:end,4) should do it.
Edit:
For example:
>> data
data =
'A' 'B' 'C' 'D' 'E'
[4] [1] [0.9837] [0.9928] [0.9928]
[4] [1] [0.9995] [0.9887] [0.9995]
[4] [1] [0.9982] [0.9995] [0.9995]
[4] [1] [0.9959] [0.9982] [0.9887]
>> data(2:end,4) %Extract the data as a cell array
ans =
[0.9928]
[0.9887]
[0.9995]
[0.9982]
>> cell2mat(data(2:end,4)) %Convert to a numeric (typical) array
ans =
0.9928
0.9887
0.9995
0.9982

removing duplicates - ** only when the duplicates occur in sequence

I would like to do something similar to the following, except I would only like to remove 'g' and'g' because they are the duplicates that occur one after each other. I would also like to keep the sequence the same.
Any help would be appreciated!!!
I have this cell array in MATLAB:
y = { 'd' 'f' 'a' 'g' 'g' 'w' 'a' 'h'}
ans =
'd' 'f' 'a' 'w' 'a' 'h'
There was an error in my first answer (below) when used on multiple duplicates (thanks grantnz). Here's an updated version:
>> y = { 'd' 'f' 'a' 'g' 'g' 'w' 'a' 'h' 'h' 'i' 'i' 'j'};
>> i = find(diff(char(y)) == 0);
>> y([i; i+1]) = []
y =
'd' 'f' 'a' 'w' 'a' 'j'
OLD ANSWER
If your "cell vector" always contains only single character elements you can do the following:
>> y = { 'd' 'f' 'a' 'g' 'g' 'w' 'a' 'h'}
y =
'd' 'f' 'a' 'g' 'g' 'w' 'a' 'h'
>> y(find(diff(char(y)) == 0) + [0 1]) = []
y =
'd' 'f' 'a' 'w' 'a' 'h'
Look at it like this: you want to keep an element if and only if either (1) it's the first element or (2) its predecessor is different from it and either (3) it's the last element or (4) its successor is different from it. So:
y([true ~strcmp(y(1:(end-1)),y(2:end))] & [~strcmp(y(1:(end-1)),y(2:end)) true])
or, perhaps better,
different = ~strcmp(y(1:(end-1)),y(2:end));
result = y([true different] & [different true]);
This should work:
y([ diff([y{:}]) ~= 0 true])
or slightly more compactly
y(diff([y{:}]) == 0) = []
Correction : The above wont remove both the duplicates
ind = diff([y{:}]) == 0;
y([ind 0] | [0 ind]) = []
BTW, this works even if there are multiple duplicate sequences
eg,
y = { 'd' 'f' 'a' 'g' 'g' 'w' 'a' 'h' 'h'};
ind = diff([y{:}]) == 0;
y([ind 0] | [0 ind]) = []
y =
'd' 'f' 'a' 'w' 'a'