sort columns in Matlab - matlab

I have 2 columns of data imported from using textscan. The data looks like this where U is undetect and D is detect
mydata=
.51 U
.57 D
.48 U
.47 D
my data = [4x1 double] [4x1 char]
I want to sort the data by the first column and so the data would look like this
.47 D
.48 U
.51 U
.57 D
I would like to preserve the cell structure so that the following command to assign logical value still hold true:
c = zeros(size(mydata,1),1); % preallocate empty matrix
c = mydata{2} == 'U';
for i = 1:size(mydata,1)
curValue = mydata{i,2};
data{i,3} = ~isempty(curValue) && ischar(curValue) && strcmp(curValue ,'U');
end
I read about sortrows but the function is used to sort matrix containing just numbers.
Does anyone have a solution for sorting arrays with a mixture of numbers and characters.

You can SORT by one vector and apply the sorting index to another vector.
[mydata{1},idx] = sort(mydata{1});
mydata{2} = mydata{2}(idx);

I don't think you can directly sort the cell array, because each cell is considered a different "entity". You can always sort the numbers, use the indices to sort the characters, and then put it back in the cell array:
nums = mydata{1};
chars = mydata{2};
[~, ind] = sort(nums);
sortednums = nums(ind);
sortedchars = chars(ind);
mydata{1} = sortednums;
mydata{2} = sortedchars;

Related

operations with structure in matlab

I have a structure 1x300 called struct with 3 fields but I'm using only the third field called way. This field is, for each 300 lines, a vertor of index.
Here an exemple with 3 lines to explain my problem : I woud like to search if the last index of the first line is present in an other vector (line) of the field way.
way
[491751 491750 491749 492772 493795 494819 495843 496867]
[491753 491754 491755 491756]
[492776 493800 494823 495847 496867]
I tried with intersect function :
Inter=intersect(struct(1).way(end), struct.way);
but Matlab returns me an error :
Error using intersect (line 80)
Too many input arguments.
Error in file2 (line 9)
Inter=intersect(struct(1).way(end), struct.way);
I don't understand why I have this error. Any explanations and/or other(s) solution(s)?
Let the data be defined as
st(1).way = [491751 491750 491749 492772 493795 494819 495843 496867];
st(2).way = [491753 491754 491755 491756];
st(3).way = [492776 493800 494823 495847 496867]; % define the data
sought = st(1).way(end);
If you want to know which vectors contain the desired value: pack all vectors into a cell array and pass that to cellfun with an anonymous function as follows:
ind = cellfun(#(x) ismember(sought, x), {st.way});
This gives:
ind =
1×3 logical array
1 0 1
If you want to know for each vector the indices of the matching: modify the anonymous function to output a cell with the indices:
ind = cellfun(#(x) {find(x==sought)}, {st.way});
or equivalently
ind = cellfun(#(x) find(x==sought), {st.way}, 'UniformOutput', false);
The result is:
ind =
1×3 cell array
[8] [1×0 double] [5]
Or, to exclude the reference vector:
n = 1; % index of vector whose final element is sought
ind = cellfun(#(x) {find(x==st(n).way(end))}, {st([1:n-1 n+1:end]).way});
You propbably want to use ismember.
Consider what you are passing to the intersect/ismember functions too, struct.way isn't a valid argument, you may need to loop to iterate over each line of your struct (in this case it would be easier to have a cell array, or matrix with equal length rows).
output = zeros(300);
for ii = 1:300
for jj = 1:300
if ii ~= jj && ismember(struct(ii).way(end), struct(jj).way)
output(ii,jj) = 1;
end
end
end
Now you have a matrix output where the elements which are 1 identify a match between the last element in way in the struct row ii and the vector struct(jj).way, where ii are the matrix row numbers and jj the column numbers.

Accessing data from cell arrays at the same time

If A is a (x,y) cell array having n cells and each of them is a vector of size (m,n) and of type double.
Example: If A is a 1x2 cell array
A =
[100x1 double] [100x1 double]
Suppose I want to access the first element of each cells at the same time, how can we do so?
Similarly, if we need to access the ith element from every cell, how do we generalise the code?
cell creation with two 1*10 arrays:
A {1} = zeros(1,10) ;
A {2} = zeros (1,10) ;
A =
[1x10 double] [1x10 double]
Adding some data which will be used for fetching later:
A {1}(5) = 5 ;
A {2}(5) = 10 ;
Routine to fetch the data at same index from both arrays inside cell:
cellfun (#(x) x(5),A)
ans =
5 10
As User1551892 suggested, you could use cellfun. Another way is to restruct the cell to a matrix first.
The speed on the operation depends on the number of cells and the size of the matrix within each cell.
% Number of cells
x = 3;
y = 2;
% Size of matrix
m = 1;
n = 100;
% Add some random numbers
A = cell(x,y);
for i = 1:numel(A)
A{i} = round(rand(m,n)*100);
end
% Index to pick in each matrix
idx = 5;
% Convert to matrix
B = [A{:}];
% Pick the number
val = B(idx:(n*m):end);
Doing som tic-toc measurements, the method above is faster for the example values. As long as the one of n or m is small the method is ok.
But if both m and n grows large, cellfun is better (faster)
val = cellfun(#(x) x(idx), A);
An alternative way would be to simply access the cell elements directly , for example we have a cell like you defined
A{1}(1:10) = randi([2 5],1,10);
A{2}(1:10) = randi([2 5],1,10);
now if you want to access the ith elements simply declare i and they will be retrieved in the matrix below
i = 3;
ObsMatrix = [A{1}(i) A{2}(i)]
ObsMatrix =
2 5
If A has unknown number of cell you can simply use a for loop , It will pick ith element from every cell index and put it in ObsMat
i = 3;
for j=1:numel(A)
ObsMat(end + 1) = A{j}(3);
end
cellfun is also a wrapper function for for loop
ObsMat =
2 5

Extract Matrix columns and store them in individual vectors

I have a A matrix of size MxN where M is large and N is around 30.
[A,B,C,...,AD] = A(:,1:30)
The reason I am asking that is that I would like to give the columns a specific name (here A,B a,c,...,AD) and not being force to write:
[A,B,C,...,AD] = deal(A(:,1),A(:,2),A(:,3),...,A(:,30))
It's usually better to keep all columns together in the matrix and just access them through their column index.
Anyway, if you really need to separate them into variables, you can convert the matrix to a cell array of its columns with num2cell, and then generate a comma-separated list to be used in the right-hand side of the assignment. Note also that in recent Matlab versions you can remove deal:
A = magic(3); % example matrix
Ac = num2cell(A, 1);
[c1 c2 c3] = Ac{:}; % or [c1 c2 c3] = deal(Ac{:});
For generating that lexicographical sequence I recently, out of ignorance, wrote this
Data = rand(2,671);
r = rem(size(Data,2),26);
m = floor(size(Data,2)/26);
Alf = char('A'+(0:25)'); %TeX-like char seq
if m == 0
zzz = Alf(1:r);
else
zzz = Alf;
for x = 1:m-1
zzz = char(zzz,[char(Alf(x)*ones(26,1)),Alf]);
end
if r > 0
zzz = char(zzz, [char(Alf(m+1)*ones(r,1)),Alf(1:r)] );
end
end
Depending on the number of columns it generates column names until ZZ. Please let me know if there is a readily made command for this in matlab.
You would never ever use eval for such things!!! eval use is dangerous and wrong (but you can't resist):
% ==========
% Assign Data to indices
% ==========
for ind = 1:size(Data,2)
eval([zzz(ind,:) '= Data(:,' num2str(ind) ');']);
end
and your workspace looks like an alphabet soup.

Random permutation of each cell in a cell array

I have a 1-by-4 cell array, D. Each of the cell elements contains 2-by-2 double matrices. I want to do random permutation over each matrix independently which in result I will have the same size cell array as D but its matrices' elements will be permuted and then the inverse in order to obtain the original D again.
for a single matrix case I have the code and it works well as follows:
A=rand(3,3)
p=randperm(numel(A));
A(:)=A(p)
[p1,ind]=sort(p);
A(:)=A(ind)
but it doesn't work for a cell array.
The simplest solution for you is to use a loop:
nd = numel(D);
D_permuted{1,nd} = [];
D_ind{1,nd} = [];
for d = 1:nd)
A=D{d};
p=randperm(numel(A));
A(:)=A(p)
[~,ind]=sort(p);
D_permuted{d} = A;
D_ind{d} = ind;
end
Assuming your D matrix is just a list of identically sized (e.g. 2-by-2) matrices, then you could avoid the loop by using a 3D double matrix instead of the cell-array.
For example if you hade a D like this:
n = 5;
D = repmat([1,3;2,4],1,1,n)*10 %// Example data
Then you can do the permutation like this
m = 2*2; %// Here m is the product of the dimensions of each matrix you want to shuffle
[~,I] = sort(rand(m,n)); %// This is just a trick to get the equivalent of a vectorized form of randperm as unfortunately randperm only accepts scalars
idx = reshape(I,2,2,n);
D_shuffled = D(idx);

3D cell arrays in matlab

I am currently working using matlab, I have uploaded a csv file into a cell array that I have named B. What I now wish to do is to input the information of B into a 3-D cell array, the 3rd dimension of the array being the first column of B which are strings ranging from "chr1" to "chr24". The full length of B is m, and the maximum length of any "chr" is maxlength. I doubt that this is the best way of going about it but here is my code:
for j = 1:m ,
Ind = findstr(B{1}{j}, 'chr');
Num = B{1}{j}(Ind+3:end-1);
cnum = str2num(Num);
for i = 1:24,
if cnum == i;
for k = 2:9 ,
for l = 1:maxlength ,
C{l}{k}{i} = B{k}{j};
C{l}{k}{i}
end
end
end
end
end
The 3-D array that comes out of this does not match the corresponding values in the initial array. I also want to know if this is the right way to create a 3-D array, I can't seem to find anything on the matlab website about them.
Thanks
There are a few possible issues with your approach: First of all, Matlab indexing is different from c-style indexing into tables. myCell{i}{j} is the j-th element of the cell array that is contained in the i-th element of the cell array myCell. If you want to index into a 2-d cell array, you would get the contents of the element in row i, column j as myCell{i,j}.
If the columns 2 through 9 of your .csv file contain all numeric data, it may be a lot more convenient to use either a 1D cell array with an entry for every chromosome, or to use a 2D or 3D numeric array if you get, for each chromosome, a single row, or a table, respectively.
Here's one way to do it
%# convert chromosomes to numbers
chromosomes = B{1};
chromosomes = strrep(chromosomes,'X',25);
chromosomes = strrep(chromosomes,'Y',26);
tmp = regexp(chromsomes,'chr(\d+)','tokens','once');
cnum = cellfun(#(x)str2double(x{1}),tmp);
%# catenate the rest of B into a 2D cell array
allNumbers = cell2mat(cat(2,B{2:end}));
%# now we can make a table with [chromosomeNumber,allOtherNumbers]
finalTable = [chromosomeNumber,allNumbers]
%# alternatively, if there are multiple entries for each chromosome, we can
%# group the data in a cell array, so that the i-th entry corresponds to chr.i
%# for readability, use a loop
outputCell = cell(26,1); %# assume 26 chromosomes
for i=1:26
outputCell{i} = allNumbers(cnum==i,:);
end
I've managed to do this with only two for loops, here is my code:
C = zeros(26,8,maxlength);
next = zeros(1,26);
for j = 1:m ,
Ind = findstr(B{1}{j}, 'chr');
Num = B{1}{j}(Ind+3:end-1);
cnum = str2num(Num);
if Num == 'X'
cnum = 25;
end
if Num == 'Y'
cnum = 26;
end
next(cnum) = next(cnum) + 1;
for k = 2:9 ,
D{cnum}{k-1}{next(cnum)} = B{k}{j};
C(cnum,k-1,next(cnum)) = str2num(B{k}{j});
end
end