Identifying uniques in a cell array

Identifying uniques in a cell array - matlab

I have a 45x2 cell in MATLAB, with the first column an arbitrarily sized matrix of doubles.
Some of these matrices are repeated, whilst others aren't. I'm attempting to strip out only the unique matrices (but recording the number of repeates), and keep the second column as is.
I've tried a number of things (tabulate, hist et al) but they all fail because of the cell structure (I think). How would one go about doing this, short of looping through each of them individually?

If you convert your matrices to strings, you can run unique on them:
%# create a sample cell array
mc = {magic(3);magic(4);magic(4);magic(5);magic(3);magic(4)}
%# convert to strings
mcs = cellfun(#(x)(mat2str(x)),mc,'uniformoutput',false);
%# run unique
[uniqueCells,idxOfUnique,idxYouWant] = unique(mcs);

Related

calculating the number of columns in a row of a cell array in matlab

i've got a cell array full of numbers, with 44 rows and different column length in each row
how could i calculate the number of columns in each row?(the columns which their contents are not empty)
i've used 2 different ways which both of them where wrong
the 1st one:
%a is the cell array
s=length(a)
it gives 44 which is the number of rows
the 2nd one
[row, columms]=size(a)
but it doesn't work either cause the number of columns is different in each row.
at least i mean the number of columns which are not empty
for example i need the number of columns in row one which it is 43(a{1 1:43}) but it gives the number of columns for each elements like a{1,1} which is 384 or a{1,2},a{1,3} and so on

You need to access each member of the cell array separately, you are looking for the size of the data contained in the cell - the cell is the container. Two methods
for loop:
cell_content_lengths=zeros(1,length(a));
for v=1:length(a)
cell_content_lengths(v)=length(a{v});
end
cellfun:
cell_content_lengths=cellfun(#length,a);
Any empty cells will just have length 0. To extend the for-loop to matrices is trivial, and you can extend the cellfun part to cells containing matrix by using something like this, if you are interested:
cell_content_sizes=cell2mat(cellfun(#length,a,'uniformoutput',false));
(Note for the above, each element of a needs to have the same dimension, otherwise it will give errors about concatenating different size matrices)
EDIT
Based on your comment I think I understand what you are looking for:
non_empty_cols = sum(~cellfun(#isempty,a),2);
With thanks to #MZimmerman6 who understood it before me.

So what you're really asking, is "How many non-empty elements are in each row of my cell array?"
filledCells = ~cellfun(#isempty,a);
columns = sum(filledCells,2);

Using ismember() with cell arrays containing vectors

I am using a cell array to contain 1x2 vectors of grid locations in the form [row, col].
I would like to check if another grid location is included in this cell array.
Unfortunately, my current code results in an error, and I cannot quite understand why:
in_range = ismember( 1, ismember({[player.row, player.col]}, proximity(:,1)) );
where player.row and player.col are integers, and proximity's first column is the aforementioned cell array of grid locations
the error I am receiving is:
??? Error using ==> cell.ismember at 28
Input must be cell arrays of strings.
Unfortunately, I have not been able to find any information regarding using ismember() in this fashion, only with cell arrays as strings or with single integers in each cell rather than vectors.
I have considered converting using num2str() and str2num(), but since I must perform calculations between the conversions, and due to the number of iterations the code will be looped for (10,000 loops, 4 conversions per loop), this method seems prohibitive.
Any help here would be greatly appreciated, thank you
EDIT: Why does ismember() return this error? Does it treat all vectors in a cell array as string arrays?
EDIT: Would there be a better / more efficient method of determining if a 1 is in the returned vector than
ismember( 1, ismember(...))?

I'm short of time at the moment (being Chrissy eve and all), so this is going to have to be a very quick answer.
As I understand it, the problem is to find if an x y coordinate lies in a sequence of many x y coordinates, and if so, the index of where it lies. If this is the case, and if you're interested in efficiency, then it is wasteful to mess around with strings or cell arrays. You should be using numeric matrices/vectors for this.
So, my suggestion: Convert the first row of your cell array to a numeric matrix. Then, compare your x y coordinates to the rows of this numerical matrix. Because you only want to know when both coordinates match a row of the numerical matrix, use the 'rows' option of ismember - it will return a true only on matching an entire row rather than matching a single element.
Some example code that will hopefully help follows:
%# Build an example cell array with coordinates in the first column, and random strings in the second column
CellOfLoc = {[1 2], 'hello'; [3 4], 'world'; [5 6], '!'};
%# Convert the first column of the cell array to a numerical matrix
MatOfLoc = cell2mat(CellOfLoc(:, 1));
%# Build an example x y coordinate location to test
LocToTest = [5 6];
%# Call ismember, being sure to use the rows option
Index = ismember(MatOfLoc, LocToTest, 'rows');
Note, if the indices in your cell array are in string form, then obviously you'll also need a call to str2num in there somewhere before you call ismember.
One other thing, I notice you're a new member, so welcome to the site. If you think this response satisfactorily answered your question, then please mark the question answered by clicking the tick mark next to this response.

What's an appropriate data structure for a matrix with random variable entries?

I'm currently working in an area that is related to simulation and trying to design a data structure that can include random variables within matrices. To motivate this let me say I have the following matrix:
[a b; c d]
I want to find a data structure that will allow for a, b, c, d to either be real numbers or random variables. As an example, let's say that a = 1, b = -1, c = 2 but let d be a normally distributed random variable with mean 0 and standard deviation 1.
The data structure that I have in mind will give no value to d. However, I also want to be able to design a function that can take in the structure, simulate a uniform(0,1), obtain a value for d using an inverse CDF and then spit out an actual matrix.
I have several ideas to do this (all related to the MATLAB icdf function) but would like to know how more experienced programmers would do this. In this application, it's important that the structure is as "lean" as possible since I will be working with very very large matrices and memory will be an issue.
EDIT #1:
Thank you all for the feedback. I have decided to use a cell structure and store random variables as function handles. To save some processing time for large scale applications, I have decided to reference the location of the random variables to save time during the "evaluation" part.

One solution is to create your matrix initially as a cell array containing both numeric values and function handles to functions designed to generate a value for that entry. For your example, you could do the following:
generatorMatrix = {1 -1; 2 #randn};
Then you could create a function that takes a matrix of the above form, evaluates the cells containing function handles, then combines the results with the numeric cell entries to create a numeric matrix to use for further calculations:
function numMatrix = create_matrix(generatorMatrix)
index = cellfun(#(c) isa(c,'function_handle'),... %# Find function handles
generatorMatrix);
generatorMatrix(index) = cellfun(#feval,... %# Evaluate functions
generatorMatrix(index),...
'UniformOutput',false);
numMatrix = cell2mat(generatorMatrix); %# Change from cell to numeric matrix
end
Some additional things you can do would be to use anonymous functions to do more complicated things with built-in functions or create cell entries of varying size. This is illustrated by the following sample matrix, which can be used to create a matrix with the first row containing a 5 followed by 9 ones and the other 9 rows containing a 1 followed by 9 numbers drawn from a uniform distribution between 5 and 10:
generatorMatrix = {5 ones(1,9); ones(9,1) #() 5*rand(9)+5};
And each time this matrix is passed to create_matrix it will create a new 10-by-10 matrix where the 9-by-9 submatrix will contain a different set of random values.
An alternative solution...
If your matrix can be easily broken into blocks of submatrices (as in the second example above) then using a cell array to store numeric values and function handles may be your best option.
However, if the random values are single elements scattered sparsely throughout the entire matrix, then a variation similar to what user57368 suggested may work better. You could store your matrix data in three parts: a numeric matrix with placeholders (such as NaN) where the randomly-generated values will go, an index vector containing linear indices of the positions of the randomly-generated values, and a cell array of the same length as the index vector containing function handles for the functions to be used to generate the random values. To make things easier, you can even store these three pieces of data in a structure.
As an example, the following defines a 3-by-3 matrix with 3 random values stored in indices 2, 4, and 9 and drawn respectively from a normal distribution, a uniform distribution from 5 to 10, and an exponential distribution:
matData = struct('numMatrix',[1 nan 3; nan 2 4; 0 5 nan],...
'randIndex',[2 4 9],...
'randFcns',{{#randn , #() 5*rand+5 , #() -log(rand)/2}});
And you can define a new create_matrix function to easily create a matrix from this data:
function numMatrix = create_matrix(matData)
numMatrix = matData.numMatrix;
numMatrix(matData.randIndex) = cellfun(#feval,matData.randFcns);
end

If you were using NumPy, then masked arrays would be the obvious place to start, but I don't know of any equivalent in MATLAB. Cell arrays might not be compact enough, and if you did use a cell array, then you would have to come up with an efficient way to find the non-real entries and replace them with a sample from the right distribution.
Try using a regular or sparse matrix to hold the real values, and leave it at zero wherever you want a random variable. Then alongside that store a sparse matrix of the same shape whose non-zero entries correspond to the random variables in your matrix. If you want, the value of the entry in the second matrix can be used to indicate which distribution (ie. 1 for uniform, 2 for normal, etc.).
Whenever you want to get a purely real matrix to work with, you iterate over the non-zero values in the second matrix to convert them to samples, and then add that matrix to your first.

Extracting data points from a matrix and saving them in different matrixes in MATLAB

I have a 2D Matrix consisting of some coordinates as below(example): Data(X,Y):
45.987543423,5.35000964
52.987544223,5,98765234
Also I have an array consisting of some integers >=0 , for example: Cluster(M)
2,0,3,1
each of these numbers in this array corresponds with a row of my 2D Matrix above.For example, it says that row one(coordinate) in the Data Matirx belongs to the cluster 2,second row belongs to cluster 0 and so on.
Now I want to have each of the datapoint of each cluster in a separate matrix, for example I want to save datapoints belonging to cluster 1 in a separate matrix, cluster 2 in a separate matrix and so on,....
I can do them manually, but the problem is this has to be an automatic extraction. which means that the number of clusters(range of the numbers in the cluster array varies in each run) so I have to have a general algorithm that does this extraction for me. Can someone help me please? thanks

Instead of dynamically creating a bunch of matrices, I would create a cell array with each matrix in a separate cell. Here's one way to do this, using the functions SORT and MAT2CELL:
[cluster,sortIndex] = sort(cluster); %# Sort cluster and get sorting index
data = data(sortIndex,:); %# Apply the same sorting to data
clusterCounts = diff([0 find(diff(cluster)) numel(cluster)]); %# Find size of
%# each cluster
cellArray = mat2cell(data,clusterCounts,2); %# Break up data into matrices,
%# each in a separate cell

You can use ARRAYFUN to distribute the coordinates among different cell arrays.
%# create sample data
clusterIdx = [2,0,3,1,1,1,3,2];
coordinates = rand(8,2);
%# first you get a list of unique cluster indices
clusterIdxUnique = unique(clusterIdx);
%# then you use arrayfun to distribute the coordinates
clusterCell = arrayfun(#(x)coordinates(clusterIdx==x,:),clusterIdxUnique,'UniformOutput',false);
The first element of clusterCell contains the coordinates corresponding to the first entry in clusterIdxUnique, etc.

I guess this is the solution:
data(cluster == i, :)
where i is the index of the cluster. Your index matrix is converted to a boolean matrix and then used to index the rows and each selected row is completely added to the resulting matrix.
If this is not what you're looking for, please specify your needs more clearly.

Thanks everyone, I managed to make it work with this code:
noOfClusters = max(cluster); %without noise
for i=1:noOfClusters
C(i,1) = {numData(cluster==i,:)}
end
I assume your codes are much faster,cause you don't use for loops.

I would either create a 3 dimensional array or table. That way the cluster index would be associated with the cluster. Something like the following construct:
xData = Data(:,1);
yData = Data(:,2);
clusterTable = table(Cluster, xData, yData);
This creates a table with column names and each row having a cluster index and a set of coordinates.

Using matlab and Time Series object (fints), how can I make an array of them?

I am getting stock prices from yahoo, and want to have each stock have its own time series data structure, but also don't want to have hundreds of variables, so naturally I would want to have an array, but when I do something like array = [stock1 stock2]; it actually merges the series together. How can I make a real array?
Thanks,
CP

[x x] notation in matlab is not an array, it is a vector. It is assumed that what you're putting together belongs together. What you probably want is a cell array which is indexed with a curly brace, ie myArray{1} = stock1; myArray{2} = stock2;. Reference here.

Ah, since you have row vectors, [stock1 stock2] is a concatenation. If you want to create a 2-by-x array instead, do something like this [stock1; stock2], which will place one array above the other.

Joining vectors using [x y] has different results depending on whether your vectors are rows or columns. If rows, then joining them with [x y] makes a longer row vector, but if columns, you'll get a Nx2 matrix. You should probably convert them to column vectors using the TRANSPOSE operator thus: [x' y']. Although you should check if transpose means the same thing with Time Series objects as at does with regular vectors.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Identifying uniques in a cell array - matlab

Related

calculating the number of columns in a row of a cell array in matlab

Using ismember() with cell arrays containing vectors

What's an appropriate data structure for a matrix with random variable entries?

Extracting data points from a matrix and saving them in different matrixes in MATLAB

Using matlab and Time Series object (fints), how can I make an array of them?

Categories

Resources