Numeric and Alphabetic symbols in same matrx - matlab

I'm working on a model to use matlab as graphical representation for other model. Therefore I'd like to have a matrix that can be updated with both letters and numbers. Numbers will represent a speed while for example '-' may represent a empty section. In the matlab documentation and on internet I found a lot of interesting tips, but not what I need.
Thanks in advance!

You cannot represent data of numeric type (integers/floating points) and data of char type in a matrix. However, you can, use cells, which are similar to matrices, and can hold different data types in each cell. Here's an example.
A={[1 2 3],'hello';'world',[4,5,6]'}
A =
[1x3 double] 'hello'
'world' [3x1 double]
Here the first cell contains a row vector, the second and third cells contain strings and the fourth cell contains a column vector. Indexing into a cell is similar to that of arrays, with one minor difference: use {} to group the indices. e.g., to access the element in the second row, first column, do
A{2,1}
ans =
world
You can also access an element of an array inside a cell like
A{2,2}(2)
ans =
5

If you're wanting to store mixtures of numeric and character type data, yoda has the correct suggestion: use cell arrays.
However, based on the example you described you may have another option. If the character entries in your matrix are there for the purpose of identifying "missing data", it may make more sense to use a purely numeric matrix containing unique values like NaN or Inf to identify data points that are empty or where data is not available.
When performing operations on your matrix, you would then have to index only elements that are finite (using, for example, ISFINITE) and perform your calculations on them. There are even some functions in the Statistics Toolbox that will perform operations ignoring NaN values. This may be a cleaner way to go since you can keep your matrix as a numeric type ('single' or 'double' precision) instead of having to mess with cell arrays.

Related

When to use a cell, matrix, or table in Matlab

I am fairly new to matlab and I am trying to figure out when it is best to use cells, tables, or matrixes to store sets of data and then work with the data.
What I want is to store data that has multiple lines that include strings and numbers and then want to work with the numbers.
For example a line would look like
'string 1' , time, number1, number 2
. I know a matrix works best if al elements are numbers, but when I use a cell I keep having to convert the numbers or strings to a matrix in order to work with them. I am running matlab 2012 so maybe that is a part of the problem. Any help is appreciated. Thanks!
Use a matrix when :
the tabular data has a uniform type (all are floating points like double, or integers like int32);
& either the amount of data is small, or is big and has static (predefined) size;
& you care about the speed of accessing data, or you need matrix operations performed on data, or some function requires the data organized as such.
Use a cell array when:
the tabular data has heterogeneous type (mixed element types, "jagged" arrays etc.);
| there's a lot of data and has dynamic size;
| you need only indexing the data numerically (no algebraic operations);
| a function requires the data as such.
Same argument for structs, only the indexing is by name, not by number.
Not sure about tables, I don't think is offered by the language itself; might be an UDT that I don't know of...
Later edit
These three types may be combined, in the sense that cell arrays and structs may have matrices and cell arrays and structs as elements (because thy're heterogeneous containers). In your case, you might have 2 approaches, depending on how you need to access the data:
if you access the data mostly by row, then an array of N structs (one struct per row) with 4 fields (one field per column) would be the most effective in terms of performance;
if you access the data mostly by column, then a single struct with 4 fields (one field per column) would do; first field would be a cell array of strings for the first column, second field would be a cell array of strings or a 1D matrix of doubles depending on how you want to store you dates, the rest of the fields are 1D matrices of doubles.
Concerning tables: I always used matrices or cell arrays until I
had to do database related things such as joining datasets by a unique key; the only way I found to do this in was by using tables. It takes a while to get used to them and it's a bit annoying that some functions that work on cell arrays don't work on tables vice versa. MATLAB could have done a better job explaining when to use one or the other because it's not super clear from the documentation.
The situation that you describe, seems to be as follows:
You have several columns. Entire columns consist of 1 datatype each, and all columns have an equal number of rows.
This seems to match exactly with the recommended situation for using a [table][1]
T = table(var1,...,varN) creates a table from the input variables,
var1,...,varN . Variables can be of different sizes and data types,
but all variables must have the same number of rows.
Actually I don't have much experience with tables, but if you can't figure it out you can always switch to using 1 cell array for the first column, and a matrix for all others (in your example).

Storing arrays of different length in one matrix Matlab

I have a number of arrays of different sizes, e.g.
A=1:10; B=1:9 etc.
Now I want to save these arrays into one big matrix. In this example I would want it to be 2x10, with NaN for the remaining spot not filled by array B. I know how to preallocate this matrix with NaN(size), but my question here is how to get these arrays in with their different lengths. It must be a super simple command, but I just can't seem to think of it!
You need to specify the column indices:
>> BigMat = NaN(2,10);
>> BigMat(1, 1:numel(A) ) = A;
>> BigMat(2, 2:numel(B) ) = B;
Also take a look at cell structures. They can contain a variety of different data types. For example
BigMat{1}=A;
BigMat{2}=B;
BigMat{3}='Some text string'

How can I convert double values into integers for indices to create a sparse matrix in MATLAB?

I am using MATLAB to load a text file that I want to make a sparse matrix out of. The columns in the text file refer to the row indices and are double type. I need them to be integers to be able to use them as indices for rows and columns. I tried using uint8, int32 and int64 to convert them to integers to use them to build a sparse matrix as so:
??? Undefined function or method 'sparse' for input
arguments of type 'int64'.
Error in ==> make_network at 5
graph =sparse(int64(listedges(:,1)),int64(listedges(:,2)),ones(size(listedges,1),1));
How can I convert the text file entries loaded as double so as to be used by the sparse function?
There is no need for any conversion, keep the indices double:
r = round(listedges);
graph = sparse(r(:, 1), r(:, 2), ones(size(listedges, 1), 1));
There are two reasons why one might want to convert to int:
The first, because you have data type restrictions.
The second, your inputs may contain fractions and are un-fit to be used as integers.
If you want to convert because of the first reason - then there's no need to: Matlab works with double type by default and often treats doubles as ints (for example, when used as indices).
However, if you want to convert to integers becuase of the second reason (numbers may be fractionals), then you should use round(), ceil() or floor() - whatever suits your purpose best.
There is another very good reason ( and really the primary one..) why one may want to convert indices of any structure (array, matrix, etc.) to int.
If you ever program in any language other than Matlab, you would be familiar with wanting to save memory space, especially with large structures. Being able to address elements in such structures with indices other than double is key.
One major issue with Matlab is the inability to more finely control the size of multidimensional structures in this way. There are sparse matrix solutions, but those are not adequate for many cases. Cell arrays will preserve the data types upon access, however the storage for every element in the cell array is extremely wasteful in terms of storage (113 bytes for a single uint8 encapsulated in a cell).

Using ismember() with cell arrays containing vectors

I am using a cell array to contain 1x2 vectors of grid locations in the form [row, col].
I would like to check if another grid location is included in this cell array.
Unfortunately, my current code results in an error, and I cannot quite understand why:
in_range = ismember( 1, ismember({[player.row, player.col]}, proximity(:,1)) );
where player.row and player.col are integers, and proximity's first column is the aforementioned cell array of grid locations
the error I am receiving is:
??? Error using ==> cell.ismember at 28
Input must be cell arrays of strings.
Unfortunately, I have not been able to find any information regarding using ismember() in this fashion, only with cell arrays as strings or with single integers in each cell rather than vectors.
I have considered converting using num2str() and str2num(), but since I must perform calculations between the conversions, and due to the number of iterations the code will be looped for (10,000 loops, 4 conversions per loop), this method seems prohibitive.
Any help here would be greatly appreciated, thank you
EDIT: Why does ismember() return this error? Does it treat all vectors in a cell array as string arrays?
EDIT: Would there be a better / more efficient method of determining if a 1 is in the returned vector than
ismember( 1, ismember(...))?
I'm short of time at the moment (being Chrissy eve and all), so this is going to have to be a very quick answer.
As I understand it, the problem is to find if an x y coordinate lies in a sequence of many x y coordinates, and if so, the index of where it lies. If this is the case, and if you're interested in efficiency, then it is wasteful to mess around with strings or cell arrays. You should be using numeric matrices/vectors for this.
So, my suggestion: Convert the first row of your cell array to a numeric matrix. Then, compare your x y coordinates to the rows of this numerical matrix. Because you only want to know when both coordinates match a row of the numerical matrix, use the 'rows' option of ismember - it will return a true only on matching an entire row rather than matching a single element.
Some example code that will hopefully help follows:
%# Build an example cell array with coordinates in the first column, and random strings in the second column
CellOfLoc = {[1 2], 'hello'; [3 4], 'world'; [5 6], '!'};
%# Convert the first column of the cell array to a numerical matrix
MatOfLoc = cell2mat(CellOfLoc(:, 1));
%# Build an example x y coordinate location to test
LocToTest = [5 6];
%# Call ismember, being sure to use the rows option
Index = ismember(MatOfLoc, LocToTest, 'rows');
Note, if the indices in your cell array are in string form, then obviously you'll also need a call to str2num in there somewhere before you call ismember.
One other thing, I notice you're a new member, so welcome to the site. If you think this response satisfactorily answered your question, then please mark the question answered by clicking the tick mark next to this response.

What's an appropriate data structure for a matrix with random variable entries?

I'm currently working in an area that is related to simulation and trying to design a data structure that can include random variables within matrices. To motivate this let me say I have the following matrix:
[a b; c d]
I want to find a data structure that will allow for a, b, c, d to either be real numbers or random variables. As an example, let's say that a = 1, b = -1, c = 2 but let d be a normally distributed random variable with mean 0 and standard deviation 1.
The data structure that I have in mind will give no value to d. However, I also want to be able to design a function that can take in the structure, simulate a uniform(0,1), obtain a value for d using an inverse CDF and then spit out an actual matrix.
I have several ideas to do this (all related to the MATLAB icdf function) but would like to know how more experienced programmers would do this. In this application, it's important that the structure is as "lean" as possible since I will be working with very very large matrices and memory will be an issue.
EDIT #1:
Thank you all for the feedback. I have decided to use a cell structure and store random variables as function handles. To save some processing time for large scale applications, I have decided to reference the location of the random variables to save time during the "evaluation" part.
One solution is to create your matrix initially as a cell array containing both numeric values and function handles to functions designed to generate a value for that entry. For your example, you could do the following:
generatorMatrix = {1 -1; 2 #randn};
Then you could create a function that takes a matrix of the above form, evaluates the cells containing function handles, then combines the results with the numeric cell entries to create a numeric matrix to use for further calculations:
function numMatrix = create_matrix(generatorMatrix)
index = cellfun(#(c) isa(c,'function_handle'),... %# Find function handles
generatorMatrix);
generatorMatrix(index) = cellfun(#feval,... %# Evaluate functions
generatorMatrix(index),...
'UniformOutput',false);
numMatrix = cell2mat(generatorMatrix); %# Change from cell to numeric matrix
end
Some additional things you can do would be to use anonymous functions to do more complicated things with built-in functions or create cell entries of varying size. This is illustrated by the following sample matrix, which can be used to create a matrix with the first row containing a 5 followed by 9 ones and the other 9 rows containing a 1 followed by 9 numbers drawn from a uniform distribution between 5 and 10:
generatorMatrix = {5 ones(1,9); ones(9,1) #() 5*rand(9)+5};
And each time this matrix is passed to create_matrix it will create a new 10-by-10 matrix where the 9-by-9 submatrix will contain a different set of random values.
An alternative solution...
If your matrix can be easily broken into blocks of submatrices (as in the second example above) then using a cell array to store numeric values and function handles may be your best option.
However, if the random values are single elements scattered sparsely throughout the entire matrix, then a variation similar to what user57368 suggested may work better. You could store your matrix data in three parts: a numeric matrix with placeholders (such as NaN) where the randomly-generated values will go, an index vector containing linear indices of the positions of the randomly-generated values, and a cell array of the same length as the index vector containing function handles for the functions to be used to generate the random values. To make things easier, you can even store these three pieces of data in a structure.
As an example, the following defines a 3-by-3 matrix with 3 random values stored in indices 2, 4, and 9 and drawn respectively from a normal distribution, a uniform distribution from 5 to 10, and an exponential distribution:
matData = struct('numMatrix',[1 nan 3; nan 2 4; 0 5 nan],...
'randIndex',[2 4 9],...
'randFcns',{{#randn , #() 5*rand+5 , #() -log(rand)/2}});
And you can define a new create_matrix function to easily create a matrix from this data:
function numMatrix = create_matrix(matData)
numMatrix = matData.numMatrix;
numMatrix(matData.randIndex) = cellfun(#feval,matData.randFcns);
end
If you were using NumPy, then masked arrays would be the obvious place to start, but I don't know of any equivalent in MATLAB. Cell arrays might not be compact enough, and if you did use a cell array, then you would have to come up with an efficient way to find the non-real entries and replace them with a sample from the right distribution.
Try using a regular or sparse matrix to hold the real values, and leave it at zero wherever you want a random variable. Then alongside that store a sparse matrix of the same shape whose non-zero entries correspond to the random variables in your matrix. If you want, the value of the entry in the second matrix can be used to indicate which distribution (ie. 1 for uniform, 2 for normal, etc.).
Whenever you want to get a purely real matrix to work with, you iterate over the non-zero values in the second matrix to convert them to samples, and then add that matrix to your first.