vector of variable length vectors in MATLAB - matlab

I want to sum up several vectors of different size in an array. Each time one of the vectors drops out of my program, I want to append it to my array. Like this:
array = [array, vector];
In the end I want to let this array be the output of a function. But it gives me wrong results. Is this possible with MATLAB?
Thanks and kind regards,
Damian

Okay, given that we're dealing with column vectors of different size, you can't put them all in a numerical array, since a numerical array has to be rectangular. If you really wanted to put them in the numerical array, then the column length of the array will need to be the length of the longest vector, and you'll have to pad out the shorter vectors with NaNs.
Given this, a better solution would be, as chaohuang hinted at in the comments, to use a cell array, and store one vector in each cell. The problem is that you don't know beforehand how many vectors there will be. The usual approach that I'm aware of for this problem is as follows (but if someone has a better idea, I'm keen to learn!):
UpperBound = SomeLargeNumber;
Array = cell(1, UpperBound);
Counter = 0;
while SomeCondition
Counter = Counter + 1;
if Counter > UpperBound
error('You did not choose a large enough upper bound!');
end
%#Create your vector here
Array{1, Counter} = YourVectorHere;
end
Array = Array(1, 1:Counter);
In other words, choose some upper bound beforehand that you are sure you won't go above in the loop, and then cut your cell array down to size once the loop is finished. Also, I've put in an error trap in case you're choice of upper bound turns out to be too small!
Oh, by the way, I just noted in your question the words "sum up several vectors". Was this a figure of speech or did you actually want to perform a sum operation somewhere?

Related

Build a matrix starting from instances of structure fields in MATLAB

I'm really sorry to bother so I hope it is not a silly or repetitive question.
I have been scraping a website, saving the results as a collection in MongoDB, exporting it as a JSON file and importing it in MATLAB.
At the end of the story I obtained a struct object organised
like this one in the picture.
What I'm interested in are the two last cell arrays (which can be easily converted to string arrays with string()). The first cell array is a collection of keys (think unique products) and the second cell array is a collection of values (think prices), like a dictionary. Each field is an instance of possible values for a set of this keys (think daily prices). My goal is to build a matrix made like this:
KEYS VALUES_OF_FIELD_1 VALUES_OF_FIELD2 ... VALUES_OF_FIELDn
A x x x
B x z NaN
C z x y
D NaN y x
E y x z
The main problem is that, as shown in the image and as I tried to explain in the example matrix, I don't always have a value for all the keys in every field (as you can see sometimes they are 321, other times 319 or 320 or 317) and so the key is missing from the first array. In that case I should fill the missing value with a NaN. The keys can be ordered alphabetically and are all unique.
What would you think would be the best and most scalable way to approach this problem in MATLAB?
Thank you very much for your time, I hope I explained myself clearly.
EDIT:
Both arrays are made of strings in my case, so types are not a problem (I've modified the example). The main problem is that, since the keys vary in each field, firstly I have to find all the (unique) keys in the structure, to build the rows, and then for each column (field) I have to fill the values putting NaN where the key is missing.
One thing to remember you can't simply use both strings and number in one matrix. So, if you combine them together they can be either all strings or all numbers. I think all strings will work for you.
Before make a matrix make sure that all the cells have same element.
new_matrix = horzcat(keys,values1,...valuesn);
This will provide a matrix for each row (according to your image). Now you can use a for loop to get matrices for all the rows.
For now, I've solved it by considering the longest array of keys in the structure as the complete set of keys, let's call it keys_set.
Then I've created for each field in the structure a Map object in this way:
for i=1:length(structure)
structure(i).myMap = containers.Map(structure(i).key_field, structure(i).value_field);
end
Then I've built my matrix (M) by checking every map against the keys_set array:
for i=1:length(keys_set)
for j=1:length(structure)
if isKey(structure(j).myMap,char(keys_set(i)))
M(i,j) = string(structure(j).myMap(char(keys_set(i))));
else
M(i,j) = string('MISSING');
end
end
end
This works, but it would be ideal to also be able to check that keys_set is really complete.
EDIT: I've solved my problem by using this function and building the correct set of all the possible keys:
%% Finding the maximum number of keys in all the fields
maxnk = length(structure(1).key_field);
for i=2:length(structure)
if length(structure(i).key_field) > maxnk
maxnk = length(structure(i).key_field);
end
end
%% Initialiting the matrix containing all the possibile set of keys
keys_set=string(zeros(maxnk,length(structure)));
%% Filling the matrix by putting "0" if the dimension is smaller
for i=1:length(structure)
d = length(string(structure(i).key_field));
if d == maxnk
keys_set(:,i) = string(structure(i).key_field);
else
clear tmp
tmp = [string(structure(i).key_field); string(zeros(maxnk-d,1))];
keys_set(:,i) = tmp;
end
end
%% Merging without duplication and removing the "0" element
keys_set = union_several(keys_set);
keys_set = keys_set(keys_set ~= string(0));

How to perform operations along a certain dimension of an array?

I have a 3D array containing five 3-by-4 slices, defined as follows:
rng(3372061);
M = randi(100,3,4,5);
I'd like to collect some statistics about the array:
The maximum value in every column.
The mean value in every row.
The standard deviation within each slice.
This is quite straightforward using loops,
sz = size(M);
colMax = zeros(1,4,5);
rowMean = zeros(3,1,5);
sliceSTD = zeros(1,1,5);
for indS = 1:sz(3)
sl = M(:,:,indS);
sliceSTD(indS) = std(sl(1:sz(1)*sz(2)));
for indC = 1:sz(1)
rowMean(indC,1,indS) = mean(sl(indC,:));
end
for indR = 1:sz(2)
colMax(1,indR,indS) = max(sl(:,indR));
end
end
But I'm not sure that this is the best way to approach the problem.
A common pattern I noticed in the documentation of max, mean and std is that they allow to specify an additional dim input. For instance, in max:
M = max(A,[],dim) returns the largest elements along dimension dim. For example, if A is a matrix, then max(A,[],2) is a column vector containing the maximum value of each row.
How can I use this syntax to simplify my code?
Many functions in MATLAB allow the specification of a "dimension to operate over" when it matters for the result of the computation (several common examples are: min, max, sum, prod, mean, std, size, median, prctile, bounds) - which is especially important for multidimensional inputs. When the dim input is not specified, MATLAB has a way of choosing the dimension on its own, as explained in the documentation; for example in max:
If A is a vector, then max(A) returns the maximum of A.
If A is a matrix, then max(A) is a row vector containing the maximum value of each column.
If A is a multidimensional array, then max(A) operates along the first array dimension whose size does not equal 1, treating the elements as vectors. The size of this dimension becomes 1 while the sizes of all other dimensions remain the same. If A is an empty array whose first dimension has zero length, then max(A) returns an empty array with the same size as A.
Then, using the ...,dim) syntax we can rewrite the code as follows:
rng(3372061);
M = randi(100,3,4,5);
colMax = max(M,[],1);
rowMean = mean(M,2);
sliceSTD = std(reshape(M,1,[],5),0,2); % we use `reshape` to turn each slice into a vector
This has several advantages:
The code is easier to understand.
The code is potentially more robust, being able to handle inputs beyond those it was initially designed for.
The code is likely faster.
In conclusion: it is always a good idea to read the documentation of functions you're using, and experiment with different syntaxes, so as not to miss similar opportunities to make your code more succinct.

Mathematica Table function to Matlab

I need to convert this to Matlab code, and am struggling without the "table" function.
Table[{i,1000,ability,savingsrate,0,RandomInteger[{15,30}],1,0},{i,nrhhs}];
So basically, these values are all just numbers, and I think I need to use a function handle, or maybe a for loop. I'm no expert, so I really need some help?
I'm not an expert in Mathematics (just used it long time ago). According to this documentation for Table function, you are using this form:
Table[expr, {i, imax}]
generates a list of the values of expr when i runs from 1 to imax.
It looks like your statement will produce list duplicating the list in first argument increasing i from 1 to nrhhs and using different random number.
In MATLAB the output can be equivalent to a matrix or a cell array.
To create a matrix with rows as your lists you can do:
result = [ (1:nrhhs)', repmat([1000,ability,savingsrate,0],nrhhs,1), ...
randi([15 30],nrhhs,1), repmat([1,0],nrhhs,1) ];
You can convert the above matrix to a cell array:
resultcell = cell2mat(result, ones(nrhhs,1));
The "Table" example you gave creates a list of nrhhs sub-lists, each of which contains 8 numbers (i, 1000, ability, savingsrate, 0, a random integer between 15 and 30 inclusive, 1, and 0). This is essentially (though not exactly) the same as an nrhhs x 8 matrix.
Assuming you do just want a matrix out, though, an analogous for loop in Matlab would be:
result = zeros(nrhhs,8); % preallocate memory for the result
for i = 1:nrhhs
result(i,:) = [i 1000 ability savingsrate 0 randi([15 30]) 1 0];
end
This method is likely slower than yuk's answer (which makes much more efficient use of vectors to avoid the for loop), but might be a little easier to pick apart depending on how familiar you are with Matlab.

Using ismember() with cell arrays containing vectors

I am using a cell array to contain 1x2 vectors of grid locations in the form [row, col].
I would like to check if another grid location is included in this cell array.
Unfortunately, my current code results in an error, and I cannot quite understand why:
in_range = ismember( 1, ismember({[player.row, player.col]}, proximity(:,1)) );
where player.row and player.col are integers, and proximity's first column is the aforementioned cell array of grid locations
the error I am receiving is:
??? Error using ==> cell.ismember at 28
Input must be cell arrays of strings.
Unfortunately, I have not been able to find any information regarding using ismember() in this fashion, only with cell arrays as strings or with single integers in each cell rather than vectors.
I have considered converting using num2str() and str2num(), but since I must perform calculations between the conversions, and due to the number of iterations the code will be looped for (10,000 loops, 4 conversions per loop), this method seems prohibitive.
Any help here would be greatly appreciated, thank you
EDIT: Why does ismember() return this error? Does it treat all vectors in a cell array as string arrays?
EDIT: Would there be a better / more efficient method of determining if a 1 is in the returned vector than
ismember( 1, ismember(...))?
I'm short of time at the moment (being Chrissy eve and all), so this is going to have to be a very quick answer.
As I understand it, the problem is to find if an x y coordinate lies in a sequence of many x y coordinates, and if so, the index of where it lies. If this is the case, and if you're interested in efficiency, then it is wasteful to mess around with strings or cell arrays. You should be using numeric matrices/vectors for this.
So, my suggestion: Convert the first row of your cell array to a numeric matrix. Then, compare your x y coordinates to the rows of this numerical matrix. Because you only want to know when both coordinates match a row of the numerical matrix, use the 'rows' option of ismember - it will return a true only on matching an entire row rather than matching a single element.
Some example code that will hopefully help follows:
%# Build an example cell array with coordinates in the first column, and random strings in the second column
CellOfLoc = {[1 2], 'hello'; [3 4], 'world'; [5 6], '!'};
%# Convert the first column of the cell array to a numerical matrix
MatOfLoc = cell2mat(CellOfLoc(:, 1));
%# Build an example x y coordinate location to test
LocToTest = [5 6];
%# Call ismember, being sure to use the rows option
Index = ismember(MatOfLoc, LocToTest, 'rows');
Note, if the indices in your cell array are in string form, then obviously you'll also need a call to str2num in there somewhere before you call ismember.
One other thing, I notice you're a new member, so welcome to the site. If you think this response satisfactorily answered your question, then please mark the question answered by clicking the tick mark next to this response.

Compact MATLAB matrix indexing notation

I've got an n-by-k sized matrix, containing k numbers per row. I want to use these k numbers as indexes into a k-dimensional matrix. Is there any compact way of doing so in MATLAB or must I use a for loop?
This is what I want to do (in MATLAB pseudo code), but in a more MATLAB-ish way:
for row=1:1:n
finalTable(row) = kDimensionalMatrix(indexmatrix(row, 1),...
indexmatrix(row, 2),...,indexmatrix(row, k))
end
If you want to avoid having to use a for loop, this is probably the cleanest way to do it:
indexCell = num2cell(indexmatrix, 1);
linearIndexMatrix = sub2ind(size(kDimensionalMatrix), indexCell{:});
finalTable = kDimensionalMatrix(linearIndexMatrix);
The first line puts each column of indexmatrix into separate cells of a cell array using num2cell. This allows us to pass all k columns as a comma-separated list into sub2ind, a function that converts subscripted indices (row, column, etc.) into linear indices (each matrix element is numbered from 1 to N, N being the total number of elements in the matrix). The last line uses these linear indices to replace your for loop. A good discussion about matrix indexing (subscript, linear, and logical) can be found here.
Some more food for thought...
The tendency to shy away from for loops in favor of vectorized solutions is something many MATLAB users (myself included) have become accustomed to. However, newer versions of MATLAB handle looping much more efficiently. As discussed in this answer to another SO question, using for loops can sometimes result in faster-running code than you would get with a vectorized solution.
I'm certainly NOT saying you shouldn't try to vectorize your code anymore, only that every problem is unique. Vectorizing will often be more efficient, but not always. For your problem, the execution speed of for loops versus vectorized code will probably depend on how big the values n and k are.
To treat the elements of the vector indexmatrix(row, :) as separate subscripts, you need the elements as a cell array. So, you could do something like this
subsCell = num2cell( indexmatrix( row, : ) );
finalTable( row ) = kDimensionalMatrix( subsCell{:} );
To expand subsCell as a comma-separated-list, unfortunately you do need the two separate lines. However, this code is independent of k.
Convert your sub-indices into linear indices in a hacky way
ksz = size(kDimensionalMatrix);
cksz = cumprod([ 1 ksz(1:end-1)] );
lidx = ( indexmatrix - 1 ) * cksz' + 1; #'
% lindx is now (n)x1 linear indices into kDimensionalMatrix, one index per row of indexmatrix
% access all n values:
selectedValues = kDimensionalMatrix( lindx );
Cheers!