Using matlab and Time Series object (fints), how can I make an array of them? - matlab

I am getting stock prices from yahoo, and want to have each stock have its own time series data structure, but also don't want to have hundreds of variables, so naturally I would want to have an array, but when I do something like array = [stock1 stock2]; it actually merges the series together. How can I make a real array?
Thanks,
CP

[x x] notation in matlab is not an array, it is a vector. It is assumed that what you're putting together belongs together. What you probably want is a cell array which is indexed with a curly brace, ie myArray{1} = stock1; myArray{2} = stock2;. Reference here.

Ah, since you have row vectors, [stock1 stock2] is a concatenation. If you want to create a 2-by-x array instead, do something like this [stock1; stock2], which will place one array above the other.

Joining vectors using [x y] has different results depending on whether your vectors are rows or columns. If rows, then joining them with [x y] makes a longer row vector, but if columns, you'll get a Nx2 matrix. You should probably convert them to column vectors using the TRANSPOSE operator thus: [x' y']. Although you should check if transpose means the same thing with Time Series objects as at does with regular vectors.

Related

Build a matrix starting from instances of structure fields in MATLAB

I'm really sorry to bother so I hope it is not a silly or repetitive question.
I have been scraping a website, saving the results as a collection in MongoDB, exporting it as a JSON file and importing it in MATLAB.
At the end of the story I obtained a struct object organised
like this one in the picture.
What I'm interested in are the two last cell arrays (which can be easily converted to string arrays with string()). The first cell array is a collection of keys (think unique products) and the second cell array is a collection of values (think prices), like a dictionary. Each field is an instance of possible values for a set of this keys (think daily prices). My goal is to build a matrix made like this:
KEYS VALUES_OF_FIELD_1 VALUES_OF_FIELD2 ... VALUES_OF_FIELDn
A x x x
B x z NaN
C z x y
D NaN y x
E y x z
The main problem is that, as shown in the image and as I tried to explain in the example matrix, I don't always have a value for all the keys in every field (as you can see sometimes they are 321, other times 319 or 320 or 317) and so the key is missing from the first array. In that case I should fill the missing value with a NaN. The keys can be ordered alphabetically and are all unique.
What would you think would be the best and most scalable way to approach this problem in MATLAB?
Thank you very much for your time, I hope I explained myself clearly.
EDIT:
Both arrays are made of strings in my case, so types are not a problem (I've modified the example). The main problem is that, since the keys vary in each field, firstly I have to find all the (unique) keys in the structure, to build the rows, and then for each column (field) I have to fill the values putting NaN where the key is missing.
One thing to remember you can't simply use both strings and number in one matrix. So, if you combine them together they can be either all strings or all numbers. I think all strings will work for you.
Before make a matrix make sure that all the cells have same element.
new_matrix = horzcat(keys,values1,...valuesn);
This will provide a matrix for each row (according to your image). Now you can use a for loop to get matrices for all the rows.
For now, I've solved it by considering the longest array of keys in the structure as the complete set of keys, let's call it keys_set.
Then I've created for each field in the structure a Map object in this way:
for i=1:length(structure)
structure(i).myMap = containers.Map(structure(i).key_field, structure(i).value_field);
end
Then I've built my matrix (M) by checking every map against the keys_set array:
for i=1:length(keys_set)
for j=1:length(structure)
if isKey(structure(j).myMap,char(keys_set(i)))
M(i,j) = string(structure(j).myMap(char(keys_set(i))));
else
M(i,j) = string('MISSING');
end
end
end
This works, but it would be ideal to also be able to check that keys_set is really complete.
EDIT: I've solved my problem by using this function and building the correct set of all the possible keys:
%% Finding the maximum number of keys in all the fields
maxnk = length(structure(1).key_field);
for i=2:length(structure)
if length(structure(i).key_field) > maxnk
maxnk = length(structure(i).key_field);
end
end
%% Initialiting the matrix containing all the possibile set of keys
keys_set=string(zeros(maxnk,length(structure)));
%% Filling the matrix by putting "0" if the dimension is smaller
for i=1:length(structure)
d = length(string(structure(i).key_field));
if d == maxnk
keys_set(:,i) = string(structure(i).key_field);
else
clear tmp
tmp = [string(structure(i).key_field); string(zeros(maxnk-d,1))];
keys_set(:,i) = tmp;
end
end
%% Merging without duplication and removing the "0" element
keys_set = union_several(keys_set);
keys_set = keys_set(keys_set ~= string(0));

Efficient way to delete multiple elements from a vector in matlab

I have looked around for an answer to this but could not see anything relevant.
I am timing some (Matlab) code and there is one part which takes up a large amount of time. I have a vector, V, of integers (that may contain repeats) and I have a much smaller vector, v, contain indices of elements that I wish to delete. I am currently implementing this as:
V(v) = [];
'V' is something that I am selecting indices from at random and without replacement. 'v' must not contain duplicates.
Any suggestions?

Using ismember() with cell arrays containing vectors

I am using a cell array to contain 1x2 vectors of grid locations in the form [row, col].
I would like to check if another grid location is included in this cell array.
Unfortunately, my current code results in an error, and I cannot quite understand why:
in_range = ismember( 1, ismember({[player.row, player.col]}, proximity(:,1)) );
where player.row and player.col are integers, and proximity's first column is the aforementioned cell array of grid locations
the error I am receiving is:
??? Error using ==> cell.ismember at 28
Input must be cell arrays of strings.
Unfortunately, I have not been able to find any information regarding using ismember() in this fashion, only with cell arrays as strings or with single integers in each cell rather than vectors.
I have considered converting using num2str() and str2num(), but since I must perform calculations between the conversions, and due to the number of iterations the code will be looped for (10,000 loops, 4 conversions per loop), this method seems prohibitive.
Any help here would be greatly appreciated, thank you
EDIT: Why does ismember() return this error? Does it treat all vectors in a cell array as string arrays?
EDIT: Would there be a better / more efficient method of determining if a 1 is in the returned vector than
ismember( 1, ismember(...))?
I'm short of time at the moment (being Chrissy eve and all), so this is going to have to be a very quick answer.
As I understand it, the problem is to find if an x y coordinate lies in a sequence of many x y coordinates, and if so, the index of where it lies. If this is the case, and if you're interested in efficiency, then it is wasteful to mess around with strings or cell arrays. You should be using numeric matrices/vectors for this.
So, my suggestion: Convert the first row of your cell array to a numeric matrix. Then, compare your x y coordinates to the rows of this numerical matrix. Because you only want to know when both coordinates match a row of the numerical matrix, use the 'rows' option of ismember - it will return a true only on matching an entire row rather than matching a single element.
Some example code that will hopefully help follows:
%# Build an example cell array with coordinates in the first column, and random strings in the second column
CellOfLoc = {[1 2], 'hello'; [3 4], 'world'; [5 6], '!'};
%# Convert the first column of the cell array to a numerical matrix
MatOfLoc = cell2mat(CellOfLoc(:, 1));
%# Build an example x y coordinate location to test
LocToTest = [5 6];
%# Call ismember, being sure to use the rows option
Index = ismember(MatOfLoc, LocToTest, 'rows');
Note, if the indices in your cell array are in string form, then obviously you'll also need a call to str2num in there somewhere before you call ismember.
One other thing, I notice you're a new member, so welcome to the site. If you think this response satisfactorily answered your question, then please mark the question answered by clicking the tick mark next to this response.

MATLAB: apply a function to every n items in a vector

This related question How can I apply a function to every row/column of a matrix in MATLAB? seems to indicate one way to do this is using num2cell, which I kind of want to stay away from.
Here's what I want to do. I've got an index list for a triangle mesh, the indices index the vertex list.
I want to run func(a,b,c) on the first 3 indices, then the next three indices, and so on.
So I could reshape(idxs,3,[]) so now i've got my data into triplets as column vectors. But arrayfun does not do what I want it to do.
Looking for something like a column-map operator.
First, get your func properly vectorized, if necessary, such that the arguments can be lists of equal length:
vec_func = #(a,b,c)(arrayfun(#func,a,b,c))
Then, you can directly access every third element of idxs:
vec_func( idxs(1:3:end), idxs(2:3:end), idxs(3:3:end) )

What's an appropriate data structure for a matrix with random variable entries?

I'm currently working in an area that is related to simulation and trying to design a data structure that can include random variables within matrices. To motivate this let me say I have the following matrix:
[a b; c d]
I want to find a data structure that will allow for a, b, c, d to either be real numbers or random variables. As an example, let's say that a = 1, b = -1, c = 2 but let d be a normally distributed random variable with mean 0 and standard deviation 1.
The data structure that I have in mind will give no value to d. However, I also want to be able to design a function that can take in the structure, simulate a uniform(0,1), obtain a value for d using an inverse CDF and then spit out an actual matrix.
I have several ideas to do this (all related to the MATLAB icdf function) but would like to know how more experienced programmers would do this. In this application, it's important that the structure is as "lean" as possible since I will be working with very very large matrices and memory will be an issue.
EDIT #1:
Thank you all for the feedback. I have decided to use a cell structure and store random variables as function handles. To save some processing time for large scale applications, I have decided to reference the location of the random variables to save time during the "evaluation" part.
One solution is to create your matrix initially as a cell array containing both numeric values and function handles to functions designed to generate a value for that entry. For your example, you could do the following:
generatorMatrix = {1 -1; 2 #randn};
Then you could create a function that takes a matrix of the above form, evaluates the cells containing function handles, then combines the results with the numeric cell entries to create a numeric matrix to use for further calculations:
function numMatrix = create_matrix(generatorMatrix)
index = cellfun(#(c) isa(c,'function_handle'),... %# Find function handles
generatorMatrix);
generatorMatrix(index) = cellfun(#feval,... %# Evaluate functions
generatorMatrix(index),...
'UniformOutput',false);
numMatrix = cell2mat(generatorMatrix); %# Change from cell to numeric matrix
end
Some additional things you can do would be to use anonymous functions to do more complicated things with built-in functions or create cell entries of varying size. This is illustrated by the following sample matrix, which can be used to create a matrix with the first row containing a 5 followed by 9 ones and the other 9 rows containing a 1 followed by 9 numbers drawn from a uniform distribution between 5 and 10:
generatorMatrix = {5 ones(1,9); ones(9,1) #() 5*rand(9)+5};
And each time this matrix is passed to create_matrix it will create a new 10-by-10 matrix where the 9-by-9 submatrix will contain a different set of random values.
An alternative solution...
If your matrix can be easily broken into blocks of submatrices (as in the second example above) then using a cell array to store numeric values and function handles may be your best option.
However, if the random values are single elements scattered sparsely throughout the entire matrix, then a variation similar to what user57368 suggested may work better. You could store your matrix data in three parts: a numeric matrix with placeholders (such as NaN) where the randomly-generated values will go, an index vector containing linear indices of the positions of the randomly-generated values, and a cell array of the same length as the index vector containing function handles for the functions to be used to generate the random values. To make things easier, you can even store these three pieces of data in a structure.
As an example, the following defines a 3-by-3 matrix with 3 random values stored in indices 2, 4, and 9 and drawn respectively from a normal distribution, a uniform distribution from 5 to 10, and an exponential distribution:
matData = struct('numMatrix',[1 nan 3; nan 2 4; 0 5 nan],...
'randIndex',[2 4 9],...
'randFcns',{{#randn , #() 5*rand+5 , #() -log(rand)/2}});
And you can define a new create_matrix function to easily create a matrix from this data:
function numMatrix = create_matrix(matData)
numMatrix = matData.numMatrix;
numMatrix(matData.randIndex) = cellfun(#feval,matData.randFcns);
end
If you were using NumPy, then masked arrays would be the obvious place to start, but I don't know of any equivalent in MATLAB. Cell arrays might not be compact enough, and if you did use a cell array, then you would have to come up with an efficient way to find the non-real entries and replace them with a sample from the right distribution.
Try using a regular or sparse matrix to hold the real values, and leave it at zero wherever you want a random variable. Then alongside that store a sparse matrix of the same shape whose non-zero entries correspond to the random variables in your matrix. If you want, the value of the entry in the second matrix can be used to indicate which distribution (ie. 1 for uniform, 2 for normal, etc.).
Whenever you want to get a purely real matrix to work with, you iterate over the non-zero values in the second matrix to convert them to samples, and then add that matrix to your first.