The following data was imported by left clicking the file on the folder pane to bring up the import window and imported as a cell array. Each column is going to be one of my variables (K = 1st column etc).
StrikePrice UnderlyingPrice mT Rf DividendRate Volatility
47 45 4 0.02 0.5 0.2
50 55 20 0.03 0.1 0.35
And I am using a function first written by Mark Hoyle (2016) that prices American Calls
function LSMAmCallContDiv(S0,K,D,r,sigma,T,NSteps,NSims)
To fill in the first row of my data for this function;
function LSMAmCallContDiv(45, 47, 0.5, 0.02, 0.2, 4, 500, 100)
Is there anyway I can do this function without manually having to change the values for the second row in my cell array? (I'm dealing with a lot of rows in reality). This was something I achieved when pricing puts in RStudio with the following code however I am a complete beginner to MatLab.
jpmitmput30results = apply(jpmitmput30full, 1, function(x) AmerPutLSM(Spot = x['UnderlyingPrice'], sigma = x['Volatility'],
n=500, m=100, Strike = x['StrikePrice'],
r = x['Rf'], dr = x['DividendRate'],
mT = x['mT']))
Given you have a cell array, I presume it looks like this:
data = {47 45 4 0.02 0.5 0.2
50 55 20 0.03 0.1 0.35};
To get one value out, you can index as data{row,column}, for example data{1,3} returns 4.
Now all you need is a loop to repeatedly call your function with the right value in the right order:
for ii=1:size(data,1)
LSMAmCallContDiv(data{ii,2},data{ii,1},data{ii,5},data{ii,4},data{ii,6},data{ii,3},500,100)
end
Since the function has no output arguments, we cannot collect its results in an array. You will have to copy-paste them from the terminal window. If you decide to modify the function to return values, then you can collect them. First modify the first line of the function to read:
function [Price,StdErr] = LSMAmCallContDiv(S0,K,D,r,sigma,T,NSteps,NSims)
and then in your own code:
Price = zeros(size(data,1),1);
StdErr = zeros(size(data,1),1);
for ii=1:size(data,1)
[Price(ii),StdErr(ii)] = LSMAmCallContDiv(data{ii,2},data{ii,1},data{ii,5},data{ii,4},data{ii,6},data{ii,3},500,100)
end
I'm not sure about that function in particular, but most functions can take vectorized input, it's a really useful feature. That is to say, where functions in other languages take single value inputs, matlab thinks of everything as arrays automatically, so you can pass vectors to functions instead, and it calls the function on each row in the input.
For instance,
frame = [47, 45, 4, 0.02, 0.5, 0.2; 50, 55, 20, 0.03, 0.1, 0.35];
out = LSMAmCallContDiv(frame(:,2), frame(:,1), frame(:,5), frame(:,4), frame(:,6), frame(:,3), 500, 100);
should give you a column vector with all the outputs you want. To be clear, the : in (:,2) refers to every row, and the 2 refers to the second column. So, frame(:,2) refers to the entire second column.
Now, this might not be the case for this function, it might not be able to take vectorized input, (I cannot find any documentation on it,) in which case, you might have to take a more programmatical approach.
i = 1;
while i <= height(frame)
out(i) = LSMAmCallContDiv(frame(i,2), frame(i,1), frame(i,5), frame(i,4), frame(i,6), frame(i,3), 500, 100);
i = i+1;
end
This is a more standard approach that one might see in any other language, (albeit less efficient in matlab), but it is certainly a good way to do it.
Related
Trying to calculate the variance of a European option using repeated trial (instead of 1 trial). I want to compare the variance using the standard randn function and the sobolset. I'm not quite sure how to draw repeated samples from the latter.
Generating from randn is easy:
num_steps = 100;
num_paths = 10;
z = rand(num_steps, mum_paths); % 100 paths, for 10 trials
Once I have this, I can loop through all the 10 columns of the z matrix, and can also repeat the experiment many times, as the randn function will provide a new random variable set everytime.
for exp_num = 1: 20
for col = 1: 10
price_vec = z(:, col);
end
end
I'm not quite sure how to do this with the sobolset. I understand I can create a matrix of dimensions to start with (say 100* 10). I can loop through as above through all the columns for the first experiment. However, when I try the next experiment (#2), the loop starts from the beginning and all the numbers are the same. Meaning I don't get any variation in my pricing. It seems I will need to find a way to randomize the column selection at the start of every experiment number. Is there a better way to do this??
data1 = sobolset(1000, 'Skip', 1000, 'Leap', 100)
data2 = net(test1, 10)
for exp_num = 1: 20
% how do I change the start of the column selection here, so that the next data3 is different from %the one in the previous exp_num?
for col = 1:10
data3(:, col) = data(2:, col)
% perform calculations
end
end
I hope this is making sense....
Thanks for the help!
Update: 8/21
I tried the following:
num_runs = 100
num_samples = 1000
for j = 1: num_runs
for i = 1 : num_samples
sobol_set = sobolset(num_samples,'Skip',j*50,'Leap',1e2);
sobol_set = net(sobol_set, 5);
sobol_seq = sobol_set(:, i)';
z_uncorr = norminv(sobol_seq, 0, 1)
% do pricing with z_uncorr through some function F
end
end
After generating 100 prices (through some function F, mentioned above), I find that the variance of the 100 prices is higher than that I get from the standard pseudo random numbers. This should not be the case. I think I'm still not sampling correctly from the sobolset. Any advice would be appreciated.
I am wondering if it is possible to use a vector to access data within a cell array. I am hoping to accomplish this using a vectorized approach rather than a for-loop.
I'm attempting to run a simple microsimulation in MATLAB. I have a simulated cohort that is initially healthy, but some are at low risk for a particular disease while others are at high risk. Thus, I have an array (Starting_Cohort) that indicates each patient's risk level (first column) and their initial status (second column). In addition, I have a cell array (pstar) that indicates each patient's likelihood of transitioning between two hypothetical health states (i.e., "healthy" and "sick").
What I would like to accomplish is the following:
1) During each period of the simulation (t = 1:T), use the first column of the starting cohort to determine the patient's risk level (i.e., 1 or 2).
2) Use the risk level to access a specific row (dependent on their current health status) within a specific cell (dependent on their risk level) of the cell array.
3) Compare the resultant vector against a random draw from the uniform distribution (contained in the array "r"), and select the column number associated with the first value larger than that draw (the column number determines their health state in the subsequent period).
HOWEVER, I want to avoid doing this for one patient at a time (i.e., introducing a nested loop), as this increases the execution time of the code by an order of magnitude (the actual cohort consists of approximately 20000 patients). I've been trying to accomplish this through vectorization - that is, running the simulation over the entire patient cohort concurrently - but I hit a roadblock when trying to access data from the cell array described above.
Starting_Cohort = [1 1; 1 1; 2 1; 2 1;];
[Cohort_Size, ~] = size(Starting_Cohort);
pstar = cell(2, 1);
pstar{1, 1} = [0.75 1.00; 0.15 1.00]; pstar{2, 1} = [0.65 1.00; 0.25 1.00];
rng(1234, 'twister'); T = 5; r = rand(Cohort_Size, T);
Sim_Results = [Starting_Cohort zeros(Cohort_Size, T)];
for t = 1:T
[~, Sim_Results(:, t+2)] = max(pstar{Sim_Results(:, 1), 1} ...
(Sim_Results(:, t+1), :) > r(:, t), [], 2);
end
When I run the above code, I obtain the error "Expected one output from a curly brace or dot indexing expression, but there were 4 results." I take this to mean that my approach to extracting information from the cell array is inappropriate, although I'm not sure whether I can address this or how. I would be deeply appreciative for any assistance rendered!
UPDATE 070619: I did eventually get this to work, using the code below. Effectively, I created a string array containing the expression I wanted to apply to each row. The expression is identical for every row EXCEPT in that it contains the row index. I can then use arrayfun and evalin to produce results similar to those I was looking for. Unfortunately, my own problem involves sparse arrays, so I could not actually solve my original problem. However, I'm hoping this information may nonetheless be useful for others.
Starting_Cohort = [1 1; 1 1; 2 1; 2 1;];
[Cohort_Size, ~] = size(Starting_Cohort);
pstar = cell(2, 1);
pstar{1, 1} = [0.75 1.00; 0.15 1.00];
pstar{2, 1} = [0.65 1.00; 0.25 1.00];
rng(1234, 'twister'); T = 5; r = rand(Cohort_Size, T);
Sim_Results = [Starting_Cohort zeros(Cohort_Size, T)];
for i = 1:Cohort_Size
TEST(i, 1) = strcat("max(pstar{Sim_Results(", string(i), ", 1), 1}",...
"(Sim_Results(", string(i), ", t+1), :) > ", ...
"r(", string(i), ", t), [], 2)");
end
for t = 1:T
[~, Sim_Results(:, t+2)] = arrayfun(#(x) evalin('base', x), TEST);
end
I seem to currently be banging my head against a brick wall as try as I might, I can not see my error here.
I am attempting to write a for loop in MATLAB that uses the equation below (adiabatic compression) to calculate the new pressure after one degree of crankshaft rotation in a four stroke engine cycle.
P2 = P1 * (V2 / V1) ^2
I am using the calculated volume from the crank-slider model as an input. I have tried this is Excel and it works as expected and gives the overall max output correctly.
The for loop in question is below;
Cyl_P = ones(720,1)
for i = (2:1:length(Cyl_V))'
Cyl_P(i,:) = Cyl_P(i-1,:) .* (Cyl_V(i,:) ./ Cyl_V(i-1,:)).^1.35
end
my aim is to use the first element of the vector Cyl_P which is equal to one, as an input to the equation above, and multiply it by the second element of Cyl_V divided by the first, and multiply the volume terms by 1.35. that should calculate the second element of Cyl_P. I would then like to feed that value back in to the same equation to calculate the third element and so on.
What am I missing?
I've put the full code below
Theta = deg2rad(1:1:720)'
Stroke = 82 / 1000
R = Stroke / 2
L = 90.5 / 1000
Bore = 71.9 / 1000
d_h = (R+L) - (R.*cos(Theta)) - sqrt(L.^2 - (R.*sin(Theta)).^2)
Pist_h = d_h
figure
plot(Pist_h)
Bore_A = (pi*Bore^2)/4
Swept_V = (Pist_h .* Bore_A)
Clear_V = max(Swept_V) / 10
Total_V = max(Swept_V) + Clear_V
Cyl_V = (Swept_V + Clear_V)
figure
plot(Cyl_V)
for ii = (2:1:length(Cyl_V))'
div_V(ii,:) = (Cyl_V(ii) ./ Cyl_V(ii-1,:)).^1.35
end
Cyl_P = ones(720,1)
for i = (2:1:length(Cyl_V))'
Cyl_P(i,:) = Cyl_P(i-1,:) .* (Cyl_V(i,:) ./ Cyl_V(i-1,:)).^1.35
end
figure
plot(Cyl_P)
Your problem is transposing the arrays you feed as argument to for loops. MATLAB reads for arguments per row, thus only the first iteration will be used when you feed it a column. General comments:
' is the complex transpose, .' is the regular transpose.
i is the imaginary unit in MATLAB, it's common practise not to use it as a variable name.
2:1:4 does the same as 2:4, as 1 is the default step size.
Please use semi-colons, ;, after each row, so as to prevent MATLAB from echoing the result of each line to the command window. This makes the script easier to run, and if you have matrices with >1M entries, echoing the contents might even crash the program all together. Even in this case, you are echoing 720 entries of Cyl_P 720 times. For checking variable contents, just break the script where necessary (or run it in parts) and examine the content where warranted (e.g. Cyl_P(1:3) would suffice here to check whether the loop fills the vector as intended).
I am currently new to matlab, and I am trying to do a loop over fifty elements at a time instead of one element at a time. For example, I have a list of 1000 elements, and I would like to compute the sum for every fifty elements. Instead of doing a sum function through indexing, it would be much faster with a loop. How would I go about doing this?
I.e. [1,...50th element, 51th element... 100...]
Output would be the the sum values of 1:50, 51:100, 101:150... and so on.
Thanks in advance
I'm not really sure what you mean by "a sum function through indexing", but there are various ways to do this. In general I try to avoid explicit loops in Matlab and let MathWorks functions do their magic.
results = zeros(20,1);
for i = 1:20
results(i) = sum(1 + (50 * (i - 1)):50 + 50 * (i - 1));
end
Another option is to do something like arrayfun.
sIndex = 1:50:951;
eIndex = 50:50:1000;
result = arrayfun(#(x, y) sum(x:y), sIndex, eIndex);
You could also use reshape and sum to do it one shot.
numbers = 1:1000;
numbers2 = reshape(numbers, 50, []);
result = sum(numbers2);
This last method is what I personally would say is a Matlab way of doing it. arrayfun is basically a wrapper around a loop and the loop is...well a loop.
In case you need the sum, you can also use movsum:
array = 1:1000;
win = 50; % window size
msum = movsum(array,win,'Endpoints','discard');
in the same way, you can use:
movmax Moving maximum
movmean Moving mean
movmedian Moving median
movmin Moving minimum
movstd Moving standard deviation
movvar Moving variance
Using cumsum and diff you can obtain the desired result.
C = [0 cumsum(a)];
out = diff(C(1:50:end));
I have a matrix time-series data for 8 variables with about 2500 points (~10 years of mon-fri) and would like to calculate the mean, variance, skewness and kurtosis on a 'moving average' basis.
Lets say frames = [100 252 504 756] - I would like calculate the four functions above on over each of the (time-)frames, on a daily basis - so the return for day 300 in the case with 100 day-frame, would be [mean variance skewness kurtosis] from the period day201-day300 (100 days in total)... and so on.
I know this means I would get an array output, and the the first frame number of days would be NaNs, but I can't figure out the required indexing to get this done...
This is an interesting question because I think the optimal solution is different for the mean than it is for the other sample statistics.
I've provided a simulation example below that you can work through.
First, choose some arbitrary parameters and simulate some data:
%#Set some arbitrary parameters
T = 100; N = 5;
WindowLength = 10;
%#Simulate some data
X = randn(T, N);
For the mean, use filter to obtain a moving average:
MeanMA = filter(ones(1, WindowLength) / WindowLength, 1, X);
MeanMA(1:WindowLength-1, :) = nan;
I had originally thought to solve this problem using conv as follows:
MeanMA = nan(T, N);
for n = 1:N
MeanMA(WindowLength:T, n) = conv(X(:, n), ones(WindowLength, 1), 'valid');
end
MeanMA = (1/WindowLength) * MeanMA;
But as #PhilGoddard pointed out in the comments, the filter approach avoids the need for the loop.
Also note that I've chosen to make the dates in the output matrix correspond to the dates in X so in later work you can use the same subscripts for both. Thus, the first WindowLength-1 observations in MeanMA will be nan.
For the variance, I can't see how to use either filter or conv or even a running sum to make things more efficient, so instead I perform the calculation manually at each iteration:
VarianceMA = nan(T, N);
for t = WindowLength:T
VarianceMA(t, :) = var(X(t-WindowLength+1:t, :));
end
We could speed things up slightly by exploiting the fact that we have already calculated the mean moving average. Simply replace the within loop line in the above with:
VarianceMA(t, :) = (1/(WindowLength-1)) * sum((bsxfun(#minus, X(t-WindowLength+1:t, :), MeanMA(t, :))).^2);
However, I doubt this will make much difference.
If anyone else can see a clever way to use filter or conv to get the moving window variance I'd be very interested to see it.
I leave the case of skewness and kurtosis to the OP, since they are essentially just the same as the variance example, but with the appropriate function.
A final point: if you were converting the above into a general function, you could pass in an anonymous function as one of the arguments, then you would have a moving average routine that works for arbitrary choice of transformations.
Final, final point: For a sequence of window lengths, simply loop over the entire code block for each window length.
I have managed to produce a solution, which only uses basic functions within MATLAB and can also be expanded to include other functions, (for finance: e.g. a moving Sharpe Ratio, or a moving Sortino Ratio). The code below shows this and contains hopefully sufficient commentary.
I am using a time series of Hedge Fund data, with ca. 10 years worth of daily returns (which were checked to be stationary - not shown in the code). Unfortunately I haven't got the corresponding dates in the example so the x-axis in the plots would be 'no. of days'.
% start by importing the data you need - here it is a selection out of an
% excel spreadsheet
returnsHF = xlsread('HFRXIndices_Final.xlsx','EquityHedgeMarketNeutral','D1:D2742');
% two years to be used for the moving average. (250 business days in one year)
window = 500;
% create zero-matrices to fill with the MA values at each point in time.
mean_avg = zeros(length(returnsHF)-window,1);
st_dev = zeros(length(returnsHF)-window,1);
skew = zeros(length(returnsHF)-window,1);
kurt = zeros(length(returnsHF)-window,1);
% Now work through the time-series with each of the functions (one can add
% any other functions required), assinging the values to the zero-matrices
for count = window:length(returnsHF)
% This is the most tricky part of the script, the indexing in this section
% The TwoYearReturn is what is shifted along one period at a time with the
% for-loop.
TwoYearReturn = returnsHF(count-window+1:count);
mean_avg(count-window+1) = mean(TwoYearReturn);
st_dev(count-window+1) = std(TwoYearReturn);
skew(count-window+1) = skewness(TwoYearReturn);
kurt(count-window +1) = kurtosis(TwoYearReturn);
end
% Plot the MAs
subplot(4,1,1), plot(mean_avg)
title('2yr mean')
subplot(4,1,2), plot(st_dev)
title('2yr stdv')
subplot(4,1,3), plot(skew)
title('2yr skewness')
subplot(4,1,4), plot(kurt)
title('2yr kurtosis')