Matlab, graphing functions - matlab

I have a homework problem, I think I did it correctly but need to make sure 100%. Can anyone check for me, before I hand it in?
Thank you.
Question:
Plot the function given by f (x) = 2 sin(2x) − 3 cos(x/2) over the in-
terval [0, 2π] using steps of length .001 (How?). Use the commands max and min to estimate the maximum and minimum points. Include the maximum and minimum points as tick marks on the x-axis and the maximum and minimum values as tick marks on the y-axis.
My code:
x=linspace(0,2*pi,6280);
f=#(x)...
2.*sin(2.*x)-3.*cos(x./2);
%f = #(x)2.*sin(2.*x)-3.*cos(x./2)
g=#(x)...
-1*(2.*sin(2.*x)-3.*cos(x./2));
%g = #(x)-1*(2.*sin(2.*x)-3.*cos(x./2))
[x3,y5]=fminbnd(g,0,2*pi);
%x3 = 4.0968
%y3 = -3.2647
[x2,y4]=fminbnd(f,0,2*pi);
%x2 =2.1864
%y2 = -3.2647
y2=max(f(x));
y3=min(f(x));
plot(x,f(x));
set(gca,'XTick',[x2 x3]);
set(gca,'YTick',[y2 y3]);
(*after I paste this code here, it appeared not as nice as I had it in my program, don't know why)

To create a vector with certain step do
x=0:0.001:2*pi;
Why do you have g(x) function and why are you using fminbind? Use MIN and MAX, return index of those values and find related x values.
[ymin, minindex] = min(f(x));
xmin = x(minindex);
For general case if you have multiple min/max values, index will contain only the first occurrence. Instead you can do:
minindex = find(y==ymin);
Or for real values to avoid precision error:
minindex = find(abs(y-ymin)<=eps);
Also your last statement returns error Values must be monotonically increasing. To avoid it sort your tick values.
set(gca,'XTick',sort([xmin xmax]));
set(gca,'YTick',sort([ymin ymax]));

Related

MATLAB - Create Array Variable in For Loop and Plot

I will preface this post with the obvious fact that I'm not very experienced in MATLAB and this post may be somewhat confusing. Any help is appreciated!
I need to store data inside two parameters but unsure on how to do it. The number of "x" values is known but it is a user inputted value, so it's not something that can be hard coded. Same as the "y" values. Here's a simplified example of what I think I need (numbers are hard coded here for the sake of the example).
Then, the final figure should have multiple plots on it. Each "x" variable is its own "output" that needs to be plotted. In the end I need "x" number of plots with "z" and "y" being the (X,Y) coordinates for each "x" plot, respectively.
EDIT: Updated example code.
list = [.0025, .005, .0075];
x = input('How many? ');
y = linspace(2.4*10^9, 5.0*10^9, 1000);
z = zeros(x, length(y));
for i = x
time = list(i)/(3*10^8);
for j = y
z(i,j) = (time * j);
end
end
for i = x
plot(z(i,j));
end
I get the following error:
Requested 3x2400000000 (53.6GB) array exceeds maximum array size preference. Creation of arrays greater
than this limit may take a long time and cause MATLAB to become unresponsive. See array size limit or
preference panel for more information.
The example that I provided could be totally wrong but I hope I have explained enough for someone to provide feedback.
Create the z-Array beforehand to your needs: https://uk.mathworks.com/help/matlab/ref/zeros.html
Then you can fill it with z[x,y] = x+y
HTH

How can I sorting the values in the jumping columns?

I have a matrices 25x600 and some columns contains positive and negative values. I want the output like this [+ + - -] (four values 2 positive and 2 negative). I am guaranteed to always have two positive values immediately before the transition and two negative values immediately after. my attempting was as follow :
My attempt was as follows:
clc;
clear all;
close all;
%%
data=[-0.0059972;-0.004994;-0.0029881;2.0868e-05;
0.0030299;0.013059;0.033115;0.063196;0.093273;0.1935;0.39385;0.69423;0.99448;1.9950;3.99550;6.99550;9.9957;19.9961;39.99620;69.9960;
99.99530;199.99810;399.99140;699.98860;1000.03130]
for r=1:600
lam=data(:,r);
N_lam = length(lam);
%%
for j=1:N_lam
kk=0;
r1=0;
if(sign(lam(j))==1)
kk=kk+1;
lampos(kk)=lam(j);
if (length(lampos(kk))>3 &length(lamneg(r1))>2)
break
end
else
r1=r1+1;
lamneg(r1)=lam(j);
end
end
cc{r}=[lampos lamneg];
end
Any help would be greatly appreciated.
The find function may be useful here, as it can help you to locate the locations where the function changes from positive to negative. The following locates the index ind of the last negative value of data (assumed here to be a 1D vector) before it rises above zero:
num_rises = 1;
ind = find(data(1:end-1)<=0 & data(2:end)>0, num_dips, 'first')
As such, for each column, you would be interested in the values of ind-1,ind, negative values, and ind+1,ind+2 for the positive values.
It was also unclear to me how many sets of these 4 values were of interest to you. To find more regions where the data dips below the origin, change the value of num_dips to suit your needs.

How to get cumulative distribution functions of a vector in Matlab using cumsum?

I want to get the probability to get a value X higher than x_i, which means the cumulative distribution functions CDF. P(X>=x_i).
I've tried to do it in Matlab with this code.
Let's assume the data is in the column vector p1.
xp1 = linspace(min(p1), max(p1)); %range of bins
histp1 = histc(p1(:), xp1); %histogram od data
probp1 = histp1/sum(histp1); %PDF (probability distribution function)
`figure;plot(probp1, 'o') `
Now I want to calculate the CDF,
sorncount = flipud(histp1);
cumsump1 = cumsum(sorncount);
normcumsump1 = cumsump1/max(cumsump1);
cdf = flipud(normcumsump1);
figure;plot(xp1, cdf, 'ok');
I'm wondering whether anyone can help me to know if I'm ok or am I doing something wrong?
Your code works correctly, but is a bit more complicated than it could be. Since probp1 has been normalized to have sum equal to 1, the maximum of its cumulative sum is guaranteed to be 1, so there is no need to divide by this maximum. This shortens the code a bit:
xp1 = linspace(min(p1), max(p1)); %range of bins
histp1 = histc(p1(:), xp1); %count for each bin
probp1 = histp1/sum(histp1); %PDF (probability distribution function)
cdf = flipud(cumsum(flipud(histp1))); %CDF (unconventional, of P(X>=a) kind)
As Raab70 noted, most of the time CDF is understood as P(X<=a), in which case you don't need flipud: taking cumsum(histp1) is all that's needed.
Also, I would probably use histp1(end:-1:1) instead of flipud(histp1), so that the vector is flipped no matter if it's a row or column.

A moving average with different functions and varying time-frames

I have a matrix time-series data for 8 variables with about 2500 points (~10 years of mon-fri) and would like to calculate the mean, variance, skewness and kurtosis on a 'moving average' basis.
Lets say frames = [100 252 504 756] - I would like calculate the four functions above on over each of the (time-)frames, on a daily basis - so the return for day 300 in the case with 100 day-frame, would be [mean variance skewness kurtosis] from the period day201-day300 (100 days in total)... and so on.
I know this means I would get an array output, and the the first frame number of days would be NaNs, but I can't figure out the required indexing to get this done...
This is an interesting question because I think the optimal solution is different for the mean than it is for the other sample statistics.
I've provided a simulation example below that you can work through.
First, choose some arbitrary parameters and simulate some data:
%#Set some arbitrary parameters
T = 100; N = 5;
WindowLength = 10;
%#Simulate some data
X = randn(T, N);
For the mean, use filter to obtain a moving average:
MeanMA = filter(ones(1, WindowLength) / WindowLength, 1, X);
MeanMA(1:WindowLength-1, :) = nan;
I had originally thought to solve this problem using conv as follows:
MeanMA = nan(T, N);
for n = 1:N
MeanMA(WindowLength:T, n) = conv(X(:, n), ones(WindowLength, 1), 'valid');
end
MeanMA = (1/WindowLength) * MeanMA;
But as #PhilGoddard pointed out in the comments, the filter approach avoids the need for the loop.
Also note that I've chosen to make the dates in the output matrix correspond to the dates in X so in later work you can use the same subscripts for both. Thus, the first WindowLength-1 observations in MeanMA will be nan.
For the variance, I can't see how to use either filter or conv or even a running sum to make things more efficient, so instead I perform the calculation manually at each iteration:
VarianceMA = nan(T, N);
for t = WindowLength:T
VarianceMA(t, :) = var(X(t-WindowLength+1:t, :));
end
We could speed things up slightly by exploiting the fact that we have already calculated the mean moving average. Simply replace the within loop line in the above with:
VarianceMA(t, :) = (1/(WindowLength-1)) * sum((bsxfun(#minus, X(t-WindowLength+1:t, :), MeanMA(t, :))).^2);
However, I doubt this will make much difference.
If anyone else can see a clever way to use filter or conv to get the moving window variance I'd be very interested to see it.
I leave the case of skewness and kurtosis to the OP, since they are essentially just the same as the variance example, but with the appropriate function.
A final point: if you were converting the above into a general function, you could pass in an anonymous function as one of the arguments, then you would have a moving average routine that works for arbitrary choice of transformations.
Final, final point: For a sequence of window lengths, simply loop over the entire code block for each window length.
I have managed to produce a solution, which only uses basic functions within MATLAB and can also be expanded to include other functions, (for finance: e.g. a moving Sharpe Ratio, or a moving Sortino Ratio). The code below shows this and contains hopefully sufficient commentary.
I am using a time series of Hedge Fund data, with ca. 10 years worth of daily returns (which were checked to be stationary - not shown in the code). Unfortunately I haven't got the corresponding dates in the example so the x-axis in the plots would be 'no. of days'.
% start by importing the data you need - here it is a selection out of an
% excel spreadsheet
returnsHF = xlsread('HFRXIndices_Final.xlsx','EquityHedgeMarketNeutral','D1:D2742');
% two years to be used for the moving average. (250 business days in one year)
window = 500;
% create zero-matrices to fill with the MA values at each point in time.
mean_avg = zeros(length(returnsHF)-window,1);
st_dev = zeros(length(returnsHF)-window,1);
skew = zeros(length(returnsHF)-window,1);
kurt = zeros(length(returnsHF)-window,1);
% Now work through the time-series with each of the functions (one can add
% any other functions required), assinging the values to the zero-matrices
for count = window:length(returnsHF)
% This is the most tricky part of the script, the indexing in this section
% The TwoYearReturn is what is shifted along one period at a time with the
% for-loop.
TwoYearReturn = returnsHF(count-window+1:count);
mean_avg(count-window+1) = mean(TwoYearReturn);
st_dev(count-window+1) = std(TwoYearReturn);
skew(count-window+1) = skewness(TwoYearReturn);
kurt(count-window +1) = kurtosis(TwoYearReturn);
end
% Plot the MAs
subplot(4,1,1), plot(mean_avg)
title('2yr mean')
subplot(4,1,2), plot(st_dev)
title('2yr stdv')
subplot(4,1,3), plot(skew)
title('2yr skewness')
subplot(4,1,4), plot(kurt)
title('2yr kurtosis')

Finding the highest peak above threshold only

if (pbcg(k+M) > pbcg(k-1+M) && pbcg(k+M) > pbcg(k+1+M) && pbcg(k+M) > threshold)
peaks_y(Counter) = pbcg(k+M);
peaks_x(Counter) = k + M;
py = peaks_y(Counter);
px = peaks_x(Counter);
plot(px,py,'ro');
Counter = (Counter + 1)-1;
fid = fopen('y1.txt','a');
fprintf(fid, '%d\t%f\n', px, py);
fclose(fid);
end
end
this code previously doesn't have any issue on finding the peak..
the main factor for it to find the only peak is this
if (pbcg(k+M) > pbcg(k-1+M) && pbcg(k+M) > pbcg(k+1+M) && pbcg(k+M) > threshold)
but right now it keep show me all the peak that is above the threshold instead of the particular highest peak..
UPDATE: what if the highest peaks have 4nodes that got the same value?
EDIT:
If multiple peaks with the same value surface, I will take the value at the middle and plot.
What I mean by that is for example [1,1,1,4,4,4,2,2,2]
I will take the '4' at the 5th position, so the plot will be at the center of the graph u see
It will be much faster and much more readable to use the built-in max function, and then test if the max value is larger than the threshold.
[C,I] = max(pbcg);
if C > threshold
...
%// I is the index of the maximal value, and C is the maximal value.
end
As alternative solution, you may evaluate the idea of using the built-in function findpeaks, which encompasses several methods to ascertain the existance of peaks within a given signal. Within thos methods you may call
findPeaks = findpeaks(data,'threshold',threshold_resolution);
The only limit I see is that findpeaks is only available with the Signal Processing Toolbox.
EDIT
In case of multiple peaks over the defined threshold, I would just call max to figure the highest peak, as follows
max(peaks);
Assuming you have a vector with peaks pbcg
Here is how you can get the middle one:
highestPeakValue = max(pbcg)
f = find(pbcg == highestPeakValue);
middleHighestPeakLocation = f(ceil(length(f)/2))
Note that you can still make it more robust for cases where you have no peaks, and can adjust it to give different behavior when there are two middle peaks (now it will take the second one)