How to convert/get the value of integer expression in cplex? - type-conversion

I am new to Cplex. I need to compute the overlap between several intervals at same time. For that, i compute the overlap_length between two intervals at a time, and i save the max and min of the start/end points.
For example, let's consider three intervals vars I1, I2, I3. The objective is to maximize the overlap between the three.
Then, the code is as follow:
over1=mdl.overlap_length(I1,I2)
start1=max(mdl.start_of(I1),mdl.start_of(I2))
end1=min(mdl.end_of(I1),mdl.end_of(I2))
over1=mdl.overlap_length(I3,(start1,end1)
And then i maximize the over1.
With that, i get the following error:
AssertionError: To express a fixed interval, 'interval2' should be a tuple of two integers
In fact, the start1 and end1 are CP integer expressions. I didn't find a way to convert or to get the value!
Is there anyone who have an idea how to do that?
Thanks,

In your last line start1 and end1 needs to be bounds, they need to be values not decision variables. So as a workaround, you could rely on an artificial interval.
since OPL is very close to docplex but IMHO opinion easier let me show you the way in OPL.
using CP;
dvar interval I1 in 0..20;
dvar interval I2 in 0..20;
dvar interval I3 in 0..20;
dvar int over1;
dvar int start1;
dvar int end1;
dvar interval artificialInterval;
maximize over1;
subject to
{
over1==overlapLength(I1,I2);
start1==maxl(startOf(I1),startOf(I2));
end1==minl(endOf(I1),endOf(I2));
startOf(artificialInterval)==start1;
endOf(artificialInterval)==end1;
over1==overlapLength(I3,artificialInterval);
}
works fine

Related

How to compare two double values in matlab?

I want to compare two double values. I know the value of MinimumValue, which is 3.5261e+04. This MinimumValue I got from an array e. My code has to print first statement 'recognized face' because both value are same. But my code is displaying second statement 'unrecognized face'.
What is the mistake in my code?
MinimumValue = min(e)
theta = 3.5261e+04;
if (MinimumValue <= theta)
fprintf('recognized face\n');
else
fprintf('unrecognized face\n');
end
There are two approaches:
Replace if MinimumValue<=theta with if MinimumValue == theta. This is the easier but probably poorer approach for your problem.
Chances are, MinimumValue is different from theta by a very small amount. If you are doing some calculations by hand to determine that theta = 3.5261e+04, and believe there are more decimal places, you should use format long to determine the actual value of theta to 15 significant figures. After that, you can use if abs(MinimumValue - theta) <= eps (edit: As patrick has noted in the comment below, you should compare to some user-defined tolerance or eps instead of realmin('double').

Error in data source: correct iteratively the vector without for loop?

Hello everyone I have a new small problem:
The data I am using have a weird trade time that goes from 17.00 of one day to 16.15 of the day after.
That means that, e.g., for the day 09-27-2013 The source I am using registers the transactions occurred as follows:
DATE , TIME , PRICE
09/27/2013,17:19:42,3225.00,1 #%first obs of the vector
09/27/2013,18:37:59,3225.00,1 #%second obs of the vector
09/27/2013,08:31:32,3200.00,1
09/27/2013,08:36:17,3203.00,1
09/27/2013,09:21:34,3210.50,1 #%fifth obs of the vector
Now first and second obs are incorrect for me: they belong to 9/27 trading day but they have been executed on 9/26. Since I am working on some functions in matlab that relies on non-decremental times I need to solve this issue. The date format I am using is actually the datenum Matlab format so I am trying to solve the problem just subtracting one from the incorrect observations:
%#Call time the time vector, I can identify the 'incorrect' observations
idx=find(diff(time)<0);
time(idx)=time(idx)-1;
It is easy to tell that this will only fix the 'last' incorrect observations of a series. In the previous example this would only correct the second element. And I should run the code several times (I thought about a while loop) until idx will be empty. This is not a big issue when working with small series but I have up to 20millions observations and probably hundred of thousands consecutively incorrect ones.
Is there a way to fix this in a vectorized way?
idx=find(diff(time)<0);
while idx
However, given that the computation would not be so complex I thought that a for loop could efficiently solve the issue and my idea was the following:
[N]=size(time,1);
for i=N:-1:1
if diff(time(i,:)<0)
time(i,:)=time(i,:)-1;
end
end
sadly it does not seems to work.
Here is an example of data I am actually using.
735504.591157407
735507.708030093 %# I made this up to give you an example of two consecutively wrong observations
735507.708564815 %# This is an incorrect observation
735507.160138889
735507.185358796
735507.356562500
Thanks everyone in advance
Sensible version -
for count = 1:numel(time)
dtime = diff([0 ;time]);
ind1 = find(dtime<0,1,'last')-1;
time(ind1) = time(ind1)-1;
end
Faster-but-crazier version -
dtime = diff([0 ;time]);
for count = 1:numel(time)
ind1 = find(dtime<0,1,'last')-1;
time(ind1) = time(ind1)-1;
dtime(ind1+1) = 0;
dtime(ind1) = dtime(ind1)-1;
end
More Crazier version -
dtime = diff([0 ;time]);
ind1 = numel(dtime);
for count = 1:numel(time)
ind1 = find(dtime(1:ind1)<0,1,'last')-1;
time(ind1) = time(ind1)-1;
dtime(ind1) = dtime(ind1)-1;
end
Some average computation runtimes for these versions with various datasizes -
Datasize 1: 3432 elements
Version 1 - 0.069 sec
Version 2 - 0.042 sec
Version 3 - 0.034 sec
Datasize 2: 20 Million elements
Version 1 - 37029 sec
Version 2 - 23303 sec
Version 3 - 20040 sec
So apparently I had 3 other different problems in the data source that I think could have stucked the routine Divakar proposed. Anyway I thought it was being too slow so I started thinking to another solution and came up with a super quick vectorized one.
Given that the observations I wanted to modify fall in a determined known interval of time the function just look for every observation falling in that interval and modifies it as I want (-1 in my case).
function [ datetime ] = correct_date( datetime,starttime, endtime)
%#datetime is my vector of dates and times in matlab numerical format
%#starttime is the starting hour of the interval expressed in datestr format. e.g. '17:00:00'
%#endtime is the ending hour of the interval expressed in datestr format. e.g. '23:59:59'
if (nargin < 1) || (nargin > 3),
error('Requires 1 to 3 input arguments.')
end
% default values
if nargin == 1,
starttime='17:00';
endtime='23:59:59';
elseif nargin == 2,
endtime='23:59:59';
end
tvec=[datenum(starttime) datenum(endtime)];
tvec=tvec-floor(tvec); %#As I am working on multiples days I need to isolate only HH:MM:SS for my interval limits
temp=datetime-floor(datetime); %#same motivation as in the previous line
idx=find(temp>=tvec(1)&temp<=tvec(2)); %#logical find the indices
datetime(idx)=datetime(idx)-1; %#modify them as I want
clear tvec temp idx
end

Implied Volatility in Matlab

I'm trying to calculate the implied volatility using the Black-Scholes formula in Matlab (2012b), but somehow have problems with some strike prices.
For instance blsimpv(1558,1440,0.0024,(1/12),116.4) will return NaN.
I thought it probably would be some problem with the function and therefore searched the internet for some other matlab scripts and customized it to my custom needs, but unfortunately I'm still not able to return a valid implied volatility.
function sigma=impvol(C,S,K,r,T)
%F=S*exp((r).*T);
%G=C.*exp(r.*T);
%alpha=log(F./K)./sqrt(T);
%beta=0.5*sqrt(T);
%a=beta.*(F+K);
%b=sqrt(2*pi)*(0.5*(F-K)-G);
%c=alpha.*(F-K);
%disc=max(0,b.^2-4*a.*c);
%sigma0=(-b+sqrt(disc))./(2*a);
i=-1000;
while i<=5000
sigma0=i/1000;
sigma=NewtonMethod(sigma0);
if sigma<=10 && sigma>=-10
fprintf('This is sigma %f',sigma)
end
i=i+1;
end
end
function s1=NewtonMethod(s0)
s1=s0;
count=0;
f=#(x) call(S,K,r,x,T)-C;
fprime=#(x) call_vega(S,K,r,x,T);
max_count=1e4;
while max(abs(f(s1)))>1e-7 && count<max_count
count=count+1;
s0=s1;
s1=s0-f(s0)./fprime(s0);
end
end
end
function d=d1(S,K,r,sigma,T)
d=(log(S./K)+(r+sigma.^2*0.5).*(T))./(sigma.*sqrt(T));
end
function d=d2(S,K,r,sigma,T)
d=(log(S./K)+(r-sigma.^2*0.5).*(T))./(sigma.*sqrt(T));
end
function p=Phi(x)
p=0.5*(1.+erf(x/sqrt(2)));
end
function p=PhiPrime(x)
p=exp(-0.5*x.^2)/sqrt(2*pi);
end
function c=call(S,K,r,sigma,T)
c=S.*Phi(d1(S,K,r,sigma,T))-K.*exp(-r.*(T)).*Phi(d2(S,K,r,sigma,T));
end
function v=call_vega(S,K,r,sigma,T)
v=S.*PhiPrime(d1(S,K,r,sigma,T)).*sqrt(T);
end
Running impvol(116.4,1558,1440,0.0024,(1/12)) will however unfortunately return the value 'Inf'. There somehow is a problem with the Newton-Rhapson method not converging but I am kind of clueless how to solve this. Does anyone know how to solve this problem or know some other way how to calculate the implied volatility?
Thank u in advance already for your help!
Kind regards,
Henk
I would definitely suggest this code: Fast Matrixwise Black-Scholes Implied Volatility
It is able to compute the entire surface in one shot and - my experience - I found it much more reliable than blsimpv() or impvol() which are other functions implemented in matlab.
Newton-Rhapson method does not work well for implied volatility. You should use the bisection method (not sure how it is used in Matlab). It is described in http://en.wikipedia.org/wiki/Bisection_method. For completeness, it works this way:
1) Pick an arbitrary high (impossible) volatility like high=200%/year.
2) Pick lowest possible volatility (low=0%).
2a) Calculate option premium for 0% volatility, if actual premium is lower than that, it means negative volatility (which is "impossible").
3) While implied volatility is not found:
3.1) If "high" and "low" are very near (e.g. equal up to 5th decimal), either one is your implied volatility. If not...
3.2) Calculate average between "high" and "low". avg=(high+low)/2
3.3) Calculate option premium for avg volatility.
3.4) If actual premium is higher then p(avg), make min=avg, because implied volatility must lie between avg and max.
3.4a) If actual premium is lower than p(avg), make max=avg, because implied vol must lie between min and avg.
The main disvantage of bisect is that you have to pick a maximum value, so your function won't find implied volatilities bigger than that. But something like 200%/year should be high enough for real-world usage.
I use yet another method more like Newton's method, hence not limited to a range, since vega is a derivative, but with a "linearization" fix to avoid hunting and failure due to small vegas:
def implied_volatility(type, premium, S, K, r, s_dummy, t):
if S <= 0.000001 or K <= 0.000001 or t <= 0.000001 or premium <= 0.000001:
return 0.0
s = 0.35
for cycle in range(0, 120):
ext_premium = type(S, K, r, s, t)
if abs(premium - ext_premium) < 0.005:
return s
ext_vega = type.vega(S, K, r, s, t)
# print S, K, r, s, t, premium, ext_premium, ext_vega
if ext_vega < 0.0000001:
# Avoids zero division if stuck
ext_vega = 0.0000001
s_new = s - (ext_premium - premium) / ext_vega
if abs(s_new - s) > 1.00:
# estimated s is too different from previous;
# it is better to go linearly, since
# vega is too small to give a good guess
if s_new > s:
s += 1.0
else:
s -= 1.0
else:
s = s_new
if s < 0.0:
# No volatility < 0%
s = 0.0001
if s > 99.99:
# No point calculating volatilities > 9999%/year
return 100.0
return 0.0
Still, I think that bisect is your best bet.
I created a simple function that conducts a sort of trial and error calculation if the output from blsimpv is NaN. This slows down the computation time significantly for me but it always gives me a desirable result.
The function is shown to be used below
BSIVC(t,i)= blsimpv(S(t,i),K,r,tau(t),HestonCiter(t,i))
if isnan(BSIVC(t,i));
BSIVC(t,i)= secondIVcalc(HestonCiter(t,i),S(t,i),K,r,q,tau(t))
end
The function itself is described below:
function IV= secondIVcalc(HestonC,S,K,r,q,T)
lowCdif = 1;
a=0;
while lowCdif>0.0001
a= a+0.00001
lowCdif = HestonC - BSCprice(S,K,r,q,a,T);
end
IV= a;
end
Please note that BSCprice is not an in-built function in matlab.
Just to make the code clearer-
BSCprice is of the format BSCprice(Underlying Asset Price, Strike Price, interest rate, dividend yield, implied vol, time to maturity).

MATLAB: n-minute/hour/day averages of a time-series

This is a follow-up to an earlier question of mine posted here. Based on Oleg Komarov's answer I wrote a little tool to get daily, hourly, etc. averages or sums of my data that uses accumarray() and datevec()'s output structure. Feel free to have a look at it here (it's probably not written very well, but it works for me).
What I would like to do now is add the functionality to calculate n-minute, n-hour, n-day, etc. statistics instead of 1-minute, 1-hour, 1-day, etc. like my function does. I have a rough idea that simply loops over my time-vector t (which would be pretty much what I would have done already if I hadn't learnt about the beautiful accumarray()), but that means I have to do a lot of error-checking for data gaps, uneven sampling times, etc.
I wonder if there is a more elegant/efficient approach that lets me re-use/extend my old function posted above, i.e. something that still makes use of accumarray() and datevec(), since this makes working with gaps very easy.
You can download some sample data taken from my last question here. These were sampled at 30 min intervals, so a possible example of what I want to do would be to calculate 6 hour averages without relying on the assumption that they are free of gaps and/or always sampled at exactly 30 min.
This is what I have come up with so far, which works reasonably well, apart from a small but easily fixed problem with the time stamps (e.g. 0:30 is representative for the interval from 0:30 to 0:45 -- my old function suffers from the same problem, though):
[ ... see my answer below ...]
Thanks to woodchips for inspiration.
The linked method of using accumarray seems overkill and too complex to me if you start with evenly spaced measurements without any gaps. I have the following function in my private toolbox for calculating an N-point average of vectors:
function y = blockaver(x, n)
% y = blockaver(x, n)
% input points are averaged over n points
% always returns column vector
if n == 1
y = x(:);
else
nblocks = floor(length(x) / n);
y = mean(reshape(x(1:n * nblocks), n, nblocks), 1).';
end
Works pretty well for quick and dirty decimating by a factor N, but note that it does not apply proper anti-alias filtering. Use decimate if that is important.
I guess I figured it out using parts of #Bas Swinckels answer and #woodchip 's code linked above. Not exactly what I would call good code, but working and reasonably fast.
function [ t_acc, x_acc, subs ] = ts_aggregation( t, x, n, target_fmt, fct_handle )
% t is time in datenum format (i.e. days)
% x is whatever variable you want to aggregate
% n is the number of minutes, hours, days
% target_fmt is 'minute', 'hour' or 'day'
% fct_handle can be an arbitrary function (e.g. #sum)
t = t(:);
x = x(:);
switch target_fmt
case 'day'
t_factor = 1;
case 'hour'
t_factor = 1 / 24;
case 'minute'
t_factor = 1 / ( 24 * 60 );
end
t_acc = ( t(1) : n * t_factor : t(end) )';
subs = ones(length(t), 1);
for i = 2:length(t_acc)
subs(t > t_acc(i-1) & t <= t_acc(i)) = i;
end
x_acc = accumarray( subs, x, [], fct_handle );
end
/edit: Updated to a much shorter fnction that does use loops, but appears to be faster than my previous solution.

Looping over matrix elements more efficiently in Matlab

I am writing some matlab code and have written an algorithm that works but I don't think its particularly efficient. Since I am trying to improve my programming skills I would like to know if there is a more efficient way of doing this.
I have a (reasonably large ~ E07) matrix of values which are unordered, but fall within the range [-100, 100]. I want to create a second matrix based on the first, by using the following rules:
If the value of the point is > 70, then the value of the point should be set to 70.
If the value of the point is < -70, then the value of the point should be set to -70.
All other values should be rounded to the nearest multiple of 5.
Here is what I am currently doing:
data = 100*(-1+2*rand(1,10000000)); % create random dataset for stackoverflow
new_data = zeros(1,length(data));
for i = 1:length(data)
if (data(i) > 70)
new_data(i) = 70;
elseif (data(i) < -70)
new_data(i) = -70;
else
new_data(i) = round(data(i)/5.0)*5.0;
end
end
Is there a more efficient method? I think there should be a way to do this using logical indexes but those are a new discovery for me...
You do not need a loop at all:
data = 100*(-1+2*rand(1,10000000)); % create random dataset for stackoverflow
new_data = zeros(1,length(data)); % note that this memory allocation is not necessary at this point
new_data = round(data/5.0)*5.0;
new_data(data>70) = 70;
new_data(data<-70) = -70;
Even easier is to use max and min. Do it in one simple line.
new_data = round(5*max(-70,min(70,data)))/5;
The two answers by H.Muster and woodchips are of course the way to do it, but there still are small improvements to be found. If you are after performance you might want to exploit specifics of your problem. For example, your output data is integers -100 <= x <= 100. This obviously qualifies for 8-bit signed integer data type. This code (note explicit cast to int8 from arbitrary double precision data)
% your double precision input data
data = 100*(-1+2*rand(1,10000000));
% cast to int8 - matlab does usual round here
data = int8(data);
new_data = 5*(max(-70,min(70,data))/5);
is the fastest for two reasons:
1 data element takes 1 byte, not 8. Memory bandwidth is a limiting factor here, so you get a lot of improvement
round is no longer necessary
Here are some timings from the codes of H.Muster, woodchips, and my small modification:
H.Muster Elapsed time is 0.235885 seconds.
woodchips Elapsed time is 0.167659 seconds.
my code Elapsed time is 0.023061 seconds.
The difference is quite striking. Although MATLAB uses doubles everywhere, you should try to use integer data types when possible..
Edit This works because of how matlab implements integer arithmetic. Differently than in C, a cast of double to int implies a round operation:
a = 0.1;
int8(a)
ans =
0
a = 0.9;
int8(a)
ans =
1