How to custom VaR at 5% and max return in Portfolio Analytics - performanceanalytics

For Asset Allocation in R using Portfolio Analytics, is there a way to set risk as constant number, then optimize portfolio returns? For example, to maintain VaR always at 5% (conservative), how do weights of 8 assets change in portfolio to max return? In contrast, how do the weights change compared to a risky (VaR =20%) portfolio? In the Portfolio Analytics package, we can only set min risk as objective, but not set risk as a constant number. (Different from Equal Risk Contribution)

Is this what you are looking for?
add.objective(portfolio = base_pf, type = 'VaR', name = 'var', target=0.05)
or
add.objective(portfolio = base_pf, type = 'VaR', name = 'var', target=0.2)

Related

Looking for advice on improving a custom function in AnyLogic

I'm estimating last mile delivery costs in an large urban network using by-route distances. I have over 8000 customer agents and over 100 retail store agents plotted in a GIS map using lat/long coordinates. Each customer receives deliveries from its nearest store (by route). The goal is to get two distance measures in this network for each store:
d0_bar: the average distance from a store to all of its assigned customers
d1_bar: the average distance between all customers common to a single store
I've written a startup function with a simple foreach loop to assign each customer to a store based on by-route distance (customers have a parameter, "customer.pStore" of Store type). This function also adds, in turn, each customer to the store agent's collection of customers ("store.colCusts"; it's an array list with Customer type elements).
Next, I have a function that iterates through the store agent population and calculates the two average distance measures above (d0_bar & d1_bar) and writes the results to a txt file (see code below). The code works, fortunately. However, the problem is that with such a massive dataset, the process of iterating through all customers/stores and retrieving distances via the openstreetmap.org API takes forever. It's been initializing ("Please wait...") for about 12 hours. What can I do to make this code more efficient? Or, is there a better way in AnyLogic of getting these two distance measures for each store in my network?
Thanks in advance.
//for each store, record all customers assigned to it
for (Store store : stores)
{
distancesStore.print(store.storeCode + "," + store.colCusts.size() + "," + store.colCusts.size()*(store.colCusts.size()-1)/2 + ",");
//calculates average distance from store j to customer nodes that belong to store j
double sumFirstDistByStore = 0.0;
int h = 0;
while (h < store.colCusts.size())
{
sumFirstDistByStore += store.distanceByRoute(store.colCusts.get(h));
h++;
}
distancesStore.print((sumFirstDistByStore/store.colCusts.size())/1609.34 + ",");
//calculates average of distances between all customer nodes belonging to store j
double custDistSumPerStore = 0.0;
int loopLimit = store.colCusts.size();
int i = 0;
while (i < loopLimit - 1)
{
int j = 1;
while (j < loopLimit)
{
custDistSumPerStore += store.colCusts.get(i).distanceByRoute(store.colCusts.get(j));
j++;
}
i++;
}
distancesStore.print((custDistSumPerStore/(loopLimit*(loopLimit-1)/2))/1609.34);
distancesStore.println();
}
Firstly a few simple comments:
Have you tried timing a single distanceByRoute call? E.g. can you try running store.distanceByRoute(store.colCusts.get(0)); just to see how long a single call takes on your system. Routing is generally pretty slow, but it would be good to know what the speed limit is.
The first simple change is to use java parallelism. Instead of using this:
for (Store store : stores)
{ ...
use this:
stores.parallelStream().forEach(store -> {
...
});
this will process stores entries in parallel using standard Java streams API.
It also looks like the second loop - where avg distance between customers is calculated doesn't take account of mirroring. That is to say distance a->b is equal to b->a. Hence, for example, 4 customers will require 6 calculations: 1->2, 1->3, 1->4, 2->3, 2->4, 3->4. Whereas in case of 4 customers your second while loop will perform 9 calculations: i=0, j in {1,2,3}; i=1, j in {1,2,3}; i=2, j in {1,2,3}, which seems wrong unless I am misunderstanding your intention.
Generally, for long running operations it is a good idea to include some traceln to show progress with associated timing.
Please have a look at above and post results. With more information additional performance improvements may be possible.

Transferring arrays/classes/records between locales

In a typical N-Body simulation, at the end of each epoch, each locale would need to share its own portion of the world (i.e. all bodies) to the rest of the locales. I am working on this with a local-view approach (i.e. using on Loc statements). I encountered some strange behaviours that I couldn't make sense out of, so I decided to make a test program, in which things got more complicated. Here's the code to replicate the experiment.
proc log(args...?n) {
writeln("[locale = ", here.id, "] [", datetime.now(), "] => ", args);
}
const max: int = 50000;
record stuff {
var x1: int;
var x2: int;
proc init() {
this.x1 = here.id;
this.x2 = here.id;
}
}
class ctuff {
var x1: int;
var x2: int;
proc init() {
this.x1 = here.id;
this.x2 = here.id;
}
}
class wrapper {
// The point is that total size (in bytes) of data in `r`, `c` and `a` are the same here, because the record and the class hold two ints per index.
var r: [{1..max / 2}] stuff;
var c: [{1..max / 2}] owned ctuff?;
var a: [{1..max}] int;
proc init() {
this.a = here.id;
}
}
proc test() {
var wrappers: [LocaleSpace] owned wrapper?;
coforall loc in LocaleSpace {
on Locales[loc] {
wrappers[loc] = new owned wrapper();
}
}
// rest of the experiment further down.
}
Two interesting behaviours happen here.
1. Moving data
Now, each instance of wrapper in array wrappers should live in its locale. Specifically, the references (wrappers) will live in locale 0, but the internal data (r, c, a) should live in the respective locale. So we try to move some from locale 1 to locale 3, as such:
on Locales[3] {
var timer: Timer;
timer.start();
var local_stuff = wrappers[1]!.r;
timer.stop();
log("get r from 1", timer.elapsed());
log(local_stuff);
}
on Locales[3] {
var timer: Timer;
timer.start();
var local_c = wrappers[1]!.c;
timer.stop();
log("get c from 1", timer.elapsed());
}
on Locales[3] {
var timer: Timer;
timer.start();
var local_a = wrappers[1]!.a;
timer.stop();
log("get a from 1", timer.elapsed());
}
Surprisingly, my timings show that
Regardless of the size (const max), the time of sending the array and record strays constant, which doesn't make sense to me. I even checked with chplvis, and the size of GET actually increases, but the time stays the same.
The time to send the class field increases with time, which makes sense, but it is quite slow and I don't know which case to trust here.
2. Querying the locales directly.
To demystify the problem, I also query the .locale.id of some variables directly. First, we query the data, which we expect to live in locale 2, from locale 2:
on Locales[2] {
var wrappers_ref = wrappers[2]!; // This is always 1 GET from 0, okay.
log("array",
wrappers_ref.a.locale.id,
wrappers_ref.a[1].locale.id
);
log("record",
wrappers_ref.r.locale.id,
wrappers_ref.r[1].locale.id,
wrappers_ref.r[1].x1.locale.id,
);
log("class",
wrappers_ref.c.locale.id,
wrappers_ref.c[1]!.locale.id,
wrappers_ref.c[1]!.x1.locale.id
);
}
And the result is:
[locale = 2] [2020-12-26T19:36:26.834472] => (array, 2, 2)
[locale = 2] [2020-12-26T19:36:26.894779] => (record, 2, 2, 2)
[locale = 2] [2020-12-26T19:36:27.023112] => (class, 2, 2, 2)
Which is expected. Yet, if we query the locale of the same data on locale 1, then we get:
[locale = 1] [2020-12-26T19:34:28.509624] => (array, 2, 2)
[locale = 1] [2020-12-26T19:34:28.574125] => (record, 2, 2, 1)
[locale = 1] [2020-12-26T19:34:28.700481] => (class, 2, 2, 2)
Implying that wrappers_ref.r[1].x1.locale.id lives in locale 1, even though it should clearly be on locale 2. My only guess is that by the time .locale.id is executed, the data (i.e. the .x of the record) is already moved to the querying locale (1).
So all in all, the second part of the experiment lead to a secondary question, whilst not answering the first part.
NOTE: all experiment are run with -nl 4 in chapel/chapel-gasnet docker image.
Good observations, let me see if I can shed some light.
As an initial note, any timings taken with the gasnet Docker image should be taken with a grain of salt since that image simulates the execution across multiple nodes using your local system rather than running each locale on its own compute node as intended in Chapel. As a result, it is useful for developing distributed memory programs, but the performance characteristics are likely to be very different than running on an actual cluster or supercomputer. That said, it can still be useful for getting coarse timings (e.g., your "this is taking a much longer time" observation) or for counting communications using chplvis or the CommDiagnostics module.
With respect to your observations about timings, I also observe that the array-of-class case is much slower, and I believe I can explain some of the behaviors:
First, it's important to understand that any cross-node communications can be characterized using a formula like alpha + beta*length. Think of alpha as representing the basic cost of performing the communication, independent of length. This represents the cost of calling down through the software stack to get to the network, putting the data on the wire, receiving it on the other side, and getting it back up through the software stack to the application there. The precise value of alpha will depend on factors like the type of communication, choice of software stack, and physical hardware. Meanwhile, think of beta as representing the per-byte cost of the communication where, as you intuit, longer messages necessarily cost more because there's more data to put on the wire, or potentially to buffer or copy, depending on how the communication is implemented.
In my experience, the value of alpha typically dominates beta for most system configurations. That's not to say that it's free to do longer data transfers, but that the variance in execution time tends to be much smaller for longer vs. shorter transfers than it is for performing a single transfer versus many. As a result, when choosing between performing one transfer of n elements vs. n transfers of 1 element, you'll almost always want the former.
To investigate your timings, I bracketed your timed code portions with calls to the CommDiagnostics module as follows:
resetCommDiagnostics();
startCommDiagnostics();
...code to time here...
stopCommDiagnostics();
printCommDiagnosticsTable();
and found, as you did with chplvis, that the number of communications required to localize the array of records or array of ints was constant as I varied max, for example:
locale
get
execute_on
0
0
0
1
0
0
2
0
0
3
21
1
This is consistent with what I'd expect from the implementation: That for an array of value types, we perform a fixed number of communications to access array meta-data, and then communicate the array elements themselves in a single data transfer to amortize the overheads (avoid paying multiple alpha costs).
In contrast, I found that the number of communications for localizing the array of classes was proportional to the size of the array. For example, for the default value of 50,000 for max, I saw:
locale
get
put
execute_on
0
0
0
0
1
0
0
0
2
0
0
0
3
25040
25000
1
I believe the reason for this distinction relates to the fact that c is an array of owned classes, in which only a single class variable can "own" a given ctuff object at a time. As a result, when copying the elements of array c from one locale to another, you're not just copying raw data, as with the record and integer cases, but also performing an ownership transfer per element. This essentially requires setting the remote value to nil after copying its value to the local class variable. In our current implementation, this seems to be done using a remote get to copy the remote class value to the local one, followed by a remote put to set the remote value to nil, hence, we have a get and put per array element, resulting in O(n) communications rather than O(1) as in the previous cases. With additional effort, we could potentially have the compiler optimize this case, though I believe it will always be more expensive than the others due to the need to perform the ownership transfer.
I tested the hypothesis that owned classes were resulting in the additional overhead by changing your ctuff objects from being owned to unmanaged, which removes any ownership semantics from the implementation. When I do this, I see a constant number of communications, as in the value cases:
locale
get
execute_on
0
0
0
1
0
0
2
0
0
3
21
1
I believe this represents the fact that once the language has no need to manage the ownership of the class variables, it can simply transfer their pointer values in a single transfer again.
Beyond these performance notes, it's important to understand a key semantic difference between classes and records when choosing which to use. A class object is allocated on the heap, and a class variable is essentially a reference or pointer to that object. Thus, when a class variable is copied from one locale to another, only the pointer is copied, and the original object remains where it was (for better or worse). In contrast, a record variable represents the object itself, and can be thought of as being allocated "in place" (e.g., on the stack for a local variable). When a record variable is copied from one locale to the other, it's the object itself (i.e., the record's fields' values) which are copied, resulting in a new copy of the object itself. See this SO question for further details.
Moving on to your second observation, I believe that your interpretation is correct, and that this may be a bug in the implementation (I need to stew on it a bit more to be confident). Specifically, I think you're correct that what's happening is that wrappers_ref.r[1].x1 is being evaluated, with the result being stored in a local variable, and that the .locale.id query is being applied to the local variable storing the result rather than the original field. I tested this theory by taking a ref to the field and then printing locale.id of that ref, as follows:
ref x1loc = wrappers_ref.r[1].x1;
...wrappers_ref.c[1]!.x1.locale.id...
and that seemed to give the right result. I also looked at the generated code which seemed to indicate that our theories were correct. I don't believe that the implementation should behave this way, but need to think about it a bit more before being confident. If you'd like to open a bug against this on Chapel's GitHub issues page, for further discussion there, we'd appreciate that.

Date logic in DAX

I am trying to compute a percentage difference between two values - market index levels separated by a period of time (the period will be determined by user input in a Power BI Slicer tool). I don't understand how I can cross reference values DAX uses by the associated date.
Value % difference from Value =
VAR __BASELINE_VALUE = SUM('Equity Markets (2)'[Value])
VAR __VALUE_TO_COMPARE = SUM('Equity Markets (2)'[Value])
RETURN
IF(
NOT ISBLANK(__VALUE_TO_COMPARE),
DIVIDE(__VALUE_TO_COMPARE - __BASELINE_VALUE, __BASELINE_VALUE)
)
"Value" is a column in a table "Equity Markets (2)" the table also includes a "Date" column.
What is the syntax for selecting a value from Value based on an associated date?
Apologies for asking such a basic question - feels like 30 sec of googling should have done it for me.
The slicer is engaging with the bar graph correctly - I know becouse I'm measuring the levels. I think all the % changes are zero because I'm evaluating x/x -1
percentage change =
VAR
__EarliestValue = CALCULATE(SUM('Equity Markets (2)'[Value]),
FIRSTDATE('Equity Markets (2)'[Date]))
VAR __LastDateValue = CALCULATE(SUM('Equity Markets (2)'[Value]),
LASTDATE('Equity Markets (2)'[Date]))
RETURN
CALCULATE(
DIVIDE(__LastDateValue,__EarliestValue)-1)

How to check via callbacks if alpha is decreasing? + How to load all cores during training?

I'm training doc2vec, and using callbacks trying to see if alpha is decreasing over training time using this code:
class EpochSaver(CallbackAny2Vec):
'''Callback to save model after each epoch.'''
def __init__(self, path_prefix):
self.path_prefix = path_prefix
self.epoch = 0
os.makedirs(self.path_prefix, exist_ok=True)
def on_epoch_end(self, model):
savepath = get_tmpfile(
'{}_epoch{}.model'.format(self.path_prefix, self.epoch)
)
model.save(savepath)
print(
"Model alpha: {}".format(model.alpha),
"Model min_alpha: {}".format(model.min_alpha),
"Epoch saved: {}".format(self.epoch + 1),
"Start next epoch"
)
self.epoch += 1
def train():
workers = multiprocessing.cpu_count()*4
model = Doc2Vec(
DocIter(),
vec_size=600, alpha=0.03, min_alpha=0.00025, epochs=20,
min_count=10, dm=1, hs=1, negative=0, workers=workers,
callbacks=[EpochSaver("./checkpoints")]
)
print(
"HS", model.hs, "Negative", model.negative, "Epochs",
model.epochs, "Workers: ", model.workers, "Model alpha:
{}".format(model.alpha)
)
And while training I see that alpha is not changing over time. On each callback I see alpha = 0.03.
Is it possible to check if alpha is decreasing? Or it really not decreasing at all during training?
One more question:
How can I benefit from all my cores while training doc2vec?
As we can see, each core is not loaded more than +-30%.
The model.alpha property only holds the initially-configured starting-alpha – it's not updated to the effective learning-rate through training.
So, even if the value is being decreased properly (and I expect that it is), you wouldn't see it in the logging you've added.
Separate observations about your code:
in gensim versions at least through 3.5.0, maximum training throughput is most often reached with some value for workers between 3 and the number of cores – but usually not the full number of cores (if it's higher than 12) or larger. So workers=multiprocessing.cpu_count()*4 is likely going to much slower than what you could achieve with a lower number.
if your corpus is large enough to support 600-dimensional vectors, and discarding words with fewer than min_count=10 examples, negative sampling may work faster and get better results than the hs mode. (The pattern in published work seems to be to prefer negative-sampling with larger corpuses.)

How are min and max of cumulative variables assigned?

I created a routing problem and added some dimension to it. A solution assignment is found and I want to know the cumulative value at each index. I noticed that the CumulVar of an assignment does not only have a Value method but also Min and Max methods. Apparently the cumulative variables are implemented in such a way that they can represent intervals. I can see how setting
slack_max>0
fix_start_cumul_to_zero=False
introduces an ambiguity for the cumulative variables as their is a choice in how to start and how much slack to add at each stop. But
Question: How are the Min and Max at each index computed?
You can get the Min and Max range of a given node index from solution.Min(dimension.Cumulvar(index))
Note you'll get Min and Max exactly the same when slack_max=0 unless you know something I don't ;)
Assuming you are using an output solution object solution and a time dimension time_dimension, this will store em as a dict with min-max tuples, you may wish to adapt the output format however you wish:
time_dict = {}
for vehicle_id in range(num_vehicles):
vehicle_time_dict={}
index = routing.Start(vehicle_id)
start_time = solution.Min(time_dimension.CumulVar(index))
vehicle_time_dict[index]=(index_min,index_max)
while not routing.isEnd(index):
previous_index = index
index = solution.Value(routing.NextVar(index))
index_min = solution.Min(time_dimension.CumulVar(index))
index_max = solution.Max(time_dimension.CumulVar(index))
vehicle_time_dict[index]=(index_min,index_max)
time_dict[vehicle_id]=vehicle_time_dict
routing.IsEnd(index) returns True if it's the last index of that vehicle's route (or anywhere after the last index, so if it's 10 nodes long:
routing.IsEnd(8) will return False,
routing.IsEnd(9) will return True,
routing.IsEnd(10) will also return True, etc)