Estimating Noise Budget without Secret Key - seal

In the latest version of SEAL, the Simulation class has been removed.
Has it been shifted to some other file?
If it has been completely removed from the library, then how can I estimate noise growth, and choose appropriate parameters for my application?

Simulation class and related tools are indeed removed from the most recent SEAL. There were several reasons for this:
updating them was simply too much work with all the recent API changes;
the parameter selection and noise growth estimation tools worked well only in extremely restricted settings, e.g. very shallow computations;
the design of these tools didn't necessary correspond very well to how we envision parameter selection to happen in the future.
Obviously it's really important to have developer tools in SEAL (e.g. parameter selection, circuit analysis/optimization, etc.) so these will surely be coming back in some form.
Even without explicit tools for parameter selection you can still test different parameters and see how much noise budget you have left after the computation. Repeat this experiment multiple times to account for probabilistic effects and convince yourself that the parameters indeed work.

Related

integrate Modelica variable without influencing state selection

I want to integrate a Modelica variable over time, just for convenience in plotting and post-processing. The variable I want to integrate over time is the power of a compressor so that I get the total energy. The first idea would be to add these lines:
Modelica.Units.SI.Power P_comp;
Modelica.Units.SI.Energy E_comp;
equation
P_comp = der(E_comp);
Is that the recommended way, or are there (better?) alternatives? Is it expected to influence the selection of dynamic states?
Assuming that those two lines are the only ones using E_comp that should work.
Basically E_comp will be part of its own separate state-selection block and changes there shouldn't influence anything else.
However, state selection consists of a number of algorithms and heuristics so it is difficult to formally guarantee that any change does not influence it.
I could imagine some strange possibilities that would break this, but I don't think anyone has implemented them - and I don't see a use-case for them (except to mess up cases like this).
And if you instead of integrating want to differentiate a signal it is a lot messier.

checking for convergence in complex hierarchical models JAGS

I have estimated a complex hierarchical model with many random effects, but don't really know what the best approach is to checking for convergend. I have complex longitudinal data from a few hundred individuals and estimate quite a few parameters for every individual. Because of that, I have way to many traceplots to inspect visually. Or should I really spend a day going through all the traceplots? What would be a better way to check for convergence? Do I have to calculate Gelman and Rubin's Rhat for every parameter on the person level? And when can I conclude that the model converged? When absolutely all of the thousends of parameters reached convergence? Is it even sensible to expect that? Or is there something like "overall convergence"? And what does it mean when some person-level parameters did not converge? Does it make sense to use autorun.jags from the R2jags package with such a model or will it just run for ever? I know, these are a lot of question, but I just don't know how to approach that.
The measure I am using for convergence is a potential scale reduction factor (psrf)* using the gelman.diag function from the R package coda.
But nevertheless, I am also quickly visually inspecting all the traceplots, even though I also have tens/hundreds of them. It can be really fast if you put them in PNG files and then quickly go through them using e.g. IrfanView (let me know if you need me to expand on this).
The reason you should inspect the traceplots is pretty well described by an example from Marc Kery (author of great Bayesian books): see "Never blindly trust Rhat for convergence in a Bayesian analysis", here I include a self explanatory image from this email:
This is related to Rhat statistics while I use psrf, but it's pretty likely that psrf suffers from this too... and better to check the chains.
*) Gelman, A. & Rubin, D. B. Inference from iterative simulation using multiple sequences. Stat. Sci. 7, 457–472 (1992).

How to ensure the output of _best_programs of SymbolicTransformer of gplearn is different?

I am using the SymbolicTransformer of gplearn to generate some automated features. The issue is, when I inspect the expression of the features via looking at _best_programs after fitting, I find that most of the features have the same expression. I am wondering whether there is a way to ensure that we output different features using SymbolicTransformer after fitting?
I don't know if there is a way to explicitly enforce this but you can probably try to enforce more diverse populations each generation in the hopes that this leads to a a collection of more diverse _best_programs. In my opinion a few parameters you could look into are:
p_crossover
p_subtree_mutation
p_hoise_mutation
p_point_mutation
p_point_replace
If you increase the chance of crossover or mutation you will increase your expected diversity but you must not overdue it. There is a balance between a diverse population and an accurate one. The higher the crossover or mutation the more likely that you will take a strong individual candidate and change it into something meaningless.

How Finagle aperture algorithm chooses "non overlapping" subsets?

I have been reading about Finagle and trying to understand the code to figure out how Aperture's subset choice works.
I have seen that ApertureLeastLoaded has a "useDeterministicOrdering" and an "EndpointFactory" which I guess should be the key points to make the decision of which clients to take in the subset.
While reading the "deterministic subsetting" section of Google SRE's book, I understood that the best way to pick a subset of servers from the client point of view, is to know the total number of clients, and a unique sequential identifier of the current client, that can be used as seed of the subset generator.
In Finagle I can't understand how this process is done (I'm not super familiar with Scala) and the documentation both on the website and in the code, explain just how the aperture paradigm works, but not very clear how the initial subset is chosen
I hope somebody can enlighten me
One of the unique properties of Aperture is that its window is sized dynamically based on a clients offered load. That is, clients have a built in controller which can expand or shrink their window at runtime. This property is important as it allows clients to operate more efficiently and better adapt to a changing environment, but it does make it more complex to achieve a uniform load distribution across servers.
To contrast, the subsetting algorithm, as proposed by the Google SRE book, suggests that operators choose a static subset size which allows a uniform load distribution to be calculated analytically but introduces another static configuration that needs to be revisited as a system evolves.
Deterministic Aperture is, to the best of our knowledge, a novel algorithm for achieving a uniform load distribution while maintaining the dynamic properties of the window sizing mentioned above. From a high level, clients construct a topology of their peer cluster (which gives them a sense of ordering and proximity) and then derive a unique per-client permutation of the servers from the topology such that each server is uniformly represented across the permutations.
We are still in the early stages of testing this in production at Twitter, but early results look very promising. After we gather more empirical results, we hope to publish some more detailed content on how the algorithm works and its properties.

How to remove nodes from TensorFlow graph?

I need to write a program where part of the TensorFlow nodes need to keep being there storing some global information(mainly variables and summaries) while the other part need to be changed/reorganized as program runs.
The way I do now is to reconstruct the whole graph in every iteration. But then, I have to store and load those information manually from/to checkpoint files or numpy arrays in every iteration, which makes my code really messy and error prone.
I wonder if there is a way to remove/modify part of my computation graph instead of reset the whole graph?
Changing the structure of TensorFlow graphs isn't really possible. Specifically, there isn't a clean way to remove nodes from a graph, so removing a subgraph and adding another isn't practical. (I've tried this, and it involves surgery on the internals. Ultimately, it's way more effort than it's worth, and you're asking for maintenance headaches.)
There are some workarounds.
Your reconstruction is one of them. You seem to have a pretty good handle on this method, so I won't harp on it, but for the benefit of anyone else who stumbles upon this, a very similar method is a filtered deep copy of the graph. That is, you iterate over the elements and add them in, predicated on some condition. This is most viable if the graph was given to you (i.e., you don't have the functions that built it in the first place) or if the changes are fairly minor. You still pay the price of rebuilding the graph, but sometimes loading and storing can be transparent. Given your scenario, though, this probably isn't a good match.
Another option is to recast the problem as a superset of all possible graphs you're trying to evaluate and rely on dataflow behavior. In other words, build a graph which includes every type of input you're feeding it and only ask for the outputs you need. Good signs this might work are: your network is parametric (perhaps you're just increasing/decreasing widths or layers), the changes are minor (maybe including/excluding inputs), and your operations can handle variable inputs (reductions across a dimension, for instance). In your case, if you have only a small, finite number of tree structures, this could work well. You'll probably just need to add some aggregation or renormalization for your global information.
A third option is to treat the networks as physically split. So instead of thinking of one network with mutable components, treat the boundaries between fixed and changing pieces are inputs and outputs of two separate networks. This does make some things harder: for instance, backprop across both is now ugly (which it sounds like might be a problem for you). But if you can avoid that, then two networks can work pretty well. It ends up feeling a lot like dealing with a separate pretraining phase, which you many already be comfortable with.
Most of these workarounds have a fairly narrow range of problems that they work for, so they might not help in your case. That said, you don't have to go all-or-nothing. If partially splitting the network or creating a supergraph for just some changes works, then it might be that you only have to worry about save/restore for a few cases, which may ease your troubles.
Hope this helps!