Preferred way to set SO timeout - httpclient

Are the following two do the same thing?
cloudSolrClient.getLbClient().getHttpClient().getParams()
.setParameter(CoreConnectionPNames.SO_TIMEOUT, 50000);
and
cloudSolrClient.getLbClient().setSoTimeout(50000);
If they are, then which one is preferable to use (except the fact that the first one is deprecated)?

Related

State Machine - What is the best way to define boolean for different states?

I have got the following question which is pondering me for some time. I'm new to state machine modeling so would really appreciate your help, ideas, and suggestions.
Let's say I have a "Valve" which can be in state "opened" or "closed". Now when I model the state machine.
Should I define two booleans for each state?
bool opened;
bool closed;
Hence, I should use both booleans for each state?
Example: State "opened" will have the booleans--> opened = 1 and closed = 0;
OR
Simply can I define only one boolean variable?
bool opened;
Example: State "opened" will have only one boolean-->opened = 1 and in the state "closed" it will have boolean opened = 0;
What is the best practice here? Any advantages of using two booleans over one boolean? I can imagine in this case too many variables have to been defined and reset every time state transitions into another state.
Thank you in advance
I think you could consider not using boolean at all to represent object state. Most objects will have more than two states, and if you use boolean flags you'll end up with qute a lot of them. This can make it quite hard to test and verify that the code works as expected all the time. I worked with someone who had 22 boolean flags in one class. This meant that the class has over 4 million possible states.
I usually use an enum to represent class state. The Valve could be Open or Closed, but what if it became defective, and impossible to operated? I can easily add more states to the enum increasing the number of states by 1, but if I use an boolean I will exponentially increase the number of possible states when adding more flags.
I'd also recommend using a state machine library, instead of manually handling state in your own code. There are many state machine libraries available, I use (and contribute to) the stateless state machine library on Github.
If the two states are mutually exclusive, there's no point in having a redundant state variable, which you'd need to maintain. A single boolean provides you with two possible states already.
Depending on the language, you might be able to introduce an alias such that you'd be able to refer to the same state under different names, purely for aesthetic purposes, and the compiler would remove the redundancy. But again, it may be more of a nuisance than something truly useful.
You want to have additional state variables when you need to deal with independent states or when you want to describe substates.

minisat randomize variable selection is not working on gcloud

I want to get different solution every time I run minisat on the same problem. I can do that with using "rnd-seed" parameter of minisat. It simply randomize the variable selection so each time I can get different solution. Even though this parameter works perfectly on my machine (Ubuntu16) it does not work on gcloud (Google Cloud) running on Ubuntu machine.
I think I am missing a small part but I can not figure out what that is.
Note: I do not want to feed negotation of a solution to minisat to get different solution. I actually need to randomize the variable selection.
Edit: Let me explain why I need randomized solution. I solve lots of SAT problems and usually these SAT problems look like each other a lot. So, if I can not randomized variable selection, I get most of the time very similar solutions which I do not want. Therefore, I actually do not run minisat on the same problem.
Edit-2: #sascha wanted me to explain what I mean by "works" and "not works". When I run a cnf file on my PC, each time I get different solution. However, when I run the same cnf file on gcloud machine, I get always same solution.
The option -rnd-seed doesn't randomize branch variable selection. Rather, it allows you to set a seed for the pseudo-random number generator that Minisat uses.
Variable selection for branching does not involve randomness unless the -rnd-freq option is used. Pass in a floating point value between 0 and 1. 0 means no randomness, 1 means try to use a random variable every branch. The code only makes one attempt to choose a variable randomly, presumably because searching for an unset variable in an arbitrarily large priority queue can get pretty expensive. If that one attempt fails, Minisat branches using the normal priority queue.

Which is better in PHP: suppress warnings with '#' or run extra checks with isset()?

For example, if I implement some simple object caching, which method is faster?
1. return isset($cache[$cls]) ? $cache[$cls] : $cache[$cls] = new $cls;
2. return #$cache[$cls] ?: $cache[$cls] = new $cls;
I read somewhere # takes significant time to execute (and I wonder why), especially when warnings/notices are actually being issued and suppressed. isset() on the other hand means an extra hash lookup. So which is better and why?
I do want to keep E_NOTICE on globally, both on dev and production servers.
I wouldn't worry about which method is FASTER. That is a micro-optimization. I would worry more about which is more readable code and better coding practice.
I would certainly prefer your first option over the second, as your intent is much clearer. Also, best to keep away edge condition problems by always explicitly testing variables to make sure you are getting what you are expecting to get. For example, what if the class stored in $cache[$cls] is not of type $cls?
Personally, if I typically would not expect the index on $cache to be unset, then I would also put error handling in there rather than using ternary operations. If I could reasonably expect that that index would be unset on a regular basis, then I would make class $cls behave as a singleton and have your code be something like
return $cls::get_instance();
The isset() approach is better. It is code that explicitly states the index may be undefined. Suppressing the error is sloppy coding.
According to this article 10 Performance Tips to Speed Up PHP, warnings take additional execution time and also claims the # operator is "expensive."
Cleaning up warnings and errors beforehand can also keep you from
using # error suppression, which is expensive.
Additionally, the # will not suppress the errors with respect to custom error handlers:
http://www.php.net/manual/en/language.operators.errorcontrol.php
If you have set a custom error handler function with
set_error_handler() then it will still get called, but this custom
error handler can (and should) call error_reporting() which will
return 0 when the call that triggered the error was preceded by an #.
If the track_errors feature is enabled, any error message generated by
the expression will be saved in the variable $php_errormsg. This
variable will be overwritten on each error, so check early if you want
to use it.
# temporarily changes the error_reporting state, that's why it is said to take time.
If you expect a certain value, the first thing to do to validate it, is to check that it is defined. If you have notices, it's probably because you're missing something. Using isset() is, in my opinion, a good practice.
I ran timing tests for both cases, using hash keys of various lengths, also using various hit/miss ratios for the hash table, plus with and without E_NOTICE.
The results were: with error_reporting(E_ALL) the isset() variant was faster than the # by some 20-30%. Platform used: command line PHP 5.4.7 on OS X 10.8.
However, with error_reporting(E_ALL & ~E_NOTICE) the difference was within 1-2% for short hash keys, and up 10% for longer ones (16 chars).
Note that the first variant executes 2 hash table lookups, whereas the variant with # does only one lookup.
Thus, # is inferior in all scenarios and I wonder if there are any plans to optimize it.
I think you have your priorities a little mixed up here.
First of all, if you want to get a real world test of which is faster - load test them. As stated though suppressing will probably be slower.
The problem here is if you have performance issues with regular code, you should be upgrading your hardware, or optimize the grand logic of your code rather than preventing proper execution and error checking.
Suppressing errors to steal the tiniest fraction of a speed gain won't do you any favours in the long run. Especially if you think that this error may keep happening time and time again, and cause your app to run more slowly than if the error was caught and fixed.

Is Either the equivalent to checked exceptions?

Beginning in Scala and reading about Either I naturally comparing new concepts to something I know (in this case from Java). Are there any differences from the concept of checked exceptions and Either?
In both cases
the possibility of failure is explicitly annotated in the method (throws or returning Either)
the programmer can handle the error case directly when it occurs or move it up (returning again an Either)
there is a way to inform the caller about the reason of the error
I suppose one uses for-comprehensions on Either to write code as there would be no error similar to checked exceptions.
I wonder if I am the only beginner who has problems to see the difference.
Thanks
Either can be used for more than just exceptions. For example, if you were to have a user either type input for you or specify a file containing that input, you could represent that as Either[String, File].
Either is very often used for exception handling. The main difference between Either and checked exceptions is that control flow with Either is always explicit. The compiler really won't let you forget that you are dealing with an Either; it won't collect Eithers from multiple places without you being aware of it, everything that is returned must be an Either, etc.. Because of this, you use Either not when maybe something extraordinary will go wrong, but as a normal part of controlling program execution. Also, Either does not capture a stack trace, making it much more efficient than a typical exception.
One other difference is that exceptions can be used for control flow. Need to jump out of three nested loops? No problem--throw an exception (without a stack trace) and catch it on the outside. Need to jump out of five nested method calls? No problem! Either doesn't supply anything like this.
That said, as you've pointed out there are a number of similarities. You can pass back information (though Either makes that trivial, while checked exceptions make you write your own class to store any extra information you want); you can pass the Either on or you can fold it into something else, etc..
So, in summary: although you can accomplish the same things with Either and checked exceptions with regards to explicit error handling, they are relatively different in practice. In particular, Either makes creating and passing back different states really easy, while checked exceptions are good at bypassing all your normal control flow to get back, hopefully, to somewhere that an extraordinary condition can be sensibly dealt with.
Either is equivalent to a checked exception in terms of the return signature forming an exclusive disjunction. The result can be a thrown exception X or an A. However, throwing an exception isn't equivalent to returning one – the first is not referentially transparent.
Where Scala's Either is not (as of 2.9) equivalent is that a return type is positively biased, and requires effort to extract/deconstruct the Exception, Either is unbiased; you need to explicitly ask for the left or right value. This is a topic of some discussion, and in practice a bit of pain – consider the following three calls to Either producing methods
for {
a <- eitherA("input").right
b <- eitherB(a).right
c <- eitherC(b).right
} yield c // Either[Exception, C]
you need to manually thread through the RHS. This may not seem that onerous, but in practice is a pain and somewhat surprising to new-comers.
Yes, Either is a way to embed exceptions in a language; where a set of operations that can fail can throw an error value to some non-local site.
In addition to the practical issues Rex mentioned, there's some extra things you get from the simple semantics of an Either:
Either forms a monad; so you can use monadic operations over sets of expressions that evaluate to Either. E.g. for short circuiting evaluation without having to test the result
Either is in the type -- so the type checker alone is sufficient to track incorrect handling of the value
Once you have the ability to return either an error message (Left s) or a successful value Right v, you can layer exceptions on top, as just Either plus an error handler, as is done for MonadError in Haskell.

Implementing snapshot in FRP

I'm implementing an FRP framework in Scala and I seem to have run into a problem. Motivated by some thinking, this question I decided to restrict the public interface of my framework so Behaviours could only be evaluated in the 'present' i.e.:
behaviour.at(now)
This also falls in line with Conal's assumption in the Fran paper that Behaviours are only ever evaluated/sampled at increasing times. It does restrict transformations on Behaviours but otherwise we find ourselves in huge problems with Behaviours that represent some input:
val slider = Stepper(0, sliderChangeEvent)
With this Behaviour, evaluating future values would be incorrect and evaluating past values would require an unbounded amount of memory (all occurrences used in the 'slider' event would have to be stored).
I am having trouble with the specification for the 'snapshot' operation on Behaviours given this restriction. My problem is best explained with an example (using the slider mentioned above):
val event = mouseB // an event that occurs when the mouse is pressed
val sampler = slider.snapshot(event)
val stepper = Stepper(0, sampler)
My problem here is that if the 'mouseB' Event has occurred when this code is executed then the current value of 'stepper' will be the last 'sample' of 'slider' (the value at the time the last occurrence occurred). If the time of the last occurrence is in the past then we will consequently end up evaluating 'slider' using a past time which breaks the rule set above (and your original assumption). I can see a couple of ways to solve this:
We 'record' the past (keep hold of all past occurrences in an Event) allowing evaluation of Behaviours with past times (using an unbounded amount of memory)
We modify 'snapshot' to take a time argument ("sample after this time") and enforce that that time >= now
In a more wacky move, we could restrict creation of FRP objects to the initial setup of a program somehow and only start processing events/input after this setup is complete
I could also simply not implement 'sample' or remove 'stepper'/'switcher' (but I don't really want to do either of these things). Has anyone any thoughts on this? Have I misunderstood anything here?
Oh I see what you mean now.
Your "you can only sample at 'now'" restriction isn't tight enough, I think. It needs to be a bit stronger to avoid looking into the past. Since you are using an environmental conception of now, I would define the behavior construction functions in terms of it (so long as now cannot advance by the mere execution of definitions, which, per my last answer, would get messy). For example:
Stepper(i,e) is a behavior with the value i in the interval [now,e1] (where e1 is the
time of first occurrence of e after now), and the value of the most recent occurrence of e afterward.
With this semantics, your prediction about the value of stepper that got you into this conundrum is dismantled, and the stepper will now have the value 0. I don't know whether this semantics is desirable to you, but it seems natural enough to me.
From what I can tell, you are worried about a race condition: what happens if an event occurs while the code is executing.
Purely functional code does not like to have to know that it gets executed. Functional techniques are at their finest in the pure setting, such that it does not matter in what order code is executed. A way out of this dilemma is to pretend that every change happened in one sensitive (internal, probably) piece of imperative code; pretend that any functional declarations in the FRP framework happen in 0 time so it is impossible for something to change during their declaration.
Nobody should ever sleep, or really do anything time sensitive, in a section of code that is declaring behaviors and things. Essentially, code that works with FRP objects ought to be pure, then you don't have any problems.
This does not necessarily preclude running it on multiple threads, but to support that you might need to reorganize your internal representations. Welcome to the world of FRP library implementation -- I suspect your internal representation will fluctuate many times during this process. :-)
I'm confused about your confusion. The way I see is that Stepper will "set" the behavior to a new value whenever the event occurs. So, what happens is the following:
The instant in which the event mouseB occurs, the value of the slider behavior will be read (snapshot). This value will be "set" into the behavior stepper.
So, it is true that the Stepper will "remember" values from the past; the point is that it only remembers the latest value from the past, not everything.
Semantically, it is best to model Stepper as a function like luqui proposes.