Are the function bodies passed to Future.onComplete(), and their closures, discarded and so garbage collected after they are called?
I ask because I'm writing an unbounded sequence of Future instances. Each Future has an .onComplete { case Failure(t)...} that refers to the previous known-good value from a previous Future. What I want to avoid is the total history of all Future results being kept in the JVM's memory because of references in closure bodies.
Perhaps Scala is more clever than this, but skimming the code related to execution contexts and futures isn't yielding much.
Thanks.
The class that normally implements Future and that you want to look at is DefaultPromise.
It contains mutable state that is being updated as the Future completes.
If you call onComplete and it has already been completed then it just schedules your callback immediately with the result. The callback is not recorded anywhere.
If you call onComplete while the result is not yet available, the callback is added to a list of "listeners".
When the result becomes available (someone calls complete on the promise), then all listeners are scheduled to run with that result, and the list of listeners is deleted (the internal state changes to "completed with this result")
This means that your callback chain is only being built up until the "upstream future" is incomplete. After that, everything gets resolved and garbage-collected.
"list of listeners" above is a bit of a simplification. There is special care being taken that these listeners do not end up referring to each-other, specifically to break reference loops that would prevent garbage collection to work when constructing futures recursively. Apparently this was indeed a problem in earlier versions.
The problem of leaks is solved by automatically breaking these chains of promises, so that promises don't refer to each other in a long chain. This
allows each promise to be individually collected. The idea is to "flatten"
the chain of promises, so that instead of each promise pointing to its
neighbour, they instead point directly the promise at the root of the
chain. This means that only the root promise is referenced, and all the
other promises are available for garbage collection as soon as they're no
longer referenced by user code.
Related
I can't seem to find anywhere whether complete and tryComplete are atomic operations on Promises in Scala. Promises are only supposed to be written to once, but if two tryCompletes happen concurrently in two different callbacks for example could something go wrong? Or are we assured that tryComplete is atomic?
First a quick note that success(...) is equivalent to calling complete(Success(...)) and tryComplete(...) is equivalent to complete(...).isCompleted.
In the docs it says
As mentioned before, promises have single-assignment semantics. As such, they can be completed only once. Calling success on a promise that has already been completed (or failed) will throw an IllegalStateException.
A promise can only complete once. Digging into the source code, DefaultPromise extends AtomicReference (ie. thread safe) and so all writes are atomic. This means that if you have two threads completing a promise, only one of them can ever succeed and it'll be whichever did so first. The other will throw an IllegalStateException.
Here's a small example of what happens when you try and complete a promise twice.
https://scastie.scala-lang.org/hTYBqVywSQCl8bFSgQI0Sg
Though apparently it seems one can circumvent the immutability of a Future by doing a bunch of weird casting acrobatics.
https://contributors.scala-lang.org/t/defaultpromise-violates-encapsulation/3440
One should probably avoid that.
As indexWhere() searches through potentially large Lists to find the matching index, is this an async operation?
And if so is there some way to await it? e.g. similar to Future.forEach
Since the method does not return a Future but an int, then the method is synchronous:
https://api.dart.dev/stable/2.7.1/dart-core/List/indexWhere.html
I also want to point out that asynchronous code in Dart are not the same as the code is running in another thread. Since we are talking about finding a index inside a List, then you can be fairly sure that this operation can only be done in a single thread since we need to prevent modifications of the List when we are making the search.
I'm having a difficult time trying to understand how the control flow in protractor work in relation to how JS event loop works. Here is what I know so far:
Protractor control flow stores commands that return promises in a queue. The first command will be at the front of the queue and the last command will be at the back. No command will be executed until the command in front of it has its promise resolved.
JS event loop stores asynchronous task (callbacks to be specific). Callbacks are not executed until all functions in the stack have completed and the stack is empty. Before running each callback, there is a check on whether the stack is empty or not.
so lets take this code for example. The code is basically clicking a search button and a api request is made. Then after data is returned, it checks whether the field that stores the returned data exists.
elem('#searchButton').click(); //will execute a api call to retrieve data
browser.wait(ExpectedConditions.presenceOf(elem('#resultDataField'),3000));
expect(elem('#resultDataField').isPresent()).toBeTruthy();
So with this code, I'm able to get it to work. But I don't know how it does it. How is the event loop applied in this scenario?
The core of the ControlFlow implementation is in runEventLoop_ (in Selenium's promise.js implementation).
As I understand it, the ControlFlow registers a call to runEventLoop_ with the JS event loop (e.g., with a 0-second timeout or somesuch). The call to runEventLoop_ can be thought of as a single iteration of a normal event loop. It registers code to actually run a scheduled task (i.e., actually do the work you queued up during your it). Once that task completes or fails (e.g., by hooking its async promise callbacks) the next iteration of runEventLoop_ is scheduled (see the calls to scheduleEventLoop in runEventLoop_).
There is some complexity when a callback ends up registering new promises (those need to be "inserted" before the old next event, this is accomplished by creating a "nested" control flow. Mostly you should never have to know this.)
While re-reading scala.lan.org's page detailing Future here, I have stumbled up on the following sentence:
In the event that some of the callbacks never complete (e.g. the callback contains an infinite loop), the other callbacks may not be executed at all. In these cases, a potentially blocking callback must use the blocking construct (see below).
Why may the other callbacks not be executed at all? I may install a number of callbacks for a given Future. The thread that completes the Future, may or may not execute the callbacks. But, because one callback is not playing footsie, the rest should not be penalized, I think.
One possibility I can think of is the way ExecutionContext is configured. If it is configured with one thread, then this may happen, but that is a specific behaviour and a not generally expected behaviour.
Am I missing something obvious here?
Callbacks are called within an ExecutionContext that has an eventually limited number of threads - if not by the specific context implementation, then by the underlying operating system and/or hardware itself.
Let's say your system's limit is OS_LIMIT threads. You create OS_LIMIT + 1 callbacks. From those, OS_LIMIT callbacks immediately get a thread each - and none ever terminate.
How can you guarantee that the remaining 1 callback ever gets a thread?
Sure, there could be some detection mechanisms built into the Scala library, but it's not possible in the general case to make an optimal implementation: maybe you want the callback to run for a month.
Instead (and this seems to be the approach in the Scala library), you could provide facilities for handling situations that you, the developer, know are risky. This removes the element of surprise from the system.
Perhaps most importantly - it enables the developer to "bake in" the necessary information about handler/task characteristics directly into his/her program, rather than relying on some obscure piece of language functionality (which may change from version to version).
I have a fairly good idea of what the Subject class does and when to use it, but I've just been looking through the language reference on msdn and see there are various other ISubject implementations such as:
AsyncSubject
BehaviorSubject
ReplaySubject
As the documentation is pretty thin on the ground, whats the point of each of these types and under what situations would you use them?
These subjects all share a common property - they take some (or all) of what gets posted to them via OnNext and record it and play it back to you - i.e. they take a Hot Observable and make it Cold. This means, that if you Subscribe to any of these more than once (i.e. Subscribe => Unsubscribe => Subscribe again), you'll see at least one of the same value again.
ReplaySubject: Every time you subscribe to the Subject, you get the entire history of what has been posted replayed back to you, as fast as possible (or a subset, like the last n items)
AsyncSubject: Always plays back the last item posted and completes, but only after the source has completed. This Subject is awesome for async functions, since you can write them without worrying about race conditions: even if someone Subscribes after the async method completes, they get the result.
BehaviorSubject: Kind of like ReplaySubject but with a buffer of one, so you always get the last thing that was posted. You also can provide an initial value. Always provides one item instantly on Subscribe.
In light of the latest version (v1.0.2856.0) and to keep this question up to date, there has been a new set of subject classes:
FastSubject, FastBehaviorSubject, FastAsyncSubject and FastReplaySubject
As per the release notes they
are much faster than regular subjects
but:
don’t decouple producer and consumer by an IScheduler
(effectively limiting them to
ImmediateScheduler);
don’t protect against stack overflow;
don’t synchronize input messages.
Fast subjects are used by Publish and
Prune operators if no scheduler is
specified.
In regards to AsyncSubject
This code:
var s = new AsyncSubject<int>();
s.OnNext(1);
s.Subscribe(Console.WriteLine);
s.OnNext(2);
s.OnNext(3);
s.OnCompleted();
prints a single value 3. And it prints same if subscription is moved to after completion. So it plays back not the first, but the last item, plays it after completion (until complete, it does not produce values), and it does not work like Subject before completion.
See this Prune discussion for more info (AsyncSubject is basically the same as Prune)
Paul's answer pretty much nails it. There's a few things worth adding, though:
AsyncSubject works as Paul says, but only after the source completes. Before that, it works like Subject (where "live" values are received by subscribers)
AsyncSubject has changed since I last ran tests against it. It no longer acts as a live subject before completion, but waits for completion before it emits a value. And, as Sergey mentions, it returns the last value, not the first (though I should have caught that as that's always been the case)
AsyncSubject is used by Prune, FromAsyncPattern, ToAsync and probably a few others
BehaviorSubject is used by overloads of Publish that accept an initial value
ReplaySubject is used by Replay
NOTE: All operator references above refer to the publishing set of operators as they were before they were replaced with generalised publish operators in rev 2838 (Christmas '10) as it has been mentioned that the original operators will be re-added