Asyncio consumer put element in queue as reference. They duplicates in loop - queue

I have implemented async paginated api requester with timeout, queue and multiple tokens (workers).
I have 3 workers (requesters), that consuming elements from queue. If result from request telling me that it is page 1 and total pages more than 1, my code put already used arguments to queue with another page param (from 2 to "total page"):
if result['page'] < result['total_pages'] and result['page'] == 1:
for page in range(2, result['total_pages']+1):
request_kwargs['params']['page'] = page
queue.put_nowait(request_kwargs)
Everything works good until adding new arguments to the queue:
On the first for loop queue._queue (in debug) looks like:
deque([{'method': 'get', 'url': 'some_url', 'params': {'count': 100, 'page': 2}}])
But on the second for loop something goes wrong and adding new args overwrite previous and adding the same:
deque([{'method': 'get', 'url': 'some_url', 'params': {'count': 100, 'page': 3}},{'method': 'get', 'url': 'some_url', 'params': {'count': 100, 'page': 3}}])
...
On the 6th loop i have 6 similar elements in queue, with similar {'page':6}
Where am i wrong? Why does putting new request_kwargs overwrites existing ones?
Adding new elements works fine when i am just adding 'page' instead of dict with page in it.
That works fine:
if result['page'] < result['total_pages'] and result['page'] == 1:
for page in range(2, result['total_pages']+1):
queue.put_nowait(page)
I think that adding new element to queue put reference to that element, not an instance. A have tried to add additional var before putting it into queue and deleting var after putting. Nothing changes.

From the context that You are providing, You are actually putting the same object over and over again:
queue.put_nowait(request_kwargs)
You are not creating a separate request_kwargs variable in each loop pass. It is the same, mutable object every time the for loop iterates.
And moreover, You are performing explicit mutations of that variable on each loop pass:
request_kwargs['params']['page'] = page
This way, You are:
changing all previous queue items, updating the page - cause each query item is actually the same object
pushing the same queue item again
I think that such change should resolve Your issues:
import copy
# ...
if result['page'] < result['total_pages'] and result['page'] == 1:
for page in range(2, result['total_pages']+1):
request_kwargs = copy.deepcopy(request_kwargs)
request_kwargs['params']['page'] = page
queue.put_nowait(request_kwargs)
But using deepcopy always bring questions if my code is actually properly designed. And here it may not be.

Related

How to abort requests with Axios AbortController?

I'm trying to abort any/all previous Axios requests, using AbortController():
https://axios-http.com/docs/cancellation
FAILS: In my testing, previous queries do not get aborted.
The search experience still works as expected, but every request gets fully digested when user slams away on filters. Instead I want all previous requests to just be aborted.
I want to avoid building logic that uses storing/tracking requests, tokens and/or promises. I'm familiar with this stuff and could build it, but just want to avoid all that.
Is the Axios' AbortController intended for this purpose?
UPDATE (WORKS): Thx to #Oluwafemi, my setup is working.
Two things had to be changed:
Set a new instance of AbortController() directly after the abort.
The signal needs to be a third parameter going into the Axios
function, and not part of the payload (unlike what you see in
material online).
Side note: In addition, not included here is a debouncer wrapping my query function (in my app), which alongside this AbortController, makes for a good multi-layer management of outgoing/incoming comms with the API server.
(I redacted a bunch of methods/lines that aren't relevant)
export default class MySearch {
constructor() {
// ONE-TIME SETUP
this.payload = null
this.active = {
q: "", // (Query) string e.g. "apples"
facets: {}, // Objects, each with array of options e.g. { 'size': [ '2 x 2 in', '3 x 3 in' ]}, { 'artists': [ 'mike', 'john', 'jane' ] }
page: null, // number e.g. 3
sortBy: null // string, one of: "default" | "newest" | "price_asc" | "price_desc"
}
// Declaring this here. Good/bad?
this.AxiosSearchController = new AbortController()
}
async query() {
return new Promise( async (resolve, reject) => {
// Abort any previous Axios request
this.AxiosSearchController.abort()
// Reinstantiate another instance of AbortController()
this.AxiosSearchController = new AbortController()
this.transformURL()
let requestParams = {
"page": this.active.page,
"sortBy": this.active.sortBy,
"filter": this.active.facets,
}
// Here we tell Axios to associate the request with the controller.
let AxiosSignal = {
signal: this.AxiosSearchController.signal
}
axios.post('/api/search/' + this.active.q, requestParams, AxiosSignal)
.then( response => {
this.payload = response.data
return resolve(response)
})
.catch( error => {
console.error(error)
return reject(error)
})
})
}
}
Where AxiosSearchController is initialized for MySearch depends on if you want multiple instances of the MySearch to keep the same state of search or to maintain their own state of search.
When initialized in the constructor, each instance of MySearch has its own state of search like you have in your snippet.
1. Instance 1 initialized
2. Instance 2 initialized
3. Instance 3 initialized
4. Instance 1 performs request
5. Instance 2 performs request
6. Instance 3 performs request
7. Instance 1 aborts request
8. Instance 2 continues request till fulfillment
9. Instance 3 continues request till fulfillment
When initialized outside of the constructor, all instances of MySearch keep the same state of search.
1. Instance 1 initialized
2. Instance 2 initialized
3. Instance 3 initialized
4. Instance 1 performs request
5. Instance 2 performs request
6. Instance 1 has request aborted
7. Instance 3 performs request
8. Instance 2 has request aborted
Providing the signal property in the params argument is the proper format to set signal for the request for the axios library.
However, when aborting any previous request, AxiosSearchController.signal.aborted gets set to true.
Without resetting this state of the abort controller, you shouldn't be able to make any further requests after the signal is aborted the first time.
You need to initialize AxiosSearchController after aborting request for the previous search.
this.AxiosSearchController.abort();
this.AxiosSearchController = new AbortController();

Replay subject subscription behaviour

The following code works as expected but I am puzzled by the way it behaves when I uncomment the line 'o.OnCompleted();'
The code joins all subscribers to the result of a single long operation and caches the result for further subscribers for 2 seconds. Any subscription after this time starts the process again.
Subscriptions will come from other threads (simulated with the thread pool).
var obs = Observable.Create((IObserver<Guid> o) =>
{
Console.WriteLine("Start");
Thread.Sleep(1000); // process
Console.WriteLine("End");
o.OnNext(Guid.NewGuid());
//o.OnCompleted(); // <-- uncomment this
return Disposable.Empty;
})
.Replay(TimeSpan.FromSeconds(2))
.RefCount()
.Take(1);
ThreadPool.QueueUserWorkItem(delegate
{
// simulate request from threadpool
obs.Subscribe(x => Console.WriteLine($"1: {x}"), () => Console.WriteLine($"1: complete"));
});
ThreadPool.QueueUserWorkItem(delegate
{
obs.Subscribe(x => Console.WriteLine($"2: {x}"), () => Console.WriteLine($"2: complete"));
});
Thread.Sleep(4000);
ThreadPool.QueueUserWorkItem(delegate
{
obs.Subscribe(x => Console.WriteLine($"3: {x}"), () => Console.WriteLine($"3: complete"));
});
Here is the result:
Start
End
1: 255BEFDC-2F14-40AD-AE77-2B005C5A3AA9
2: 255BEFDC-2F14-40AD-AE77-2B005C5A3AA9
1: complete
2: complete
Start
End
3: 1214DC63-F688-475A-9CB7-C3784054A4AC
3: complete
The odd behaviour is if I uncomment the line 'o.OnCompleted()' the result changes to this:
Start
End
1: 255BEFDC-2F14-40AD-AE77-2B005C5A3AA9
2: 255BEFDC-2F14-40AD-AE77-2B005C5A3AA9
1: complete
2: complete
Start
End
3: complete
The 3rd subscriber causes another subscription to the root observable but the result is missing. It appears the ReplaySubject caches the result of the previous observable having completed but still causes a new subscription. This seems unintuitive. I would like to understand why it doesn't work.
Note: I originially tried this using Defer instead of Create which had the same result as the second run above (for obvious reasons).
When you use the Replay/RefCount pair you create an observable that shares a common subscription to the source observable.
From the source:
Returns a connectable observable sequence that shares a single subscription to the underlying sequence replaying all notifications.
Now, it's important to remember that an observable produces a series of zero or more values, followed by either a complete or error signal. It cannot produce values after a complete or error is produced.
Since you are sharing a common subscription to the source and if your source producing a complete then it cannot produce more values. So when you call o.OnCompleted() then you're doing exactly that.
Also, as a side-note, you should avoid ever writing return Disposable.Empty; inside a Create. It means you're creating an observable than can complete before the subscription has returned and that can lead to race conditions.
The way to write your code without it is:
var obs =
Observable
.Defer(() => Observable.Return(Guid.NewGuid()).Concat(Observable.Never<Guid>()))
.Replay(TimeSpan.FromSeconds(2.0))
.RefCount()
.Take(1);
But this is the same as not calling o.OnCompleted().

RxJS combineLatest: how to get emit after just one value changes?

I'm trying to learn the RxJS library. One of the cases I don't quite understand is described in this jsfiddle (code also below).
var A= new Rx.Subject();
var B= new Rx.Subject();
A.onNext(0);
// '.combineLatest' needs all the dependency Observables to get emitted, before its combined signal is emitted.
//
// How to have a combined signal emitted when any of the dependencies change (using earlier given values for the rest)?
//
A.combineLatest( B, function (a,b) { return a+b; } )
.subscribe( function (v) { console.log( "AB: "+ v ); } );
B.onNext("a");
A.onNext(1);
I'd like to get two emits to the "AB" logging. One from changing B to "a" (A already has the value 0). Another from changing A to 1.
However, only changes that occur after a subscribe seem to matter (even though A has a value and thus the combined result could be computed).
Should I use "hot observables" for this, or some other method than .combineLatest?
My problem in the actual code (bigger than this sample) is that I need to make separate initialisations after the subscribes, which cuts stuff in two separate places instead of having the initial values clearly up front.
Thanks
I think you have misunderstood how the Subjects work. Subjects are hot Observables. They do not hold on to values, so if they receive an onNext with no subscribers than that value will be lost to the world.
What you are looking for is a either the BehaviorSubject or the ReplaySubject both of which hold onto past values that re-emit them to new subscribers. In the former case you always construct it with an initial value
//All subscribers will receive 0
var subject = new Rx.BehaviorSubject(0);
//All subscribers will receive 1
//Including all future subscribers
subject.onNext(1);
in the latter you set the number of values to be replayed for each subscription
var subject = new Rx.ReplaySubject(1);
//All new subscribers will receive 0 until the subject receives its
//next onNext call
subject.onNext(0);
Rewriting your example it could be:
var A= new Rx.BehaviorSubject(0);
var B= new Rx.Subject();
// '.combineLatest' needs all the dependency Observables to get emitted, before its combined signal is emitted.
//
// How to have a combined signal emitted when any of the dependencies change (using earlier given values for the rest)?
//
A.combineLatest( B, function (a,b) { return a+b; } )
.subscribe( function (v) { console.log( "AB: "+ v ); } );
B.onNext("a");
A.onNext(1);
//AB: 0a
//AB: 1a
On another note, realizing of course that this is all new to you, in most cases you should not need to use a Subject directly as it generally means that you are trying to wrangle Rx into the safety of your known paradigms. You should ask yourself, where is your data coming from? How is it being created? If you ask those questions enough, following your chain of events back up to the source, 9 out of 10 times you will find that there is probably an Observable wrapper for it.

RXJS : Idiomatic way to create an observable stream from a paged interface

I have paged interface. Given a starting point a request will produce a list of results and a continuation indicator.
I've created an observable that is built by constructing and flat mapping an observable that reads the page. The result of this observable contains both the data for the page and a value to continue with. I pluck the data and flat map it to the subscriber. Producing a stream of values.
To handle the paging I've created a subject for the next page values. It's seeded with an initial value then each time I receive a response with a valid next page I push to the pages subject and trigger another read until such time as there is no more to read.
Is there a more idiomatic way of doing this?
function records(start = 'LATEST', limit = 1000) {
let pages = new rx.Subject();
this.connect(start)
.subscribe(page => pages.onNext(page));
let records = pages
.flatMap(page => {
return this.read(page, limit)
.doOnNext(result => {
let next = result.next;
if (next === undefined) {
pages.onCompleted();
} else {
pages.onNext(next);
}
});
})
.pluck('data')
.flatMap(data => data);
return records;
}
That's a reasonable way to do it. It has a couple of potential flaws in it (that may or may not impact you depending upon your use case):
You provide no way to observe any errors that occur in this.connect(start)
Your observable is effectively hot. If the caller does not immediately subscribe to the observable (perhaps they store it and subscribe later), then they'll miss the completion of this.connect(start) and the observable will appear to never produce anything.
You provide no way to unsubscribe from the initial connect call if the caller changes its mind and unsubscribes early. Not a real big deal, but usually when one constructs an observable, one should try to chain the disposables together so it call cleans up properly if the caller unsubscribes.
Here's a modified version:
It passes errors from this.connect to the observer.
It uses Observable.create to create a cold observable that only starts is business when the caller actually subscribes so there is no chance of missing the initial page value and stalling the stream.
It combines the this.connect subscription disposable with the overall subscription disposable
Code:
function records(start = 'LATEST', limit = 1000) {
return Rx.Observable.create(observer => {
let pages = new Rx.Subject();
let connectSub = new Rx.SingleAssignmentDisposable();
let resultsSub = new Rx.SingleAssignmentDisposable();
let sub = new Rx.CompositeDisposable(connectSub, resultsSub);
// Make sure we subscribe to pages before we issue this.connect()
// just in case this.connect() finishes synchronously (possible if it caches values or something?)
let results = pages
.flatMap(page => this.read(page, limit))
.doOnNext(r => this.next !== undefined ? pages.onNext(this.next) : pages.onCompleted())
.flatMap(r => r.data);
resultsSub.setDisposable(results.subscribe(observer));
// now query the first page
connectSub.setDisposable(this.connect(start)
.subscribe(p => pages.onNext(p), e => observer.onError(e)));
return sub;
});
}
Note: I've not used the ES6 syntax before, so hopefully I didn't mess anything up here.

rx reactive extension: how to have each subscriber get a different value (the next one) from an observable?

Using reactive extension, it is easy to subscribe 2 times to the same observable.
When a new value is available in the observable, both subscribers are called with this same value.
Is there a way to have each subscriber get a different value (the next one) from this observable ?
Ex of what i'm after:
source sequence: [1,2,3,4,5,...] (infinite)
The source is constantly adding new items at an unknown rate.
I'm trying to execute a lenghty async action for each item using N subscribers.
1st subscriber: 1,2,4,...
2nd subscriber: 3,5,...
...
or
1st subscriber: 1,3,...
2nd subscriber: 2,4,5,...
...
or
1st subscriber: 1,3,5,...
2nd subscriber: 2,4,6,...
I would agree with Asti.
You could use Rx to populate a Queue (Blocking Collection) and then have competing consumers read from the queue. This way if one process was for some reason faster it could pick up the next item potentially before the other consumer if it was still busy.
However, if you want to do it, against good advice :), then you could just use the Select operator that will provide you with the index of each element. You can then pass that down to your subscribers and they can fiter on a modulus. (Yuck! Leaky abstractions, magic numbers, potentially blocking, potentiall side effects to the source sequence etc)
var source = Obserservable.Interval(1.Seconds())
.Select((i,element)=>{new Index=i, Element=element});
var subscription1 = source.Where(x=>x.Index%2==0).Subscribe(x=>DoWithThing1(x.Element));
var subscription2 = source.Where(x=>x.Index%2==1).Subscribe(x=>DoWithThing2(x.Element));
Also remember that the work done on the OnNext handler if it is blocking will still block the scheduler that it is on. This could affect the speed of your source/producer. Another reason why Asti's answer is a better option.
Ask if that is not clear :-)
How about:
IObservable<TRet> SomeLengthyOperation(T input)
{
return Observable.Defer(() => Observable.Start(() => {
return someCalculatedValueThatTookALongTime;
}, Scheduler.TaskPoolScheduler));
}
someObservableSource
.SelectMany(x => SomeLengthyOperation(input))
.Subscribe(x => Console.WriteLine("The result was {0}", x);
You can even limit the number of concurrent operations:
someObservableSource
.Select(x => SomeLengthyOperation(input))
.Merge(4 /* at a time */)
.Subscribe(x => Console.WriteLine("The result was {0}", x);
It's important for the Merge(4) to work, that the Observable returned by SomeLengthyOperation be a Cold Observable, which is what the Defer does here - it makes the Observable.Start not happen until someone Subscribes.