Goal:
I use Bloomberg Java API's subscription service to monitor bond prices in real time (subscribing to ASK/BID real time fields). However in the RESPONSE messages, bloomberg does not provide the associated yield for the given price. I need a way to calculate the yields.
Attempt:
Here's what I've tried:
Within in the code that processes Events coming backing from a real time subscription, when I get a BID or ASK response, I extract the price from the message element, and then initiates a new synchronous reference data request, using overrides to get the YAS_BOND_YLD by providing YAS_BOND_PX and setting the overriding flag.
Problem:
This seems very slow and cumbersome. Is there a better way other than having to calculate yields myself?
In my code, I seem to be able to process real time prices if they are being sent to me slowly. If a few bonds' prices were updated at the same time (say, in MSG1 pricing), I seem to only capture one out of these updates, it feels like I'm missing the other events.. Is this because I cannot use a synchronous reference data request while the subscription is still alive?
Thanks.
bloomberg does not provide the associated yield for the given price
Have you tried retrieving the ASK_YIELD and BID_YIELD fields? They may be what you are looking for.
Problem: This seems very slow and cumbersome.
Synchronous one-off requests are slower than real time subscription. Unless you need real time data on the yield, you could queue the requests and send them all at once every x seconds for example. The time to get 100 or 1 yield is probably not that different, and certainly not 100 times slower.
In my code, I seem to be able to process real time prices if they are being sent to me slowly. If a few bonds' prices were updated at the same time (say, in MSG1 pricing), I seem to only capture one out of these updates, it feels like I'm missing the other events.. Is this because I cannot use a synchronous reference data request while the subscription is still alive?
You should not miss items just because you are sending a synchronous request. You may get a "Slow consumer warning" but that's about it. It's difficult to say more without seeing your code. However, if you want to make sure your real time data is not delayed by your synchronous requests, you should use two separate Sessions.
Related
I'm running several batch updates (with about 200 entities each one) and I have a Quantum Leap subscribed to the CB to capture the historical data. It is supposed that each batch update should generate 200 different notifications, but, if I have set a throttle of "1", does this mean I will only receive the first notification and loose the other 199? Is this right? (just watching the QL, it seems to me that I'm loosing many notifications)
Entities are processed one by one during a batch update request. Thus, if you have 200 entities the update in each one triggers a subscription, then you will have 200 notifications being sent.
The throttling effect depends on the case. For instance:
If each entity triggers a different subscription, then they have any effect (as the trottling is evaluated by subscription)
If each entity triggers the same subscription and assuming that all the 200 notifications are send very fast (let's say, less than 1 second) then only the first one will be sent, and the 199 remaining ones will be lost.
In general, we don't recommend to use throttling due to this (and some other :) problems. It uses to be better to implement traffic flow control in the receiver.
I would like to count the number of messages in the last hour (last hour referring to a timestamp field in the message data).
I currently have a code that will count the messages synchronously (I am using Google Cloud Pub/Sub Synchronous pull), but I noticed it will take quite long.
My code will repeatedly poll the subscription for a predefined (I set it to 100+) number of times so that I am sure there are no more messages in the last hour that are coming in out of order.
This is not an acceptable design because it means the user has to wait for 5-10 mins for the service to count the messages when they want the metric!
Are there best practices in Pub Sub design for solving this kind of problem?
This seems like a simple problem to solve (count the number of events in the last X timeframe) so I thought there might be.
Will asynchronous design help? How would an async design work? I am not too sure about the async and Python future concept (I am using GCP Pub/Sub's Python client library).
I will try to catch the message differently. My solution is based on logging and BigQuery. The idea is to write a log, for example message received with timestamp xxxxx, to filter this log pattern and to sink the result in BigQuery.
Then, when a user ask, you simply have to request BigQuery and to count the message in the desired lap of time. You also have the advantage to change the time frame, to have an history,...
For writing this log, 2 solutions
Cheaper but not really recommended, the process which consume the message log it with it process it. However, you are dependent of an external service. And this service has 2 responsibilities: its work, and this log (for metrics). Not SOLID. Maybe it's can be the role of the publisher with a loge like this: message published at XXXX. However this imply that all the publisher or all the subscribers are on GCP.
Better is to plug a function, the cheaper (128Mb of memory) to simply handle the message and write the log.
I'm working with Rest Api that requires an incremented parameter to be sent with each request. I use unix miliseconds as nonce and originally naively sent requests one after another but even if I send one message before another, they can arrive in a reversed order which results in an error.
One solution could be sending the next request only after the previous one got back. But it would be too slow. I'm thinking about less strict solution like measuring latency over the last 10 requests and waiting for x% of latency before sending the next message. I feel like this problem should've been already solved but can't find any good reference. Would appreciate any advice.
I am building a securities trading simulator in Scala/Akka. Each TraderActor has a var wealth that fluctuates over time as the actor trades via the market.
At various time intervals, I would like to query all of the TradingActors to get the current value of their respective 'wealth' and store all of the results in a database for later analysis. How might I accomplish this?
Querying million of actors to retrieve the value that they have is not a good idea because
whenever you get the entire aggregated value, those value will be stale.
You can not have realtime report
So, you need kinda distributed eventing system like Kafka to push the value to that upon any change. Then you can define consumer of Kafka which subscribed to it and receive events and aggregate or visualise etc.
In this way you will have live reporting system without setting up any cronjob to periodically goes through actors and retrieve their state.
I would send a StoreMessage that would tell the TraderActors to send their wealth value to a StoreController actor ref through some StoreData message.
The StoreController would then receive the StoreData messages and either store their content as they are received, or route them to a StoreWorker that would store them as they are received (making StoreController a router), or stack them before writing them, or any other strategy that suits your needs.
The way you want the StoreController to handle the received wealth mostly depend on your database, the number of TraderActors, how often you would like to store the values, etc.
I think the event bus implementation that comes with Akka is there for this very purpose.
Rx has great function Observable.Buffer. But there is a problem with it in real life.
Scenario: application sends a stream of events to a database. Inserting events one-by-one is expensive, so we need to batch it. I want to use Observable.Buffer for this. But inserting into DB has small probability of failure (deadlocks, timeouts, downtime, etc).
I can add some retry logic into batching function itself, but it would be against Rx idea of composablility. Observable.Retry does not cut it, because it will re-subscribe to "hot" source, which means that failed batch will be lost.
Are there functions, which I can compose to achieve desired effect, or do I need to implement my own extension? I would like something like this:
_inputBuffer = new BufferBlock<int>();
_inputBuffer.AsObservable().
Buffer(TimeSpan.FromSeconds(10), 1000).
Do(batch => SqlSaveBatch(batch)).
{Retry???}.
Subscribe()
To make it perfect, I would like to be able to get control over situation when OnComplete is called, while retry buffer has incomplete batches, and be able to perform some actions (send error email, save data to local file system, etc.)
When a save to database fails and needs to be retried, it's not really the stream or the events that are in error, it's a action taken against an event.
I would structure your code more like this:
IDisposable subscription =
_inputBuffer.AsObservable().
Buffer(TimeSpan.FromSeconds(10), 1000).
Subscribe(
batch => SqlSaveBatchWithRetryLogic(batch),
() => YourOnCompleteAction);
You can provide the retry logic inside of SqlSaveBatchWithRetryLogic()
Handle OnComplete of the events inside YourOnCompleteAction()
You can elect to dispose the subscription from within SqlSaveBatchWithRetryLogic() if you fail to save a batch.
This also removes the Do side effect.
I would be careful about this approach though - you need to watch the retry logic. You have no back-pressure (way to slow down the input). So if you have any kind of back-off/retry you are risking the queue backing up and filling memory. If you start seeing batches consistently at the count limit, you are probably in trouble! You may want to implement a counter to monitor the outstanding items.