Does QuickFix engine provide recovery by default - quickfix

I am using QuickFix engine based Initiator application. Will that application handle recovery by default or I need to provide MessageStoreFactory? If needed then I will use "FileStoreFactory".
Also I assume default "MessageStoreFactory" and "LogFactory" must be doing it synchronously and that will affect the latency? Is my assumption correct?

Related

How does cadence handle fault in various failure condition?

Cadence is a fault tolerant stateful code platform. How does cadence handle fault in various failure condition?
There are al kinds of failures in distributed systems and Cadence provides various options to them.
Here is the list from myself. It may not be complete. But I will try add more if I can think of.
activity
Activity failure and retry. See https://cadenceworkflow.io/docs/concepts/activities/#timeouts
Also note that long running activity can recover from checkpoints via “heartbeat “
workflow
By design of event sourcing models, a workflow can recover from any point left when a worker crashed. See https://cadenceworkflow.io/docs/concepts/workflows/#state-recovery-and-determinism
Workflow can also have retry policy like activity to retry on failure automatically https://cadenceworkflow.io/docs/concepts/workflows/#workflow-retries
On certain scenarios the failure is caused by bad code change which leads to wrong states. Cadence provides “reset” tool to reset workflow to any point of time.
See https://cadenceworkflow.io/docs/cli/#reset-and-restart
On top of reset, Cadence also allows you to reset by deployment. This is useful to reset a big number of workflow(eg millions of).
Cadence server cluster
Both activity and workflow workers are stateless.
Cadence server is a highly available and scalable service provides the durability.
The durability is from underlying design and persistence storage ( by either Cassandra, MySQL or Postgres)
In a single cluster setup, Cadence service is running with different independent shards. The whole cluster consists of different hosts. Any failed host can be replaced by another.
Cadence provides Cross data center replication to provide much higher availability https://cadenceworkflow.io/docs/concepts/cross-dc-replication/#global-domains-architecture

Getting Beyond 50 Replica Set Members in Mongodb

I’m looking to build a distributed Access Control system for a microservice platform. I’m considering using Mongodb as my database technology. My system design objectives are as follows:
Policy Enforcement should be distributed - If any given Policy
Enforcement Point (PEP) experiences downtime, only the application
that the PEP serves should be affected.
Policy Decisions should be
distributed - We don’t want the whole platform to experience downtime
because a central Policy Decision Point (PDP) is experiencing
downtime. We only want it to affect the application that it serves.
Policy Administration should be centralized - Creating a centralized
policy administration interface provides the ability for any system
(including a UI) to understand what rights an individual has, and by
establishing a common interface it allows us to more easily audit
changes to access across a whole platform.
Policy Information (context) is distributed - We don’t get to choose this if we are
building a distributed microservice platform. We can centralize the
retrieval of additional context by aggregating data that is needed to
make access control decisions into a single place, but the data
sources are still distributed.
I’m considering building a system like the one shown below. The idea is that Access Policies are administered by a central Policy Admin API. This API manages Policies that are persisted to a mongodb cluster with a 3 member replica set backing it. I would like other APIs in the platform to have a dedicated policy-query-api (Policy Decision Point) that is deployed along side it to make Access Control decisions pertinent to the API. The idea is that if any one of the policy-query-apis goes down, only the API that it serves will be affected.
I want changes to Policies to be governed by the Policy Admin API and I would like the changes to be replicated across each mongo instance that is used by each of the policy-query-apis.I don’t want the mongo replicas for each policy-query-api to affect a write to the primaries.
I also don’t need immediate data consistency (less than 5 sec latency), but I would like the data replication to be handled at the database layer if possible. The technology is already built to handle this and I don’t want to reinvent the wheel at the application layer if possible.
I’ve looked at the documentation on Replica Set Members and I’ve pretty thoroughly reviewed the documentation on Replica Sets in Mongo. It seems like having a Hidden Member or Delayed Member would be a good fit for my use case. Do you agree? Also, I’m concerned about the 50 member replica set limit 1. Since each one of these replicas would serve an API in my platform, if there exceeded more than 50 microservices (which is quite likely) how would I manage replication like this?
Just so that I understand, you are asking about:
one standalone (?? your picture suggests standalone but you are asking about 50 node RS limit) node per application, data mirrored to standalone from the master RS
the application only queries its local standalone
MongoDB provides read preference nearest for the use case of reading data from local nodes. Importantly the nearest read preference still provides availability if your local node is unavailable - the next closest (roughly) node will be used in this case. Your proposed architecture would take the application down every time its local database node needs to be restarted for version upgrades.
You may also look into tag sets.
Additionally, MongoDB allows specifying priorities on nodes for election purposes. If you put all of your MongoDB nodes into the same RS, you can use priorities to have one of the 3 designated "main" servers be primaries if any of them are available.

When to use polling and streaming in launch darkly

I have started using launch darkly(LD) recently. And I was exploring how LD updates its feature flags.
As mentioned Here, there are two ways.
Streaming
Polling
I was just thinking which implementation will be better in what cases. After a little research about streaming vs polling, It was found Streaming has the following advantages over polling.
Faster than polling
Receives only latest data instead of all the data which is same as before
Avoids periodic requests
I am pretty sure all of the above advantages comes at a cost. So,
Are there any downsides of using streaming over polling?
In what scenarios polling should be preferred? or the other way around?
On what factors should I decide whether to stream or poll?
Streaming
Streaming requires your application to be always alive. This might not be the case in a serverless environment. Furthermore, a streaming solution usually relies on a connection that is always open in the background. This might be costly, so feature flag providers tend to limit the number of concurrent connections you can keep open to their infrastructure. This might be not a problem if you use feature flags only in a few application instances. But you will easily reach the limit if you want to stream feature flag updates to mobile apps or a ton of microservices.
Polling
Polling sounds less fancy, but it's a reliable & robust old-school pattern that will work in almost all environments.
Webhooks
There is a third option too: webhooks. The basic idea is that you create an HTTP endpoint on your end and he feature flag service will call that endpoint whenever a feature flag value update happens. This way you get a "notification" about feature flag value changes. For example ConfigCat supports this model. ConfigCat can notify your infrastructure by calling your webhooks and (optionally) pushing new values to your end. Webhooks have the advantage over streaming that they are cheap to maintain, so feature flag service providers don't limit them as much (for example ConfigCat can give you unlimited webhooks).
How to decide
How I would use the above 3 option really depends on your use-case. A general rule of thumb is: use polling by default and add quasi real-time notifications (by streaming or by webhooks) to the components where it's critical to know about feature flag value updates.
In addition to #Zoltan's answer, I Found the following from LaunchDarkly's Effective Feature management E book (Page 36)
In any networked system there are two methods to distribute information.
Polling is the method by which the endpoints (clients or servers) periodically ask for updates. Streaming, the second method,is when the central authority pushes the new values to all the end‐points as they change.Both options have pros and cons.
However, in a poll-based system, you are faced with an unattractive trade-off: either you poll infrequently and run the risk of different parts of your application having different flag states, or you poll very frequently and shoulder high costs in system load, network bandwidth, and the necessary infra‐structure to support the high demands.
A streaming architecture, on the other hand, offers speed advantages and consistency guarantees. Streaming is a better fit for large-scale and distributed systems. In this design, each client maintains along-running connection to the feature management system, which instantly sends down any changes as they occur to all clients.
Polling Pros:
Simple
Easily cached
Polling Cons:
Inefficient. All clients need to connect momentarily, regardless of whether there is a change.
Changes require roughly twice the polling interval to propagate to all clients.
Because of long polling intervals, the system could create a “split brain” situation, in which both new flag and old flag states exist at the same time.
Streaming Pros:
Efficient at scale. Each client receives messages only when necessary.
Fast Propagation. Changes can be pushed out to clients in real time.
Streaming Cons:
Requires the central service to maintain connections for every client
Assumes a reliable network
For my use case, I have decided to use polling in places where I don't need to update the flags often(long polling interval) and doesn't care about inconsistencies (split-brain) .
And Streaming for applications that need immediate flag updates and consistency is important.

How to tune Netflix eureka self preservation to handle autoscaling?

The self-preservation feature that never expires does not looks friendly to cluster auto-scaling ability.
When we scale down our services after reduced load thous shutted down instances could trigger self-preservation.
As I understand self-preservation tries to tolerate short-term network issues. But there are already exists settings which allow us to tune some tolerance window:
eureka.instance.lease-expiration-duration-in-seconds = 90
eureka.instance.lease-renewal-interval-in-seconds = 30
I faced some advises to don't turn self-preservation off but seems it brings more pain than gain. Do I miss something?
First, you need to distinguish between normal shutdown and unclean termination of Eureka client. Self preservation mode only cares about unclean termination.
Namely, when you scale down your servers, if you make your application shutdown normally (unregister), self preservation mode will not be activated.
If you're using Spring cloud based Eureka client, this normal shutdown will be done when application shutdown. The problem is that some Spring cloud releases have the issue about sending shutdown(Eureka unregister) message. So if you want to make sure, just send unregister messages via REST API to Eureka server just after scaling down about the scaling downed instances.
Another possible approach is that just decreasing the threshold for self preservation.
eureka:
server:
renewal-percent-threshold: 0.50
One more thing.
You need to be careful when change eureka.instance.leaseRenewalIntervalInSeconds value. Original Eureka server source code assumes that this value is 30 seconds when it calculates the threshold for self preservation mode. I'm not sure this hard-coded part still lives in the latest Spring cloud release. You need double check.

JavaFX interactivity with Spring MVC Restful

I am building a JavaFX client application communicating with Spring MVC Restful server(Spring boot 1.4.1) application which works as expected.
Some features require fast interaction with the server to validate limits and availability before proceeding to next input example check if member number insert is valid and if has exceeded limit to insert, during accumulation of records(each confirmed record temporarily stored in a tableview before sent to server for storage) before the records are actually saved.
Within JavaFX and Spring framework(in both frontend and backend) scope, how can such kind of features made look more interactive(or live) than normal "let-me-wait-for-response" approach
If question is not clear, just ask, otherwise i think it is
It appears that the only interaction you have between client (JavaFX) and server (SpringBoot) is through a REST API. This will make short bursts of data (such a validation) take longer.
Switching to another communication mechanism (for example gRPC or Netty with Msgpack) could help. Note that once you open the door for non-REST calls it'll make you re-think the use of REST in the first place.
Non-REST communication may not be an option depending on your requirements (firewalls, etc) or may need additional setup in order to surmount other obstacles, in other words, there's no free lunch.