does it makes sense? a protocol designed for speed as well as resiliency that eliminates the FIX layer for high performance order execution?
The FAST protocol is intended to be a "faster" version of the FIX protocol. The quantity of extra processing it requires means that it is only faster "on the wire" however and so will not be very effective for those with boxes at exchange. #dumbcoder is, as usual, correct about optimization and high-powered machines being the best way of reducing latencies. That FIX isn't inherently slow, dependent on your implementation, is also very important. Sell-side and HFT implementations are much faster than the cheaper ones used by the hedgies and investors.
Are you asking whether exchanges should adopt non-fix protocols for receiving market messages? Some already have alternatives (e.g. NASDAQ's ITCH and OUCH). But they don't 'eliminate' the FIX layer - they still provide the same function, they just go about it in a different way.
FIX actually doesn't have to be all that slow - if you treat messages as byte arrays (instead of one big string) and then only get out exactly what you need (which, for order acceptances, fills, etc., can be very few tags), then FIX is really not that bad.
The key selling point of FIX is that it is an industry standard. Exchanges are free to develop their own proprietary protocols which can be higher performance, but the fact that everyone can write to a single protocol is a big deal (even if it is not always implemented in the most efficient manner).
Another angle which should be explored is whether there is a separate communication channel for the protocol or one is implemented as a wrapper on top of other.
Working on FIX is definitely an advantage that implementation remains portable with slight modification across exchanges.
Related
With ZeroMQ PUB/SUB model, is it possible for a subscriber to filter based on contents of more than just the first frame?
For example, if we have a multi-frame message that contains three frames1) data type,2) instrument, and then 3) the actual data,is it possible to subscribe to a specific data type, instrument pair?
(All of the examples I've seen only shows filtering based off of the first message of a multipart message).
Different modes of ZeroMQ PUB/SUB filtering
Initial ZeroMQ model used a subscriber-side subscription-based filtering.
That was easier for publisher, as it need not handle subscriber-specific handling and simply fed all data to all SUB-s ( yes, at the cost of the vasted network traffic and causing SUB-side workload to process all incoming BLOBs, all the way up from the PHY-media, even if it did not subscribe to the particular topic. Yes, pretty expensive in low-latency designs )
PUB-side filtering was initially proposed for later implementation
still, this mode does not allow your idea to fly just on using the PUB/SUB Scaleable Formal Communication Pattern.
ZeroMQ protocol design strives to inspire users, how to achieve just-enough designed distributed systems' behaviour.
As understood from your 1-2-3 intention, a custom logic shall be just enough to achieve the desired processing.
So, how to approach the solution?
Do not hesitate to setup several messaging / control relays in parallel in your application domain-specific solution, it works much better for any tailor-made solutions and a way safer than to try to "bend" a library's primitive archetype ( in this case the trivial PUB/SUB topic-based filtering ) to make something a bit different, than for what an original use-case was designed.
The more true, if your domain-specific use is FOREX / Equity trading, where latency is your worst enemy, so a successful approach has to minimise stream-decoding and alternative branching as much as possible.
nanoseconds do matter
If one reads into details about multiframe composition ( a sender-side BLOB assembly ), there is no latency-wise advantage, as the whole BLOB gets onto wire only after it has been completed - se there is no advantage for your SUB-side in case your idea was to handle initial frame content for signalling, as the whole BLOB arrives "together"
What to prefer in which situation: Single function handling events or a function for each event?
Here is a basic code example:
Option 1
enum Notification {
case A
case B
case C
}
protocol One {
func consumer(consumer: Consumer, didReceiveNotification notification: Notification)
}
or
Option 2
protocol Two {
func consumerDidReceiveA(consumer: Consumer)
func consumerDidReceiveB(consumer: Consumer)
func consumerDidReceiveC(consumer: Consumer)
}
Background
Apple use both options. E.g. for NSStreamDelegate we have the first option, while in CoreBluetooth (e.g. CBCentralManagerDelegate) we see option two.
One big difference I see is that Swift does not support optional protocol methods nicely (via extension or #obj keyword).
What would you prefer? What's the (dis)advantages?
In terms of achieving the loosest form of coupling and highest degree of cohesion, naturally the choice would sway towards individual events, not this kind of multi-event bundle of responsibilities.
Yet there are a lot of practical concerns that might move you towards favoring the opposite, coarser way of dealing with events instead of a separate function per granular event.
Here are some possible ones (not listed in any specific order).
Boilerplate
While it's not the biggest thing to worry about, writing a bunch of functions tends to take a bit more effort than writing a bunch of if/else statements or switch cases within one. More importantly than this, however, is the code needed to connect/disconnect event-handling slots to event-handling signals. Avoiding the need to write that subscription/unsubscription kind of code for every single teeny event handled can save considerably on the amount of code to maintain.
Performance
It might seem counter-intuitive that performance can favor the coarser multi-event handler. After all, the granular event-handler requires less branching (one dynamic dispatch to get to the precise event handler), while the coarser one requires twice as much (one dynamic dispatch to get to a coarse event-handling site, and another local series of branches to get to the precise event-handling code).
Yet the cost of dynamic dispatch leans heavily on branch prediction. If you're branching into coarser event handlers, then often you're branching more often into the same set of instructions, and that can be an optimization strategy. To have two sets of more predictable branches can often produce more optimal results than one less predictable branch.
Moreover, coarser event-handling typically implies fewer aggregates, fewer lists of functions to call on the side of those triggering events. And that can translate to reduced memory usage and improved locality of reference.
On the flip side, to branch into coarser event handlers often means branching more often. For example, some site might only be interested in push kind of input events, not resize events. If we lump all these together into a coarse event handler and without some filtering mechanism on top, then typically we would have to pay the cost of dynamic dispatch even for a resize event that isn't even handled for a particular site.
Yet I've found that this is actually often better than I thought it would be to branch needlessly into the same coarse functions (most likely due to the branch predictor succeeding) as opposed to branching into a wide variety of disparate functions and only as needed.
So there's a balancing act here and even performance doesn't clearly side with one strategy over the another. It still varies case-by-case.
Nevertheless, lacking measurements and very detailed data about the critical code paths, it's typically safer from a performance perspective to err on the side of these coarser multi-event handlers. After all, even if that proves to be the wrong decision from a performance standpoint, it's easier to optimize from coarse to fine (we can even do so very non-intrusively by keeping the coarse and using fine-grained event-handling in cases that benefit most from it) than vice versa.
Event Subscription/Unsubscription
This can likewise swing one way or the other, but in my experience (from team settings), most of the human errors associated with event handling do not occur within the event-handling code, but outside. The most common source of errors I see relate to failing to subscribing to events and, most commonly, failing to unsubscribe when the events are no longer of interest.
When events are handled at a coarser level, there's typically less of that error-prone subscription/unsubscription code involved (this relates to the boilerplate concerns above, but this is unusual kind of boilerplate in that it can be quite error-prone and not merely tedious to write).
This is also very case-by-case. In the systems I've often been involved in, there was a common need for entities to continue to exist that unsubscribed from certain events prematurely. Those premature cases often required the code to unsubscribe from events to be written manually, as they could not be tied to an entity's lifetime. That may have pointed more to design issues elsewhere, but in that scenario, the number of mistakes made team-wide went down with coarser event handling.
Type Safety
While not shown in the examples here, typically with coarser event-handling is a need to squeeze more disparate types of data through more generic parameters. That might translate in an extreme scenario like in C to squeezing more data through void pointers and more dangerous pointer type casts. With that, compile-time type safety is obliterated and we could start seeing a whole new source of human error.
In higher-level languages, this might translate to more down casts or things of that sort when we cannot model the signature of a delegate to perfectly fit the parameters passed in when an event is triggered.
I've found typically that this isn't the biggest source of confusions and bugs provided that there is at least some form of runtime type safety when casting or unboxing these parameters. But it is a con on the side of coarser event-handling.
Intellectual Overhead
This might vary per individual but I tend to look at systems from a very administrative/overview kind of standpoint and specifically with respect to control flow. It's because I tend to work in lower-level portions of the system, including things like proprietary UI toolkits.
In those cases, when a button is pushed, what functions are called? It turns into a mystery in a large-scale codebase composed of hundreds of thousands of little functions without tracing into the code actually invoked when a button is pushed and seeing each and every function that is called.
That's an inevitability with an event-driven paradigm and something I never became 100% comfortable about, but I find it alleviates some of that explosive complexity that I perceive in my personal mental model (something resembling a very complex graph) when there's less code decentralization. With coarser event handlers comes fewer, more centralized functions to branch into throughout a system on such a button push, and that helps me increase my familiarity when there are fewer but bigger functions involved in my mental graph.
There is a very simple practical benefit here where if you want to find out when a specific entity responds to a series of events, we can simply put a breakpoint on this one coarse event-handling site (while still being able to drill down a specific event for that specific entity by putting a breakpoint in a local branch of code).
Of course, I might be an exception there working in these low-level systems that everyone uses. It seems a lot of people are comfortable with the idea of just subscribing to a button push event in their code without worrying about all the other subscribers to the same event.
From my kind of holistic control flow view of the system, it helps me to absorb the complexity more easily when there are fewer but coarser event-handling sites in the codebase even though I normally otherwise find monolithic functions to be a burden. Especially in a debugging context where I face a concern like, "What caused this to happen?", that combined with the event-handling concern of "What functions are actually going to be called when this happens?" can really multiply the complexity. With fewer potential target sites where events are handled, the latter concern is mitigated.
Conclusion
So these are some factors that might sway you to choose one design strategy over another. I find myself somewhere in the middle. I generally don't choose design as coarse as say, wndproc on Windows which wants to associate a single, ultra-coarse event handler for every single window event imaginable. Yet I might favor designing at a coarser event-handling level than some just to alleviate this kind of mental complexity, reduce code decentralization, possibly improve performance (always with a profiler in hand).
And then there are times when I choose to design at a very granular level when the complexity isn't that great (typically when the package triggering events isn't that central), when performance isn't a concern or performance actually favors this route, and for the improved type safety. It's all case-by-case.
Can anyone explain if there are any significant advantages or disadvantages when choosing to implement features such as authentication or caching etc using hooks as opposed to using middleware?
For instance - I can implement a translation feature by obtaining the request object through custom middleware and setting an app language variable that can be used to load the correct translation file when the app executes. Or I can add a hook before the routing and read the request variable and then load the correct file during the app execution.
Is there any obvious reason I am missing that makes one choice better than the other?
Super TL/DR; (The very short answer)
Use middleware when first starting some aspect of your application, i.e. routers, the boot process, during login confirmation, and use hooks everywhere else, i.e. in components or in microservices.
TL/DR; (The short answer)
Middleware is used when the order of execution matters. Because of this, middleware is often added to the execution stack in various aspects of code (middleware is often added during boot, while adding a logger, auth, etc. In most implementations, each middleware function subsequently decides if execution is continued or not.
However, using middleware when order of execution does not matter tends to lead to bugs in which middleware that gets added does not continue execution by mistake, or the intended order is shuffled, or someone simply forgets where or why a middleware was added, because it can be added almost anywhere. These bugs can be difficult to track down.
Hooks are generally not aware of the execution order; each hooked function is simply executed, and that is all that is guaranteed (i.e. adding a hook after another hook does not guarantee the 2nd hook is always executed second, only that it will simply be executed). The choice to perform it's task is left up to the function itself (to call out to state to halt execution). Most people feel this is much simpler and has fewer moving parts, so statistically yields less bugs. However, to detect if it should run or not, it can be important to include additional state in hooks, so that the hook does not reach out into the app and couple itself with things it's not inherently concerned with (this can take discipline to reason well, but is usually simpler). Also, because of their simplicity, hooks tend to be added at certain named points of code, yielding fewer areas where hooks can exist (often a single place).
Generally, hooks are easier to reason with and store because their order is not guaranteed or thought about. Because hooks can negate themselves, hooks are also computationally equivalent, making middleware only a form of coding style or shorthand for common issues.
Deep dive
Middleware is generally thought of today by architects as a poor choice. Middleware can lead to nightmares and the added effort in debugging is rarely outweighed by any shorthand achieved.
Middleware and Hooks (along with Mixins, Layered-config, Policy, Aspects and more) are all part of the "strategy" type of design pattern.
Strategy patterns, because they are invoked whenever code branching is involved, are probably one of if not the most often used software design patterns.
Knowledge and use of strategy patterns are probably the easiest way to detect the skill level of a developer.
A strategy pattern is used whenever you need to apply "if...then" type of logic (optional execution/branching).
The more computational thought experiments that are made on a piece of software, the more branches can mentally be reduced, and subsequently refactored away. This is essentially "aspect algebra"; constructing the "bones" of the issue, or thinking through what is happening over and over, reducing the procedure to it's fundamental concepts/first principles. When refactoring, these thought experiments are where an architect spends the most time; finding common aspects and reducing unnecessary complexity.
At the destination of complexity reduction is emergence (in systems theory vernacular, and specifically with software, applying configuration in special layers instead of writing software in the first place) and monads.
Monads tend to abstract away what is being done to a level that can lead to increased code execution time if a developer is not careful.
Both Monads and Emergence tend to abstract the problem away so that the parts can be universally applied using fundamental building blocks. Using Monads (for the small) and Emergence (for the large), any piece of complex software can be theoretically constructed from the least amount of parts possible.
After all, in refactoring: "the easiest code to maintain is code that no longer exists."
Functors and mapping functions
A great way to continually reduce complexity is applying functors and mapping functions. Functors are also usually the fastest possible way to implement a branch and let the compiler see into the problem deeply so it can optimize things in the best way possible. They are also extremely easy to reason with and maintain, so there is rarely harm in leaving your work for the day and committing your changes with a partially refactored application.
Functors get their name from mathematics (specifically category theory, in which they are referred to a function that maps between two sets). However, in computation, functors are generally just objects that map problem-space in one way or another.
There is great debate over what is or is not a functor in computer science, but in keeping with the definition, you only need to be concerned with the act of mapping out your problem, and using the "functor" as a temporary thought scaffold that allows you to abstract the issue away until it becomes configuration or a factor of implementation instead of code.
As far as I can say that middleware is perfect for each routing work. And hooks is best for doing anything application-wide. For your case I think it should be better to use hooks than middleware.
This question relates to project design. The project takes an electrical system and defines it function programatically. Now that I'm knee-deep in defining the system, I'm incorporating a significant amount of interaction which causes the system to configure itself appropriately. Example: the system opens and closes electrical contactors when certain events occur. Because this system is on an airplane, it relies on air/ground logic and thus incorporates two different behaviors depending on where it is.
I give all of this explanation to demonstrate the level of complexity that this application contains. As I have continued in my design, I have employed the use of if/else constructs as a means of extrapolating the proper configurations in this electrical system. However, the deeper I get into the coding, the more if/else constructs are required. I feel that I have reached a point where I am inefficiently programing this system.
For those who tackled projects like this before, I ask: Am I treading a well-known path (when it comes to defining EVERY possible scenario that could occur) and I should continue to persevere... or can I employ some other strategies to accomplish the task of defining a real-world system's behavior.
At this point, I have little to no experience using delegates, but I wonder if I could utilize some observers or other "cocoa-ey" goodness for checking scenarios in lieu of endless if/else blocks.
Since you are trying to model a real world system, I would suggest creating a concrete object oriented design that well defines the is-a and a has-a relationships and apply good old fashioned object oriented design and apply it into breaking the real world system into a functional decomposition.
I would suggest you look into defining protocols that handle the generic case, and using them on specific cases.
For example, you can have many types of events adhering to an ElectricalEvent protocol and depending on the type you can better decide how an ElectricalContactor discriminates on a GeneralElectricEvent versus a SpecializedElectricEvent using the isKindOfClass selector.
If you can define all the states in advance, you're best of implementing this as a finite state machine. This allows you to define the state-dependent logic clearly in one central place.
There are some implementations that you could look into:
SCM allows you to generate state machine code for Objective-C
OFC implements them as DFSM
Of course you can also roll your own customized implementation if that suits you better.
An initial draft of requirements specification has been completed and now it is time to take stock of requirements, review the specification. Part of this process is to make sure that there are no sizeable gaps in the specification. Needless to say that the gaps lead to highly inaccurate estimates, inevitable scope creep later in the project and ultimately to a death march.
What are the good, efficient techniques for pinpointing missing and implicit requirements?
This question is about practical techiniques, not general advice, principles or guidelines.
Missing requirements is anything crucial for completeness of the product or service but not thought of or forgotten about,
Implicit requirements are something that users or customers naturally assume is going to be a standard part of the software without having to be explicitly asked for.
I am happy to re-visit accepted answer, as long as someone submits better, more comprehensive solution.
Continued, frequent, frank, and two-way communication with the customer strikes me as the main 'technique' as far as I'm concerned.
It depends.
It depends on whether you're being paid to deliver what you said you'd deliver or to deliver high quality software to the client.
If the former, simply eliminate ambiguity from the specifications and then build what you agreed to. Try to stay away from anything not measurable (like "fast", "cool", "snappy", etc...).
If the latter, what Galwegian said + time or simply cut everything not absolutely drop-dead critical and build that as quickly as you can. Production has a remarkable way of illuminating what you missed in Analysis.
evaluate the lifecycle of the elements of the model with respect to a generic/overall model such as
acquisition --> stewardship --> disposal
do you know where every entity comes from and how you're going to get it into your system?
do you know where every entity, once acquired, will reside, and for how long?
do you know what to do with each entity when it is no longer needed?
for a more fine-grained analysis of the lifecycle of the entities in the spec, make a CRUDE matrix for the major entities in the requirements; this is a matrix with the operations/applications as the rows and the entities as the columns. In each cell, put a C if the application Creates the entity, R for Reads, U for Updates, D for Deletes, or E for "Edits"; 'E' encompasses C,R,U, and D (most 'master table maintenance' apps will be Es). Then check each column for C,R,U, and D (or E); if one is missing (except E), figure out if it is needed. The rows and columns of the matrix can be rearranged (manually or using affinity analysis) to form cohesive groups of entities and applications which generally correspond to subsystems; this may assist with physical system distribution later.
It is also useful to add a "User" entity column to the CRUDE matrix and specify for each application (or feature or functional area or whatever you want to call the processing/behavioral aspects of the requirements) whether it takes Input from the user, produces Output for the user, or Interacts with the user (I use I, O, and N for this, and always make the User the first column). This helps identify where user-interfaces for data-entry and reports will be required.
the goal is to check the completeness of the specification; the techniques above are useful to check to see if the life-cycle of the entities are 'closed' with respect to the entities and applications identified
Here's how you find the missing requirements.
Break the requirements down into tiny little increments. Really small. Something that can be built in two weeks or less. You'll find a lot of gaps.
Prioritize those into what would be best to have first, what's next down to what doesn't really matter very much. You'll find that some of the gap-fillers didn't matter. You'll also find that some of the original "requirements" are merely desirable.
Debate the differences of opinion as to what's most important to the end users and why. Two users will have three opinions. You'll find that some users have no clue, and none of their "requirements" are required. You'll find that some people have no spine, and things they aren't brave enough to say out loud are "required".
Get a consensus on the top two or three only. Don't argue out every nuance. It isn't possible to envision software. It isn't possible for anyone to envision what software will be like and how they will use it. Most people's "requirements" are descriptions of how the struggle to work around the inadequate business processes they're stuck with today.
Build the highest-priority, most important part first. Give it to users.
GOTO 1 and repeat the process.
"Wait," you say, "What about the overall budget?" What about it? You can never know the overall budget. Do the following.
Look at each increment defined in step 1. Provide a price-per-increment. In priority order. That way someone can pick as much or as little as they want. There's no large, scary "Big Budgetary Estimate With A Lot Of Zeroes". It's all negotiable.
I have been using a modeling methodology called Behavior Engineering (bE) that uses the original specification text to create the resulting model when you have the model it is easier to identify missing or incomplete sections of the requirements.
I have used the methodolgy on about six projects so far ranging from less than a houndred requirements to over 1300 requirements. If you want to know more I would suggest going to www.behaviorengineering.org there some really good papers regarding the methodology.
The company I work for has created a tool to perform the modeling. The work rate to actually create the model is about 5 requirements for a novice and an expert about 13 requirements an hour. The cool thing about the methodolgy is you don't need to know really anything about the domain the specification is written for. Using just the user text such as nouns and verbs the modeller will find gaps in the model in a very short period of time.
I hope this helps
Michael Larsen
How about building a prototype?
While reading tons of literature about software requirements, I found these two interesting books:
Problem Frames: Analysing & Structuring Software Development Problems by Michael Jackson (not a singer! :-).
Practical Software Requirements: A Manual of Content and Style by Bendjamen Kovitz.
These two authors really stand out from the crowd because, in my humble opinion, they are making a really good attempt to turn development of requirements into a very systematic process - more like engineering than art or black magic. In particular, Michael Jackson's definition of what requirements really are - I think it is the cleanest and most precise that I've ever seen.
I wouldn't do a good service to these authors trying to describe their aproach in a short posting here. So I am not going to do that. But I will try to explain, why their approach seems to be extremely relevant to your question: it allows you to boil down most (not all, but most!) of you requirements development work to processing a bunch of check-lists* telling you what requirements you have to define to cover all important aspects of the entire customer's problem. In other words, this approach is supposed to minimize the risk of missing important requirements (including those that often remain implicit).
I know it may sound like magic, but it isn't. It still takes a substantial mental effort to come to those "magic" check-lists: you have to articulate the customer's problem first, then analyze it thoroughly, and finally dissect it into so-called "problem frames" (which come with those magic check-lists only when they closely match a few typical problem frames defined by authors). Like I said, this approach does not promise to make everything simple. But it definitely promises to make requirements development process as systematic as possible.
If requirements development in your current project is already quite far from the very beginning, it may not be feasible to try to apply the Problem Frames Approach at this point (although it greatly depends on how your current requirements are organized). Still, I highly recommend to read those two books - they contain a lot of wisdom that you may still be able to apply to the current project.
My last important notes about these books:
As far as I understand, Mr. Jackson is the original author of the idea of "problem frames". His book is quite academic and theoretical, but it is very, very readable and even entertaining.
Mr. Kovitz' book tries to demonstrate how Mr. Jackson ideas can be applied in real practice. It also contains tons of useful information on writing and organizing the actual requirements and requirements documents.
You can probably start from the Kovitz' book (and refer to Mr. Jackson's book only if you really need to dig deeper on the theoretical side). But I am sure that, at the end of the day, you should read both books, and you won't regret that. :-)
HTH...
I agree with Galwegian. The technique described is far more efficient than the "wait for customer to yell at us" approach.