Fix Message ER Rejected No Route Defined - quickfix

I'm trying to send a new order single message but I'm getting an ER Rejected message that says that the order reject reason is UNKNOWN ORDER, and No Route Defined but I couldn't find any explanation for this error.
If anyone knows what "No Route Defined" means I would be grateful, thanks.

Typically this means the FIX engine and transaction infrastructure at the other end (it's usually a broker you are connecting to with FIX) does not know what to with the order you are sending to them. Specifically it does not know which exchange or other handling venue to route it to. Hence 'no route'.
This may be because it does not recognize the instrument in your order or some combination of parameters on your order are invalid. Although typically you would get a more informative error message if this were the cause.
Other causes include the broker's connection to a down stream handling system (e.g. exchange, trading desk) has been interrupted. Sometimes this is a transient situation - service interuption or time-of-day issue outside regular trading hours (RTH).
In any case the message indicates that a valid servicing destination for your the order can not be found at this moment.

Related

How does the FIX protocol handle a message sequence number overflow?

We are currently incorporating a FIX engine (using QuickFixJ) in our application. We will be the initiator and use trade capture reports to get informed on all trades happening on the platform.
The trading (and thus the FIX session) will be running 24/7 and we are currently looking into ways to handle this properly. Our concern is that at some point we will need to reset the message sequence numbers to avoid an overflow. We would ideally not want to reset the sequence number as we need to be sure that we catch every single trade. We are worried about the following scenario:
We send a SequenceReset message
Our system crashes due to unrelated reasons
The acceptor side send us one or more TradeCaptureReport messages
Only now does the acceptor side receive our SequenceReset message
Our system has recovered and sends a ResendRequest message, with BeginSeqNo equal to 1 (because we have reset the message sequence number)
We do not get the TradeCaptureReport messages from (3.)
However, we have noticed that in case of a message sequence overflow, neither our engine nor the acceptor side seem to be troubled by this.
The example I have tested is simply sending heartbeats which will overflow the sequence number:
8=FIXT.1.19=13135=A34=149=INITIATOR50=INITIATOR52=20220901-15:26:03.40356=ACCEPTOR98=0108=10141=Y553=INITIATOR554=password1137=910=224
8=FIXT.1.19=00010235=A49=ACCEPTOR56=INITIATOR34=157=INITIATOR52=20220901-15:26:03.65498=0108=10141=Y1409=01137=910=212
8=FIXT.1.19=9035=434=249=INITIATOR50=INITIATOR52=20220901-15:26:03.71856=ACCEPTOR36=2147483646123=Y10=038
8=FIXT.1.19=00007035=049=ACCEPTOR56=INITIATOR34=257=INITIATOR52=20220901-15:26:13.79210=009
8=FIXT.1.19=7935=034=214748364649=INITIATOR50=INITIATOR52=20220901-15:26:13.78956=ACCEPTOR10=044
8=FIXT.1.19=00007035=049=ACCEPTOR56=INITIATOR34=357=INITIATOR52=20220901-15:26:23.85210=008
8=FIXT.1.19=7935=034=214748364749=INITIATOR50=INITIATOR52=20220901-15:26:23.85056=ACCEPTOR10=035
8=FIXT.1.19=00007035=049=ACCEPTOR56=INITIATOR34=457=INITIATOR52=20220901-15:26:33.89610=018
8=FIXT.1.19=8035=034=-214748364849=INITIATOR50=INITIATOR52=20220901-15:26:33.89256=ACCEPTOR10=080
8=FIXT.1.19=00007035=049=ACCEPTOR56=INITIATOR34=557=INITIATOR52=20220901-15:26:43.93310=012
8=FIXT.1.19=8035=034=-214748364749=INITIATOR50=INITIATOR52=20220901-15:26:43.93256=ACCEPTOR10=075
Is this a feature of the FIX protocol or is it undefined behaviour (and just works coincidentally)? And if this doesn't work (or is discouraged), is there a best way to handle ongoing FIX sessions? We have not found any usable information and most exchanges we have seen simply reset once a day.
I think the title of the question should rather be "how does a FIX engine handle message sequence number overflow".
As per the FIX spec the sequence number is always positive: FIX datatypes
Sequence of character digits without commas or decimals. Value must be
positive.
I can only speak for QuickFIX/J: internally the sequence number is of type java.lang.Integer which means its maximum positive value is 2147483647.
Now when QuickFIX/J (or any other engine) accepts or uses negative sequence numbers it clearly is a bug.
Maybe you should approach your Exchange how other clients handle this. I think at some point they have a time window where sequence numbers can (and should) be reset.
I guess the exchange handles it like outlined here: FIX session 24-hour connectivity

Should it matter if a call to a private REST API returns 400 or 500?

We have a private REST API that is locked down and only ever called by software we control, not the public. Many endpoints take a JSON payload. If deserialising the JSON payload fails (eg. the payload has an int where a Guid is expected), an exception is thrown and the API is returning a 500 Internal Server Error. Technically, it should return a 400 Bad Request in this circumstance.
Without knowing how much effort is required to ensure a 400 is returned in this circumstance, is there benefit in changing the API to return a 400? The calling software and QA are the only entities that see this error, and it only occurs if the software is sending data that doesn't match the expected model which is a critical defect anyway. I see this as extra effort and maintenance for no gain.
Am I missing something here that the distinction between 400 and 500 would significantly help with?
From a REST perspective:
If you want to follow strict REST principals, you should return 4xx as the problem is with the data being sent and not the server program
5xx are reserved for server errors. For example if the server was not able to execute the method due to site outage or software defect. 5xx range status codes SHOULD NOT be utilized for validation or logical error handling.
From a technical perspective:
The reported error does not convey useful information if tomorrow another programmer/team will work on the issue
If tomorrow you have to log your errors in a central error log, you will pollute it will wrong status codes
As a consequence, if QA decides to run reports/metrics on errors, they will be erroneous
You may be increasing your technical debt which can impact your productivity in the future. link
The least you can do is to log this issue or create a ticket if you use a tool like JIRA.
Should it matter if a call to a private REST API returns 400 or 500?
A little bit.
The status code is meta data:
The status-code element is a 3-digit integer code describing the result of the server's attempt to understand and satisfy the client's corresponding request. The rest of the response message is to be interpreted in light of the semantics defined for that status code.
Because we have a shared understanding of the status codes, general purpose clients can use that meta data to understand the broad meaning of the response, and take sensible actions.
The primary difference between 4xx and 5xx is the general direction of the problem. 4xx indicates a problem in the request, and by implication with the client
The 4xx (Client Error) class of status code indicates that the client seems to have erred.
5xx indicates a problem at the server.
The 5xx (Server Error) class of status code indicates that the server is aware that it has erred or is incapable of performing the requested method
So imagine, if you would, a general purpose reverse proxy acting as a load balancer. How might the proxy take advantage of the ability to discriminate between 4xx and 5xx.
Well... 5xx suggests that the query itself might be fine. So the proxy could try routing the request to another healthy instance in the cluster, to see if a better response is available. It could look at the pattern of 5xx responses from a specific member of the cluster, and judge whether that instance is healthy or unhealthy. It could then evict that unhealthy instance and provision a replacement.
On the other hand, with a 4xx status code, none of those mitigations make any sense - we know instead that the problem is with the client, and that forwarding the request to another instance isn't going to make things any better.
Even if you aren't going to automatically mitigate the server errors, it can still be useful to discriminate between the two error codes, for internal alarms and reporting.
(In the system I maintain, we're using general purpose monitoring that distinguishes 4xx and 5xx responses, with different thresholds to determine if I should be paged. As you might imagine, I'm rather invested in having that system be well tuned.)

quickfixj initiator disconnecting due to low seqnum too low

quickfixj initiator getting Disconnecting: Encountered END_OF_STREAM while trying to logon to the acceptor. We are using vendor's fix engine as acceptor. and feedback from acceptor is that logon request for xxxx was not accepted, incoming too small, expect 305, received 27.
I read the quickfix documentation but didn't get it exactly what's the proper solution for the sequence number mismatch. I understand that if I am disconnected, my initiator will send an 35=4 for resend with initiator side seqnum asking acceptor to resend the messages and fill up the gap.
But in what case, if initiator is sending a lower seqnum will be rejected by acceptor and refuse the connection?
And what's the proper procedure to handle this kind of rejection and reconnect? In order to not loose any message, how should both side do the reset and fill the gap?
In case there is a break between the initiator and acceptor, what's the recommended solution to keep the messages in sync and not loosing any?
Due to the first sentence of your question I would like to show you an answer to the same error message Disconnecting: Encountered END_OF_STREAM. There is a blog post by bhageera quoted.
In the end the reason was pretty silly… the counterparty I was connecting to allows only 1 connection per user/password (i.e. session with those credentials) at a time. As it turns out there was another application using the same credentials against the same TargetCompID. As soon as that application was killed off, the current one logged in fine.
I searched for the cause of the bug for a while, until I realized that I had two initiators with the same credentials running on two different test environments.
According to default logic in QuickfixJ:
QuickfixJ manages 2 sequence number, expectedSeqNum to receive(targetSeqNum) and nextSeqNumber to sent.
Check the next expected target SeqNum against the received SeqNum.If a mismatch is detected, apply the following logic:
if lower than expected SeqNum, logout
if higher, send a resend request
In your case received was lower than expected so it gets disconnected.
Reason for receiving higher than expected SeqNum:
Receiver misses some message so it could be a normal scenario.
Reason for lower than expected SeqNum(Your case):
One of the counterparties resets its sequence number, which is not expected it should be agreed by both the counterparties.
In a normal scenario, whenever you miss the message you will receive a higher number and it would be managed by QuickFixJ.

How old orders should be treated with a resend request

Occasionally my quickfixN engine loses connection with the exchange, and when it reconnects the exchange realises there are missing messages and asks for a resend. My engine then sends the messages.
However often the orders are old, and often I will have subsequently sent an orderCancellation request. Nevertheless when the exchange executes the messages in order when they are resent there's a good chance the Orders will be filled.
What is the correct way to deal with this problem? ie, how can I tell the exchange not to execute these orders, or alternatively, how can I stop quickfixN from resending old orders?
I don't know if there is a universally "correct" way to handle this issue.
In our system, we always, always respond with a Gap Fill, i.e.
Exchange: "Hey, we're missing sequences 537 through 542!"
Us: "Don't worry about it. Expect sequence 545 next."
The 545 is not a typo—we may have already sent 543 and 544 while their Resend Request was in transmission.
This technique is expressly to avoid the kind of dilemmas you're facing. By refusing to send old messages, you at the very least retain control over your executions.
To illustrate a larger perspective, what we do is, when we initiate any action on an order, we flag the order as "in progress," meaning it cannot be actioned further in any way (amended/CFO'ed or cancelled). Only when we receive an ACK, i.e. an Execution Report, do we remove this state. So if the exchange misses a message pertaining to that order, that order simply ends up "stuck" (and gets highlighted as such on the front-end). Not ideal, but again, at least it's not out of control. The trader then simply re-enters the desired order. (Note that it's the very guarantee that we won't resend messages that enables a trader to safely re-enter orders.) With our system it's just try-again-and-move-on, without need for complex sequence-scenario resolving.
Source: Work on an order entry system connecting to >10 Canadian exchanges, used by >50 Canadian brokers.

FIX protocol sequence number

I have few question on FIX protocol sequence number:
What is the benefit of setting ResetOnLogon=N?
Does initiator and acceptor both can send Resend request?
How message sequence helps in session recovery/error handling?
it means that sequence numbers are reset by the protocol on a logon message. This keeps sequence numbers low which can be useful. The sell side usually defines whether this should be done or not.
Yes, as long as the engine thinks that, due to out of synch sequence numbers, a message may have been lost it may request a resend.
If sequence numbers are out of synch between a message and its predecessor, and the number is higher than expected then the engine may assume that some messages have been lost in the connection. This means that it needs to recover these meaasges.
If you have any more questions or want more information I would be happy to reply.
ResetOnLogon determines if sequence numbers should be reset when recieving a logon request. (please find documentation here: http://www.quickfixengine.org/quickfix/doc/html/configuration.html)
Yes, both can send a Resend Request, but you must follow the specs between your side and the counterparty.
The message sequence numbers tell that no messages were lost during the current session. If there is a mismatch, actions must be taken in order to establish the correct sync between the 2 sides.