QuickFixN disconnect during the day and could not reconnect - quickfix

We are using QuickFixN for sending orders to exchange and receiving execution reports.
If the VPN for exchange is disconnected during the day, the QuickFixN could not reconnect until the next day, despite having the ResetOnLogon and ResetOnDisconnected settings set to N.
We do not understand the reason: the sequence, or something else?
0171217-12:15:39.122 : Created session
20171217-12:15:39.129 : Connecting to 172.16.105.151 on port 10060
20171217-12:15:39.399 : Connection succeeded
20171217-12:15:39.423 : Initiated logon request
20171217-12:15:39.680 : Session FIX.4.2:NOOR->MBS disconnecting: System.Net.Sockets.SocketException (0x80004005): An existing connection was forcibly closed by the remote host
at QuickFix.SocketInitiatorThread.ReadSome(Byte[] buffer, Int32 timeoutMilliseconds)
at QuickFix.SocketInitiatorThread.Read()
20171217-12:15:41.140 : Connecting to 172.16.105.151 on port 10060
20171217-12:15:41.398 : Connection succeeded
20171217-12:15:41.399 : Initiated logon request
20171217-12:15:41.654 : Session FIX.4.2:NOOR->MBS disconnecting: System.Net.Sockets.SocketException (0x80004005): An existing connection was forcibly closed by the remote host
at QuickFix.SocketInitiatorThread.ReadSome(Byte[] buffer, Int32 timeoutMilliseconds)
at QuickFix.SocketInitiatorThread.Read()
Requests to exchange
20171217-12:15:39.423 : 8=FIX.4.2|9=65|35=A|34=7304|49=NOOR|52=20171217-12:15:39.415|56=MBS|98=0|108=30|10=192|
20171217-12:15:41.398 : 8=FIX.4.2|9=65|35=A|34=7305|49=NOOR|52=20171217-12:15:41.398|56=MBS|98=0|108=30|10=196|
20171217-12:15:43.397 : 8=FIX.4.2|9=65|35=A|34=7306|49=NOOR|52=20171217-12:15:43.397|56=MBS|98=0|108=30|10=198|
20171217-12:15:45.398 : 8=FIX.4.2|9=65|35=A|34=7307|49=NOOR|52=20171217-12:15:45.398|56=MBS|98=0|108=30|10=202|
20171217-12:15:47.399 : 8=FIX.4.2|9=65|35=A|34=7308|49=NOOR|52=20171217-12:15:47.399|56=MBS|98=0|108=30|10=206|
20171217-12:15:49.400 : 8=FIX.4.2|9=65|35=A|34=7309|49=NOOR|52=20171217-12:15:49.400|56=MBS|98=0|108=30|10=192|

Well the error means your counterparty is likely sending a TCP reset packet to you. So it looks like you're making the connection, but then failing the logon attempt and the result is the reset packet.
Is your username / password correct?

First of All , thank you #rupweb for your replay and interest to give a hand .Second , there were two sources of this problem : 1 - before disconnecting the client system -that connects to remote party - it must logout. 2 - the executable file of client system must run from the same physical location each time.Because, the Fix generates file to keep the sequence on my device to start another handshaking and connecting next time.

Related

QuickFIX Initiator sends logout request and does not reconnect

I am using a FIX session to get TradeCaptureReports. When connection is established, I get responses to TradeCaptureRequest. After logon, heartbeat messages are sending and receiving.
But then FIX initiator sends logout request and does not reconnect, even if ReconnectInterval is set to 1 in session config.
event log:
08:23:56 : Initiated logon request
08:23:56 : Logon contains ResetSeqNumFlag=Y, reseting sequence numbers to 1
08:23:56 : Received logon response
08:25:42 : Initiated logout request
I need to keep QuickFIX connection alive and keep sending scheduled TradeCaptureRequests. Do you have any idea, what can cause this logout?
Message log after logon request and response:
8=FIX.4.4|9=56|35=0|34=3|49=**|52=20151203-08:24:56.310|56=***|10=169|
8=FIX.4.4|9=56|35=0|49=***|56=**|34=3|52=20151203-08:24:55.771|10=179|
8=FIX.4.4|9=56|35=0|34=4|49=**|52=20151203-08:25:26.313|56=***|10=171|
8=FIX.4.4|9=56|35=0|49=***|56=**|34=4|52=20151203-08:25:25.772|10=179|
8=FIX.4.4|9=56|35=5|34=5|49=**|52=20151203-08:25:42.338|56=***|10=182|
Session Config:
HeartBtInt=30
ReconnectInterval=1
ResetOnLogon=Y
StartTime=00:00:00
EndTime=00:00:00
I won't be able to test this until Monday, but I don't think you can set your StartTime to be equal to your EndTime. That would explain why it's disconnecting, because it doesn't think it's time for the session to be up.
QuickFix does support week long sessions, if you use the StartDay and EndDay parameters, in conjunction with the StartTime and EndTime ones.

Fiddler 2 error: SecureClientPipeDirect failed: System.IO.IOException Unable to read data from the transport connection

I am trying to decrypt the https traffic by fiddler2 which has just been upgraded.
What is the problem to get this errror?
17:27:45:6821 !SecureClientPipeDirect failed: System.IO.IOException Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. < A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond on pipe to (CN=192.168.0.100, O=DO_NOT_TRUST, OU=Created by http://www.fiddler2.com)
Thanks
The error message indicates that the client failed to complete the HTTPS handshake. What was the client? This message typically indicates that the client isn't configured to trust Fiddler's Root Certificate.
What, if any, other messages are shown on the Log tab?

socket programming for bad network

client:
socket(), connect() and then
for (1 to 1024) {
write(1024 bytes)
}
exit(0);
server:
socket(), bind(), listen()
while (1) {
accept()
while((n = read()) {
if (n == -1) abort(); /* never happended */
total_read += n
}
close()
}
now, client runs on Mac under NAT and server runs on my VPS (abroad)
generally, it works fine (client send all data and exit & server recv all data)
however, when client is running but suddenly the network is broken for couple minutes(and regain), the client won't exit after a long long time... I kill it with control + C and run it again, the server seems not read the data any more (client is still running)
here is what netstat shows:
client:
tcp4 0 130312 192.168.1.254.58573 A.B.C.D.8888 ESTABLISHED
server:
tcp 0 0 A.B.C.D:8888 a.b.c.d:54566 ESTABLISHED 10970/a.out
tcp 102136 0 A.B.C.D:8888 a.b.c.d:60916 ESTABLISHED -
A.B.C.D is my VPS address
a.b.c.d is my public client address
my quesiton is:
1, why ?
2, server will works fine after restarting, how to write code to get rid of it without restarting ?
In TCP, there's no way to tell that a connection has failed unless you try to send something on the connection. TCP doesn't perform active monitoring of the connection (actually, there are optional "keepalive" packets, but these are not normally sent until the connection has been idle for a couple of hours). When you send something, you'll eventually get an error if there's a timeout waiting for the other machine to return an acknowledgement. But if you're just reading data without sending, you can't tell that the connection has failed -- it just looks like the sender doesn't have anything to send.
You can resolve this by designing your application so that the client is required to send something every N seconds. Then set a timer in the server that detects that you haven't received anything for more than N seconds (you should add a little extra time to allow for transient delays).
When the network is broken what happens is that you clients keep sending data and at some point the socket send buffer gets full (I understand from what you show that you are sending 1024 Bytes, 1024 times, 1MB in total). The default for send buffer could be 16KB (surely less than 1MB). Then when the client tries to write, it gets blocked forever.
BTW, now I'm answering your question I don't know whether eventually after a number of TCP timeouts, TCP gives up and closes the socket making the socket interface return with error. I think that's not happening ... :) - So, connect fails if there is a problem in the network but write and read do not fail.
In the server side, the server gets blocked in read because it never receives the EOF.
Solution:
In the client side use non-blocking sockets, if the network is broken, at some point write will return with error EWOULDBLOCK. Then you will realize the send buffer is full for some reason. At that point, you could clouse the connection and try to connect again. If the network is broken, you will receive an error.
In the server side also use non-blocking sockets and select() function with a timeout. After a few timeouts you may decide there is a problem with the new connection and close it.

PostgreSQL frontend unexpectedly closes connection

I'm a little bit confused with the following case.
I've got a Postgres server running on host A, and a java based client running on host B. The client uses org.postgresql.Driver JDBC driver (version 9.1-901.jdbc3).
sometimes while executing long running stored procedure I get exception "java.net.SocketException: Socket closed". I'm using org.apache.commons.dbcp.BasicDataSource for retrieving
connections.
DBCP pool is configured with default options.
I got tcp dump in order to figure out on which side (client or server) socket is being closed;
Here is what I've got:
1. Client B sends a test query message when tries to borrow connection from dbcp pool ("Select 1")
2. Server A sends successful response back (Type: Command completion, Ready for query)
3. Client B sends ACK message in response on server A response (see the item 2).
4. Client B sends query message to the server A.
5. Server A sends ACK message in response on client Query message (see the item 4).
6. Client B sends terminating message (Type : Termination) after some time passed (from 3 to 10 or sometimes even more minutes).
7 Client B sends FIN ACK message to the server.
8. Server A sends back ACK on termination message.
9. Server A sends ACK on (FIN, ACK) message (item 7).
10. Server A sends back a response on the client query (from item 4) Type: Row description Columns: 40.
11. Client B sends RST message (reset).
12. Server A continues sending response on the query Type: Data row Length: 438 Columns 40 and so on.
13 Client B sends RST message (reset) again.
14. Server A continues sending response on the query Type: Data row Length: 438 Columns 40 and so on.
15. Client B sends RST message (reset).
After that communication seems to be finished.
After the item 6, in my client logs I got Exception like the following:
Caused by: java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:145)
at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:114)
at org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:73)
at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:274)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1661)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
Could you please help me to figure out the reason of such a failure. (This bug happens once per 10 successful cases.)
We had a similar problem, and it was caused by a firewall or connection tracking router between the server and the client.
I am guessing you took the tcpdump on the server side. The query runs for a considerable time with no traffic on the connection. The firewall has a timer on the open connection; it expires and the firewall closes the connection towards the server, and also back towards the client. On the capture at the server side, it looks like the client is closing the connection.
You could verify this by capturing on the client side simultaneously as you capture on the server side - on the client side it will look like the server has closed the connection, while on the server side it looks like the server is closing the connection. In reality the firewall is closing it in both directions.
To prevent this, you can set tcp_keepalives_idle, tcp_keepalives_interval and/or tcp_keepalives_count (if your OS supports TCP Keepalives). Alternatively, you will have to change the settings on the firewall.

QuickFix/N sending repeated logons

I use QuickFixN to connect to 2 liquidity providers.
One is connecting and working fine. The other isn't showing any error message, seems to be connecting, but Logon isn't succeeding.
In the messages log: I am sending the Logon request (message type 'A'), and receiving back another message type A, but then nothing happens. 30secs later this happens again. It has many repeats looking like this:
20131118-20:11:32.422 : 8=FIX.4.49=11535=A34=149=XXXX50=XXXX52=20131118-20:11:32.40856=XXXX57=XXXX98=0108=30141=Y10=152
20131118-20:11:32.795 : 8=FIX.4.49=11535=A34=149=XXXX50=XXXX52=20131118-20:11:32.61956=XXXX57=XXXX98=0108=30141=Y10=156
....same again every 30secs....
the event log looks like this:
20131118-20:11:32.023 : Connecting to AA.AAA.AAA.AAA on port BBBB
20131118-20:11:32.395 : Connection succeeded
20131118-20:11:32.408 : Session reset: ResetOnLogon
20131118-20:11:32.422 : Session reset: ResetSeqNumFlag
20131118-20:11:32.422 : Initiated logon request
20131118-20:11:32.796 : Message 1 Rejected: 9
20131118-20:11:32.798 : Verify failed: Tried to send a reject while not logged on
20131118-20:11:32.798 : Session FIX.4.4:XXXX->YYYY disconnecting: Verify failed: Tried to send a reject while not logged on
Within my application, on the QuickFix.Application interface, OnCreate is being called for this session, and so is OnLogout, but OnLogon is not. Neither FromAdmin or FromApp receive any messages from this session.
What could I be doing wrong?
The "Message 1 Rejected: 9" phrase is saying the message with sequence number 1 (the Logon message) was rejected for reason 9. The reason is a FIX Session Reject Reason and 9 indicates a CompID problem. Double-check your sender and target CompIDs in the message to be sure they are correct for your counterparty. Note that your side of the session is rejecting their login so it could be an issue with configuration of your session. The "Verify failed" message is logged because QuickFIX/n is apparently trying to send a reject message before the session is logged in.