Will PutRecords resend data when network is terrible - aws-sdk-go

I send data to aws kinesis via PutRecords. And all the data is sent successfully.
But some record is received twice in consumer side. I checked the sent data and received data, they are same.
I think PutRecords will resend data when it doesn't receive response from kinesis server(response lost due to the terrible network station). But the data is receive by kinesis server.
So, I get same record twice on the consumer side.
Is my assumption correct?

What version of the SDK are you using? The SDK does not retry on timeouts. So I do not think that is the reason.
Whatever the reason may be for the duplicates, there is a good post about dealing with duplicates here.

Related

Is it possible for POST requests sent sequentially to be received out of order?

I'm a bit new to restful API usage.
suppose that I am sequentially sending out 1000 post requests in python, is it possible that my server could receive these requests out of order? (for example post request 5 is somehow received before post request 4)
If it is not possible for requests to be handled out of order, how do web application frameworks know what order to handle the requests in? is there some sort of timestamp/metadata that indicates priority of the request? For example, does the oldest request HAVE to be fulfilled as soon as possible?
I am curious because I am using the POST api to populate my database and I noticed that my sequentially sent data NEVER has come out of order. I don't want to be dependent on this assumption that sent data will always be consistently in the proper order, so I thought I should ask here
If you are sending requests and wait for a response before sending the next one, then everything will happen in order, guaranteed.
If you are sending many requests in parallel, there's generally no guarantee which will arrive first and get processed first.
It's pretty reasonable to assume that the request that arrives first also gets processed first, and over a local fast network that may usually be the case; but over the internet specific packets can get delayed for lots of reasons. Even wifi might cause one of your requests to have to be resent if it wasn't received well the first time.
tl;dr: the only way you can guarantee order is if you wait for a response before sending the next request.

Ensure at-most-once semantic with SendGrid Mail API

I have an [Azure Storage] queue where I put e-mail messages to be sent. Then, there is a separate service which monitors that queue and send e-mails using some service. In this particular case I'm using SendGrid.
So, theoretically, if the sender crashes right after a successful call to SendGrid Mail Send API (https://sendgrid.com/docs/API_Reference/Web_API_v3/Mail/index.html), the message will be returned to the queue and retried later. This may result in the same e-mail being delivered more than once, which could be really annoying for some type of e-mail.
The normal way to avoid this situation would be to provide some sort of idempotency key to Send API. Then the side being called can make sure the operation is performed at most once.
After careful reading of SendGrid documentation and googling, I could not find any way to achieve what I'm looking for here (at most once semantic). Any ideas?
Without support for an idempotency key in the API itself your options are limited I think.
You could modify your email sending service to dequeue and commit before calling the Send API. That way if the service fails to send the message will not be retried as it has already been removed from the queue, it will be sent at most once.
Additionally, you could implement some limited retries on particular http responses (e.g. 429 & 5xx) from SendGrid where you are certain the message was not sent and retrying might be useful - this would maintain "at most once" whilst lowering the failure rate. Probably this should include some backoff time between each attempt.

What is the flow of a request to a server with a queue in the middle?

I'm trying very hard to understand the flow of a web request to a server which has a queue or message broker in the middle, but I can't find information about when and where the reply is given.
Imagine this use case:
Client A:
sends a invoice order request
the invoice is enqueued
the request is processed and dequeued.
at which time the client will receive a response?
right after the message is received by the queue?
right after the message is processed and dequeued? Other?
I'm asking because if the reply only comes after the message being processed the client might wait a long time. Imagine the message takes 3 minutes to process, would the client need to keep requesting the server to see if it is processed? or a connection is maintained using something like long polling?
I'm interested in scenarios using RabbitMq and kafka.
Advantage of having a messaging system is to ensure the frontend webserver and backend processing is decoupled. Best practice is Web server should publish the message and just wait for the messaging system to acknowledge receiving the message.

What are the problems in 3-way Message passing Reliable IPC protocol?

Here, at the end of this page. last paragraph ,
They mentioned some problems that occurs in This protocol.
i am unable to understand what are these problems. ?
for example. He told. "If a request processing long time"
I am unable to understand this statement. Where is the request which processing taking long time, on client ? or on server ?
Or i am unable to understand where is the Clock(time) ? is it on Client side or Server Side? because here mentioned in the end of 2 point. "if the reply is not received within the time period , the kernel of the client machine re-transmits the request message."
Consider this:
The client sends a message. If it doesn't get a reply from the server within - say - 1 minute it will transmit the message again.
When the server receives a message, it only sends a reply after having generated a full response to the message that the client sent.
No suppose you, as client, send a message to the server. The server receives your message, and starts processing it. At this time, you, the client, have no idea of whether the server got the message or not. Assume you send a complicated task to the server, which takes it 1 minute and 5 seconds to complete. After 1 minute (ignoring transmission times), the server is still busy doing your work, but you as the client don't know of any of this and send your message again.
Now, depending on the actual protocol implementation, there are a few potential issues:
It's possible that by sending the message again, you increase some sequence count and are therefore unable to receive the reply to the original message afterwards.
It's possible that the server isn't able to determine whether a message that arrives is the first message or a message that had to be send again. So it could be doing work that it already did, leading either to needless processing or in the worst case to (business) logic errors.
Additionally, by sending both the message and the reply possibly needless more than once, you increase the amount of total data transmitted, without gaining anything from it.
To "solve" this, you could increase the waiting time before the client sends its message again. This will "fix" the issue with long running tasks on the server, but will also hurt in case the message actually got lost on the way, because you're waiting longer to even send a new message.
The "real" solution here is to have the server acknowledge as soon as it receives a message from the client, just as saying "i got your message, i'll send the reply soon!" before even starting to actually process the message.

Quickfix engine - does it persist messages before the start time on the server side

If a quick fix session is created by server(acceptor) at say 9AM, but the StartTime is at 11AM. This means the session exists but not active.
If the server receives an unsolicited message from an exchange that it needs to send on this session, will it persist this if I have configuration PersistMessages=Y and sends it to the client(initiator) when it connects after 11AM?
No, it would not persist messages received before start time and would send you a reject message. The message will be rejected at the interface itself, message isn't handled. You would have to resend it to get a response.
QuickFIX does persist (but not send) messages before a session is connected. The sequence numbers are updated and when the session is connected and the first message is sent, the counterparty FIX engine will see the gap in the sequence numbers and request a resend. QuickFIX will then resend the persisted messages. However, depending on your QuickFIX configuration, the outgoing messages might be considered to be too old and rejected locally.
As I understand, these are kept to take into account timings under which corresponding exchange would accept the orders.
Application or its sub-modules do not need to maintain timings and take some action on closing the fix session. Rather, QuickFix shall automatically deactivate the session.
Persistence of the message or re-sesnding when the session becomes active does not look desirable to me.
You can rather maintain some kind of queue to buffer such messages in sending application, and send them only when the time matches with active session timings.
That's my thoughts, hope that helps.