Reduce client wait time during traffic surge

Reduce client wait time during traffic surge - server

I am trying to understand how client response time behaves during a traffic surge. Lets say we have a concert ticket website (nginx HTTP2 and a CDN provider for handling requests) which shall release special tickets at a certain (pre-defined) time of the day. Naturally at that point of time, there would be a huge traffic surge for ticket buyers. Is there a way a client (ticket buyer) can always ensure to be first in the queue and get the ticket ? Won't constant requesting the ticket webpage constitute DoS (denial-of-service) attack ?
Tried to read about M/M/1 queue but don't think its used in practice.

Related

Rate Limiting by Request in Haproxy

My goal is to: Limit no of requests (not connection) for API per backend server, so that server won't get loaded with too many requests.
I can do this in middleware of each server, but problem is if server goes in stuck state, and won't able to perform any action on request, Request will go in wait state and that will impact client.
So I won't to perform this using haproxy, where haproxy based on requests on each server, will transfer request to next available node
I read documentation on haproxy.
But it has connection based rate limiting on each server. Or total request rate limiter on frontend, problem with this is if no of servers increase no of allowed request should increase and this does not limit request on each server instead its for service
https://www.haproxy.com/blog/four-examples-of-haproxy-rate-limiting/
Any help will be appreciated

Is there a way of handling Page Expired errors in PingFederate?

The SSO product PingFederate produces a "Page Expired" when it cannot find the request in its table of recent requests. They state, in a manner reminiscent of "640K ought to be enough for anybody.", that
This is unlikely since PingFederate's state table handles up to 10000 requests by default.
Well, guess what, this PingFederate server is kind of busy (producing 10MB of logs per minute), so if the user should wait for, say, an hour, 10K requests have been produced and that state table no longer contains the lookup key (of cookie+nonce).
So, apart from trying to keep the user from staying on the logon screen, is there a way I can instruct PF to "redirect user back to logon screen in case of Pag Expired"?
Logout requests has exactly this feature, through the InErrorResource parameter, so the opposite seems likely to exist.

PingFederate does support an InErrorResource parameter for both the IdP-init SSO as well as SP-init SSO endpoints. This being said, I doubt that the InErrorResource value will be kept when the state is dropped, PingFederate might end up with no knowledge of the user and request, resulting in the same error.
If the environment is busy as you stated it would make more sense to adjust the size limits to avoid the state being lost. The documentation explains here how these can be configured, and what each limit controls. It's worth noting that increasing these limits will have an impact on memory usage, handle with care.

Two channels for one API

We have a SaaS. It consists of Single Page application (client), Gateway, Data Service 1, Data Service 2 and Notification Service.
Client talk with Gateway (using REST) and service route the request to appropriate Data Service (1 or 2) or do own calculations.
One request from the client can be split on multiple at Gateway service. The result is an aggregation of responses from the sub-services.
Notification Service - is a service which pushing information about changes made by other users using MQ and WebSocket connection to the client. Notification can be published by any service.
With enginers, we had a discussion how the process can be optimized.
Currently, the problem that Gateway spending a lot of time just waiting for the response from Data Services.
One of the proposals is letting Gateway service response 200 Ok as soon as message pushed to the Data Service and let client wait for operation progress throw Notification channel (WebSocket connection).
It means that client always sends HTTP request for operation and get confirmation that operation is executed by WebSocket from the different endpoint.
This schema can be hidden by providing JS client library which will hide all this internal complexity.
I think something wrong with this approach. I have never seen such design. But I don't have valuable arguments against it, except complexity and two points of failure (instead of one).
What do you think about this design approach?
Do you see any potential problems with it?
Do you know any public solutions with
such approach?

Since your service is slow it might makes sense to treat it more like a batch job.
Client sends a job request to Gateway.
Gateway returns a job ID immediately after accepting it from the Client.
Client periodically polls the Gateway for results for that job ID.

Increase Batch Quota in Google Core Reporting API

Does anyone know if there is a way to increase the quota limit of 10 queries when batching calls to the core reporting API?
This question/answer mentions the limit of 10: How can I combine/speed up multiple API calls to improve performance?
If I try to add more than 10 queries to the batch only the first ten are processed, each one after that contains a 403 quota exceeded error.
Is there a pay option? Would love to speed up the process of reporting on GA data for a bunch of URLs. I looked in my Google Developer's Console under the Analytics API where there is an option to increase the per-user limit and a link to request additional quota but I don't need total quota to increase, only allowed batch requests.
Thanks!

Quota is the number of requests you are allowed to make to a Google API without requesting permission to access more. Most of the Google APIs have a free quota, a number of requests Google lets you make without asking for permission to make more request. There are project based quotas and user based quotas.
Unless it says other wise APIs Quotas are projects based not user based.
User quota example
Per-user limit 10 requests/second/user
Some Quotas are user based, a user is normally the person that has authenticated the request. Every request sent to google contains information about who is making the request in the form of the IP address where the request came from. If you have your code running on a server the IP address is the same all the time so Google sees it as the same user. You can get around his by adding a random Quotauser to your request this will identify the request based upon different users.
If you send to many requests to fast from the same user you will see the following error.
userRateLimitExceeded The request failed because a per-user rate limit
has been reached.
The best way to get around this is to use QuotaUser in all of your requests, and identify different users to Google. Or just send a random number every time should also work.
Answer: You can't apply for an extension of the flood protection user rate limit. But you can get around it by using QuotaUser.
more info on quotas can be found on Google developers console APIs

How many parallel requests can be made using a single session token in a REST API

I am working on an application which is going to be heavily dependent on Sabre API. The critical factor for the application is going to be performance when around a million users are accessing the API simultaneously.
After speaking to Sabre API support , all they told me is that they will provide max 50 session tokens at a time and you have to manage sessions at your end.
This leaves my question unanswered - will they be able to handle a million parallel requests?
So basically will we be able to make multiple requests using the same session token unless it expires?
Please help me understand their response.Below is the series of email conversation I had with the Sabre API support.
Hello Karam,
The limit will be the simultaneous sessions that is setup for your PCC. By default you can create up to 50 simultaneous tokens in CERT (50 simultaneous sessions) but the answer to your question is no, processing time from our side will not be impacted.
Regards,
Hello Sebastian
Thank you very much for being with me and helping me out with this.
So as you have mentioned that we can have 50 session tokens at a time, is it possible to make more than 1 simultaneous requests (asynchronous requests) using a single session token?
For example , we get a session token and store it at our end and use it to make multiple requests.
I ask this because , if not , then it would mean we can only make 50 parallell requests at a time (1 request per session token).
And if that is true then we might have to implement a request queue which will delay the responses for the end users.
Thanks
Karam
Hello Karam,
Please see below my answers to your inquiries:
So as you have mentioned that we can have 50 session tokens at a time, is it possible to make more than 1 simultaneous requests (asynchronous requests) using a single session token?
For example , we get a session token and store it at our end and use it to make multiple requests.
It is not possible, It is actually not a Sabre Web Services related behavior but how Sabre host works. Sabre is a synchronous system, once a request has been sent, you need to wait until receiving a response back in order to run a second call. Otherwise you will receive a message like “PREVIOUS ENTRY ACTIVE” or similar.
I ask this because , if not , then it would mean we can only make 50 parallell requests at a time (1 request per session token).
And if that is true then we might have to implement a request queue which will delay the responses for the end users.
It will depend on the session manager and the customer’s needs but most of our customers don’t need to consume 1000 simultaneous sessions. In any case, once you are a webservices subscriber you can define and request to your account executive the amount of tokens that best meets your needs.
Hope this helps!
Best regards,

It is correct, you cannot use the same session/token for multiple parallel requests...(Sabre keeps the session state, and that affects the result of your next request)
What they recommend is to create a session manager, so you'll have your session queue and use them and "ignore" them as you need them. That way you can have sessions for query only and sessions for touching a PNR, you can also manage your own expiration time, or "keep alive" routine.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Reduce client wait time during traffic surge - server

Related

Rate Limiting by Request in Haproxy

Is there a way of handling Page Expired errors in PingFederate?

Two channels for one API

Increase Batch Quota in Google Core Reporting API

How many parallel requests can be made using a single session token in a REST API

Categories

Resources