GetCustomerAccounts slow - intuit-partner-platform

I have the dotnetAggCatSample as well as my own app running with the aggCat API.
When I execute GetCustomerAccounts in Dev mode, there's a 30 second or so lag before I get a response. Is this lag part of some kind of Dev mode throttling, or can I expect this behavior in Production as well?
There's no real lag in GetAccountTransactions though.

Related

Locust response time high latencies at the start

I'm doing some load testing on a microservice application. Collected the percentile statistics and plotted them. The application is running in a shared K8s cluster. The thing I am not quite understanding is why is there a latency spike in the start? Is this an issue with a cold boot?
Locust plot showing RT over time
Is this an issue with a cold boot?
Yes, this is the most likely explanation. There's no way of knowing without digging into your application and its logs though.
Most applications, especially ones that do automatic scaling, perform very poorly when suddenly hit with a large amount of load. If your actual expected user load does not have this behaviour, then maybe a slower ramp-up is more appropriate.
If you havent already read this, then maybe have a look at https://github.com/locustio/locust/wiki/FAQ#increase-my-request-raterps

Zero downtime deployment of Slack bot

We develop bot with BotKit and now we try to solve problem with minimal deployment downtime.
There are the server and docker container running on this server. Inside container run bot-app instance connected with RTM-server (Slack).
When I start to deploy new version (v2) of bot-app, I want to get zero downtime, users should not see "bot is offline".
Deploy script runs second docker container with a new version of bot-app. And bot-app connect to RTM-server too. In this way, there are few seconds, when both apps run, connected to RTM-server and responds to user commands (and a user will to see two answers to his command).
What optimal decision I can get if on the one hand we want to get zero downtime and on the other hand, we want to prevent the user interact with the two instances at the same time?
Decision 1:
To allow small chance the likelihood of a collision, when both instances will respond to the user command.
Decision 2:
Abandon the zero downtime deployment. In this case, deploy script first stops the first docker-container, then start another one. The app will not respond to user commands, sent between stopping current version of the app and fully starting of a new version of an app.
Decision 3:
With an interact of parallel run current and new version of app or mutexes. General schematic:
1) Current version of app is running
2) Deploy script starts new version of app
3) I time when a new version of app almost run and ready to connect to RTM-server, it send to current version app command to close RTM-connection.
4) Current version of app closes RTM-connection
5) New version of app open RTM-connection
I think there are other good solutions.
How would you have solved this problem in your application?
(Sorry for the second reply; had another idea.)
The approach I described earlier would be pretty disruptive to your existing code, since you'd probably need to stop using botkit (or at least not use it to do the RTM API communication). An approach that may be less disruptive would be to use some sort of external way to signal that a given message is already been processed.
For example, using Redis, have the bot do the following command when a message comes in:
SET message:<message timestamp> 1 NX PX 30000
The NX option means this command will only succeed if the key doesn't already exist. So the first instance of the bot that manages to execute this will succeed, and the other instance will fail. The bot should only process the message and respond if this command succeeded.
(The PX 30000 sets a 30-second expiration so Redis doesn't get full of these keys.)
This should let you do your zero-downtime upgrades via overlapping the running bot instances without having to worry about a message being processed twice.
Note that it's still possible in this scheme for a message to be dropped altogether if a bot is shut down in a non-graceful way. (It could die just after calling the SET command but before it's actually dealt with the message.) A real queue with a two-phase "get/delete" would be better, but then you're back to my other answer. :-)
One idea I would consider is separating into two components:
A component that keeps a WebSocket connected to the Slack RTM API. This component simply reads messages from the API and puts them on to a queue. (Let's call this the "queuer.")
The actual "bot," which reads messages from the queue and responds as needed.
Depending on how your bot behaves, it can use the Web API directly or perhaps put its own messages on an outbound queue which the "queuer" can send via the RTM API.
This architecture probably solves your problem... you can now either take the bot down briefly while upgrading—responses will just be delayed until the new version is running—or you can run two versions of the bot at the same time and rely on the semantics of the queue to prevent both versions from responding to the same message.

Google Cloud SQL very slow from time to time

It's been almost 3 months I have switched my platform to Google Cloud (Compute Engine + Cloud SQL + Cloud Storage).
I am very happy with it but from time to time I noticed big latency on the Cloud SQL server. My VMs from Compute Engine and my Cloud SQL instance are all on the same location (us-1) datacenter.
Since my Java backend makes a lot of SQL queries to generate a server response, the response times may vary from 250-300ms (normal) up to 2s!
In the console, I notice absolutely nothing: no CPU peaks, no read/write peaks, no backup running, nothing. No alert. Last time it happened, it lasted for a few days and then the response times went suddenly better than ever.
I am pretty sure Google works on the infrastructure behind the scenes... But no way to point that out.
So here's my questions:
Has anybody else ever had noticed the same kind of problem?
It is really annoying for me because my web pages get very slow and I have absolutely no control over it. Plus I loose a lot of time because I generally never first suspect a hardware problem / maintenance but instead something that we introduced in our app. Is it normal or do I have a problem on my SQL instance?
Is there anywhere I can have visibility over what's Google doing on the hardware? I know there are maintenance alerts, but for my zone it seems always empty when it happen.
The only option I have for now is to wait and that is really not acceptable.
I suspect that Google does some sort of IO throttling and their algorithm is not very sophisticated. We have a build server which slows down to a crawl if we do more than two builds within an hour. The build that normally takes 15 minutes will run for more than an hour and we usually terminate it and re-run manually later. This question describes a similar problem and the recommended solution is to use larger volumes as they come with more IO allowance.

Azure Mobile Services Latency

I am noticing latency in REST data the first time I visit a web site that is being served via Azure Mobile Services. Is there a cache or timeout of a connection after a set amount of time, because I am worried about user experience while waiting 7-8 seconds for the data to load (and there is not a lot of data, as I am testing 10 records returned). Once the first connection is made, subsequent visits appear to load quickly... but if I don't visit the site for a while, we are back to 7-8 seconds on first load.
Reason: The reason for this latency is the "shared" mode. When the first call to the service is made, it performs a "cold start" (initializing and starting the virtual server etc)
As you described in your question, after a while when the service is not used, it is put into the "sleep mode" again.
Solution: If you do not want this waiting-time, you can set your service to "reserved" mode, which forces the service to be active all time even when you do not access it for a while. But be aware that this requires you to pay some extra fees.

What are the limitations of the flask built-in web server

I'm a newbie in web server administration. I've read multiple times that flask built-in web server is not designed for "production", and must be used only for tests and debug...
But what if my app touchs only a thousand users who occasionnaly send data to the server ?
If it works, when will I have to bother with the configuration of a more sophisticated web server ? (I am looking for approximative metrics).
In a nutshell, I would love to find what the builtin web server can do (with approx thresholds) and what it cannot.
Thanks a lot !
There isn't one right answer to this question, but here are some things to keep in mind:
With the right amount of horizontal scaling, it is quite possible you could keep scaling out use of the debug server forever. When exactly you would need to start scaling (or switch to using a "real" web server) would also depend on the environment you are hosting in, the expectations of the users, etc.
The main issue you would probably run into is that the server is single-threaded. This means that it will handle each request one at a time, serially. This means that if you are trying to serve more than one request (including favicons, static items like images, CSS and Javascript files, etc.) the requests will take longer. If any given requests happens to take a long time (say, 20 seconds) then your entire application is unresponsive for that time (20 seconds). This is only the default, of course: you could bump the thread counts (or have requests be handled in other processes), which might alleviate some issues. But once again, it can still be slow under a "high" load. What is considered a "high" load will be dependent on your application and the expectations of a maximum acceptable response time.
Another issue is security: if you are concerned at ALL about security (and not just the security of the data in the application itself, but the security of the box that will be running it as well) then you should not use the development server. It is not ready to withstand any sort of attack.
Finally, the development server could just fail outright. It is not designed to be used as a long-running process (days, weeks, months), and so it has not been well tested to work in this capacity.
So, yes, it has limitations. Yes, you could still conceivably use it in production. And yes, I would still recommend using a "real" web server. If you don't like the idea of needing to install something like Apache or Nginx, you can still go with a solution that is still as easy as "run a python script" by using some of the WSGI Standalone servers, which can run a server that is designed to be in production with something just as simple as running python run_app.py in the command line. You typically just need to create a 4-5 line python script to import and create the server object, point it to your Flask app, and run it.
gunicorn could be run with only the following on the command line, no extra script needed:
gunicorn myproject:app
...where "myproject" is the Python package that contains the app Flask object. Keep in mind that one of developers of gunicorn would probably recommend against this approach. See https://serverfault.com/questions/331256/why-do-i-need-nginx-and-something-like-gunicorn.
The OP has long-since moved on, but for those who encounter this question in the future I would just add that setting up an Apache server, even on a laptop, is free and pretty easy. It can be readily configured for as few or as many features as you want just by uncomment in or commenting out lines in the config file. There might be an even easier GUI method for doing that nowdays, but just editing the configs is simple.