How to resolve degardation high response time on AEM Author server - aem

When team will execute a test of performance on AEM Pre prod author servers then immediately author server load is increase highly and response time getting more time on server. so here we need a solution for how to reduce this author server response time when test cases are executing.

Add a dispatcher in front of the author server. Though usually intended for publish instances, dispatchers can be setup for authors as well, and alleviate the load on the server returning common files. Of course, make sure to configure it properly to avoid caching editable content.

Related

How appropriate it is to use SAML_login with AEM with more than 1m users?

I am investigating a slow login time and some profile synchronisation problems of a large enterprise AEM project. The system has around 1.5m users. And the website is served by 10 publishers.
The way this project is built, is that they have enabled the SAML_login for all these end-users and there is a third party IDP which I assume SAML_login talks to. I'm no expert on this SSO - SAML_login processes, so I'm trying to understand if this is the correct way to go at the first step.
Because of this setup and the number of users, SAML_login call takes 15 seconds on avarage. This is getting unacceptable day by day as the user count rises. And even more importantly, the synchronization between the 10 publishers are failing occasionally, hence some of the users sometimes can't use the system as they are expected to.
Because the users are stored in the JCR for SAML_login, you cannot even go and check the home/users folder from crx browser. It times out as it is impossible to show 1.5m rows at once. And my educated guess is, that's why the SAML_login call is taking so long.
I've come accross with articles that tells how to setup SAML_login on AEM, and this makes it sound legal for what it is used in this case. But in my opinion this is the worst setup ever as JCR is not a well designed quick access data store for this kind of usage scenarios.
My understanding so far is that this approach might work well but with only limited number of users, but with this many of users, it is an inapplicable solution approach. So my first question would be: Am I right? :)
If I'm not right, there is certainly a bottleneck somewhere which I'm not aware of yet, what can be that bottleneck to improve upon?
The AEM SAML Authentication handler has some performance limitations with a default configuration. When your browser does an HTTP POST request to AEM under /saml_login it includes a base 64 encoded "SAMLResponse" request parameter. AEM directly processes that response and does not contact any external systems.
Even though the SAML response is processed on AEM itself, the bottle-necks of the /saml_login call are the following:
Initial login where AEM creates the user node for the first time - you can look at creating the nodes ahead of time. You could write a script to create the SAML user nodes (under /home/users) in AEM ahead of time.
During each login when the session is first created - a token node is created under the user node under /home/users/.../{usernode}/.tokens - this can be avoided by enabling the encapsulated token feature.
Finally, the last bottle-neck occurs when it saves the SAMLResponse XML under the user node (for later use required for SAML-based logout). This can be avoided by not implementing SAML-based logout. The latest com.adobe.granite.auth.saml bundle supports turning off the saving of the SAML response. Service packs AEM 6.4.8 and AEM 6.5.4 include this feature. To enable this feature, set the OSGI configuration properties storeSAMLResponse=false and handleLogout=false and it would not store the SAML response.

What are the limitations of the flask built-in web server

I'm a newbie in web server administration. I've read multiple times that flask built-in web server is not designed for "production", and must be used only for tests and debug...
But what if my app touchs only a thousand users who occasionnaly send data to the server ?
If it works, when will I have to bother with the configuration of a more sophisticated web server ? (I am looking for approximative metrics).
In a nutshell, I would love to find what the builtin web server can do (with approx thresholds) and what it cannot.
Thanks a lot !
There isn't one right answer to this question, but here are some things to keep in mind:
With the right amount of horizontal scaling, it is quite possible you could keep scaling out use of the debug server forever. When exactly you would need to start scaling (or switch to using a "real" web server) would also depend on the environment you are hosting in, the expectations of the users, etc.
The main issue you would probably run into is that the server is single-threaded. This means that it will handle each request one at a time, serially. This means that if you are trying to serve more than one request (including favicons, static items like images, CSS and Javascript files, etc.) the requests will take longer. If any given requests happens to take a long time (say, 20 seconds) then your entire application is unresponsive for that time (20 seconds). This is only the default, of course: you could bump the thread counts (or have requests be handled in other processes), which might alleviate some issues. But once again, it can still be slow under a "high" load. What is considered a "high" load will be dependent on your application and the expectations of a maximum acceptable response time.
Another issue is security: if you are concerned at ALL about security (and not just the security of the data in the application itself, but the security of the box that will be running it as well) then you should not use the development server. It is not ready to withstand any sort of attack.
Finally, the development server could just fail outright. It is not designed to be used as a long-running process (days, weeks, months), and so it has not been well tested to work in this capacity.
So, yes, it has limitations. Yes, you could still conceivably use it in production. And yes, I would still recommend using a "real" web server. If you don't like the idea of needing to install something like Apache or Nginx, you can still go with a solution that is still as easy as "run a python script" by using some of the WSGI Standalone servers, which can run a server that is designed to be in production with something just as simple as running python run_app.py in the command line. You typically just need to create a 4-5 line python script to import and create the server object, point it to your Flask app, and run it.
gunicorn could be run with only the following on the command line, no extra script needed:
gunicorn myproject:app
...where "myproject" is the Python package that contains the app Flask object. Keep in mind that one of developers of gunicorn would probably recommend against this approach. See https://serverfault.com/questions/331256/why-do-i-need-nginx-and-something-like-gunicorn.
The OP has long-since moved on, but for those who encounter this question in the future I would just add that setting up an Apache server, even on a laptop, is free and pretty easy. It can be readily configured for as few or as many features as you want just by uncomment in or commenting out lines in the config file. There might be an even easier GUI method for doing that nowdays, but just editing the configs is simple.

Testing a Product that Includes Syncing and other Network Requests

I am nearing the release of an iOS app that syncs and otherwise interacts with a server. I am struggling with a testing procedure that can cover most/all possible situations. I don't have any experience with automated testing so I have been doing everything manually so far with the iPhone simulator and a physical device.
How would I start designing automated tests that can help me get better coverage of possible situations and also serve me well in the future as I make changes and add new features?
You probably need to be more specific in your question. ie. outline how you communicate with your server, what technology is being employed etc.
But as a general approach the first thing I would be doing is looking to find a way to get reproducable results from the server. For example if I send a message asking for a record with an id of 'x' then the server will alwasy return the same record with the same data. There are severa ways to do this, one would be to load a set of test data into your server. Another would be to create a local test server and talk to that instead. Another option is to avoid the server all together in your automaticed tests and mock out the communication classes in your app. It totally depends on what you are trying to test and how.
Once you have your back end dealt with you can then look into automating the tests. This very much depends on how you have dealt with the server. For example, if you are performing an integration style test where you actually talk to a server, then the test might take the form:
Reset or clear the server data.
Load it with predictable data.
Run the iOS app using some testing framework and verify any data sent from the server.
Access the server and verify any changes made there.

REST: how to tell server to do some background process

I am building a client-side product with REST. All user interaction will be done with a browser (the config stuff will be on a server running on localhost). I want everything to be REST compliant, even though the application will be running on a client's machine on localhost and will never be accessible from the outside.
The commands are pretty simple:
update
restart
sync
Here's what I've come up with:
POST to / with 'action' parameter (JSON) detailing specifics
PUT a new resource
subsequent GET requests will return the status
when the command is complete, the resource is deleted
What would be the most RESTful way to implement this?
Note:
I'm not asking for scrutinization of my software architecture. I have reasons for choosing a REST interface instead of a unix domain socket, CLI interface, or even a regular GUI interface. The justification would overcomplicate the question and make it too localized.
I have had the same need on a couple of different projects (both client only and server) and I am looking for community input on best practices.
I would POST to a /process resource with the appropriate parameters necessary to start the process, then I would have it return a Location header to that resource that actually represents the process status (/process/123). You can then use GET on that process to get the latest information about it.
I would not automatically delete the process, because if you do that, the client will not know if the process finished properly or not, just simply that it finished (well, stopped running).
Noting that, the client can certainly DELETE the resource when it is done, or you can clean it up later after some reasonable time where whoever was interested in it is likely not to be any more.

GWT Server side entry point

I followed these instructions
http://code.google.com/webtoolkit/usingeclipse.html
There appears to be no entry point function for the server? How do I run background threads or code not related to the rpc services that the server exports?
For example, what if some embedded database needs to be updated every 5 minutes. So then a background thread would fetch this new data to do the updating
GWT is client-side technology and has nothing to do with server-side. You can use any servers-side technology with it. If you use java/servlets then you can optionally use GWT-RPC, which is nice, but not required.
Web applications are based around request-reply paradigm: when there is a request, they handle it and send back the reply. Servlets are designed around this paradigm. They are used at some of the largest sites and are not just a toy (as you noted in other comment).
When you need something to run periodically, then this is usually the job for Job Schedulers. I'd recommend Quartz, which has great documentation. There is also an example how to initialize it in servlet environment.
thats not how web applications are supposed to work. Read http://code.google.com/intl/de-AT/webtoolkit/doc/latest/tutorial/clientserver.html
If you want to run some processing when request comes and potentially include some dynamic parts, you can just make your pages to be JSP or servlets. GWT does not need to be used in HTML files. Just the page served by server need to be HTML. So something like server side entry point is either JSP or servlet. Otherwise you need to use PRC. But if you needed to run RPC for every page loaded, you could consider this tip of embedding RPC in the base response.