Cq5 dispatcher is it must or optional - aem

We are getting lot of problems with dispatcher, As per CQ5 documentation dispatcher is cache and/or load balancing tool, so as per my analysis we can go with out dispatcher also,I am correct? I want to integrate Squid or varnish web cache with my apache, so want get shutdown the dispatcher, will it be a good option
Any views/help is appreciated.

Yes, it's perfectly possible to run a website without the Dispatcher in front. Your options would then seem to come down to:
No caching
Implementing a cache in front of the Publish instance (e.q. Squid/Varnish, as you mentioned; configuration required)
Integrate a caching solution in Java that you can apply to parts of your templates/components individually (development required)
Also, you'd need to check with Adobe what level of support they'd give you for any of the above solutions before undertaking them. If you like, you could post specific questions to SO around the problems you're facing with the Dispatcher and you may get some resolutions too.

I was told that you should use dispatcher servers for your publish instance, because it really helps the loading times. There also was a documentation with a table showing how much it affects the performance depending on the number of documents served.
To avoid caching problems, you can specify files, folders or file types which should never be cached. You can also specify caching behaviour in the source code of the pages. Also, making changes to content on your author instance triggers a flush on the dispatcher for the affected content, to make sure that no cached old version is beeing served.
Last but not least using an apache server also allows you to handle virtual hosts and rewrite rules easily.

Its a must.
If you are getting problems with dispatcher, this could be a sign that you are using the wrong platform for your development needs. Seeing as you are needing to revert to technologies that are not needed for AEM.

Related

What is the maximum versions a page can have in AEM?

Is there a limit to the no. of versions a content item can have in AEM? I want to retain all the versions of my page. As in, unlimited.
Want to know if AEM has a limit internally after which it automatically removes older versions?
Appreciate any thoughts on this.
Although this is not recommended but you can disable the version manager by configuring the versionmanager.purgingEnabled to false. You will need to configure this as described in the document below:
https://docs.adobe.com/docs/en/aem/6-3/deploy/configuring/version-purging.html#Version Manager
Retaining lots of versions will gradually slow down your instance and result in poor authoring performance as the storage (Tar or Mongo) will grow large with stale data.
It is normally recommended to retain versions by a fixed number of days or fixed number of version counts.
For performance reasons, it is better to backup your AEM instance for older archived versions and rely on a restore function to access those versions.
I was asking this question once to Adobe DayCare and received the similar response like in i.net post - it is possible to disable purging the versioning of the page however it comes with the risk of authoring performance issues - pages can start loading very slowly.
The solutions that were suggested (depending on the requirements):
backing up an instance, which is not the best one if you need to be able to retrieve or compare old content anytime, recover if needed; the disadvantage is that all copy of instance needs to be stored and it needs to be repeated from time to time (when you notice performance issues)
designing and implementing a custom solution with an additional instance that would be responsible for storing these versions - I have no much details on that solution however as I understood, it would require deep analysis how it can be done
if the access to previous content is needed only for historical reasons (no need to retrieve it and publish once again) then taking use of the page to PDF extraction mechanism and storing history in DAM or another place; you can then also consider saving to PDF screenshot all page with design (not content only), presenting different browser breakpoints, annotations, etc. depending on requirements

Redis / Memcached ReST caching for an external service

Question here about caching data from calls to an external ReST API.
There is currently a ReST service set up to generate and retrieve some specific types of reports that the UI must consume. However, this service is not meant for high volume usage, or to be exposed to the public and these reports are fairly static. Possibly only changing every 10-20 minutes. The web application resides on a separate server.
What I would like to do is, using memcached or Redis, is when a request for data comes in from the UI to the web back-end, make a call from the web application back-end to the report server to get the specified report, transform the data to the appropriate format for the UI to consume, cache it with a timestamp, and return it to the UI so subsequent requests will be available in memory on the web applications back-end without having to re-request from the report server. I would also need to check this timestamp and make a new request if the cached report has been held for longer than the specified time. The data that will be cached is fairly minuscule just some smallish JSON objects with only a handful of values holding the information the UI needs and there is NOT a ton of these objects, I would not be surprised if they could all be easily stored in memory at once so the time stamping is the only invalidation that should be necessary.
I have almost 0 experience here when it comes to caching / memcached / Redis. Is there advantages to one or the other? Is something like this possible? How would I go about implementing this? Are there other options?
Appreciate the help!
Server-caching these kinds of RESTful query responses is very possible and quite common.
With any server based caching, you should also think hard about whether you really need it, as it does add complexity. It can certainly make a huge improvement, but since your usage volume is low, it might actually be overkill. You may also be able to use HTTP caching protocols to avoid the need for caching on the server. If the data doesn't change very often and you use eTags or modified dates correctly, along with an intermediary proxy like AWS CloudFront, users will rarely experience that delay.
Also, if you are finding your database to be a bottleneck, you might be able to get away with just configuring it to cache more aggressively.
Assuming you do want to cache in memory ...
For server-side caching, the normal approach is to cache results for some time period or manually clear them from the cache. But a more modern and better approach imo is to use Russian-doll caching, where you key items according to the time their inputs changed. Then you never need to worry about manually clearing them, you just make sure timestamps are correct and synchronised.
Memcached versus Redis versus something else? For this usage, Memcached is probably best as it's extremely simple and you don't have to worry about persistence, which is a big advantage of Redis over Memcached. Redis is well-engineered and would work fine too, but I don't see the benefit to use something that's considerably more feature-rich and complex if you don't need it and there's a good alternative. That said, the one big advantage of Redis is it now has excellent built-in clustering support, so it's easy to scale and stay online. But that would be overkill for your use case.
Something else? There are plenty of other in-memory databases, but I think Memcached and Redis are probably best if you want to avoid the problems of relying on cutting-edge frameworks without too much support. However, there is something else: boring old files. If you're generating reports, you might want to consider just generating them as temporary files. If your OS is doing its job, the files will end up being cached anyway.

AEM6 CQ how to handle Component Development and Content Authoring happening at the same time?

I just started at my new job and found myself right in the middle of a big project using Adobe AEM CQ, which I've never used before. Currently there are developers creating and tweaking components while content authors are busy authoring about 65 pages of content using those components.
Obviously, every time a component changes someone needs to update all the authored content with the new component changes. This is a huge time-waster as it seems like the only way to do this is through a custom made script that looks for nodes in the xml files and tries to convert them to the new component specs. Sometimes this is not even possible and authors need to re-author tons of stuff and lose lots of time.
Can anyone with AEM experience please let me know if:
1) There is a more painless way to migrate authored content to new components?
2) There is a better way to have developers and authors work simultaneously?
I know that the ideal way is to develop components first, and then author on top of those but it seems unrealistic especially with a big client project where things change all the time.
Thanks
Firstly, it sounds like a business process problem. The components should be fully developed and fully tested before content is being added by the authors. If the edits to components are so different that you're having this problem, i would recommend having functional and technical requirements written before the build starts.
With that said, the Groovy console for AEM is an excellent tool for updating nodes and content within an AEM site. Take a look at it here: https://github.com/Citytechinc/cq-groovy-console
I would not agree that content production should happen after all the components where developed. It's beneficial, especially when the content production will take a lot of time, to start it while the development is happening.
On the other hand I completely agree with the other part of the answer. Groovy Console is a way to go, when dealing with content migration (both before Go Live and after, during BAU process). Ideal situation is where all the current content can be mapped to data in new version of component. Then you should be able to migrate all the content with scripts. If that's not the case then you can't run away from authors putting the content manually.
Definitely components should be fully developed before their usage.
But if you want to change something specific in a component which will remain same for the entire website just like logo component or header component you can look into the Design Dialog.
So advantage of it is:
If you have already done authoring for n pages, when you change the component using Design Dialog it will be automatically reflected in all the pages wherever the component is being used.
AEM is a CMS where content is your data is put it in simpler terms. If your development process is such that data is inconsistent with the UI after every release then your delivery process might be at fault. You can use the following ways to make things better:
Make components backward compatible with the data
Make components version-able, i.e. new versions of components work with new models of data and it's left to the user to use new versions.
Provision for data or component migration in your project plan.
In practice, most AEM implementation make components backward compatible and provide an upgrade path to new versions. This is not a technical problem, it's more of a project governance issue.
This post is resurfacing so don't want people to get wrong idea from the current state of answers (and some answers that should be comments IMHO) but the approach in general to deal with components and releases is not a technical problem of the platform.

How to implement continuous migration for large website?

I am working on a website of 3,000+ pages that is updated on a daily basis. It's already built on an open source CMS. However, we cannot simply continue to apply hot fixes on a regular basis. We need to replace the entire system and I anticipate the need to replace the entire system on a 1-2 year basis. We don't have the staff to work on a replacement system while the other is being worked on, as it results in duplicate effort. We also cannot have a "code freeze" while we work on the new site.
So, this amounts to changing the tire while driving. Or fixing the wings while flying. Or all sorts of analogies.
This brings me to a concept called "continuous migration." I read this article here: https://www.acquia.com/blog/dont-wait-migrate-drupal-continuous-migration
The writer's suggestion is to use a CDN like Fastly. The idea is that a CDN allows you to switch between a legacy system and a new system on a URL basis. This idea, in theory, sounds like a great idea that would work. This article claims that you can do this with Varnish but Fastly makes the job easier. I don't work much with Varnish, so I can't really verify its claims.
I also don't know if this is a good idea or if there are better alternatives. I looked at Fastly's pricing scheme, and I simply cannot translate what it means to a specific price point. I don't understand these cryptic cloud-service pricing plans, they don't make sense to me. I don't know what kind of bandwidth the website uses. Another agency manages the website's servers.
Can someone help me understand whether or not using an online CDN would be better over using something like Varnish? Is there free or cheaper solutions? Can someone tell me what this amounts to, approximately, on a monthly or annual basis? Any other, better ways to roll out a new website on a phased basis for a large website?
Thanks!
I think I do not have the exact answers to your question but may be my answer helps a little bit.
I don't think that the CDN gives you an advantage. It is that you have more than one system.
Changes to the code
In professional environments I'm used to have three different CMS installations. The fist is the development system, usually on my PC. That system is used to develop the extensions, fix bugs and so on supported by unit-tests. The code is committed to a revision control system (like SVN, CVS or Git). A continuous integration system checks the commits to the RCS. When feature is implemented (or some bugs are fixed) a named tag will be created. Then this tagged version is installed on a test-system where developers, customers and users can test the implementation. After a successful test exactly this tagged version will be installed on the production system.
A first sight this looks time consuming. But it isn't because most of the steps can be automated. And the biggest advantage is that the customer can test the change on a test system. And it is very unlikely that an error occurs only on your production system. (A precondition is that your systems are build on a similar/equal environment. )
Changes to the content
If your code changes the way your content is processed it is an advantage when your
CMS has strong workflow support. Than you can easily add a step to your workflow
which desides if the content is old and has to be migrated for the current document.
This way you have a continuous migration of the content.
HTH
Varnish is a cache rather than a CDN. It intercepts page requests and delivers a cached version if one exists.
A CDN will serve up contents (images, JS, other resources etc) from an off-server location, typically in the cloud.
The cloud-based solutions pricing is often very cryptic as it's quite complicated technology.
I would be careful with continuous migration. I've done both methods in the past (continuous and full migrations) and I have to say, continuous is a pain. It means double the admin time for everything, and assumes your requirements are the same at all points in time.
Unfortunately, I would say you're better with a proper rebuilt on a 1-2 year basis than a continuous migration, but obviously you know best about that.
I would suggest you maybe also consider a hybrid approach? Build yourself an export tool to keep all of your content in a transferrable state like CSV/XML/JSON so you can just import into a new system when ready. This means you can incorporate new build requests when you need them in a new system (what's the point in a new system if it does exactly the same as the old one) and you get to keep all your content. Plus you don't need to build and maintain two CMS' all the time.

How to manage multiple clients with slightly different business rules? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
We have written a software package for a particular niche industry. This package has been pretty successful, to the extent that we have signed up several different clients in the industry, who use us as a hosted solution provider, and many others are knocking on our doors. If we achieve the kind of success that we're aiming for, we will have literally hundreds of clients, each with their own web site hosted on our servers.
Trouble is, each client comes in with their own little customizations and tweaks that they need for their own local circumstances and conditions, often (but not always) based on local state or even county legislation or bureaucracy. So while probably 90-95% of the system is the same across all clients, we're going to have to build and support these little customizations.
Moreover, the system is still very much a work in progress. There are enhancements and bug fixes happening continually on the core system that need to be applied across all clients.
We are writing code in .NET (ASP, C#), MS-SQL 2005 is our DB server, and we're using SourceGear Vault as our source control system. I have worked with branching in Vault before, and it's great if you only need to keep 2 or 3 branches synchronized - but we're looking at maintaining hundreds of branches, which is just unthinkable.
My question is: How do you recommend we manage all this?
I expect answers will be addressing things like object architecture, web server architecture, source control management, developer teams etc. I have a few ideas of my own, but I have no real experience in managing something like this, and I'd really appreciate hearing from people who have done this sort of thing before.
Thanks!
I would recommend against maintaining separate code branches per customer. This is a nightmare to maintain working code against your Core.
I do recommend you do implement the Strategy Pattern and cover your "customer customizations" with automated tests (e.g. Unit & Functional) whenever you are changing your Core.
UPDATE:
I recommend that before you get too many customers, you need to establish a system of creating and updating each of their websites. How involved you get is going to be balanced by your current revenue stream of course, but you should have an end in mind.
For example, when you just signed up Customer X (hopefully all via the web), their website will be created in XX minutes and send the customer an email stating it's ready.
You definitely want to setup a Continuous Integration (CI) environment. TeamCity is a great tool, and free.
With this in place, you'll be able to check your updates in a staging environment and can then apply those patches across your production instances.
Bottom Line: Once you get over a handful of customers, you need to start thinking about automating your operations and your deployment as yet another application to itself.
UPDATE: This post highlights the negative effects of branching per customer.
Our software has very similar requirements and I've picked up a few things over the years.
First of all, such customizations will cost you both in the short and long-term. If you have control over it, place some checks and balances such that sales & marketing do not over-zealously sell customizations.
I agree with the other posters that say NOT to use source control to manage this. It should be built into the project architecture wherever possible. When I first began working for my current employer, source control was being used for this and it quickly became a nightmare.
We use a separate database for each client, mainly because for many of our clients, the law or the client themselves require it due to privacy concerns, etc...
I would say that the business logic differences have probably been the least difficult part of the experience for us (your mileage may vary depending on the nature of the customizations required). For us, most variations in business logic can be broken down into a set of configuration values which we store in an xml file that is modified upon deployment (if machine specific) or stored in a client-specific folder and kept in source control (explained below). The business logic obtains these values at runtime and adjusts its execution appropriately. You can use this in concert with various strategy and factory patterns as well -- config fields can contain names of strategies etc... . Also, unit testing can be used to verify that you haven't broken things for other clients when you make changes. Currently, adding most new clients to the system involves simply mixing/matching the appropriate config values (as far as business logic is concerned).
More of a problem for us is managing the content of the site itself including the pages/style sheets/text strings/images, all of which our clients often want customized. The current approach that I've taken for this is to create a folder tree for each client that mirrors the main site - this tree is rooted at a folder named "custom" that is located in the main site folder and deployed with the site. Content placed in the client-specific set of folders either overrides or merges with the default content (depending on file type). At runtime the correct file is chosen based on the current context (user, language, etc...). The site can be made to serve multiple clients this way. Efficiency may also be a concern - you can use caching, etc... to make it faster (I use a custom VirtualPathProvider). The largest problem we run into is the burden of visually testing all of these pages when we need to make changes. Basically, to be 100% sure you haven't broken something in a client's custom setup when you have changed a shared stylesheet, image, etc... you would have to visually inspect every single page after any significant design change. I've developed some "feel" over time as to what changes can be comfortably made without breaking things, but it's still not a foolproof system by any means.
In my case I also have no control other than offering my opinion over which visual/code customizations are sold so MANY more of them than I would like have been sold and implemented.
This is not something that you want to solve with source control management, but within the architecture of your application.
I would come up with some sort of plugin like architecture. Which plugins to use for which website would then become a configuration issue and not a source control issue.
This allows you to use branches, etc. for the stuff that they are intended for: parallel development of code between (or maybe even over) releases. Each plugin becomes a seperate project (or subproject) within your source code system. This also allows you to combine all plugins and your main application into one visual studio solution to help with dependency analisys etc.
Loosely coupling the various components in your application is the best way to go.
As mention before, source control does not sound like a good solution for your problem. To me it sounds that is better yo have a single code base using a multi-tenant architecture. This way you get a lot of benefits in terms of managing your application, load on the service, scalability, etc.
Our product using this approach and what we have is some (a lot) of core functionality that is the same for all clients, custom modules that are used by one or more clients and at the core a the "customization" is a simple workflow engine that uses different workflows for different clients, so each clients gets the core functionality, its own workflow(s) and some extended set of modules that are either client specific or generalized for more that one client.
Here's something to get you started on multi-tenancy architecture:
Multi-Tenant Data Architecture
SaaS database tenancy patterns
Without more info, such as types of client specific customization, one can only guess how deep or superficial the changes are. Some simple/standard approaches to consider:
If you can keep a central config specifying the uniqueness from client to client
If you can centralize the business rules to one class or group of classes
If you can store the business rules in the database and pull out based on client
If the business rules can all be DB/SQL based (each client having their own DB
Overall hard coding differences based on client name/id is very problematic, keeping different code bases per client is costly (think of the complete testing/retesting time required for the 90% that doesn't change)...I think more info is required to properly answer (give some specifics)
Layer the application. One of those layers contains customizations and should be able to be pulled out at any time without affect on the rest of the system. Application- and DB-level "triggers" (quoted because they may or many not employ actual DB triggers) that call customer-specific code or are parametrized with customer keys) are very helpful.
Core should never be customized, but you must layer it in somewhere, even if it is simplistic web filtering.
What we have is a a core datbase that has the functionality that all clients get. Then each client has a separate database that contains the customizations for that client. This is expensive in terms of maintenance. The other problem is that when two clients ask for a simliar functionality, it is often done differnetly by the two separate teams. There is currently little done to share custiomizations between clients and make common ones become part of the core application. Each client has their own application portal, so we don't have the worry about a change to one client affecting some other client.
Right now we are looking at changing to a process using a rules engine, but there is some concern that the perfomance won't be there for the number of records we need to be able to process. However, in your circumstances, this might be a viable alternative.
I've used some applications that offered the following customizations:
Web pages were configurable - we could drag fields out of view, position them where we wanted with our own name for the field label.
Add our own views or stored procedures and use them in: data grids (along with an update proc) and reports. Each client would need their own database.
Custom mapping of Excel files to import data into system.
Add our own calculated fields.
Ability to run custom scripts on forms during various events.
Identify our own custom fields.
If you clients are larger companies, you're almost going to need your own SDK, API's, etc.