JCR API or Apache Sling - aem

I read a lot of articles like JCR vs Apache Sling and I'm confused about what to use. Some authors advise to use JCR API like more performance optimized and the rest are on the side of Apache Sling because it's faster to write and far more readable and maintainable in the long run. And I had some questions:
What practice is better from your point of view?
What is more often used in production projects?

I think Maciej Matuszewski exhausted this subject enough in his presentation JCR, Sling or AEM? Which API should I use and when?.
In most of the cases, it is recommended to use Apache Sling as a higher API whereas JCR is required when performance needs to be taken into account. It is then important however to know the border between these two scenarios.
Maciej notices it is around 1ms difference for opening the regular AEM page without taking any caching into account. Taking care of the performance is totally unnecessary in that case. We should instead focus on writing code that is readable, understandable, reduced to the minimum and reusing already existing APIs, frameworks, util classes that are covered already by proper unit tests and peer reviewed, rather than reinventing the wheel from the beginning. Base on that, we should also prefer AEM layer over Sling layer.
From my experience, I would say that JCR should be utilized in few scenarios, mainly when traversing of a large amount of data of CRX database and it cannot be achieved by any searching API.
So that the difference is like between using C# or C++ as a programming language for computer games development - in some of the cases it is enough to stay higher API for development convenience however for some cases it is required to get lower and start using C++ pointers.
However, the most important thing is to not mix both abstract layers in your implementation.

To start with a very typical answer, 'IT DEPENDS'.
Consider the following scenarios for your understanding:
Scenario 1: Read the title of the page which is containing the current resource.
Approach 1: Leverage the awesome Sling API's to work upon all the available context objects like currentPage, resource, pageManager, wcmmode and many more in your Java controller (Sling Model/ WCMUSe class).
// get the page that contains this resource.
// If the resource is a page the resource is returned. Otherwise it
// walks up the parent resources until a page is found.
Page page = pageManager.getContainingPage(resource);
// Check if the returned page object isn't null
if(page != null){
return page.getTitle();
}
Approach 2: Use the JCR API's:
// assign the current resource node to parent Node to check
// if the current resoure in itself is a page
Node parentNode = currentNode;
while (parentNode.getProperty("jcr:PrimaryTpe").getString() != "cq:Page" ){
parentNode = parentNode.getParent();
}
// The page Title
String pageTitle = null;
// find the jcr:content node of the page and return the
// jcr:title property of that node
if(parentNode.hasNode("jcr:content"){
Node jcrContentNode = parentNode.getNode("jcr:Content");
pageTitle = jcrContentNode.getProperty("jcr:title").getValue().getString();
}
return pageTitle;
In this scenario, obviously the Sling API's win by a huge margin on
the ease of access and usability. I have never experienced any
performance issue with the Sling APIs in comparison to the JCR APIs.
Scenario 2: Change the title of the first level page nodes (considering /content/mywebsite/en to be level ZERO) of your website to Upper Case letters.
Approach: In such a requirement where you need to do certain one-time changes to your JCR repository, you should use the JCR APIs by creating a Standalone Java Application to perform such tasks instead of creating an unnecessary component, its controller, an unnecessary page to put this component and then using the Sling API's in the controller to perform these tasks.
//Create a connection to the CQ repository running on local host
Repository repository = JcrUtils.getRepository("http://localhost:4502/crx/server");
//Create a Session
Session session = repository.login( new SimpleCredentials("username", "password".toCharArray()),"crx.default");
//Create a node that represents the root node
Node root = session.getRootNode();
// Get the level ZERO node
Node homepageNode = root.getNode("/content/mywebsite/en");
NodeIterator iter = homePageNode.getNodes;
while(iter.hasNext){
// if next node is of primarty type cq:Page
// get its jcr:content node and
// set its jcr:title property to uppercase letters.
}
Rule of Thumb:
If you want access to your AEM repository from within the AEM
application use Sling APIs over JCR APIs they are:
higher APIs than JCR (have a lot of predefined methods to do a lot of work)
provide access to all the Global Context objects inside the controller
very easy to use
but in case if you need to accces your repository for large scale opertaions,
(generally one time changes) choose to work with a standalone Java
application using the JCR APIs.

Related

bulk GET using HATEOAS

I've seen many examples of HATEOAS where every resource has links to related resources. An API that returns N items of a certain resource per page, the client would probably need N calls to fetch any nested resource by consuming HATEOAS. For example:
GET city/documents:
[{
id: 1,
city: {
self: 'http://service.com/cities?filter=id==1'
},
document: { ... }
...
}, {
id: 2,
city: {
self: 'http://service.com/cities?filter=id==2'
},
document: { ... }
...
}]
FYI, the query parameter uses the FIQL syntax to define the filters.
Now, if the client was to fetch the city details for each document (to show on UI), it will probably need N additional calls. However in my case, the /cities API can additionally take multiple city ids like this: /cities?filter=id=in=(1,2) that can reduce N calls to one. Is there a way to articulate something like this using HATEOAS? I've read about the templates but not sure how should the template look like and how would client consume it?
I've seen many examples of HATEOAS where every resource has links to related resources. An API that returns N items of a certain resource per page, the client would probably need N calls to fetch any nested resource by consuming HATEOAS.
Yes. Less true in a world with Server-Push, where the server can proactively provide multiple resources in response to a query. If you imagine asking for a web page, and getting the html, and then also the images and the java script resources too, then you've got the right sort of idea.
API can additionally take multiple city ids like this: /cities?filter=id=in=(1,2) that can reduce N calls to one. Is there a way to articulate something like this using HATEOAS?
Yes.
Let's walk through it carefully. What you've done here is introduced a new resource, with identifier /cities?filter=id=in=(1,2). You might have another resource /cities?filter=id=in=(1,20) and another resource /cities?filter=id=in=(1,2000). In your implementation, these might be a "single endpoint" that extracts parameters from the identifier and uses them to generate the correct representation.
So what you get is something like a data transfer object - a large grained resource fetched in a single go.
I've read about the templates but not sure how should the template look like and how would client consume it?
The simplest example, which you have likely seen already, is a web form. You allow the client to provide the start and end elements, and the form processing takes that information and creates the specified URI from it.
/filtered-cities?start=1&end=2000
So the client needs to understand what the form is for, and how to identify the semantics of the different elements in the form. The agent needs to understand the processing rules that transfer the form data into the URI.
URI Templates are the same basic idea; they give you a domain agnostic language with which to describe where the parameters go in a resource identifier. The basic pattern is the same - there needs to be agreement about the semantics of the parameters, the server provides a URI, the client provides a parameter map, and the generic code can take care of the merge
uri = template.apply(parameterMap)
URI Templates aren't quite as powerful as forms; with a form, you can introduce a default value for a parameter, but there is no analogous capability in URI templates.
HAL-Forms may give you a better sense of how a form based approach might work in JSON.

Multiple GET Rest APIs on different fields of a table or One Rest API in DDD

I wanted to provide functionality to clients from my service to get the data based on different fields or sometimes combination of fields. Eg.
getByA
getByB
getByC
getByAandB
getByAandC
In domain driven design, while designing the GET APIs, what should I do out of the following 2:
Should I create individual get api for all such functionalities I wanted to provide?
Should I create one get API with all the possible gets by using all these fields in query parameter. Eg.
get?A=?&B=?&C=?
Which one is the better way to do this? Any suggestions on best practice?
There is a middle path between using individual GET APIs for each of these queries and creating one GET API.
You could use the Specification pattern to expose one GET API, but translate it into a Domain Specification Object before passing it on to the Domain layer for querying. You typically do this transformation in your View Controller, before invoking the Application Service.
Martin Fowler and Eric Evans have published a great paper on using Specifications: https://martinfowler.com/apsupp/spec.pdf
As the paper states, The central idea of Specification is to separate the statement of how to match a candidate, from the candidate object that it is matched against.
Note:
You are fine if you are using this pattern for the Query side as you have outlined in your question, and avoid reusing it in different contexts. For ex., DO NOT use a specification object on both the query side and command side, if you are using (or plan to use) CQRS. You will be creating a central dependency between two parts, that NEED to be kept separate.
Specifications are handy when you want to represent a domain concept. Evaluate your queries (getByAandB and getByAandC) to draw out the question you are asking to the domain (For ex., ask your domain expert to describe the data he is trying to fetch).

Renewing instances in Autofac

I know that the entire context of this issue is a bit specific, but I'll try to do my best explaining it. I'm performing a quite big importation from one ecommerce platform to nopCommerce.
nopCommerce works with Autofac as dependency injection container. Importing one product to nopCommerce involves some queries over nopCommerce tables and finally an insertion to the products table. These steps are repeated a lot of times, and Entity Framework context gets bigger, as it has to track more and more entities and trying to detect changes and figure out how many objects has to persist.
What I want to do is, in every iteration of the loop, renew the context, so it only tracks the entities associated to the current iteration. Obviously I want to achieve this, trying to not modify (as much as possible) nopCommerce core. In the container configuration, it is explicitly set that the EF context instances are given per http request (something I want to avoid, as I need a new instance per iteration).
An easy way to do it would be:
foreach job in jobs
Eject all instances in container
service1 = Container.RequestInstance<SomeServiceINeed>
service2 = Container.RequestInstance<SomeServiceINeed2>
DoTheJob
The thing is, I don't know how to accomplish this with Autofac. I have been trying to create a new ContainerBuilder and update the existing one, but _context.GetHashCode will always return the same instance.
Any idea about the best way to do it?
EDIT:
As it was suggested in the comments, I've tried to get the instances inside a lifetime scope. Basically:
using (var lifeTime = EngineContext.Current.ContainerManager.Container.BeginLifetimeScope())
{
service1 = lifeTime.Resolve<SomeServiceINeed>();
service2= lifeTime.Resolve<SomeServiceINeed2>();
..............
}
But I get this exception:
No scope with a Tag matching 'AutofacWebRequest' is visible from the scope in
which the instance was requested. This generally indicates that a component
registered as per-HTTP request is being requested by a SingleInstance() component
(or a similar scenario.) Under the web integration always request dependencies from
the DependencyResolver.Current or ILifetimeScopeProvider.RequestLifetime,
never from the container itself.
The services I'm trying to resolve, obviously depends also on a lot of different repositories and other services that are already defined in the container wiring (app start). Some of them are configured as 'PerHttpRequest'.
Thanks a lot!

How should I deal with object hierarchies in a RESTful API?

I am currently designing the API for an existing PHP application, and to this end am investigating REST as a sensible architectural approach.
I believe I have a reasonable grasp of the key concepts, but I'm struggling to find anybody that has tackled object hierarchies and REST.
Here's the problem...
In the [application] business object hierarchy we have:
Users
L which have one-to-many Channel objects
L which have one-to-many Member objects
In the application itself we use a lazy load approach to populate the User object with arrays of these objects as required. I believe in OO terms this is object aggregation, but I have seen various naming inconsistencies and do not care to start a war about the precise naming convention </flame war>.
For now, consider I have some loosely coupled objects that I may / may not populate depending on application need.
From a REST perspective, I am trying to ascertain what the approach should be. Here is my current thinking (considering GET only for the time being):
Option 1 - fully populate the objects:
GET api.example.com/user/{user_id}
Read the User object (resource) and return the User object with all possible Channel and Member objects pre-loaded and encoded (JSON or XML).
PROS: reduces the number of objects, no traversal of object hierarchies required
CONS: objects must be fully populated (expensive)
Option 2 - populate the primary object and include links to the other object resources:
GET api.example.com/user/{user_id}
Read the User object (resource) and return the User object User data populated and two lists.
Each list references the appropriate (sub) resource i.e.
api.example.com/channel/{channel_id}
api.example.com/member/{member_id}
I think this is close to (or exactly) the implications of hypermedia - the client can get the other resources if it wants (as long as I tag them sensibly).
PROS: client can choose to load the subordinates or otherwise, better separation of the objects as REST resources
CONS: further trip required to get the secondary resources
Option 3 - enable recursive retrieves
GET api.example.com/user/{user_id}
Read the User object and include links to lists of the sub-objects i.e.
api.example.com/user/{user_id}/channels
api.example.com/user/{user_id}/members
the /channels call would return a list of channel resources in the form (as above):
api.example.com/channel/{channel_id}
PROS: primary resources expose where to go to get the subordinates but not what they are (more RESTful?), no requirement to get the subordinates up front, the subordinate list generators (/channels and /members) provide interfaces (method like) making the response more service like.
CONS: three calls now required to fully populate the object
Option 4 - (re)consider the object design for REST
I am re-using the [existing] application object hierarchy and trying to apply it to REST - or perhaps more directly, provide an API interface to it.
Perhaps the REST object hierarchy should be different, or perhaps the new RESTful thinking is exposing limitations of the existing object design.
Any thoughts on the above welcomed.
There's no reason not to combine these.
api.example.com/user/{user_id} – return a user representation
api.example.com/channel/{channel_id} – return a channel representation
api.example.com/user/{user_id}/channels – return a list of channel representations
api.example.com/user/{user_id}/channel_list – return a list of channel ids (or links to their full representations, using the above links)
When in doubt, think about how you would display the data to a human user without "API" concerns: a user wants both index pages ({user_id}/channel_list) and full views ({user_id}/channels).
Once you have that, just support JSON instead of (or in addition to) HTML as the representation format, and you have REST.
The best advice I can give is to try and avoid thinking about your REST api as exposing your objects. The resources you create should support the use cases you need. If necessary you might create resources for all three options:
api.example.com/completeuser/{id}
api.example.com/linkeduser/{id}
api.example.com/lightweightuser/{id}
Obviously my names are a bit goofy, but it really doesn't matter what you call them. The idea is that you use the REST api to present data in the most logical way for the particular usage scenario. If there are multiple scenarios, create multiple resources, if necessary. I like to think of my resources more like UI models rather than business entities.
I would recommend Restful Obects which is standards for exposing domain model's restful
The idea of Restful Objects is to provide a standard, generic RESTful interface for domain object models, exposing representations of their structure using JSON and enabling interactions with domain object instances using HTTP GET, POST, PUT and DELETE.
According to the standard, the URIs will be like:
api.example.com/object/user/31
api.example.com/object/user/31/properties/username
api.example.com/object/user/31/collections/channels
api.example.com/object/user/31/collections/members
api.example.com/object/user/31/actions/someFunction
api.example.com/object/user/31/actions/someFunction/invoke
There are also other resources
api.example.com/services
api.example.com/domain-types
The specification defines a few primary representations:
object (which represents any domain object or service)
list (of links to other objects)
property
collection
action
action result (typically containing either an object or a list, or just feedback messages)
and also a small number of secondary representations such as home, and user
This is interesting as you’ll see that representations are fully self-describing, opening up the possibility of generic viewers to be implemented if required.
Alternatively, the representations can be consumed directly by a bespoke application.
Here's my conclusions from many hours searching and with input from the responders here:
Where I have an object that is effectively a multi-part object, I need to treat that as a single resource. Thus if I GET the object, all the sub-ordinates should be present. This is required in order that the resource is cacheable. If I part load the object (and provide an ETag stamp) then other requestors may receive a partial object when they expected a full one. Conclude - objects should be fully populated if they are being made available as resources.
Associated object relationships should be made available as links to other (primary) resources. In this way the objects are discoverable by traversing the API.
Also, the object hierarchy that made sense for main application site may appear not be what you need to act in RESTful manner, but is more likely revealing problems with the existing hierarchy. Having said this the API may require more specialised use cases than had been previously envisaged, and specialised resources may be required.
Hope that helps someone

RESTful POSTS, do you POST objects to the singular or plural Uri?

Which one of these URIs would be more 'fit' for receiving POSTs (adding product(s))? Are there any best practices available or is it just personal preference?
/product/ (singular)
or
/products/ (plural)
Currently we use /products/?query=blah for searching and /product/{productId}/ for GETs PUTs & DELETEs of a single product.
Since POST is an "append" operation, it might be more Englishy to POST to /products, as you'd be appending a new product to the existing list of products.
As long as you've standardized on something within your API, I think that's good enough.
Since REST APIs should be hypertext-driven, the URI is relatively inconsequential anyway. Clients should be pulling URIs from returned documents and using those in subsequent requests; typically applications and people aren't going to need to guess or visually interpret URIs, since the application will be explicitly instructing clients what resources and URIs are available.
Typically you use POST to create a resource when you don't know the identifier of the resource in advance, and PUT when you do. So you'd POST to /products, or PUT to /products/{new-id}.
With both of these you'll return 201 Created, and with the POST additionally return a Location header containing the URL of the newly created resource (assuming it was successfully created).
In RESTful design, there are a few patterns around creating new resources. The pattern that you choose largely depends on who is responsible for choosing the URL for the newly created resource.
If the client is responsible for choosing the URL, then the client should PUT to the URL for the resource. In contrast, if the server is responsible for the URL for the resource then the client should POST to a "factory" resource. Typically the factory resource is the parent resource of the resource being created and is usually a collection which is pluralized.
So, in your case I would recommend using /products
You POST or GET a single thing: a single PRODUCT.
Sometimes you GET with no specific product (or with query criteria). But you still say it in the singular.
You rarely work plural forms of names. If you have a collection (a Catalog of products), it's one Catalog.
I would only post to the singular /product. It's just too easy to mix up the two URL-s and get confused or make mistakes.
As many said, you can probably choose any style you like as long as you are consistent, however I'd like to point out some arguments on both sides; I'm personally biased towards singular
In favor of plural resource names:
simplicity of the URL scheme as you know the resource name is always at plural
many consider this convention similar to how databases tables are addressed and consider this an advantage
seems to be more widely adopted
In favor of singular resource names (this doesn't exclude plurals when working on multiple resources)
the URL scheme is more complex but you gain more expressivity
you always know when you are dealing with one or more resources based on the resource name, as opposed to check whether the resource has an additional Id path component
plural is sometimes harder for non-native speakers (when is not simply an "s")
the URL is longer
the "s" seems to be a redundant from a programmers' standpoint
is just awkward to consider the path parameter as a sub-resource of the collection as opposed to consider it for what it is: simply an ID of the resource it identifies
you can apply the filtering parameters only where they are needed (endpoint with plural resource name)
you could use the same url for all of them and use the MessageContext to determine what type of action the caller of the web service wanted to perform.
No language was specified but in Java you can do something like this.
WebServiceContext ws_ctx;
MessageContext ctx = ws_ctx.getMessageContext();
String action = (String)ctx.get(MessageContext.HTTP_REQUEST_METHOD);
if(action.equals("GET")
// do something
else if(action.equals("POST")
// do something
That way you can check the type of request that was sent to the web service and perform the appropriate action based upon the request method.