We are using AEM 6.3 and we have need to implement Content search functionality in our project.We implemented it using Search API provided but issue is that Search API take only request parameter and hence we are not able to cache the search result page.
Did try to use selector or set request attributes (searchTerm and Tags)and than create Search Client instance and call getResult method but it doesn't return any results.
As we need to do content search across pages and mutilple properties can we use QueryBuilder API here and achieve the same result provided by Search API
Search API is highly performant and the caching is not the best strategy for using searches as you might get stale results. In practice, you end up reducing the cache lifetime and end up at the same problem.
You should look more into optimising your searches with proper indexes over targeted content etc.
However, if you really want to cache the search results you could look into 3rd party solutions but I would highly discourage it in the context of AEM as there are better solutions like:
Offloading searches to a dedicated publisher. You can do it via your LB or dispatcher rules.
Optimise searches by optimising indexes. Remember, indexes don't hit your repository.
Worst case if you really struggle with performance, look into AEM Solr integration as Solr has good caching. You can also achieve same with ElasticSearch or other DB. Just be warned that plumbing and TCO is not free for this.
Related
I have setup my application for REST access as per documentation. The default routes are working well. I am able to retrieve, update and delete records, however, I am not sure how I could filter data sending parameters to the controller. I wonder if I can do that using querystring or if there is a better way to accomplish that. Please can someone give me directions?
Reads about the Request object in the manual. And use the Search Plugin for filtering.
The search plugin comes with a lot of documentation that explains how to use it as well.
Your question is so generic that a proper answer would end up in a whole article - which I'm obviously not going to write, there is enough information available on HTTP requests and query params. Use Google or read these links:
https://developer.mozilla.org/en-US/docs/Web/HTTP/Messages
https://www.w3.org/Protocols/HTTP/Request.html
Is there any way to create index through API? I could not find any documentation about index creation using S&P API.
I do not want to let S&P to crawl my site to create index rather I would prefer to use APIs to do it.
No, there is no API for S&P that can be leveraged for the indexing purposes because S&P is basically a crawling engine.
For the sake of simplicity, think of it as a private search engine crawler where you can configure several aspects of crawling like content type, frequency, paths etc. using an interface.
You can, however, use the scripted index features to allow incremental indexing of your site or provide more finer control on crawling. This will not eliminate crawling as the S&P needs to crawl the site in order to extract and index the information. More details about this feature can be found here:
https://marketing.adobe.com/resources/help/en_US/snp/c_about_scripted_index.html
Please note that this is more of a configuration and set of commands rather than an API.
I am very confused about the correct or recommended mechanism to use for accessing google fusion tables APIs in app scripts. There seem to be two methods with examples but no discussion about which is preferred or why. Is one of these interfaces newer and preferred while the other is dying? Is one obsolete or more restricted in what it can do?
Method 1 is the REST API described here
https://developers.google.com/fusiontables/docs/v2/sql-reference#Select
Method 2 is a set of library functions sort of described here under the Apps Script/Google Advanced Services:
https://developers.google.com/apps-script/advanced/fusion-tables
For example, using the REST api to do a dql query, we end up with something like this:
function runSQL(sql){
var getDataURL = 'https://www.googleapis.com/fusiontables/v1/query?sql='+sql;
var dataResponse = UrlFetchApp.fetch(getDataURL,getUrlFetchOptions()).getContentText();
return dataResponse;
}
And using the advanced API we use something like this:
result = FusionTables.Query.sql(sql, { hdrs: false });
The REST API seems much harder to use, requireing complex oAuth and developer keys to be configured in advance and coded into the application while the Advanced Services API harvests all this behind the scenes and makes for simple API calls like I show here.
I have seen numerous examples using each of the above with no hint as to why one author chose her mechanism instead of the other.
Your help is greatly appreciated.
The service within app-script is a work in progress, so the full functionality of the API might not be fully supported at the moment. As you mentioned though, the big advantage of the service over the REST API is that you do not have to handle the OAuth flow, as you only need to enable it on your script (as stated here).
The Apps Script "advanced service" implementation still lacks some advanced functionality (like alt=media format queries or multipart / resumable uploads) -- if it actually has those features, it lacks extremely basic documentation of them, to the point that the Apps Script editor autocomplete is unaware of them. The tradeoff of these functionality gaps is that you don't need to handle keys, request building, etc.
So, if you're doing simple sql select / importRows work, the Advanced Service should be able to cover almost all your needs. If you need to delete from your FusionTables, you might want to consider setting up the REST API - because deleting is 1 record per query, the better way to delete is to instead "download what you want to keep, then re-upload it back via replaceRows."
(This worked for me for a while, but eventually what I was keeping outgrew the Apps Script service's limitations and I began receiving Empty Response errors from the call to replaceRows. My remedy was to perform my record maintenance tasks via the REST API, where I can specify resumable uploads, timeouts, etc., while more "normal" interactions are done through the Advanced Service.)
We're currently using MarkLogic's dls functions to handle document versioning, and are trying to switch over to use the REST API. The document endpoint doesn't use versioning by default, and I can't figure out a way to get it to. I'm referring to the dls functions for keeping multiple document versions, btw, not the new "content versioning" the REST API documentation mentions. In fact, the only reference to document versions in the REST API docs seems to be a line saying that content versioning isn't the same thing.
The only solution we've been able to come up with is to write a custom endpoint that duplicates everything the existing document endpoint's PUT does, plus document management. I'd rather avoid that if possible, especially when looking at MarkLogic 7's partial document updates. We're using MarkLogic 6 now, if it matters, but it doesn't look like 7 has any new features related to this.
Is there a way to do this using MarkLogic's existing endpoints?
You can write a REST API extension that automates the DLS operations. See http://docs.marklogic.com/guide/rest-dev/extensions. You will largely end up duplicating a lot of the same things, but this will plug into the existing endpoints.
Yes, MarkLogic 7 added content versioning to make refreshing of caches easier. And unfortunately, the DLS library hasn't been integrated into the REST api so far. You can file a feature request at support if you like.
In the mean time, the best suggestion I can give is use a separate route to do document updates using DLS (your current route or a limited custom endpoint that only supports the DLS functions you need for doc updates), and do anything else (as far as possible) using the existing REST api. You can look at this other stackoverflow question to see how to limit searches to the latest doc versions:
Marklogic REST API search for latest document version
HTH!
A member of MarkLogic has put together a REST extension to provide better DLS support in the REST-api. Hopefully that makes working with DLS over the MarkLogic REST-api a lot easier:
https://github.com/sanjuthomas/marklogic-dls-rest-extension
HTH!
What are the options in a web api to indicate that the returned data is paged and there is further data available.
ASP.Net Web API with OData uses a syntax similar to the following:
{
"odata.metadata":"http://myapi.com/api/$metadata#MyResource","value":[
{
"ID":1,"Name":"foo"
},
...
{
"ID":100,"Name":"bar"
}
],"odata.nextLink":"http://myapi.com/api/MyResource?$skip=20"
}
Are there any other ways to indicate the link to the next/previous 'page' of data without using a metadata wrapper around the results. Can this be achieved by using custom response headers instead?
Let's take a step back and think about WebAPI. WebAPI in essence is a raw data delivery mechanism. It's great for making an API and it elevates separation of concerns to a pretty good height (specifically eliminating UI concerns).
Using Web API, however, doesn't really change core of the issue you are facing. You're asking "how do I want to query my data store in an performant manner and return the data to the client efficiently?" Your decisions here really parallel the same question when building a more traditional web app.
As you noted, oData is one method to return this information. The benefit here is it's well known and well defined. The body of questions/blogs/articles on the topic is growing rapidly. The wrapper doesn't add any meaningful overhead.
Yet, oData is by no means the only way you can do this. We've had to cope with this since software has been displaying search results. It's tough to give you specific advice without really understanding your scenario. Here are some questions that bubbled up as I read your question :
Are your results sets huge but users only see the first one or two
pages?
Or do user tend to page through all of the results?
Are pages of results limited (like 20 or 50 per page) or 100's/ 1000's ?
Does the data set shift rapidly, so records are added as the user is
paging?
Are your result sets short and adding columns that repeat tolerable?
Do you have enough control over the client do do something out of band -- like custom HTTP headers, or a separate HTTP request that just asks for a query summary?
There really are hundreds of options depending on your needs. I don't know what you're using as a data store, but I wrote a post on getting row count efficiently. The issues there are very germane here, albeit from the DB perspective. It might help you get some perspective.