I'm trying to set autocomplete/suggestions on my site's search form, using Elastic Search's completion suggester feature.
I have a list of products, which are grouped by categories (on multiple levels). The search feature should be able to suggest category names, which are of more interest to users than direct products.
Several of these categories have the same name but a different parent (e.g. 'milk' under parent category 'dairy products' and 'milk' under category 'baby'). When the user selects a category suggestion, she's redirected to another page, with more specific results than mere search method.
Additional metadata (url to redirect to, parent category id/name) are added in the payload field.
I use the output field to return normalized suggestions for different inputs. As stated in the documentation:
"The result is de-duplicated if several documents have the same output,
i.e. only one is returned as part of the suggest result."
But as explained, my outputs may have the same value, while being different results, as they will link to different pages. It is also possible in the future that different category levels will yield different actions.
I am reluctant to differentiate things by adding the full string (i.e. "milk in dairy products") as the output, because:
1. The parent category is conceptually not the output itself but a related metadata.
2. I intend to have some formatting in the results, this forces me to parse the output string to add HTML tags in it.
So, is it possible to deactivate the de-duplication?
One workaround I'm thinking of if it's not possible is to store a stringified json object in the output, with all the data 'll need, both the one displayed in the search form and the metadata currently in the payload. But Id' rather look into existing configuration before resorting to that.
Related
We are migrating from MarkLogic DHF4 to DHF5 (Data Hub Framework)
We are having scenario where for one entity, depending on criteria we need to create more than one harmonize document for a single input document. This scenario was possible in legacy flow where we used to call write:write method.
But in DHF5 implementation we are supposed to return only one content for one input document getting processed in main module main.sjs.
Is there a way where we can create multiple harmonize document as required from one input document in DHF5?
We expect that one input document should be able to create more than one harmonized document in single step(in main.sjs)
Your custom module is expected to return a Content object or an array of them. Each Content object must have uri, value, and context properties and may also specify provenance.
Based on this, your one input can be turned into multiple outputs.
See the "Required Outputs" section on this page for more details.
Your module must return the following:
For a Custom-Ingestion step, a Content object.
For all other custom steps, a Content object or an array of Content objects.
Documentation and Code Sample
In the documentation above there are only two parameters.
However, in the code example they are using fields as parameters.
I tried searching the docs but I'm still unclear on how fields and params are different.
Are they completely interchangeable or are there specific times to use each?
I tried searching the docs but I'm still unclear on how fields and params are different. Are they completely interchangeable or are there specific times to use each?
Fields are the specific data elements you can request about an object.
A user’s e-mail address, a post’s message, a page’s cover photo – those are fields.
Parameters allow you to limit the selection of data, based on specific criteria.
You request a page’s feed, but you only want posts from a specific time frame - then you use parameters like since and until, for example.
If you are familiar with basic SQL, you could use this as an analogy: Fields would be the column names you specify after SELECT; Parameters would be the WHERE clause.
I reached out to Facebook Support and this was their answer:
Parameters are inputs to an API that specify constraints on the range
of data that will be returned (time ranges, specific ids etc.). Fields
are what are returned by the API, if you want specific fields to be
returned, these can be specified by adding something like
"fields=id,name..." to an API.
Parameters and fields are not interchangeable.
Suppose there's USERS and ORDERS
for a specific user's order list
You could do
/user/3/order_list
/order/?user=3
Which one is prefered and why?
Optional parameters tend to be easier to put in the query string.
If you want to return a 404 error when the parameter value does not correspond to an existing resource then I would tend towards a path segment parameter. e.g. /customer/232 where 232 is not a valid customer id.
If however you want to return an empty list then when the parameter is not found then query string parameters. e.g. /contacts?name=dave
If a parameter affects an entire URI structure then use a path e.g. a language parameter /en/document/foo.txt versus /document/foo.txt?language=en
If unique identifiers to be in a path rather than a query parameter.
Path is friendly for search engine/browser history/ Navigation.
When I started to create an API, I was thinking about the same question.
Video from apigee. help me a lot.
In a nutshell when you decide to build an API, you should decide which entity is independent and which is only related to someone.
For example, if you have a specific endpoint for orders with create/update/delete operations, then it will be fine to use a second approach /order/?user=3.
In the other way, if orders have only one representation, depends on a user and they don't have any special interaction then you could first approach.
There is also nice article about best practice
The whole point of REST is resources. You should try and map them as closely as possible to the actual requests you're going to get. I'd definitely not call it order_list because that looks like an action (you're "listing" the orders, while GET should be enough to tell you that you're getting something)
So, first of all I think you should have /users instead of /user, Then consider it as a tree structure:
A seller (for lack of a better name) can have multiple users
A user can have multiple orders
An order can have multiple items
So, I'd go for something like:
The seller can see its users with yourdomain.com/my/users
The details of a single user can be seen with yourdomain.com/my/users/3
The orders of a single user can be seen with yourdomain.com/my/users/3/orders
The items of a single order can be seen with yourdomain.com/my/users/3/orders/5
GET https://api.website.com/v1/project/employee;company-id={company-id},
title={title-id}?non-smoker={true|false}&<name1>=<value1>&<name2>=<value2>&<name3>=<value3>
where:
company-id is mandatory,
title is optional
name/value can be any filter criteria.
Is there a better way to define the interface?
This API is not supposed to create an employee object. It is for getting an array of employee objects that belongs to a particular company and has a particular title and the other filter criteria.
I don't know if there is a better way, because it depends often on the technology you use and its idioms.
However, here is two different URI designs that I like (and why)
#1 GET https://api.website.com/v1/project/employee/{company-id}?title={title-id}&non-smoker={true|false}&<name1>=<value1>&<name2>=<value2>&<name3>=<value3>
#2 GET https://api.website.com/v1/project/company/{company-id}/employee?title={title-id}&non-smoker={true|false}&<name1>=<value1>&<name2>=<value2>&<name3>=<value3>
As you can see in both example I extracted company-id from the query string. I prefer to add mandatory parameters in the path info to distinguish them. Then, in the second URI, the employee ressource is nested in the company. That way you can easily guess that you can retrieve all employee from a specific company, which is not obvious in the first example.
This api is supposed to GET employee objects that satisfy the given criteria of belonging to a particular company, having particular job title and some other filter criteria.
Personally I would just design your URI as http://acme.com/employee/?company=X&title=Y&non-smoker=Z&T=U. I wouldn't write "in stone" that the company is mandatory: your API will be easier to change.
However, you should consider that few "big" requests are far faster than plenty of small ones. Moreover, URI representations can be effectively cached. Therefore it is often better to have URIs based on IDs (since there are more chances that they will be asked again).
So you could get the complete employee list of a company (plus other data about the company itself) with http://acme.com/company/X and then filter it client-side.
Are you creating a new employee object? If so then a POST (create) is more appropriate. A good clue is all the data you're pushing in the URL. All that should be in the body of the POST object.
I want to be able to grab data from multiple tags / folders in a users Google Reader.
I know how to do one http://www.google.com/reader/atom/user/-/label/SOMELABEL but how would you do two or three or ten?
Doesn't look like you can get multiple tags/folders in one request. If it's feasible you should iterate over the different tags/folders and aggregate them in your application.
[edit]
Since it looks like you have a large list of tags/folders you need to query, an alternative is to get the full list of entries, then sort out the ones the user wants. It looks like each entry has a category element that will tell you what tag is associated with it. This might be feasible in your case.
(Source: http://code.google.com/p/pyrfeed/wiki/GoogleReaderAPI)
(Source: http://www.google.com/reader/atom/user/-/state/com.google/starred)
I think you cannot get aggregated data as you hope to be able to. If you think about it, even Google lets you browse folders or tags one at a time, and do not aggregate a sub-set of them.
You can choose to have a list of all the items (for each one of their available statuses) or a list of a particular tag/folder.
You could do it in 2 requests. First you need to perform a GET request to http://www.google.com/reader/stream/items/ids. It supports several parameters like
s (required parameter; stream id to fetch; may be defined more than one time),
n (required; number of items to fetch)
r for ranking (optional)
and others (see more under /ids section)
And then you should perform a POST request (this is because there could be a lot of ids, and therefore the request could be cut off) to http://www.google.com/reader/api/0/stream/items/contents. The required parameter is i which holds the feed item identifier (could be defined more than once).
This should return data from several feeds (as returned for me).