OR query in Algolia Web Interface - algolia

The documentation states that
To OR refinements, you must use nested arrays. For example, to refine on “Business & Investing” books written by Jason Fried or David Heinemeier Hansson:
I can't get OR-queries to work in the Algolia web interface. Whenever I add a second parameter I get no results.
E.g. I want to have all entries with difficulties easy and medium:
difficulty: medium, difficulty: easy
Is it currently not supported or do I use the wrong syntax?
Thank you!

Currently you cannot use disjunctive facets (OR filters) directly from Algolia web interface; the syntax you're referring to is for conjunctive facets (AND filters).

Related

How do I use the Github Search API to look for repositories, using multiple queries?

As per the official documentation - "*For example, if you want to search for popular Tetris repositories written in assembly code, your query might look like this:
q=tetris+language:assembly&sort=stars&order=desc
This query searches for repositories with the word tetris in the name, the description, or the README. The results are limited to repositories where the primary language is assembly. The results are sorted by stars in descending order, so that the most popular repositories appear first in the search results.*"
However, I'd like to use more than one keyword to avoid false positives. For instance I want to count repositories that make use of the TensorFlow framework for meta-learning. However, I don't see there is any code example for this. Could someone please help with that?
Here is what I did using the THUNDER CLIENT extension in VSCode, but I am not sure if it's correct -
https://api.github.com/search/repositories?q=Tensorflow, meta-learning+language:python
I also found this question, where the poster says to use "+". However, I don't see any official documentation anywhere.

Describing "greater than"-filters search in a REST API URL?

I'm designing a REST API where the /widgets endpoint can be filtered to only show widgets with a certain number of connections. This seems like a natural design:
/widgets?connections=4
I also want to allow filtering for widgets using lesser than and greater than, however. These URL designs seem wrong as they don't follow the classic query string pattern or appear misleading:
/widgets?connections>2
/widgets?connections=>2
What is the normal way of designing this kind of filter? I also need to be able to combine filters, e.g. "more than two connections and exactly one screen".
I've read this related question: REST URL design for greater than, less than operations, but it is not the same as it relates to pagination and ID, and does not contain a neat answer for combined filters.
REST does not give you an exact solution, it just says that your should use standards to build an uniform interface if there are available standards. If not, then it is up to you, anyways it must be documented for the client developers.
Here what you are doing is developing a complete query language for the URI. It would be good to check what exactly you need, because if there is a query language standard, then supporting it completely is just too much work. Afaik. Odata has something you need and there are other conventions, for example RQL is a very old one. With a little search there are other ones too: w x y z. I guess there are many others too. I would choose one of these and implement only what I need from it or look for an existing implementation.

Watson Assistant: Can I define Intent using Entities in the Examples?

How to I create an #Intent which looks something like this:
How much is a #ProductType?
Whereas the #ProductType is an simple Entity which consists of:
Soft Drinks: Coke, Pepsi, Sprite, Fanta
Fruits: Apple, Banana, Watermelon
I tried adding an Intent with above settings, but it doesn't seem to work. Is such ability natively supported in IBM Watson? Or otherwise, do I need to manually handle in the Dialog, using Conditions and stuffs? Please kindly advise.
The training is based on regular language and typical sentences or phrases. So #ProductType is not what you want in the phrase, but any of the fruits or drinks.
By defining the entities, Watson Assistant later learns the connection and to identify the entities and intents.
To get started, you define the intents and entities. Both can be imported from lists. Then you add the dialog which references the different types.
This blog should give insight to all the ways to train an entity and how it is used within intents.
https://medium.com/ibm-watson/all-about-entities-dictionaries-and-patterns-with-watson-assistant-part-1-5ef7254df76b
There are a number of possible pipelines you can choose from.
1. Indirect references: this is the preferred method.
Use natural language in your intent training data. "I want to buy a pear"
Watson will automatically see the other values you have related to pear and use those as intent training as well. This will be the fastest and simplest way to manage your data
2. Direct references: this should only be used if absolutely necessary
Directly reference the entity in your intent data. "I want to buy an #pear"
Nothing is done in the UI to tell you this works, but it does. This tells Watson the entity is a very important term and will increase the weight, as well as reference all synonyms with high weight. This is more effort for you to go through your entire workspace and relabel everything this way, hence why it is not recommended unless absolutely necessary. By doing this, you also tell watson that when the system sees various fruits without the # symbol, to ignore them as entities which is not ideal
3. Contextual entities. This is highlighting them like in your screenshot.
Note the UI has been updated so there is no an annotation mode instead of just highlighting. This builds a model around the entity, and is good for things like names or locations, but not necessary for a small list of items like crayons in a box, or fruit in a store. This will ignore all of the dictionary values youve created and only look at the model. It should be used according to the blog above when the use case is ideal.
What #data_henrik answered was partially correct. But it doesn't seem like Watson Assistant "automatically" learns the preferred #Entity just by simply inputting the pure (plain-text) Examples into the #Intent. In fact, that step was required. But we still need to do one more step.
After keying in the good plain-text Examples into the #Intent, we then still need to "right click" on the text-string of the possible #Entity entry, and then choose (teach Watson) the correct #Entity name from the dropdown list appeared.
Only then Watson starts to understand such; this #Intent uses that #Entity, I suppose.
Thank you #data_henrik, and appreciate your hint.

Standard format for REST pagination, field selection, querying?

When designing a REST API, following guidance such as 10 Best Practices for Better RESTful API, there seem to be all sorts of ways to provide a query syntax, pagination, selecting fields to return, etc.
For example, some ways to do pagination:
/orders?max=20&start=100
/orders?per_page=20&page=5
Some ways to provide a query interface:
/orders?q=value>20
/orders?q={'value': 'gt 20'}
Are there any standards for how to design an API that offers these features? If not, standards in development or best practice guidelines would be useful.
When researching this for the Watson Discovery and Assistant APIs, we weren't able to find any widely adopted conventions for filtering or paging, although there are many different conventions.
Some considerations for which convention you use:
Do you need compound clauses in your query? If you want to be able to express a > 10 || b < 10, then you need a string syntax or structured JSON structure to represent the more complicated queries, which will likely be a usability challenge for your users, and so is preferable to avoid if you don't really need the flexibility. In general, the simpler you can keep the requirements, the easier the API will be to learn and use, while potentially at the expense of flexibility. For example, if it turns out that the created date is the only field that users actually care about doing inequality filtering on, you could have explicit begin_date and end_date filter parameters instead of allowing inequality comparisons on all fields.
For pagination, do you have frequently changing data? If so, paging by offset may give you unstable results. For example, paging through logs that are actively being created, sorted by most recent, would cause you to see duplicate items. To avoid this, the server can return a token that represents the next page. This token can either be a lookup value or directly encode the information necessary to identify the values of the next item in the potentially changing list. Microsoft's API guidelines contain examples of both token and offset based paging, and are one of many sets of conventions to follow: https://github.com/Microsoft/api-guidelines/blob/vNext/Guidelines.md#98-pagination

Lucene.NET faceted search

I found a great tutorial on performing a faceted search.
http://www.devatwork.nl/articles/lucenenet/faceted-search-and-drill-down-lucenenet/
This article does not explain how to retrieve the narrowed available attributes to filter from (for further drill down).
Lets say I am looking for planners that are red. When I perform the faceted search, I want to return all available attributes to filter from that are red. Then when I add a "weekly format" filter, I want the attribute list to get even smaller, containing only filters available for the segmented group.
I want love to use Solr/SolrNET but I am in a shared hosting situation with limited access to the actual server.
I am fairly new to lucene.net, so examples are much appreciated.
IIUC, you get a BitArray containing the list of the filtered results. In the tutorial's example, you will have combinedResults as this list. If you want to further narrow this down, you need to reiterate the process: run another searchQuery and intersect the results with the BitArray you have for combinedResults.
I want love to use Solr/SolrNET but I am in a shared hosting situation with limited access to the actual server.
You can always use an off-site, hosted Solr solution. See this question for more information.