Github API field descriptions - github

I'm toying with the Github search API (v3) and can't seem to find a description of the fields that are returned. Most of them are obvious, but there are a few like score that aren't. Does anyone know what score means, and does a field reference exist?

The score attribute is the search score of that document for a particular query, and is used for Best Match sorting. In other words, it's used for ranking search results, but it isn't shown in search results on github.com.

Related

Azure Search querying json metadata

I have indexed documents with a "MetadataFields" key whose values are json.
"MetadataFields": "{"continent":"north america","country":"united states","region":"x34","tagnumber":"abc-123"}
I'm able to search for a specific match in Search Explorer using
MetadataFields/tagnumber:abc-123
But I cannot figure out how to find documents where this attribute is null or missing. Is this possible? What is the proper syntax?
In Azure Cognitive Search, you can use filtering to find null items.
In Search Explorer, the syntax would look like this:
$filter=MetadataFields/tagnumber eq null
To do this, you would need to have the field tagnumber marked as filterable in your index definition.

Cloudant distinct operator

I am new to cloud-ant, In my current assignment i want to search all distinct records based on fields x:
I have documents which have domain as attribute. I want all unique domains which are present in my db.Below is the example,
documentNo1-{"domain":"gmail.com"}
documentNo2-{"domain":"ymail.com"}
documentNo3-{"domain":"gmail.com"}
expected result is API should return only unique domain name, like below
[gmail.com,ymail.com]
I am not getting operators in cloud-ant which can achieve this, only solution i have is to retrieve it and create our own unique domain list.
Looking for any good approach/solution for above scenario.
You can use Cloudant Search to create a faceted index.
See https://console.bluemix.net/docs/services/Cloudant/api/search.html#faceting
This would allow you to essentially group documents by domain, creating the unique list you need.
There is a good video tutorial showing this technique:
https://www.youtube.com/watch?v=9er3XI150VM

MongoDB - Tag based search with autocomplete

I am looking to implement a tag search feature and was looking for some advice in terms of efficiency. I am new to MongoDB so I am unsure of best practices for performance.
Okay so I want to create a link sharing app which users tag the links based on their content. For instance a funny dog image would be tagged with "funny" and "dog". A link would have a:
title,
url,
user_id,
tags: array of tags
Now in order for me to allow users to search for links I need a list of all the tags used. For usability this needs to have auto-complete functionality. So I researched a bit and tested out using a collection of tags where I index the tag value e.g. "funny" and then use a regex.
db.tags.find({value:/^search/})
With a collection of 600,000 documents it searched for all documents beginning with "s" in 63 milliseconds. As the length of the search term increases the execution time decreases.
Now comes the part I'm unsure of. Say for instance I want to find all the links with have the tags "funny" and "dog" (need to use intersects). How should I store the tags? Should I store the object id of each tag? Can I index these object ids? Is there another way to structure the whole database?
Also id like to be able suggest tags based on tags they already entered. I was thinking of just having a related field in the tag document for instance:
tag
----
id
value
related: [{
tag_id
count
}]
(again unsure as it would suggest tags that could be related to one of the already entered tags and not to another. With an intersect this would return no results.)
Any advice would be much appreciated.
Edit: mistake
Create a text index on the tag array. This will enable you to search quickly for funny, dog, and funny or dog.
https://docs.mongodb.com/manual/core/index-text/
db.tags.createIndex( { tags: "text" }, {background:true} )
As to the related tags, I don't think that you want to reference the _id values. You can probably embed an array of related tags such as:
relatedTags: [{tag1}, {tag2}]

MongoEngine search index

I'm trying to implement an inverted-index search engine with MongoDb (MongoEngine) where terms in Posts are assigned weights and then used as embedded documents like such:
class Term(db.EmbeddedDocument):
t = db.StringField()
weight = db.FloatField()
class Post(db.Document):
terms = db.ListField(db.EmbeddedDocumentField(Term))
Then given a term, I can find the Posts that contain the term using this query:
post_list = Post.objects(terms__t=term)
However, this returns a list of Posts, but how can I find the weight of the term for each returned Post without having to iterate through the list of embedded terms looking for the term? Is there a way to query the Posts to automatically return the weight for any returned Posts as well?
Also would appreciate if anyone has any better methods of implementing a search engine in MongoDB?
Thanks!
MongoDB supports a basic text index see: http://docs.mongodb.org/manual/core/index-text/ This is a better way to store and search against documents, especially if you want a score for the match.
You'd have to call the command manually as its not currently implemented in MongoEngine.

Lucene.Net give one field more weigt than another

I know I've seen this and I just can't seem to find it.
I have 2 fields that i am searching on - Name, and Tags. I want results that are based on a match on the "Name" field have a higher score than those based on the "tags" field.
how can I do this?
Thanks
Along with boosting during search, you can also boost fields differently during indexing. This means that a general search for a term that could show up in either field would still give a better score for those that match your preferred field without overtly stating where you're looking for the term.
You can use the boost operator:
title_wa:something^4
if the title matches 'something', then its score will be boosted according to the factor.