total number of comparisons required by merge sort and selection sort - mergesort

I am working on a merge sort programme form the scratch web site for my class pupils using 4 lists each with 10 items but i cannot get it to work correctly I would like to know how many comparisons will selection sort require to sort 4 lists each of length 10?

The number of comparisons depends on the values of the list elements for both sorting methods. Hence there is no fixed number of comparisons for a given list length in the general case. You should implement both sorting methods and count the number of comparisons performed for each list.

Related

Firestore 1 global index vs 1 index per query what is better?

I'm working on my app and I just ran into a dilemma regarding what's the best way to handle indexes for firestore.
I have a query that search for publication in a specify community that contains at least one of the tag and in a geohash range. The index for that query looks like this:
community Ascending tag Ascending location.geohash Ascending
Now if my user doesnt need to filter by tag, I run the query without the arrayContains(tag) which prompt me to create another index:
community Ascending location.geohash Ascending
My question is, is it better to create that second index or, to just use the first one and specifying all possible tags in arrayContains in the query if the user want no filters on tag ?
Neither is pertinently better, but it's a typical space vs time tradeoff.
Adding the extra tags in the query adds some overhead there, but it saves you the (storage) cost for the additional index. So you're trading some small amount of runtime performance for a small amount of space/cost savings.
One thing to check is whether the query with tags can actually run on just the second index, as Firestore may be able to do a zigzag merge join. In that case you could only keep the second, smaller index and save the runtime performance of adding additional clauses, but then get a (similarly small) performance difference on the query where you do specify one or more tags.

Extensive filtering

Example:
{
shortName: "KITT",
longName: "Knight Industries Two Thousand",
fromZeroToSixty: 2,
year: 1982,
manufacturer: "Pontiac",
/* 25 more fields */
}
Ability to query by at least 20 fields which means that only 10 fields are left unindexed
There's 3 fields (all number) that could be used for sorting (both ways)
This leaves me wondering that how does sites with lots of searchable fields do it: e.g real estate or car sale sites where you can filter by every small detail and can choose between several sort options.
How could I pull this off with MongoDB? How should I index that kind of collection?
Im aware that there are dbs specifically made for searching but there must be general rules of thumb to do this (even if less performant) in every db. Im sure not everybody uses Elasticsearch or similar.
---
Optional reading:
My reasoning is that index could be huge but the index order matters. You'll always make sure that fields that return the least results are first and most generic fields are last in index. However, what if user chooses only generic fields? Should I include non-generic fields to query anyway? How to solve ordering in both ways? Or index intersection saves the day and I should just add 20 different indexes?
text index is your friend.
Read up on it here: https://docs.mongodb.com/v3.2/core/index-text/
In short, it's a way to tell mongodb that you want full text search over a specific field, multiple fields, or all fields (yay!)
To allow text indexing of all fields, use the special symbol $**, and define it of type 'text':
db.collection.createIndex( { "$**": "text" } )
you can also configure it with Case Insensitivity or Diacritic Insensitivity, and more.
To perform text searches using the index, use the $text query helper, see: https://docs.mongodb.com/v3.2/reference/operator/query/text/#op._S_text
Update:
In order to allow user to select specific fields to search on, it's possible to use weights when creating the text-index: https://docs.mongodb.com/v3.2/core/index-text/#specify-weights
If you carefully select your fields' weights, for example using different prime numbers only, and then add the $meta text score to your results you may be able to figure out from the "textScore" which field was matched on this query, and so filter out the results that didn't get a hit from a selected search field.
Read more here: https://docs.mongodb.com/v3.2/tutorial/control-results-of-text-search/

In Algolia, how do you construct records to allow for alphabetical sorting of query results?

As far as I know, you can only sort on numeric fields in Algolia, so how do you efficiently set up your records to allow for results to be returned alphabetically based on a specific string field?
For example, let's say in each record in an index you have a field called "title" that contains an arbitrary string value. How would you create a sibling field called "title_sort" that contains a number that allows for the the results to be sorted such that the records come out in alphabetical order by "title"? Is there a particularly well-accepted algorithm for creating such a number from the string in "title"?
If you have a static dataset, then you can just sort your data and put an index on it. This works as long as sorting data every time you update your indices.
I'm also thinking that if you can deal with a partial sorting, meaning that you can accept orc < orb but you need or < os, then you could derive an can use base64 as our index. You can then sort it to as many characters as you have precision for. It's only a partial sorting, but it might be acceptable for your use case. You just need to map your base64 -> base10 mappings to accomodate the sorting.
Additionally, if you don't care about the difference between capital and lowercase letters, then you can do base26 -> base10. The more I think about this the more limited it is, but it might work for your use case.

Using Mongo: should we create an index tailored to each type of high-volume query?

We have two types of high-volume queries. One looks for docs involving 5 attributes: a date (lte), a value stored in an array, a value stored in a second array, one integer (gte), and one float (gte).
The second includes these five attributes plus two more.
Should we create two compound indices, one for each query? Assume each attribute has a high cardinality.
If we do, because each query involves multiple arrays, it doesn't seem like we can create an index because of Mongo's restriction. How do people structure their Mongo databases in this case?
We're using MongoMapper.
Thanks!
Indexes for queries after the first ranges in the query the value of the additional index fields drops significantly.
Conceptually, I find it best to think of the addition fields in the index pruning ever smaller sub-trees from the query. The first range chops off a large branch, the second a smaller, the third smaller, etc. My general rule of thumb is only the first range from the query in the index is of value.
The caveat to that rule is that additional fields in the index can be useful to aid sorting returned results.
For the first query I would create a index on the two array values and then which ever of the ranges will exclude the most documents. The date field is unlikely to provide high exclusion unless you can close the range (lte and gte). The integer and float is hard to tell without knowing the domain.
If the second query's two additional attributes also use ranges in the query and do not have a significantly higher exclusion value then I would just work with the one index.
Rob.

How can I filter by the length of an embedded document in MongoDB?

For example given the BlogPost/Comments schema here:
http://mongoosejs.com/
How would I find all posts with more than five comments? I have tried something along the lines of
where('comments').size.gte(5)
But I'm getting tripped up with the syntax
MongoDb doesn't support range queries with size operator (Link). They recommend you to create a separate field to contain the size of the list that you increment yourself.
You cannot use $size to find a range of sizes (for example: arrays with more than 1 element). If you need to query for a range, create an extra size field that you increment when you add elements.
Note that for some queries, it may be feasible to just list all the counts you want in or excluded using (n)or conditions.
In your example, the following query will give all documents with more than 5 comments (using standard mongodb syntax, not mongoose):
db.col.find({"comments":{"$exists"=>true}, "$nor":[{"comments":{"$size"=>4}}, {"comments":{"$size"=>3}}, {"comments":{"$size"=>2}}, {"comments":{"$size"=>1}}, {"comments":{"$size"=>0}}]})
Obviously, this is very repetitive, so it only makes sense for small boundaries, if at all. Keeping a separate count variable, as recommended in the mongodb docs, is usually the better solution.
It's slow, but you could also use the $where clause:
db.Blog.find({$where:"this.comments.length > 5"}).exec(...);