QuickSort with 3 partitions - quicksort

Suppose we modified Quicksort to have three partitions instead of two. The left partition has values < pivot. The middle partition has values = pivot. The right partition has values > pivot. We then recurse on the left and right partitions. How much time will this 3-way partitioning take?
I see this in one of the interview questions, where answer was given as O(n). But for normal 1 parition quick sort it is O(nlogn).
Please help me understand why it O(n) ?

It's only O(n), when all values are the same. The first instance of partition will find all values == pivot (regardless of which value is chosen for the pivot), and since there are no values < or > pivot, no recursion occurs.
For normal data, time complexity remains average case O(n log(n)) or worst case O(n^2).

Related

quick sort is slower than merge sort

I think the speed of quick sort is less efficient when arranging an array with duplicate data, right? when datatype is char, the bigger the array(over 100000), the closer it gets to the n^2 order.
and assuming there is no duplicate data, to get the best case of a quick sort where the first element is placed as a pivot, first elementsI think we can recursively change the first and intermediate elements by dividing the already aligned array like a merge sort. right? is there general best case?
Lomuto partition scheme, which scans from one end to the other during partition, is slower with duplicates. If all the values are the same, then each partition step splits it into sizes 1 and n-1, a worst case scenario.
Hoare partition scheme, which scans from both both ends towards each other until the indexes (or iterators or pointers) cross, is usually faster with duplicates. Even though duplicates result in more swaps, each swap occurs just after reading and comparing two values to the pivot and are still in the cache for the swap (assuming object size is not huge). As the number of duplicates increases, the splitting improves towards the ideal case where each partition step splits the data into two equal halves. I ran a benchmark sorting 16 million 64 bit integers: with random data, it took about 1.37 seconds, improving with duplicates and with all values the same, it took about about 0.288 seconds.
Another alternative is a 3 way partition, which splits a partition into elements < pivot, elements == pivot, elements > pivot. If all the elements are the same, it's done in O(n) time. For n elements with only k possible values, then time complexity is O(n ⌈log3(k)⌉), and since k is constant, the time complexity is still O(n).
Wiki links:
https://en.wikipedia.org/wiki/Quicksort#Repeated_elements
https://en.wikipedia.org/wiki/Dutch_national_flag_problem

Quicksort, given 3 values, how can I get to 9 operations?

Well, I want to use Quick sort on given 3 values, doesn't matter what values, how can I get to the worst case which is 9 operations?
Can anyone draw a tree and show how it show nlogn and n^2 operations? I've tried to find on the internet, but I still didnt manage to draw one properly to show that.
The worst case complexity of quick sort depends on the chosen pivot. If the pivot chosen is the leftmost or the rightmost element. Then the worst case complexity will occur in the following cases:
1) Array is already sorted in same order.
2) Array is already sorted in reverse order.
3) All elements are same (special case of case 1 and 2).
Since these cases occur very frequently, the pivot is chosen randomly. By choosing pivot randomly the chances of worst case are reduced.
The analysis of quicksort algorithm is explained in this blogpost by khan academy.

Hash tables - complexity of insert, search, and delete

I've been given two homework problems on the complexity of hash tables, but I'm struggling to understand the difference between them.
They are as follows:
Consider a hash function which is to take n inputs and map them to a table of size m.
Write the complexity of insert, search, and deletion for the hash function which distributes all n inputs evenly over the buckets of the hash table.
Write the complexity of insert, search, and deletion for the (supposedly perfect but unrealistic) hash function which will never has two items to the same bucket, i.e. this hash function will never result in a collision.
These two questions seem quite similar to me and I'm not really sure of their differences.
For question one, since the n inputs are distributed evenly we can assume there will be zero or one items in each bucket, so all of insert, search and delete will be O(1). Is this correct?
How then does question two differ in any way? If the function never results in a collision then all the items will be spread evenly so wouldn't this result in O(1) for each operation?
Is my thinking correct for these problems or am I missing something?
EDIT:
I believe I've identified where I've gone wrong. O(1) is correct for every operation in question 3 because the hash function is ideal and never results in collision.
However for question 2, the items are spread evenly BUT DOES NOT MEAN there is only 1 item in each bucket, every bucket could have 20 items in a linked list, for example. So insertion would be O(1).
But what about search? It would be O(1) + cost of searching the linked list. But we don't know the size, only know it's spread evenly. Can we get an expression for the length in terms of n (number of inputs) and m (size of table)?
Your edit is on the right track.
Can we get an expression for the length in terms of n (number of inputs) and m (size of table)?
For 1, if the hash table sizing is inhibited in some way that means the load factor (i.e. number of items per bucket) n/m is greater than 1 and not constant nor within constant bounds, then you can postulate a relationship m = f(n), then the load factor will be n / f(n), so the complexity will be O(n/f(n)) too.
In the second case, the complexity is always O(1).

What is the runtime for initializing a hash table with n elements?

Is it O(n) or O(n logn)? I have n elements that I need to setup in a hash table, what is the worst-case and average runtime?
Worst case is unlimited. You need to calculate hash codes and may have to compare elements, and the time for that is not limited.
Assuming that calculating hashes and comparing elements is constant time, for insertion the worst case is O (n^2). What saves you is the fact that the worst case would be exceedingly rare, assuming a halfway decent has function. Average time for a decent implementation is O (n).

Time complexity of QuickSort+Insertion sort hybrid algorithm?

I am implementing an algorithm that perform Quick sort with Leftmost pivot selection up to a certain limit and when the list of arrays becomes almost sorted, I will use Insertion sort to sort those elements.
For left most pivot selection,I know the Average case complexity of Quick sort is O(nlogn) and worst case complexity ,i.e. when the list is almost sorted, is O(n^2). On the other hand, Insertion sort is very efficient on almost sorted list of elements with a complexity is O(n).
SO I think the complexity of this hybrid algorithm should be O(n). Am I correct?
The most important thing for the performance of qsort is picking a good pivot above all. This means choosing an element that's as close to the average of the elements you're sorting as possible.
The worse case of O(n2) in qsort comes about from consistently choosing 'bad' pivots every time for each partition pass. This causes the partitions to be extremely lopsided rather than balanced eg. 1 : n-1 element partition ratio.
I don't see how adding insertion sort into the mix as you've describe would help or mitigate this problem.