merge sort performance compared to insertion sort - mergesort

For any array of length greater than 10, is it safe to say that merge sort performs fewer comparisons among the array's elements than does insertion sort on the same array because the best case for the run time of merge sort is O(N log N) while for insertion sort, its O(N)?

My take on this. First off, you are talking about comparisons, but there are swaps as well that matter.
In insertion sort in the worst case (an array sorted in opposite direction) you have to do n^2 - n comparisons and swaps (11^2 - 11 = 121 - 11 = 110 for 11 elements, for example). But if the array is even partially sorted in needed order (I mean many elements already stay at correct positions or even not far from them), the number of swaps&comparisons may significantly drop. The right position for the element will be found pretty soon and there will be no need for performing as many actions as in case of an array sorted in opposite order. So, as you can see for arr2, which is almost sorted, the number of actions will become linear (in relation to the input size) - 6.
var arr1 = [11,10,9,8,7,6,5,4,3,2,1];
var arr2 = [1,2,3,4,5,6,7,8,11,10,9];
function InsertionSort(arr) {
var arr = arr, compNum = 0, swapNum = 0;
for(var i = 1; i < arr.length; i++) {
var temp = arr[i], j = i - 1;
while(j >= 0) {
if(temp < arr[j]) { arr[j + 1] = arr[j]; swapNum++; } else break;
j--;
compNum++;
}
arr[j + 1] = temp;
}
console.log(arr, "Number of comparisons: " + compNum, "Number of swaps: " + swapNum);
}
InsertionSort(arr1); // worst case, 11^2 - 11 = 110 actions
InsertionSort(arr2); // almost sorted array, few actions
In merge sort we always do aprox. n*log n actions - the properties of the input array don't matter. So, as you can see in both cases we will get both of our arrays sorted in 39 actions:
var arr1 = [11,10,9,8,7,6,5,4,3,2,1];
var arr2 = [1,2,3,4,5,6,7,8,11,10,9];
var actions = 0;
function mergesort(arr, left, right) {
if(left >= right) return;
var middle = Math.floor((left + right)/2);
mergesort(arr, left, middle);
mergesort(arr, middle + 1, right);
merge(arr, left, middle, right);
}
function merge(arr, left, middle, right) {
var l = middle - left + 1, r = right - middle, temp_l = [], temp_r = [];
for(var i = 0; i < l; i++) temp_l[i] = arr[left + i];
for(var i = 0; i < r; i++) temp_r[i] = arr[middle + i + 1];
var i = 0, j = 0, k = left;
while(i < l && j < r) {
if(temp_l[i] <= temp_r[j]) {
arr[k] = temp_l[i]; i++;
} else {
arr[k] = temp_r[j]; j++;
}
k++; actions++;
}
while(i < l) { arr[k] = temp_l[i]; i++; k++; actions++;}
while(j < r) { arr[k] = temp_r[j]; j++; k++; actions++;}
}
mergesort(arr1, 0, arr1.length - 1);
console.log(arr1, "Number of actions: " + actions); // 11*log11 = 39 (aprox.)
actions = 0;
mergesort(arr2, 0, arr2.length - 1);
console.log(arr2, "Number of actions: " + actions); // 11*log11 = 39 (aprox.)
So, answering your question:
For any array of length greater than 10, is it safe to say that merge sort performs fewer comparisons among the array's elements than does insertion sort on the same array
I would say that no, it isn't safe to say so. Merge sort can perform more actions compared to insertion sort in some cases. The size of an array isn't important here. What is important in this particular case of comparing insertion sort vs. merge sort is how far from the sorted state is your array. I hope it helps :)
BTW, merge sort and insertion sort have been united in a hybrid stable sorting algorithm called Timsort to get the best from both of them. Check it out if interested.

Related

Is there a better way to calculate the moving sum of a list in flutter

Is there a better way to calculate a moving sum of a list?
List<double?> rollingSum({int window = 3, List data = const []}) {
List<double?> sum = [];
int i = 0;
int maxLength = data.length - window + 1;
while (i < maxLength) {
List tmpData = data.getRange(i, i + window).toList();
double tmpSum = tmpData.reduce((a, b) => a + b);
sum.add(tmpSum);
i++;
}
// filling the first n values with null
i = 0;
while (i < window - 1) {
sum.insert(0, null);
i++;
}
return sum;
}
Well, the code is already clean for what you need. Maybe just some improvements like:
Use a for loop
You can use the method sublist which creates a "view" of a list, which is more efficient
To insert some values in the left/right of a list, there is a specific Dart method called padLeft, where you specify the lenght of the list which you want it to become (first parameter), then the value you want to use to fill it (second parameter). For example, if you have an array of N elements, and you want to fill it with X "null"s to the left, use padLeft(N+X, null).
List<double?> rollingSum({int window = 3, List data = const []}) {
List<double?> sum = [];
for (int i = 0; i < data.length - window + 1; i++) {
List tmpData = data.sublist(i, i + window);
double tmpSum = tmpData.reduce((a, b) => a + b);
sum.add(tmpSum);
}
sum.padLeft(window - 1, null);
return sum;
}
if I understand your problem correctly you can just calculate the window one time and in one loop you can for each iteration you can add the current element to the sum and subtract i - (window - 1)
so for an input like this
data = [1,2,3,4,5,6]
window = 3
the below code will result in [6,9,12,15]
int sum = 0;
List<double> res = [];
for (int i = 0;i<data.length;i++) {
sum += data[i];
if (i < window - 1) {
continue;
}
res.add(sum);
sum -= data[i - (window - 1)]; // remove element that got out of the window size
}
this way you won't have to use getRange nor sublist nor reduce as all of those are expensive functions in terms of time and space complexity

Peculiar issue with quicksort partition

Today, when trying quicksort, instead of taking last element as pivot and partitioning,i took the first element as pivot, But it is not producing the correct partitioned output.
int pivot = ar[0];
int pindex = 0;
for(int i = 0;i < ar.size();i++)
{
if(ar[i] <= pivot)
{
swap(ar[i],ar[pindex]);
pindex++;
}
}
swap(ar[pindex],ar[ar.size()-1]);
I could not understand why, i always use this for partition, but this is not working when i take first element as partition.
But this worked even if i took first element as partition
int i, j, pivot, temp;
pivot = ar[0];
i = 0;
j = ar.size()-1;
while(1)
{
while(ar[i] < pivot && ar[i] != pivot)
i++;
while(ar[j] > pivot && ar[j] != pivot)
j--;
if(i < j)
{
temp = ar[i];
ar[i] = ar[j];
ar[j] = temp;
}
else
{
break;
}
}
What are the differences between them.
At last found that, this method is Hoare's partition method, where as the typical quick sort method we all follow is lomuto's partition.
See this wiki page, it has all details https://en.wikipedia.org/wiki/Quicksort

speed up prime number generating

I have written a program that generates prime numbers . It works well but I want to speed it up as it takes quite a while for generating the all the prime numbers till 10000
var list = [2,3]
var limitation = 10000
var flag = true
var tmp = 0
for (var count = 4 ; count <= limitation ; count += 1 ){
while(flag && tmp <= list.count - 1){
if (count % list[tmp] == 0){
flag = false
}else if ( count % list[tmp] != 0 && tmp != list.count - 1 ){
tmp += 1
}else if ( count % list[tmp] != 0 && tmp == list.count - 1 ){
list.append(count)
}
}
flag = true
tmp = 0
}
print(list)
Two simple improvements that will make it fast up through 100,000 and maybe 1,000,000.
All primes except 2 are odd
Start the loop at 5 and increment by 2 each time. This isn't going to speed it up a lot because you are finding the counter example on the first try, but it's still a very typical improvement.
Only search through the square root of the value you are testing
The square root is the point at which a you half the factor space, i.e. any factor less than the square root is paired with a factor above the square root, so you only have to check above or below it. There are far fewer numbers below the square root, so you should check the only the values less than or equal to the square root.
Take 10,000 for example. The square root is 100. For this you only have to look at values less than the square root, which in terms of primes is roughly 25 values instead of over 1000 checks for all primes less than 10,000.
Doing it even faster
Try another method altogether, like a sieve. These methods are much faster but have a higher memory overhead.
In addition to what Nick already explained, you can also easily take advantage of the following property: all primes greater than 3 are congruent to 1 or -1 mod 6.
Because you've already included 2 and 3 in your initial list, you can therefore start with count = 6, test count - 1 and count + 1 and increment by 6 each time.
Below is my first attempt ever at Swift, so pardon the syntax which is probably far from optimal.
var list = [2,3]
var limitation = 10000
var flag = true
var tmp = 0
var max = 0
for(var count = 6 ; count <= limitation ; count += 6) {
for(var d = -1; d <= 1; d += 2) {
max = Int(floor(sqrt(Double(count + d))))
for(flag = true, tmp = 0; flag && list[tmp] <= max; tmp++) {
if((count + d) % list[tmp] == 0) {
flag = false
}
}
if(flag) {
list.append(count + d)
}
}
}
print(list)
I've tested the above code on iswift.org/playground with limitation = 10,000, 100,000 and 1,000,000.

The call stack size of quick sort

I read this answer and found an implementation of Quicksort here. It's still unclear to me why Quicksort requires O(log n) extra space.
I understand what a call stack is. I applied the implementation stated above to an array of random numbers and saw n - 1 calls of quickSort.
public static void main(String[] args) {
Random random = new Random();
int num = 8;
int[] array = new int[num];
for (int i = 0; i < num; i++) {
array[i] = random.nextInt(100);
}
System.out.println(Arrays.toString(array));
quickSort(array, 0, array.length - 1);
System.out.println(Arrays.toString(array));
}
static int partition(int arr[], int left, int right) {
int i = left, j = right;
int tmp;
int pivot = arr[(left + right) / 2];
while (i <= j) {
while (arr[i] < pivot)
i++;
while (arr[j] > pivot)
j--;
if (i <= j) {
tmp = arr[i];
arr[i] = arr[j];
arr[j] = tmp;
i++;
j--;
}
}
return i;
}
static void quickSort(int arr[], int left, int right) {
System.out.println("quickSort. left = " + left + " right = " + right);
int index = partition(arr, left, right);
if (left < index - 1)
quickSort(arr, left, index - 1);
if (index < right)
quickSort(arr, index, right);
}
The output I saw:
[83, 65, 68, 91, 43, 45, 58, 82]
quickSort. left = 0 right = 7
quickSort. left = 0 right = 6
quickSort. left = 0 right = 4
quickSort. left = 0 right = 3
quickSort. left = 0 right = 2
quickSort. left = 0 right = 1
quickSort. left = 5 right = 6
[43, 45, 58, 65, 68, 82, 83, 91]
It makes that 7 (n -1) calls. So why does quickSort require O(log n) space for its call stack if the number of calls depends on n, not log n?
I think I understand why the stack size of Quicksort is O(n) in the worst case.
One part of the array (suppose left) to be sorted consists of one element, and the other part (right) consists of n - 1 elements. The size of the left part is always 1, and the size of the right part decrements by 1 every time.
Thus, we initially call Quicksort and then call it n - 1 times for the right part recursively. So extra space for the call stack is O(n). And since the partitioning procedure takes O(n) for every recursive call, the time complexity is O(n2).
As for the average case analysis, now I don't know how to prove O(n * log n) for the time complexity and O(log n) for extra space. But I know that if I divide the input array into two almost equal parts, I'll call Quicksort (log n) / 2 times for the left part. And the right part is sorted using tail recursion which doesn't add to the call stack.
https://en.wikipedia.org/wiki/Quicksort
So extra space needed for Quicksort is O (log n) in this case. The constant factor 1/2 is left out.
Since the partitioning routine is n, the time complexity is O(n * log n).
Please correct me if my assumptions are wrong. I'm ready to read and accept your answer.

Carefully deleting N items from a "circular" vector (or perhaps just an NSMutableArray)

Imagine a std:vector, say, with 100 things on it (0 to 99) currently. You are treating it as a loop. So the 105th item is index 4; forward 7 from index 98 is 5.
You want to delete N items after index position P.
So, delete 5 items after index 50; easy.
Or 5 items after index 99: as you delete 0 five times, or 4 through 0, noting that position at 99 will be erased from existence.
Worst, 5 items after index 97 - you have to deal with both modes of deletion.
What's the elegant and solid approach?
Here's a boring routine I wrote
-(void)knotRemovalHelper:(NSMutableArray*)original
after:(NSInteger)nn howManyToDelete:(NSInteger)desired
{
#define ORCO ((NSInteger)[original count])
static NSInteger kount, howManyUntilLoop, howManyExtraAferLoop;
if ( ... our array is NOT a loop ... )
// trivial, if messy...
{
for ( kount = 1; kount<=desired; ++kount )
{
if ( (nn+1) >= ORCO )
return;
[original removeObjectAtIndex:( nn+1 )];
}
return;
}
else // our array is a loop
// messy, confusing and inelegant. how to improve?
// here we go...
{
howManyUntilLoop = (ORCO-1) - nn;
if ( howManyUntilLoop > desired )
{
for ( kount = 1; kount<=desired; ++kount )
[original removeObjectAtIndex:( nn+1 )];
return;
}
howManyExtraAferLoop = desired - howManyUntilLoop;
for ( kount = 1; kount<=howManyUntilLoop; ++kount )
[original removeObjectAtIndex:( nn+1 )];
for ( kount = 1; kount<=howManyExtraAferLoop; ++kount )
[original removeObjectAtIndex:0];
return;
}
#undef ORCO
}
Update!
InVariant's second answer leads to the following excellent solution. "starting with" is much better than "starting after". So the routine now uses "start with". Invariant's second answer leads to this very simple solution...
N times do if P < currentsize remove P else remove 0
-(void)removeLoopilyFrom:(NSMutableArray*)ra
startingWithThisOne:(NSInteger)removeThisOneFirst
howManyToDelete:(NSInteger)countToDelete
{
// exception if removeThisOneFirst > ra highestIndex
// exception if countToDelete is > ra size
// so easy thanks to Invariant:
for ( do this countToDelete times )
{
if ( removeThisOneFirst < [ra count] )
[ra removeObjectAtIndex:removeThisOneFirst];
else
[ra removeObjectAtIndex:0];
}
}
Update!
Toolbox has pointed out the excellent idea of working to a new array - super KISS.
Here's an idea off the top of my head.
First, generate an array of integers representing the indices to remove. So "remove 5 from index 97" would generate [97,98,99,0,1]. This can be done with the application of a simple modulus operator.
Then, sort this array descending giving [99,98,97,1,0] and then remove the entries in that order.
Should work in all cases.
This solution seems to work, and it copies all remaining elements in the vector only once (to their final destination).
Assume kNumElements, kStartIndex, and kNumToRemove are defined as const size_t values.
vector<int> my_vec(kNumElements);
for (size_t i = 0; i < my_vec.size(); ++i) {
my_vec[i] = i;
}
for (size_t i = 0, cur = 0; i < my_vec.size(); ++i) {
// What is the "distance" from the current index to the start, taking
// into account the wrapping behavior?
size_t distance = (i + kNumElements - kStartIndex) % kNumElements;
// If it's not one of the ones to remove, then we keep it by copying it
// into its proper place.
if (distance >= kNumToRemove) {
my_vec[cur++] = my_vec[i];
}
}
my_vec.resize(kNumElements - kNumToRemove);
There's nothing wrong with two loop solutions as long as they're readable and don't do anything redundant. I don't know Objective-C syntax, but here's the pseudocode approach I'd take:
endIdx = after + howManyToDelete
if (Len <= after + howManyToDelete) //will have a second loop
firstloop = Len - after; //handle end in the first loop, beginning in second
else
firstpass = howManyToDelete; //the first loop will get them all
for (kount = 0; kount < firstpass; kount++)
remove after+1
for ( ; kount < howManyToDelete; kount++) //if firstpass < howManyToDelete, clean up leftovers
remove 0
This solution doesn't use mod, does the limit calculation outside the loop, and touches the relevant samples once each. The second for loop won't execute if all the samples were handled in the first loop.
The common way to do this in DSP is with a circular buffer. This is just a fixed length buffer with two associated counters:
//make sure BUFSIZE is a power of 2 for quick mod trick
#define BUFSIZE 1024
int CircBuf[BUFSIZE];
int InCtr, OutCtr;
void PutData(int *Buf, int count) {
int srcCtr;
int destCtr = InCtr & (BUFSIZE - 1); // if BUFSIZE is a power of 2, equivalent to and faster than destCtr = InCtr % BUFSIZE
for (srcCtr = 0; (srcCtr < count) && (destCtr < BUFSIZE); srcCtr++, destCtr++)
CircBuf[destCtr] = Buf[srcCtr];
for (destCtr = 0; srcCtr < count; srcCtr++, destCtr++)
CircBuf[destCtr] = Buf[srcCtr];
InCtr += count;
}
void GetData(int *Buf, int count) {
int srcCtr = OutCtr & (BUFSIZE - 1);
int destCtr = 0;
for (destCtr = 0; (srcCtr < BUFSIZE) && (destCtr < count); srcCtr++, destCtr++)
Buf[destCtr] = CircBuf[srcCtr];
for (srcCtr = 0; srcCtr < count; srcCtr++, destCtr++)
Buf[destCtr] = CircBuf[srcCtr];
OutCtr += count;
}
int BufferOverflow() {
return ((InCtr - OutCtr) > BUFSIZE);
}
This is pretty lightweight, but effective. And aside from the ctr = BigCtr & (SIZE-1) stuff, I'd argue it's highly readable. The only reason for the & trick is in old DSP environments, mod was an expensive operation so for something that ran often, like every time a buffer was ready for processing, you'd find ways to remove stuff like that. And if you were doing FFT's, your buffers were probably a power of 2 anyway.
These days, of course, you have 1 GHz processors and magically resizing arrays. You kids get off my lawn.
Another method:
N times do {remove entry at index P mod max(ArraySize, P)}
Example:
N=5, P=97, ArraySize=100
1: max(100, 97)=100 so remove at 97%100 = 97
2: max(99, 97)=99 so remove at 97%99 = 97 // array size is now 99
3: max(98, 97)=98 so remove at 97%98 = 97
4: max(97, 97)=97 so remove at 97%97 = 0
5: max(96, 97)=97 so remove at 97%97 = 0
I don't program iphone for know, so I image std::vector, it's quite easy, simple and elegant enough:
#include <iostream>
using std::cout;
#include <vector>
using std::vector;
#include <cassert> //no need for using, assert is macro
template<typename T>
void eraseCircularVector(vector<T> & vec, size_t position, size_t count)
{
assert(count <= vec.size());
if (count > 0)
{
position %= vec.size(); //normalize position
size_t positionEnd = (position + count) % vec.size();
if (positionEnd < position)
{
vec.erase(vec.begin() + position, vec.end());
vec.erase(vec.begin(), vec.begin() + positionEnd);
}
else
vec.erase(vec.begin() + position, vec.begin() + positionEnd);
}
}
int main()
{
vector<int> values;
for (int i = 0; i < 10; ++i)
values.push_back(i);
cout << "Values: ";
for (vector<int>::const_iterator cit = values.begin(); cit != values.end(); cit++)
cout << *cit << ' ';
cout << '\n';
eraseCircularVector(values, 5, 1); //remains 9: 0,1,2,3,4,6,7,8,9
eraseCircularVector(values, 16, 5); //remains 4: 3,4,6,7
cout << "Values: ";
for (vector<int>::const_iterator cit = values.begin(); cit != values.end(); cit++)
cout << *cit << ' ';
cout << '\n';
return 0;
}
However, you might consider:
creating new loop_vector class, if you use this kind of functionality enough
using list if you perform many deletions (or few deletions (not from end, that's simple pop_back) but large array)
If your container (NSMutableArray or whatever) is not list, but vector (i.e. resizable array), you most definitely don't want to delete items one by one, but whole range (e.g. std::vector's erase(begin, end)!
Edit: reacting to comment, to fully realize what must be done by vector, if you erase element other than the last one: it must copy all values after that element (e.g. 1000 items in array, you erase first, 999x copying (moving) of item, that is very costly).
Example:
#include <iostream>
#include <vector>
#include <ctime>
using namespace std;
int main()
{
clock_t start, end;
vector<int> vec;
const int items = 64 * 1024;
cout << "using " << items << " items in vector\n";
for (size_t i = 0; i < items; ++i) vec.push_back(i);
start = clock();
while (!vec.empty()) vec.erase(vec.begin());
end = clock();
cout << "Inefficient method took: "
<< (end - start) * 1.0 / CLOCKS_PER_SEC << " ms\n";
for (size_t i = 0; i < items; ++i) vec.push_back(i);
start = clock();
vec.erase(vec.begin(), vec.end());
end = clock();
cout << "Efficient method took: "
<< (end - start) * 1.0 / CLOCKS_PER_SEC << " ms\n";
return 0;
}
Produces output:
using 65536 items in vector
Inefficient method took: 1.705 ms
Efficient method took: 0 ms
Note it's very easy to get inefficient, look e.g. have at http://www.cplusplus.com/reference/stl/vector/erase/