For clarification, first(where:) method keeps iterating through the sequence until it finds the satisfied element and returns it.
Based on that I would assume that it is not O(n) (linear time) because at some point it doesn't has to iterate through the whole sequence until its end.
You could check: What is the difference between filter(_:).first and first(where:)?
I'm not sure if it could be something relates to O(log n), AFAIK it has something to do with splitting into halves...
It would be great if someone could describe how we can determine the time complexity for such a process.
We are usually interested in the worst case running time of a program. Based on that, it should be O(n) as the worst case is when it iterates through all the elements.
On average you'll only have to check 1/2 of the values, so you'd think first(where:) would be O(1/2 N). But O() notation ignores constants. O(N) means it grows linearly as the number of elements grows. For 10 items, you'd check 5 on average, for 100, you'd check 50, for 1000, you'd check 500 on average. Connect the points (10,5), (100,50), (1000, 500). That's a straight line.
Related
I know that when you resize an array by a scalar (like doubling the length of the array, then copying all elements into the new big array) the amortized time complexity is O(1).
But why is it the case that when you do it with a constant (say, resizing it by +10 each time) not O(1) as well?
Edit: https://www.cs.utexas.edu/~slaberge/docs/topics/amortized/dynamic_arrays/ this site seems to explain it, but I am very confused on the math. Where does big $N$ come from? I thought we were dealing with k?
If every kth consecutive insertions cost as much as the number of elements that are already in the array (denote by n+N*k where n is the initial size of the array) then you got sequences of this type:
n O(1) Operations
Expensive operation of O(n)
k O(1) Operations
Expensive operation of O(n+k)
k O(1) Operations
Expensive operation of (n+2k)
k O(1) Operations
Expensive operation of O(n+3k)
See where this is going? each expensive insertion happens every k insertions (expect first time) and costs as the current number of elements.
This means that after, lets simplify, n+A*k insertions we had A copies of n elements, and also we had A-1 copy of the first set of k elements, A-2 copies of the second set of k elements, and so on..
This sums up to O(An + A^2 * k). And because we did n+Ak, we can divide to get amortized cost.
This gives us (An + A^2 * k)/(n+Ak)=A
So, this implies that we are amortized dependent in this array on the NUMBER OF INSERTIONS, which is bad because we won't be able to state that this array, does a constant work in average.
I think the speed of quick sort is less efficient when arranging an array with duplicate data, right? when datatype is char, the bigger the array(over 100000), the closer it gets to the n^2 order.
and assuming there is no duplicate data, to get the best case of a quick sort where the first element is placed as a pivot, first elementsI think we can recursively change the first and intermediate elements by dividing the already aligned array like a merge sort. right? is there general best case?
Lomuto partition scheme, which scans from one end to the other during partition, is slower with duplicates. If all the values are the same, then each partition step splits it into sizes 1 and n-1, a worst case scenario.
Hoare partition scheme, which scans from both both ends towards each other until the indexes (or iterators or pointers) cross, is usually faster with duplicates. Even though duplicates result in more swaps, each swap occurs just after reading and comparing two values to the pivot and are still in the cache for the swap (assuming object size is not huge). As the number of duplicates increases, the splitting improves towards the ideal case where each partition step splits the data into two equal halves. I ran a benchmark sorting 16 million 64 bit integers: with random data, it took about 1.37 seconds, improving with duplicates and with all values the same, it took about about 0.288 seconds.
Another alternative is a 3 way partition, which splits a partition into elements < pivot, elements == pivot, elements > pivot. If all the elements are the same, it's done in O(n) time. For n elements with only k possible values, then time complexity is O(n ⌈log3(k)⌉), and since k is constant, the time complexity is still O(n).
Wiki links:
https://en.wikipedia.org/wiki/Quicksort#Repeated_elements
https://en.wikipedia.org/wiki/Dutch_national_flag_problem
Is it O(n) or O(n logn)? I have n elements that I need to setup in a hash table, what is the worst-case and average runtime?
Worst case is unlimited. You need to calculate hash codes and may have to compare elements, and the time for that is not limited.
Assuming that calculating hashes and comparing elements is constant time, for insertion the worst case is O (n^2). What saves you is the fact that the worst case would be exceedingly rare, assuming a halfway decent has function. Average time for a decent implementation is O (n).
It was a question on my final I took earlier and I had no idea how to answer it.
Well it was
What is Merge sort's worst case runtime but MORE IMPORTANTLY, why?
The divide-and-conquer contributes a log(n) factor. You divide the array in half log(n) times, and each time you do, for each segment, you have to do a merge on two sorted array. Merging two sorted arrays is O(n). The algorithm is just to walk up the two arrays, and walk up the one that's lagging.
The recursion you get is r(n) = O(n) + r(roundup(n/2))+r(rounddown(n/2).
The problem is that you cant use the Masters Theorem for solving this due to the rounding. Hence you can ether do the math or use a little hack-like solution. If ur input isn't a power of two number just "blow it up". Then u can use the masters theorem on r(n) = O(n) + 2r(n/2). Obviously this leads to O(nlogn). The function merge() itself is in O(n), because in the worst case you need n-1 compares.
So I have an array of numbers that look something like
1,708,234
2,802,532
11,083,432
5,098,123
5,777,111
I want to find out when two numbers are within a certain distance from each other (say 1,500,000) so I can group them into the same location and have just one UI element represent both for the level of zoom I am looking at. How would one go about doing this smartly or efficiently. I'm thinking I would just start with the first entry, loop through all the elements, and if one was close to another, flag those two and put it in a dictionary of some sort. That would be my brute force method, but I'm thinking there has to be a better way.
I'm coding in obj-c btw if that makes or breaks any design decisions.
How many numbers are we dealing with here? If it's small enough:
Sort the numbers (generally n-log-n)
Run through each number, n, and compare its bigger neighbor, n+1, to see if it's within your range.
Repeat for n+2, n+3, until the number is no longer within your range.
Your brute force method there is O((n/2)^2). This method will bring it to O(n + n log(n)), or O(n log n) on the average case.