Filtering and reducing lazy structures

Filtering and reducing lazy structures - swift

I would like to filter a lazy structure and then reduce it using Swift language.
func main() -> () {
let result = (1...)
.lazy
.filter { $0 < 3 }
.reduce(0, {$0 + $1})
return print(
result
)
}
main()
This code compiles; however, the program doesn't execute in a proper way (takes too long). The printed result on the screen should be 3.
Is there a way to accomplish this goal ?

The issue is not that your sequence is lazy, it's that your sequence is infinite. You might be looking for the sequence method. Example:
let s = sequence(first: 0) {
$0 > 3 ? nil : $0 + 1
}
let result = s.reduce(0) { $0 + $1 }
print(result) // 10
s is lazy and is potentially infinite but not actually infinite, because the method that generates the next item in the series does an early exit by returning nil when the sequence goes past 3. This is similar to your filter except that it is not a filter, it's a stopper. You can use any condition you like here to generate the stopper.

To get the program to terminate, you would need to add an upper limit on your original sequence (1...).
As it's written, you have an infinite sequence of numbers, starting at 1. The following operators — the filter, in particular — have no way of "knowing" that they can discard the rest of the sequence once they pass 3. They have to process the entire infinite sequence, filtering out all but the first two elements, before your reduce can produce a final result and you can print it out. (In practice, you'll eventually overflow Int, so the program would terminate then, but that's not really a good thing to rely on.)
If you don't want to change the original (1...), you can approximate the same behavior by swapping out your filter with a prefix. A filter has to look at every element; a prefix can "know" that it stops after a certain number of elements. For example, this runs very quickly and prints out 3:
let result = (1...)
.lazy
.prefix(2)
.reduce(0) {$0 + $1}

Related

Does this sorting algorithm exist? (implemented in Swift)

This might be a bad question but I am curious.
I was following some data structures and algorithms courses online, and I came across algorithms such as selection sort, insertion sort, bubble sort, merge sort, quick sort, heap sort.. They almost never get close to O(n) when the array is reverse-sorted.
I was wondering one thing: why are we not using space in return of time?
When I organise something I pick up one, and put it where it belongs to. So I thought if we have an array of items, we could just put each value to the index with that value.
Here is my implementation in Swift 4:
let simpleArray = [5,8,3,2,1,9,4,7,0]
let maxSpace = 20
func spaceSort(array: [Int]) -> [Int] {
guard array.count > 1 else {
return array
}
var realResult = [Int]()
var result = Array<Int>(repeating: -1, count: maxSpace)
for i in 0..<array.count{
if(result[array[i]] != array[i]){
result[array[i]] = array[i]
}
}
for i in 0..<result.count{
if(result[i] != -1){
realResult.append(i)
}
}
return realResult
}
var spaceSorted = [Int]()
var execTime = BenchTimer.measureBlock {
spaceSorted = spaceSort(array: simpleArray)
}
print("Average execution time for simple array: \(execTime)")
print(spaceSorted)
Results I get:
Does this sorting algorithm exist already?
Is this a bad idea because it only takes unique values and loses the duplicates? Or could there be uses for it?
And why can't I use Int.max for the maxSpace?
Edit:
I get the error below
error: Execution was interrupted.
when I use let maxSpace = Int.max
MyPlayground(6961,0x7000024af000) malloc: Heap corruption detected,
free list is damaged at 0x600003b7ebc0
* Incorrect guard value: 0 MyPlayground(6961,0x7000024af000) malloc: * set a breakpoint in malloc_error_break to debug
Thanks for the answers

This is an extreme version of radix sort. Quoted from Wikipedia:
radix sort is a non-comparative sorting algorithm. It avoids comparison by creating and distributing elements into buckets according to their radix. For elements with more than one significant digit, this bucketing process is repeated for each digit, while preserving the ordering of the prior step, until all digits have been considered. For this reason, radix sort has also been called bucket sort and digital sort.
In this case you choose your radix as maxSpace, and so you don't have any "elements with more than one significant digit" (from quote above).
Now, if you would use a Hash Set data structure instead of an array, you would actually not need to really allocate the space for the whole range. You would still keep all the loop iterations though (from 0 to maxSpace), and it would check whether the hash set contains the value of i (the loop variable), and if so, output it.
This can only be an efficient algorithm if maxSpace has the same order of magnitude as the number of elements in your input array. Other sorting algorithms can sort with O(nlogn) time complexity, so for cases where maxSpace is much greater than nlogn, the algorithm is not that compelling.

Scala: For loop that matches ints in a List

New to Scala. I'm iterating a for loop 100 times. 10 times I want condition 'a' to be met and 90 times condition 'b'. However I want the 10 a's to occur at random.
The best way I can think is to create a val of 10 random integers, then loop through 1 to 100 ints.
For example:
val z = List.fill(10)(100).map(scala.util.Random.nextInt)
z: List[Int] = List(71, 5, 2, 9, 26, 96, 69, 26, 92, 4)
Then something like:
for (i <- 1 to 100) {
whenever i == to a number in z: 'Condition a met: do something'
else {
'condition b met: do something else'
}
}
I tried using contains and == and =! but nothing seemed to work. How else can I do this?

Your generation of random numbers could yield duplicates... is that OK? Here's how you can easily generate 10 unique numbers 1-100 (by generating a randomly shuffled sequence of 1-100 and taking first ten):
val r = scala.util.Random.shuffle(1 to 100).toList.take(10)
Now you can simply partition a range 1-100 into those who are contained in your randomly generated list and those who are not:
val (listOfA, listOfB) = (1 to 100).partition(r.contains(_))
Now do whatever you want with those two lists, e.g.:
println(listOfA.mkString(","))
println(listOfB.mkString(","))
Of course, you can always simply go through the list one by one:
(1 to 100).map {
case i if (r.contains(i)) => println("yes: " + i) // or whatever
case i => println("no: " + i)
}
What you consider to be a simple for-loop actually isn't one. It's a for-comprehension and it's a syntax sugar that de-sugares into chained calls of maps, flatMaps and filters. Yes, it can be used in the same way as you would use the classical for-loop, but this is only because List is in fact a monad. Without going into too much details, if you want to do things the idiomatic Scala way (the "functional" way), you should avoid trying to write classical iterative for loops and prefer getting a collection of your data and then mapping over its elements to perform whatever it is that you need. Note that collections have a really rich library behind them which allows you to invoke cool methods such as partition.
EDIT (for completeness):
Also, you should avoid side-effects, or at least push them as far down the road as possible. I'm talking about the second example from my answer. Let's say you really need to log that stuff (you would be using a logger, but println is good enough for this example). Doing it like this is bad. Btw note that you could use foreach instead of map in that case, because you're not collecting results, just performing the side effects.
Good way would be to compute the needed stuff by modifying each element into an appropriate string. So, calculate the needed strings and accumulate them into results:
val results = (1 to 100).map {
case i if (r.contains(i)) => ("yes: " + i) // or whatever
case i => ("no: " + i)
}
// do whatever with results, e.g. print them
Now results contains a list of a hundred "yes x" and "no x" strings, but you didn't do the ugly thing and perform logging as a side effect in the mapping process. Instead, you mapped each element of the collection into a corresponding string (note that original collection remains intact, so if (1 to 100) was stored in some value, it's still there; mapping creates a new collection) and now you can do whatever you want with it, e.g. pass it on to the logger. Yes, at some point you need to do "the ugly side effect thing" and log the stuff, but at least you will have a special part of code for doing that and you will not be mixing it into your mapping logic which checks if number is contained in the random sequence.

(1 to 100).foreach { x =>
if(z.contains(x)) {
// do something
} else {
// do something else
}
}
or you can use a partial function, like so:
(1 to 100).foreach {
case x if(z.contains(x)) => // do something
case _ => // do something else
}

Can this be more Swift3-like?

What I want to do is populate an Array (sequence) by appending in the elements of another Array (availableExercises), one by one. I want to do it one by one because the sequence has to hold a given number of items. The available exercises list is in nature finite, and I want to use its elements as many times as I want, as opposed to a multiple number of the available list total.
The current code included does exactly that and works. It is possible to just paste that in a Playground to see it at work.
My question is: Is there a better Swift3 way to achieve the same result? Although the code works, I'd like to not need the variable i. Swift3 allows for structured code like closures and I'm failing to see how I could use them better. It seems to me there would be a better structure for this which is just out of reach at the moment.
Here's the code:
import UIKit
let repTime = 20 //seconds
let restTime = 10 //seconds
let woDuration = 3 //minutes
let totalWOTime = woDuration * 60
let sessionTime = repTime + restTime
let totalSessions = totalWOTime / sessionTime
let availableExercises = ["push up","deep squat","burpee","HHSA plank"]
var sequence = [String]()
var i = 0
while sequence.count < totalSessions {
if i < availableExercises.count {
sequence.append(availableExercises[i])
i += 1
}
else { i = 0 }
}
sequence

You can overcome from i using modulo of sequence.count % availableExercises.count like this way.
var sequence = [String]()
while(sequence.count < totalSessions) {
let currentIndex = sequence.count % availableExercises.count
sequence.append(availableExercises[currentIndex])
}
print(sequence)
//["push up", "deep squat", "burpee", "HHSA plank", "push up", "deep squat"]

You can condense your logic by using map(_:) and the remainder operator %:
let sequence = (0..<totalSessions).map {
availableExercises[$0 % availableExercises.count]
}
map(_:) will iterate from 0 up to (but not including) totalSessions, and for each index, the corresponding element in availableExercises will be used in the result, with the remainder operator allowing you to 'wrap around' once you reach the end of availableExercises.
This also has the advantage of preallocating the resultant array (which map(_:) will do for you), preventing it from being needlessly re-allocated upon appending.

Personally, Nirav's solution is probably the best, but I can't help offering this solution, particularly because it demonstrates (pseudo-)infinite lazy sequences in Swift:
Array(
repeatElement(availableExercises, count: .max)
.joined()
.prefix(totalSessions))
If you just want to iterate over this, you of course don't need the Array(), you can leave the whole thing lazy. Wrapping it up in Array() just forces it to evaluate immediately ("strictly") and avoids the crazy BidirectionalSlice<FlattenBidirectionalCollection<Repeated<Array<String>>>> type.

Merge Sort algorithm efficiency

I am currently taking an online algorithms course in which the teacher doesn't give code to solve the algorithm, but rather rough pseudo code. So before taking to the internet for the answer, I decided to take a stab at it myself.
In this case, the algorithm that we were looking at is merge sort algorithm. After being given the pseudo code we also dove into analyzing the algorithm for run times against n number of items in an array. After a quick analysis, the teacher arrived at 6nlog(base2)(n) + 6n as an approximate run time for the algorithm.
The pseudo code given was for the merge portion of the algorithm only and was given as follows:
C = output [length = n]
A = 1st sorted array [n/2]
B = 2nd sorted array [n/2]
i = 1
j = 1
for k = 1 to n
if A(i) < B(j)
C(k) = A(i)
i++
else [B(j) < A(i)]
C(k) = B(j)
j++
end
end
He basically did a breakdown of the above taking 4n+2 (2 for the declarations i and j, and 4 for the number of operations performed -- the for, if, array position assignment, and iteration). He simplified this, I believe for the sake of the class, to 6n.
This all makes sense to me, my question arises from the implementation that I am performing and how it effects the algorithms and some of the tradeoffs/inefficiencies it may add.
Below is my code in swift using a playground:
func mergeSort<T:Comparable>(_ array:[T]) -> [T] {
guard array.count > 1 else { return array }
let lowerHalfArray = array[0..<array.count / 2]
let upperHalfArray = array[array.count / 2..<array.count]
let lowerSortedArray = mergeSort(array: Array(lowerHalfArray))
let upperSortedArray = mergeSort(array: Array(upperHalfArray))
return merge(lhs:lowerSortedArray, rhs:upperSortedArray)
}
func merge<T:Comparable>(lhs:[T], rhs:[T]) -> [T] {
guard lhs.count > 0 else { return rhs }
guard rhs.count > 0 else { return lhs }
var i = 0
var j = 0
var mergedArray = [T]()
let loopCount = (lhs.count + rhs.count)
for _ in 0..<loopCount {
if j == rhs.count || (i < lhs.count && lhs[i] < rhs[j]) {
mergedArray.append(lhs[i])
i += 1
} else {
mergedArray.append(rhs[j])
j += 1
}
}
return mergedArray
}
let values = [5,4,8,7,6,3,1,2,9]
let sortedValues = mergeSort(values)
My questions for this are as follows:
Do the guard statements at the start of the merge<T:Comparable> function actually make it more inefficient? Considering we are always halving the array, the only time that it will hold true is for the base case and when there is an odd number of items in the array.
This to me seems like it would actually add more processing and give minimal return since the time that it happens is when we have halved the array to the point where one has no items.
Concerning my if statement in the merge. Since it is checking more than one condition, does this effect the overall efficiency of the algorithm that I have written? If so, the effects to me seems like they vary based on when it would break out of the if statement (e.g at the first condition or the second).
Is this something that is considered heavily when analyzing algorithms, and if so how do you account for the variance when it breaks out from the algorithm?
Any other analysis/tips you can give me on what I have written would be greatly appreciated.

You will very soon learn about Big-O and Big-Theta where you don't care about exact runtimes (believe me when I say very soon, like in a lecture or two). Until then, this is what you need to know:
Yes, the guards take some time, but it is the same amount of time in every iteration. So if each iteration takes X amount of time without the guard and you do n function calls, then it takes X*n amount of time in total. Now add in the guards who take Y amount of time in each call. You now need (X+Y)*n time in total. This is a constant factor, and when n becomes very large the (X+Y) factor becomes negligible compared to the n factor. That is, if you can reduce a function X*n to (X+Y)*(log n) then it is worthwhile to add the Y amount of work because you do fewer iterations in total.
The same reasoning applies to your second question. Yes, checking "if X or Y" takes more time than checking "if X" but it is a constant factor. The extra time does not vary with the size of n.
In some languages you only check the second condition if the first fails. How do we account for that? The simplest solution is to realize that the upper bound of the number of comparisons will be 3, while the number of iterations can be potentially millions with a large n. But 3 is a constant number, so it adds at most a constant amount of work per iteration. You can go into nitty-gritty details and try to reason about the distribution of how often the first, second and third condition will be true or false, but often you don't really want to go down that road. Pretend that you always do all the comparisons.
So yes, adding the guards might be bad for your runtime if you do the same number of iterations as before. But sometimes adding extra work in each iteration can decrease the number of iterations needed.

In swift which loop is faster `for` or `for-in`? Why?

Which loop should I use when have to be extremely aware of the time it takes to iterate over a large array.

Short answer
Don’t micro-optimize like this – any difference there is could be far outweighed by the speed of the operation you are performing inside the loop. If you truly think this loop is a performance bottleneck, perhaps you would be better served by using something like the accelerate framework – but only if profiling shows you that effort is truly worth it.
And don’t fight the language. Use for…in unless what you want to achieve cannot be expressed with for…in. These cases are rare. The benefit of for…in is that it’s incredibly hard to get it wrong. That is much more important. Prioritize correctness over speed. Clarity is important. You might even want to skip a for loop entirely and use map or reduce.
Longer Answer
For arrays, if you try them without the fastest compiler optimization, they perform identically, because they essentially do the same thing.
Presumably your for ;; loop looks something like this:
var sum = 0
for var i = 0; i < a.count; ++i {
sum += a[i]
}
and your for…in loop something like this:
for x in a {
sum += x
}
Let’s rewrite the for…in to show what is really going on under the covers:
var g = a.generate()
while let x = g.next() {
sum += x
}
And then let’s rewrite that for what a.generate() returns, and something like what the let is doing:
var g = IndexingGenerator<[Int]>(a)
var wrapped_x = g.next()
while wrapped_x != nil {
let x = wrapped_x!
sum += x
wrapped_x = g.next()
}
Here is what the implementation for IndexingGenerator<[Int]> might look like:
struct IndexingGeneratorArrayOfInt {
private let _seq: [Int]
var _idx: Int = 0
init(_ seq: [Int]) {
_seq = seq
}
mutating func generate() -> Int? {
if _idx != _seq.endIndex {
return _seq[_idx++]
}
else {
return nil
}
}
}
Wow, that’s a lot of code, surely it performs way slower than the regular for ;; loop!
Nope. Because while that might be what it is logically doing, the compiler has a lot of latitude to optimize. For example, note that IndexingGeneratorArrayOfInt is a struct not a class. This means it has no overhead over declaring the two member variables directly. It also means the compiler might be able to inline the code in generate – there is no indirection going on here, no overloaded methods and vtables or objc_MsgSend. Just some simple pointer arithmetic and deferencing. If you strip away all the syntax for the structs and method calls, you’ll find that what the for…in code ends up being is almost exactly the same as what the for ;; loop is doing.
for…in helps avoid performance errors
If, on the other hand, for the code given at the beginning, you switch compiler optimization to the faster setting, for…in appears to blow for ;; away. In some non-scientific tests I ran using XCTestCase.measureBlock, summing a large array of random numbers, it was an order of magnitude faster.
Why? Because of the use of count:
for var i = 0; i < a.count; ++i {
// ^-- calling a.count every time...
sum += a[i]
}
Maybe the optimizer could have fixed this for you, but in this case it hasn’t. If you pull the invariant out, it goes back to being the same as for…in in terms of speed:
let count = a.count
for var i = 0; i < count; ++i {
sum += a[i]
}
“Oh, I would definitely do that every time, so it doesn’t matter”. To which I say, really? Are you sure? Bet you forget sometimes.
But you want the even better news? Doing the same summation with reduce was (in my, again not very scientific, tests) even faster than the for loops:
let sum = a.reduce(0,+)
But it is also so much more expressive and readable (IMO), and allows you to use let to declare your result. Given that this should be your primary goal anyway, the speed is an added bonus. But hopefully the performance will give you an incentive to do it regardless.
This is just for arrays, but what about other collections? Of course this depends on the implementation but there’s a good reason to believe it would be faster for other collections like dictionaries, custom user-defined collections.
My reason for this would be that the author of the collection can implement an optimized version of generate, because they know exactly how the collection is being used. Suppose subscript lookup involves some calculation (such as pointer arithmetic in the case of an array - you have to add multiple the index by the value size then add that to the base pointer). In the case of generate, you know what is being done is to sequentially walk the collection, and therefore you could optimize for this (for example, in the case of an array, hold a pointer to the next element which you increment each time next is called). Same goes for specialized member versions of reduce or map.
This might even be why reduce is performing so well on arrays – who knows (you could stick a breakpoint on the function passed in if you wanted to try and find out). But it’s just another justification for using the language construct you should probably be using regardless.

Famously stated: "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil" Donald Knuth. It seems unlikely that you are in the %3.
Focus on the bigger problem at hand. After it is working, if it needs a performance boost, then worry about for loops. But I guarantee you, in the end, bigger structural inefficiencies or poor algorithm choice will be the performance problem, not a for loop.
Worrying about for loops is oh so 1960s.

FWIW, a rudimentary playground test shows map() is about 10 times faster than for enumeration:
class SomeClass1 {
let value: UInt32 = arc4random_uniform(100)
}
class SomeClass2 {
let value: UInt32
init(value: UInt32) {
self.value = value
}
}
var someClass1s = [SomeClass1]()
for _ in 0..<1000 {
someClass1s.append(SomeClass1())
}
var someClass2s = [SomeClass2]()
let startTimeInterval1 = CFAbsoluteTimeGetCurrent()
someClass1s.map { someClass2s.append(SomeClass2(value: $0.value)) }
println("Time1: \(CFAbsoluteTimeGetCurrent() - startTimeInterval1)") // "Time1: 0.489435970783234"
var someMoreClass2s = [SomeClass2]()
let startTimeInterval2 = CFAbsoluteTimeGetCurrent()
for item in someClass1s { someMoreClass2s.append(SomeClass2(value: item.value)) }
println("Time2: \(CFAbsoluteTimeGetCurrent() - startTimeInterval2)") // "Time2 : 4.81457495689392"

The for (with a counter) is just incrementing a counter. Very fast. The for-in uses an iterator (call object to pass the next element). This is much slower. But finally you want to access the element in both cases wich will then make no difference in the end.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Filtering and reducing lazy structures - swift

Related

Does this sorting algorithm exist? (implemented in Swift)

Scala: For loop that matches ints in a List

Can this be more Swift3-like?

Merge Sort algorithm efficiency

In swift which loop is faster `for` or `for-in`? Why?

Categories

Resources