Haskell-like range in PureScript? - purescript

Is there a native function that behaves like Haskell's range?
I found out that 2..1 returns a list [2, 1] in PureScript, unlike Haskell's [2..1] returning an empty list []. After Googling around, I found the behavior is written in the Differences from Haskell documentation, but it doesn't give a rationale behind.
In my opinion, this behavior is somewhat inconvenient/unintuitive since 0 .. (len - 1) doesn't give an empty list when len is zero, and this could possibly lead to cryptic bugs.
Is there a way to obtain the expected array (i.e. range of length len incrementing from 0) without handling the length == 0 case every time?
Also, why did PureScript decide to make range behave like that?
P.S. How I ran into this question: I wanted to write a getLocalStorageKeys function, which gets all keys from the local storage. My implementation gets the number of keys using the length function, creates a range from 0 to numKeys - 1, and then traverses it with the key function. However, the range didn't behave as I expected.

How about just make it yourself?
indicies :: Int -> Array Int
indicies n = if n <= 0 then [] else 0..(n-1)
As far as "why", I can only speculate, and my speculation is that the idea was to avoid this kind of iffy logic for creating "reverse" ranges - which is something that does come up for me in Haskell once in a while.

Related

Replace with .lengthCompare warning

IntelliJ keep suggesting to replace .length == X with .lengthCompare(X) == 0. Why is that better? Don't quite get it, since the suggested changes are more verbose.
It is more efficient.
Since length is a linear operation on some collections like List, doing x.length == 3 would need to compute the length first and then compare it with the value. On the other hand .lengthCompare would terminate computing the length once it finds that the comparison is wrong already.
In Scala 2.13 we have lengthIs method which might be used to compare length of some collection just as length in this use case but with lengthCompare under the hood! So it is both efficient and readable. E.g.:
val list = List(1,2,3)
list.lengthIs > 2 // true
https://www.scala-lang.org/api/2.13.4/scala/collection/Seq.html#lengthIs:scala.collection.IterableOps.SizeCompareOps

Ambiguous use of 'lazy'

I have no idea why this example is ambiguous. (My apologies for not adding the code here, it's simply too long.)
I have added prefix (_ maxLength) as an overload to LazyDropWhileBidirectionalCollection. subscript(position) is defined on LazyPrefixCollection. Yet, the following code from the above example shouldn't be ambiguous, yet it is:
print([0, 1, 2].lazy.drop(while: {_ in false}).prefix(2)[0]) // Ambiguous use of 'lazy'
It is my understanding that an overload that's higher up in the protocol hierarchy will get used.
According to the compiler it can't choose between two types; namely LazyRandomAccessCollection and LazySequence. (Which doesn't make sense since subscript(position) is not a method of LazySequence.) LazyRandomAccessCollection would be the logical choice here.
If I remove the subscript, it works:
print(Array([0, 1, 2].lazy.drop(while: {_ in false}).prefix(2))) // [0, 1]
What could be the issue?
The trail here is just too complicated and ambiguous. You can see this by dropping elements. In particular, drop the last subscript:
let z = [0, 1, 2].lazy.drop(while: {_ in false}).prefix(2)
In this configuration, the compiler wants to type z as LazyPrefixCollection<LazyDropWhileBidirectionalCollection<[Int]>>. But that isn't indexable by integers. I know it feels like it should be, but it isn't provable by the current compiler. (see below) So your [0] fails. And backtracking isn't powerful enough to get back out of this crazy maze. There are just too many overloads with different return types, and the compiler doesn't know which one you want.
But this particular case is trivially fixed:
print([0, 1, 2].lazy.drop(while: {_ in false}).prefix(2).first!)
That said, I would absolutely avoid pushing the compiler this hard. This is all too clever for Swift today. In particular overloads that return different types are very often a bad idea in Swift. When they're simple, yes, you can get away with it. But when you start layering them on, the compiler doesn't have a strong enough proof engine to resolve it. (That said, if we studied this long enough, I'm betting it actually is ambiguous somehow, but the diagnostic is misleading. That's a very common situation when you get into overly-clever Swift.)
Now that you describe it (in the comments), the reasoning is straightforward.
LazyDropWhileCollection can't have an integer index. Index subscripting is required to be O(1). That's the meaning of the Index subscript versus other subscripts. (The Index subscript must also return the Element type or crash; it can't return an Element?. That's way there's a DictionaryIndex that's separate from Key.)
Since the collection is lazy and has an arbitrary number of missing elements, looking up any particular integer "count" (first, second, etc.) is O(n). It's not possible to know what the 100th element is without walking through at least 100 elements. To be a collection, its O(1) index has to be in a form that can only be created by having previously walked the sequence. It can't be Int.
This is important because when you write code like:
for i in 1...1000 { print(xs[i]) }
you expect that to be on the order of 1000 "steps," but if this collection had an integer index, it would be on the order of 1 million steps. By wrapping the index, they prevent you from writing that code in the first place.
This is especially important in highly generic languages like Swift where layers of general-purpose algorithms can easily cascade an unexpected O(n) operation into completely unworkable performance (by "unworkable" I mean things that you expected to take milliseconds taking minutes or more).
Change the last row to this:
let x = [0, 1, 2]
let lazyX: LazySequence = x.lazy
let lazyX2: LazyRandomAccessCollection = x.lazy
let lazyX3: LazyBidirectionalCollection = x.lazy
let lazyX4: LazyCollection = x.lazy
print(lazyX.drop(while: {_ in false}).prefix(2)[0])
You can notice that the array has 4 different lazy conformations - you will have to be explicit.

Should I count up in Perl 6 with a sequence or a range?

Perl 6 has lazy lists, but it also has unbounded Range objects. Which one should you choose for counting up by whole numbers?
And there's unbounded Range with two dots:
0 .. *
There's the Seq (sequence) with three dots:
0 ... *
A Range generates lists of consecutives thingys using their natural order. It inherits from Iterable, but also Positional so you can index a range. You can check if something is within a Range, but that's not part of the task.
A Seq can generate just about anything you like as long as it knows how to get to the next element. It inherits from Iterable, but also PositionalBindFailover which fakes the Positional stuff through a cache and list conversion. I don't think that a big deal if you're only moving from one element to the next.
I'm going back and forth on this. At the moment I'm thinking it's Range.
Both 0 .. * and 0 ... * are fine.
Iterating over them, for example with a for loop, has exactly the same effect in both cases. (Neither will leak memory by keeping around already iterated elements.)
Assigning them to a # variable produces the same lazy Array.
So as long as you only want to count up numbers to infinity by a step of 1, I don't see a downside to either.
The ... sequence construction operator is more generic though, in that it can also be used to
count with a different step (1, 3 ... *)
count downwards (10 ... -Inf)
follow a geometric sequence (2, 4, 8 ... *)
follow a custom iteration formula (1, 1, *+* ... *)
so when I need to do something like that, then I'd consider using ... for any nearby and related "count up by one" as well, for consistency.
On the other hand:
A Range can be indexed efficiently without having to generate and cache all preceding elements, so if you want to index your counter in addition to iterating over it, it is preferable. The same goes for other list operations that deal with element positions, like reverse: Range has efficient overloads for them, whereas using them on a Seq has to iterate and cache its elements first.
If you want to count upwards to a variable end-point (as in 1 .. $n), it's safer to use a Range because you can be sure it'll never count downwards, no matter what $n is. (If the endpoint is less than the startpoint, as in 1 .. 0, it will behave as an empty sequence when iterated, which tends to get edge-cases right in practice.)
Conversely, if you want to safely count downwards ensuring it will never unexpectedly count upwards, you can use reverse 1 .. $n.
Lastly, a Range is a more specific/high-level representation of the concept of "numbers from x to y", whereas a Seq represents the more generic concept of "a sequence of values". A Seq is, in general, driven by arbitrary generator code (see gather/take) - the ... operator is just semantic sugar for creating some common types of sequences. So it may feel more declarative to use a Range when "numbers from x to y" is the concept you want to express. But I suppose that's a purely psychological concern... :P
Semantically speaking, a Range is a static thing (a bounded set of values), a Seq is a dynamic thing (a value generator) and a lazy List a static view of a dynamic thing (an immutable cache for generated values).
Rule of thumb: Prefer static over dynamic, but simple over complex.
In addition, a Seq is an iterable thing, a List is an iterable positional thing, and a Range is an ordered iterable positional thing.
Rule of thumb: Go with the most generic or most specific depending on context.
As we're dealing with iteration only and are not interested in positional access or bounds, using a Seq (which is essentially a boxed Iterator) seems like a natural choice. However, ordered sets of consecutive integers are exactly what an integer Range represents, and personally that's what I would see as most appropriate for your particular use case.
When there is no clear choice, I tend to prefer ranges for their simplicity anyway (and try to avoid lazy lists, the heavy-weight).
Note that the language syntax also nudges you in the direction of Range, which are rather heavily Huffman-coded (two-char infix .., one-char prefix ^).
There is a difference between ".." (Range) and "..." (Seq):
$ perl6
> 1..10
1..10
> 1...10
(1 2 3 4 5 6 7 8 9 10)
> 2,4...10
(2 4 6 8 10)
> (3,6...*)[^5]
(3 6 9 12 15)
The "..." operator can intuit patterns!
https://docs.perl6.org/language/operators#index-entry-..._operators
As I understand, you can traverse a Seq only once. It's meant for streaming where you don't need to go back (e.g., a file). I would think a Range should be a fine choice.

Need an explanation for a confusing way the AND boolean works

I am tutoring someone in basic search and sorts. In insertion sort I iterate negatively when I have a value that is greater than the one previous to it in numerical terms. Now of course this approach can cause issues because there is a check which calls for array[-1] which does not exist.
As underlined in bold below, adding the and x > 0 boolean prevents the index issue.
My question is how is this the case? Wouldn't the call for array[-1] still be made to ensure the validity of both booleans?
the_list = [10,2,4,3,5,7,8,9,6]
for x in range(1,len(the_list)):
value = the_list[x]
while value < the_list[x-1] **and x > 0**:
the_list[x] = the_list[x-1]
x=x-1
the_list[x] = value
print the_list
I'm not sure I completely understand the question, and I don't know what programming language this is, but most modern programming languages use so-called short-circuit Boolean evaluation by default so that the logical expression isn't evaluated further once the outcome is known.
You can use that to guard against range overflow, like this:
while x > 0 and value < the_list[x-1]
but the check of x's range here must come before the use.
AND operation returns true if and only if both arguments are true, so if one of arguments is false there's no point of checking others as the final value is already known at that point. As for your example, usually evaluation goes from left to right but it is not a principle and it looks the language you used is not following that rule (othewise it still should crash on array lookup). But ut may be, this particular implementation optimizes this somehow (which IMHO is not good idea) and evaluates "simpler" things first (like checking if x > 0) before it look up the array. check the specs why this exact order works for you as in most popular languages you would still crash if test x > 0 wouldn't be evaluated before lookup

Why do Scala's index methods return -1 instead of None if the element is not found?

I've always been wondering why in Scala the various index methods for determining the position of an element in a collection (e.g. List.indexOf, List.indexWhere) return -1 to indicate the absence of the given element in the collection, instead of a more idiomatic Option[Int]. Is there some particular advantage to returning -1 instead of None, or is this just for historical reasons?
It is just for historical reasons, but then one wants to know what the historical reasons are: what was the history, and why did it turn out that way?
The immediate history is the java.lang.String.indexOf method, which returns the index, or -1 if no matching character is found. But this is hardly new; the Fortran SCAN function returns 0 if no character is found in a string, which is the same thing given that Fortran uses 1-indexing.
The reason to do this is that strings have only positive length, so any negative length can be used as a None value without any overhead of boxing. -1 is the most convenient negative number, so that's it.
And this can add up if the compiler isn't smart enough to realize that all the boxing and unboxing and everything is irrelevant. In particular, an object creation tends to take 5-10 ns, while a function call or comparison typically takes more like 1-2 ns, so if the collection is short, creating a new object can have a sizable fractional penalty (more so if your memory is already taxed and the GC has a lot of work to do).
If Scala had initially had an amazing optimizer, then the choice probably would have been different, as one would just write things with options, which is safer and less of a special case, and then trust the compiler to convert it into appropriately high-performance code.
Speed? (not sure)
def a(): Option[Int] = Some(Math.random().toInt)
def b(): Int = Math.random().toInt
val t0 = System.nanoTime; (0 to 1000000).foreach(_ => a()); println("" + (System.nanoTime - t0))
// 53988000
val t0 = System.nanoTime; (0 to 1000000).foreach(_ => b()); println("" + (System.nanoTime - t0))
// 49273000
And you also should always check for index < 0 in Some(index)
There is also the benefit that just returning an Int can use Java's built-in types, whereas Option[Int] would need to wrap the integer in an Object. This means both worse speed (as indicated by #idonnie) but also more memory usage.
While Option is great as a general tool (and I use it a lot) also other non-value presentations s.a. Double.NaN or an empty string are perfectly valid, and useful.
One of the benefits of using Option is the ability to pass it to for loops etc. as a collection. If you are not likely to do that, checking for -1 or NaN may be more concise than doing matches for None/Some.