Replace with .lengthCompare warning

Replace with .lengthCompare warning - scala

IntelliJ keep suggesting to replace .length == X with .lengthCompare(X) == 0. Why is that better? Don't quite get it, since the suggested changes are more verbose.

It is more efficient.
Since length is a linear operation on some collections like List, doing x.length == 3 would need to compute the length first and then compare it with the value. On the other hand .lengthCompare would terminate computing the length once it finds that the comparison is wrong already.

In Scala 2.13 we have lengthIs method which might be used to compare length of some collection just as length in this use case but with lengthCompare under the hood! So it is both efficient and readable. E.g.:
val list = List(1,2,3)
list.lengthIs > 2 // true
https://www.scala-lang.org/api/2.13.4/scala/collection/Seq.html#lengthIs:scala.collection.IterableOps.SizeCompareOps

Related

Scala Array.view memory usage

I'm learning Scala and have been trying some LeetCode problems with it, but I'm having trouble with the memory limit being exceeded. One problem I have tried goes like this:
A swap is defined as taking two distinct positions in an array and swapping the values in them.
A circular array is defined as an array where we consider the first element and the last element to be adjacent.
Given a binary circular array nums, return the minimum number of swaps required to group all 1's present in the array together at any location.
and my attempted solution looks like
object Solution {
def minSwaps(nums: Array[Int]): Int = {
val count = nums.count(_==1)
if (count == 0) return 0
val circular = nums.view ++ nums.view
circular.sliding(count).map(_.count(_==0)).min
}
}
however, when I submit it, I'm hit with Memory Limit Exceeded for one of the test case where nums is very large.
My understanding is that, because I'm using .view, I shouldn't be allocating over O(1) memory. Is that understanding incorrect? To be clear, I realise this is the most time efficient way of solving this, but I didn't expect it to be memory inefficient.
The version used is Scala 2.13.7, in case that makes a difference.
Update
I did some inspection of the types and it seems circular is only a View unless I replace ++ with concat which makes it IndexedSeqView, why is that, I thought ++ was just an alias for concat?
If I make the above change, and replace circular.sliding(count) with (0 to circular.size - count).view.map(i => circular.slice(i, i + count)) it "succeeds" in hitting the time limit instead, so I think sliding might not be optimised for IndexedSeqView.

Scala spark: Efficient check if condition is matched anywhere?

What I want is roughly equivalent to
df.where(<condition>).count() != 0
But I'm pretty sure it's not quite smart enough to stop once it finds any such violation. I would expect some sort of aggregator to be able to do this, but I haven't found one? I could do it with a max and some sort of conversion, but again I don't think it would necessarily know to quit (not being specific to bool, I'm not sure if understands no value is larger than true).
More specifically, I want to check if a column contains only a single element. Right now my best idea is to do this is by grabbing the first value and comparing everything.

I would try this option, it should be much faster:
df.where(<condition>).head(1).isEmpty
You can also try to define your conditions on a row together with scala's exists (which stops at the first occurence of true):
df.mapPartitions(rows => if(rows.exists(row => <condition>)) Iterator(1) else Iterator.empty).isEmpty
At the end you should benchmark the alternatives

Haskell-like range in PureScript?

Is there a native function that behaves like Haskell's range?
I found out that 2..1 returns a list [2, 1] in PureScript, unlike Haskell's [2..1] returning an empty list []. After Googling around, I found the behavior is written in the Differences from Haskell documentation, but it doesn't give a rationale behind.
In my opinion, this behavior is somewhat inconvenient/unintuitive since 0 .. (len - 1) doesn't give an empty list when len is zero, and this could possibly lead to cryptic bugs.
Is there a way to obtain the expected array (i.e. range of length len incrementing from 0) without handling the length == 0 case every time?
Also, why did PureScript decide to make range behave like that?
P.S. How I ran into this question: I wanted to write a getLocalStorageKeys function, which gets all keys from the local storage. My implementation gets the number of keys using the length function, creates a range from 0 to numKeys - 1, and then traverses it with the key function. However, the range didn't behave as I expected.

How about just make it yourself?
indicies :: Int -> Array Int
indicies n = if n <= 0 then [] else 0..(n-1)
As far as "why", I can only speculate, and my speculation is that the idea was to avoid this kind of iffy logic for creating "reverse" ranges - which is something that does come up for me in Haskell once in a while.

Why do Scala's index methods return -1 instead of None if the element is not found?

I've always been wondering why in Scala the various index methods for determining the position of an element in a collection (e.g. List.indexOf, List.indexWhere) return -1 to indicate the absence of the given element in the collection, instead of a more idiomatic Option[Int]. Is there some particular advantage to returning -1 instead of None, or is this just for historical reasons?

It is just for historical reasons, but then one wants to know what the historical reasons are: what was the history, and why did it turn out that way?
The immediate history is the java.lang.String.indexOf method, which returns the index, or -1 if no matching character is found. But this is hardly new; the Fortran SCAN function returns 0 if no character is found in a string, which is the same thing given that Fortran uses 1-indexing.
The reason to do this is that strings have only positive length, so any negative length can be used as a None value without any overhead of boxing. -1 is the most convenient negative number, so that's it.
And this can add up if the compiler isn't smart enough to realize that all the boxing and unboxing and everything is irrelevant. In particular, an object creation tends to take 5-10 ns, while a function call or comparison typically takes more like 1-2 ns, so if the collection is short, creating a new object can have a sizable fractional penalty (more so if your memory is already taxed and the GC has a lot of work to do).
If Scala had initially had an amazing optimizer, then the choice probably would have been different, as one would just write things with options, which is safer and less of a special case, and then trust the compiler to convert it into appropriately high-performance code.

Speed? (not sure)
def a(): Option[Int] = Some(Math.random().toInt)
def b(): Int = Math.random().toInt
val t0 = System.nanoTime; (0 to 1000000).foreach(_ => a()); println("" + (System.nanoTime - t0))
// 53988000
val t0 = System.nanoTime; (0 to 1000000).foreach(_ => b()); println("" + (System.nanoTime - t0))
// 49273000
And you also should always check for index < 0 in Some(index)

There is also the benefit that just returning an Int can use Java's built-in types, whereas Option[Int] would need to wrap the integer in an Object. This means both worse speed (as indicated by #idonnie) but also more memory usage.
While Option is great as a general tool (and I use it a lot) also other non-value presentations s.a. Double.NaN or an empty string are perfectly valid, and useful.
One of the benefits of using Option is the ability to pass it to for loops etc. as a collection. If you are not likely to do that, checking for -1 or NaN may be more concise than doing matches for None/Some.

comparing an element with all elements in a list

I'm learning Scala now, and I have a scenario where I have to compare an element (say num) with all the elements in a list.
Assume,
val MyList = List(1, 2, 3, 4)
If num is equal to anyone the elements in the list, I need to return true. I know to do it recursively using the head and tail functions, but is there a simpler way to it (I think I'll be able to do it using foreach, but I'm not sure how to implement it exactly)?

There is number of possibilities:
val x = 3
MyList.contains(x)
!MyList.forall(y => y != x) // early exit, basically the same as .contains
If you plan to do it frequently, you may consider to convert your list to Set, cause every .contains lookup on list in worst case is proportional to number of elements, whereas on Set it is effectively constant
val mySet = MyList.toSet
mySet.contains(x)
or simply:
mySet(x)

A contains method is pretty standard for lists in any language. Scala's List has it too:
http://www.scala-lang.org/api/current/scala/collection/immutable/List.html

As others have answered, the contains method on the list will do exactly this, and it's the most understandable/performant way.
Looking at your closing comments though, you wouldn't be able to do it (in an elegant fashion) with foreach, since that returns Unit. Foreach "does" something for each element, but you don't get any result back. It's useful for logging/println statements, but it doesn't act as a transformation.
If you want to run a function on every element individually, you would use map, which returns a List of the results of applying the function. So assuming num = 3, then MyList.map(_ == num) would return List(false, false, true, false). Since you're looking for a single result, and not a list of results, then this is not what you're after.
In order to collapse a sequence of things into a single result, you would use a fold over the data. Folding involves a function that takes two arguments (the result so far, and the current thing in the list) and returns the new running result. So that this can work on the very first element, you also need to provide the initial value to use for the ongoing result (usually some sort of zero).
In your particular case, then, you want a Boolean answer at the end - "was an element found that was equal to num". So the running result would be "have I seen an element so far that was equal to num". Which means the initial value is false. And the function itself should return true if an element has already been seen, or if the current element is equal to num.
Putting this together, it would look like this:
MyList.foldLeft(false) { case (runningResult, listElem) =>
// return true if runningResult is true, or if listElem is the target number
runningResult || listElem == num
}
This doesn't have the nice aspect of stopping as soon as the target value has been found - and it's nowhere near as concise as calling MyList.contains. But as an instructional example, this is how you could implement this yourself from the primitive functional operations on a list.

List has a method for that:
val found = MyList.contains(num)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Replace with .lengthCompare warning - scala

IntelliJ keep suggesting to replace .length == X with .lengthCompare(X) == 0. Why is that better? Don't quite get it, since the suggested changes are more verbose.

Related

Scala Array.view memory usage

Scala spark: Efficient check if condition is matched anywhere?

Haskell-like range in PureScript?

Why do Scala's index methods return -1 instead of None if the element is not found?

comparing an element with all elements in a list

Categories

Resources