Can Lazy modify elements in Array in Scala? - scala

I want to define an array, each element of array is a data set read from certain path in the file system, because the data reading is costly and the position in array to be visited is sparse, so I want to use Lazy modifier to realize that one data set will not be read until being visited. How can define this kind of array?

Yes,we can define it with view function.
Instead of (0 to 10).toArray
scala> val a=(0 to 10).toArray
a: Array[Int] = Array(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
View can not instantiate the array but when ever we call then only it executes
val a=(0 to 10).view
a: scala.collection.SeqView[Int,scala.collection.immutable.IndexedSeq[Int]] = SeqView(...)
scala> for (x <- a ){
| println(x)}
0
1
2
3
4
5
6
7
8
9
10
I hope this answered your question

No, you cannot use lazy on the array to make the elements lazy. The most natural thing to do would be to use a caching library like ScalaCache.
You can also make a kind of wrapper class with a lazy field as you suggested in your comment. However, I would prefer not to expose the caching to the clients of the array. You should be able to just write myArray(index) to access an element.
If you don't want to use a library, this is another option (without lazy) that gives you an array like object with caching:
class CachingArray[A](size: Int, getElement: Int => A) {
val elements = scala.collection.mutable.Map[Int, A]()
def apply(index: Int) = elements.getOrElseUpdate(index, getElement(index))
}
Just initialize it with the size and a function that computes an element at a given index.
If you like, you can make it extend IndexedSeq[A] so it can be used more like a real array. Just implement length like this:
override def length: Int = size

Related

How can i make the following function more efficient?

Here is a function which makes a map from given array. Where key is the integer number and the value is the frequency of this number in the given array.
I need to find the key which has the maximum frequency. If two key has the same frequency then i need to take the key which is smaller.
that's what i have written:
def findMinKeyWithMaxFrequency(arr: List[Int]): Int = {
val ansMap:scala.collection.mutable.Map[Int,Int] = scala.collection.mutable.Map()
arr.map(elem=> ansMap+=(elem->arr.count(p=>elem==p)))
ansMap.filter(_._2==ansMap.values.max).keys.min
}
val arr = List(1, 2 ,3, 4, 5, 4, 3, 2, 1, 3, 4)
val ans=findMinKeyWithMaxFrequency(arr) // output:3
How can i make it more efficient, it is giving me the right answer but i don't think it's the most efficient way to solve the problem.
In the given example the frequency of 3 and 4 is 3 so the answer should be 3 as 3 is smaller than 4.
Edit 1:
That's what i have done to make it bit efficient. Which is converting arr into Set as we need to find frequency for the unique elements only.
def findMinKeyWithMaxFrequency(arr: List[Int]): Int = {
val ansMap=arr.toSet.map{ e: Int =>(e,arr.count(x=>x==e))}.toMap
ansMap.filter(_._2==ansMap.values.max).keys.min
}
Can it be more efficient? Is it the most functional way of writing the solution for the given problem.
def findMinKeyWithMaxFrequency(arr: List[Int]): Int =
arr.groupBy(identity).toSeq.maxBy(p => (p._2.length,-p._1))._1
Use groupBy() to get an effective count for each element then, after flattening to a sequence of tuples, code the required rules to determine the maximum.

The easiest way to write {1, 2, 4, 8, 16 } in Scala

I was advertising Scala to a friend (who uses Java most of the time) and he asked me a challenge: what's the way to write an array {1, 2, 4, 8, 16} in Scala.
I don't know functional programming that well, but I really like Scala. However, this is a iterative array formed by (n*(n-1)), but how to keep track of the previous step? Is there a way to do it easily in Scala or do I have to write more than one line of code to achieve this?
Array.iterate(1, 5)(2 * _)
or
Array.iterate(1, 5)(n => 2 * n)
Elaborating on this as asked for in comment. Don't know what you want me to elaborate on, hope you will find what you need.
This is the function iterate(start,len)(f) on object Array (scaladoc). That would be a static in java.
The point is to fill an array of len elements, from first value start and always computing the next element by passing the previous one to function f.
A basic implementation would be
import scala.reflect.ClassTag
def iterate[A: ClassTag](start: A, len: Int)(f: A => A): Array[A] = {
val result = new Array[A](len)
if (len > 0) {
var current = start
result(0) = current
for (i <- 1 until len) {
current = f(current)
result(i) = current
}
}
result
}
(the actual implementation, not much different can be found here. It is a little different mostly because the same code is used for different data structures, e.g List.iterate)
Beside that, the implementation is very straightforward . The syntax may need some explanations :
def iterate[A](...) : Array[A] makes it a generic methods, usable for any type A. That would be public <A> A[] iterate(...) in java.
ClassTag is just a technicality, in scala as in java, you normally cannot create an array of a generic type (java new E[]), and the : ClassTag asks the compiler to add some magic which is very similar to adding at method declaration, and passing at call site, a class<A> clazz parameter in java, which can then be used to create the array by reflection. If you do e.g List.iterate rather than Array.iterate, it is not needed.
Maybe more surprising, the two parameters lists, one with start and len, and then in a separate parentheses, the one with f. Scala allows a method to have severals parameters lists. Here the reason is the peculiar way scala does type inference : Looking at the first parameter list, it will determine what is A, based on the type of start. Only afterwards, it will look at the second list, and then it knows what type A is. Otherwise, it would need to be told, so if there had been only one parameter list, def iterate[A: ClassTag](start: A, len: Int, f: A => A),
then the call should be either
Array.iterate(1, 5, n : Int => 2 * n)
Array.iterate[Int](1, 5, n => 2 * n)
Array.iterate(1, 5, 2 * (_: int))
Array.iterate[Int](1, 5, 2 * _)
making Int explicit one way or another. So it is common in scala to put function arguments in a separate argument list. The type might be much longer to write than just 'Int'.
A => A is just syntactic sugar for type Function1[A,A]. Obviously a functional language has functions as (first class) values, and a typed functional language has types for functions.
In the call, iterate(1, 5)(n => 2 * n), n => 2 * n is the value of the function. A more complete declaration would be {n: Int => 2 * n}, but one may dispense with Int for the reason stated above. Scala syntax is rather flexible, one may also dispense with either the parentheses or the brackets. So it could be iterate(1, 5){n => 2 * n}. The curlies allow a full block with several instruction, not needed here.
As for immutability, Array is basically mutable, there is no way to put a value in an array except to change the array at some point. My implementation (and the one in the library) also use a mutable var (current) and a side-effecting for, which is not strictly necessary, a (tail-)recursive implementation would be only a little longer to write, and just as efficient. But a mutable local does not hurt much, and we are already dealing with a mutable array anyway.
always more than one way to do it in Scala:
scala> (0 until 5).map(1<<_).toArray
res48: Array[Int] = Array(1, 2, 4, 8, 16)
or
scala> (for (i <- 0 to 4) yield 1<<i).toArray
res49: Array[Int] = Array(1, 2, 4, 8, 16)
or even
scala> List.fill(4)(1).scanLeft(1)(2*_+0*_).toArray
res61: Array[Int] = Array(1, 2, 4, 8, 16)
The other answers are fine if you happen to know in advance how many entries will be in the resulting list. But if you want to take all of the entries up to some limit, you should create an Iterator, use takeWhile to get the prefix you want, and create an array from that, like so:
scala> Iterator.iterate(1)(2*_).takeWhile(_<=16).toArray
res21: Array[Int] = Array(1, 2, 4, 8, 16)
It all boils down to whether what you really want is more correctly stated as
the first 5 powers of 2 starting at 1, or
the powers of 2 from 1 to 16
For non-trivial functions you almost always want to specify the end condition and let the program figure out how many entries there are. Of course your example was simple, and in fact the real easiest way to create that simple array is just to write it out literally:
scala> Array(1,2,4,8,16)
res22: Array[Int] = Array(1, 2, 4, 8, 16)
But presumably you were asking for a general technique you could use for arbitrarily complex problems. For that, Iterator and takeWhile are generally the tools you need.
You don't have to keep track of the previous step. Also, each element is not formed by n * (n - 1). You probably meant f(n) = f(n - 1) * 2.
Anyway, to answer your question, here's how you do it:
(0 until 5).map(math.pow(2, _).toInt).toArray

Replace an element in a sorted set in scala

How do I replace the first element in a sorted set in Scala? Is there an analogous function to 'patch' for Sorted Sets? Is it even possible?
val a = SortedSet(1,5,6)
val b = a.patch(0, seq[2], 1)
println(b)
Result should be:
TreeSet(2, 5, 6)
How about this:
scala> val a = SortedSet(1,5,6)
a: scala.collection.SortedSet[Int] = TreeSet(1, 5, 6)
scala> val b = a.drop(1) + 2
b: scala.collection.SortedSet[Int] = TreeSet(2, 5, 6)
Note: You're not really replacing anything here (at least not like an array.) What you are doing is taking a SortedSet and using drop to remove the first element (which happens to be the lowest value in sorted order in this case) and then you are adding another element to the set. 2 is only in the first position because that is where it is supposed to be in sorted order.
scala> a.drop(1) + 10
res21: scala.collection.SortedSet[Int] = TreeSet(5, 6, 10)
As you see, if you add 10, it also takes its place in sorted order which is at the end.
Furthermore, because sets cannot contain duplicates, doing something like:
scala> a.drop(1) + 6
res22: scala.collection.SortedSet[Int] = TreeSet(5, 6)
removes the first element and leaves you with only two elements in the set. This is because 6 was already in the set, so it is not added (again, a property of a set is that it does not contain duplicates.)

Constructing BitSets in Scala from a predicate?

Suppose I want to construct a BitSet containing all integers from 0 until n satisfying some predicate f: Int => Boolean.
I could write something like
BitSet((0 until n):_*).filter(f)
which of course works. But it feels rather inefficient! I'm planning on doing this inside a pretty tight loop, and would like suggestions for more efficient ways.
This is the best I could come up with at the moment
BitSet((0 until n).view.filter(f):_*)
The view part makes the filter method lazy. This makes sure that when the BitSet is created from the given sequence, it will filter on the fly. Your original suggestion creates a new BitSet after the first one is created.
If performance is truly your major concern, the best option is probably to use a mutable.BitSet and a while loop, and then call toImmutable on the result.
val bitSet = {
val tmp = new scala.collection.mutable.BitSet(n)
var i = 0;
while (i < n) {
if (f(i)) {
tmp += i
}
i = i + 1
}
tmp.toImmutable
}
I think the most efficient "functional" way is to use foldLeft:
(1 to 5).foldLeft(BitSet())((s,i) => if (f(i)) s + i else s)
It doesn't create an intermediate collection but construct the collection from scratch while filtering.
The first thing I thought is to use breakOut, but it doesn't work for filter:
scala> val set: BitSet = (0 until 10).filter(f)(collection.breakOut)
<console>:11: error: polymorphic expression cannot be instantiated to expected type;
found : [From, T, To]scala.collection.generic.CanBuildFrom[From,T,To]
required: Int
val set: BitSet = (0 until 10).filter(f)(collection.breakOut)
^
scala> val set: BitSet = (0 until 10).map(_+1)(collection.breakOut)
set: scala.collection.immutable.BitSet = BitSet(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
breakOut doesn't create an intermediate collection too, but because filter doesn't have a second parameter list it can't work.

Scala: what is the most appropriate data structure for sorted subsets?

Given a large collection (let's call it 'a') of elements of type T (say, a Vector or List) and an evaluation function 'f' (say, (T) => Double) I would like to derive from 'a' a result collection 'b' that contains the N elements of 'a' that result in the highest value under f. The collection 'a' may contain duplicates. It is not sorted.
Maybe leaving the question of parallelizability (map/reduce etc.) aside for a moment, what would be the appropriate Scala data structure for compiling the result collection 'b'? Thanks for any pointers / ideas.
Notes:
(1) I guess my use case can be most concisely expressed as
val a = Vector( 9,2,6,1,7,5,2,6,9 ) // just an example
val f : (Int)=>Double = (n)=>n // evaluation function
val b = a.sortBy( f ).take( N ) // sort, then clip
except that I do not want to sort the entire set.
(2) one option might be an iteration over 'a' that fills a TreeSet with 'manual' size bounding (reject anything worse than the worst item in the set, don't let the set grow beyond N). However, I would like to retain duplicates present in the original set in the result set, and so this may not work.
(3) if a sorted multi-set is the right data structure, is there a Scala implementation of this? Or a binary-sorted Vector or Array, if the result set is reasonably small?
You can use a priority queue:
def firstK[A](xs: Seq[A], k: Int)(implicit ord: Ordering[A]) = {
val q = new scala.collection.mutable.PriorityQueue[A]()(ord.reverse)
val (before, after) = xs.splitAt(k)
q ++= before
after.foreach(x => q += ord.max(x, q.dequeue))
q.dequeueAll
}
We fill the queue with the first k elements and then compare each additional element to the head of the queue, swapping as necessary. This works as expected and retains duplicates:
scala> firstK(Vector(9, 2, 6, 1, 7, 5, 2, 6, 9), 4)
res14: scala.collection.mutable.Buffer[Int] = ArrayBuffer(6, 7, 9, 9)
And it doesn't sort the complete list. I've got an Ordering in this implementation, but adapting it to use an evaluation function would be pretty trivial.