Transforming arrays in-place with parallel collections - scala

When one has an array of objects it is often desirable (e.g. for performance reasons) to update (replace) some of the objects in place. For example, if you have an array of integers, you might want to replace the negative integers with positive ones:
// Faster for primitives
var i = 0
while (i < a.length) {
if (a(i) < 0) a(i) = -a(i)
i += 1
}
// Fine for objects, often okay for primitives
for (i <- a.indices) if (a(i) < 0) a(i) = -a(i)
What is the canonical way to perform a modification like this using the parallel collections library?

As far as parallel arrays are considered - it's an oversight. A parallel transform for parallel arrays will probably be included in the next release.
You can, however, do it using a parallel range:
for (i <- (0 until a.length).par) a(i) = computeSomething(i)
Note that not all mutable collections are modifiable in place this way. In general, if you wish to modify something in place, you have to make sure it's properly synchronized. This is not a problem for arrays in this case, since different indices will modify different array elements (and the changes are visible to the caller at the end, since the completion of a parallel operation guarantees that all writes become visible to the caller).

Sequential mutable collections have methods like transform which work in-place.
Parallel mutable collections lack these methods, but I'm not sure there is a reason behind it or if it is just an oversight.
My answer is that you're currently out of luck, but you could write it yourself of course.
Maybe it would make sense filing a ticket after this has been discussed a bit more?

How about creating a parallel collection that holds the indices into the array to transform and then run foreach to mutate one cell in an array, given the index.
That way you also have more control and it is possible to make four workers, that work on the four quarters of an array. Because simply flipping one single integer sign is probably not enough work to justify a parallel computation.

Related

How is Scala so efficient with lists?

It's usually considered a bad practice to make unnecessary collections in Java as it consumes some memory and CPU. Scala seems to be pretty efficient with it and encourages to use immutable data structures.
How is Scala so efficient with Lists? What techniques are used to achieve that?
While the comments are correct that the claim that list is particularly efficient is a dubious one, it does much better than doing full copies of the collection for every operation like you would do with Java's standard collections.
The reason for this is List and the other immutable collections are not just mutable collections with mutation methods returning a copy, but are designed differently to with immutability in mind. They Take advantage of something called "structural sharing". If parts of a collection remain the same after a change, then those parts don't need to be copied and the same object can be shared across multiple collections. This works because of immutability, there is no change that they could be altered so it's safe to share.
Imagine the simplest example, prepending to a list.
You have a List(1,2,3) and you want to prepend 0
val original = List(1,2,3)
val updated = 0 :: original
You list would then look something like this
updated original
\ \
0 - - - 1 - - - 2 - - - 3
All that's needed is to create a new node and point it's tail to the head of your original list. Nothing needs to be copied. Similarly the tail and drop operations just need to return a reference to the appropriate node and nothing needs to be copied. This is why List can be quite good with the prepend and tail operations, because it doesn't do any copying even though it creates a "new" List.
Other List operations do require some amount copying, but always as little as possible. As long as part of the tail of a list is unchanged it doesn't need to be copied. For example when concatenating lists, the first list needs to be copied, but then it's tail can just point to the head of the second, so the second list doesn't need to be copied at all. This is why, when concatenating a long and short list it's better to put the shorter list on the "left" as it is the only one that needs to be copied.
Other types of collections are better at different operations. Vector for example can to both prepend and append in amortized constant time, as well as having good random access and update capabilities (though still much worse than a raw mutable array). In most cases it will be more efficient than List while still being immutable. It's implementation is quite complicated. It uses a trie datastructure, with many internal arrays to store data. The unchanged ones can be shared and only the ones that need to be altered by an update operation need to be copied.

Converting a large sequence to a sequence with no repeats in scala really fast

So I have this large sequence, with a lot of repeats as well, and I need to convert it into a sequence with no repeats. What I have been doing so far has been converting the sequence to a set, and then back to the original sequence. Conversion to the set gets rid of the duplicates, and then I convert back into the set. However, this is very slow, as I'm given to understand that when converting to set, every pair of elements is compared, and the makes the complexity O(n^2), which is not acceptable. And since I have access to a computer with thousands of cores (through my university), I was wondering whether making things parallel would help.
Initially I thought I'd use scala Futures to parallelize the code in the following manner. Group the elements of the sequence into smaller subgroups by their hash code. That way, I have a subcollection of the original sequence, such that no element appears in two different subcollections and and every element is covered. Now I convert these smaller subcollections to sets, and back to sequences and concatenate them. This way I'm guaranteed to get a sequence with no repeats.
But I was wondering if applying the toSet method on a parallel sequence already does this. I thought I'd test this out in the scala interpreter, but I got roughly the same time for the conversion to parallel set vs the conversion to the non parallel set.
I was hoping someone could tell me whether conversion to parallel sets works this way or not. I'd be much obliged. Thanks.
EDIT: Is performing a toSet on a parallel collection faster than performing toSet on a non parallel collection?
.distinct with some of the Scala collection types is O(n) (as of Scala 2.11). It uses a hash map to record what has already been seen. With this, it linearly builds up a list:
def distinct: Repr = {
val b = newBuilder
val seen = mutable.HashSet[A]()
for (x <- this) {
if (!seen(x)) {
b += x
seen += x
}
}
b.result()
(newBuilder is like a mutable list.)
Just thinking outside the box, would it be possible that you prevent the creation of these doublons instead of trying to get rid of them afterwards ?

Scala List.updated

I am curious about List.updated. What is it's runtime? And how does it compare to just changing one element in an ArrayBuffer? In the background, how does it deal with copying all of the list? Is this an O(n) procedure? If so, is there an immutable data structure that has an updated like method without being so slow?
An example is:
val list = List(1,2,3)
val list2 = list.updated(2, 5) --> # list2 = (1,5,3)
var abuf = ArrayBuffer(1,2,3)
abuf(2) = 5 --> # abuf = (1,5,3)
The time and memory complexity of the updated(index, value) method is linear in terms of index (not in terms of the size of the list). The first index cells are recreated.
Changing an element in an ArrayBuffer has constant time and memory complexity. The backing array is updated in place, no copying occurs.
This updated method is not slow if you update elements near the beginning of the list. For larger sequences, Vector has a different way to share common parts of the list and will probably do less copying.
List.updated is an O(n) operation (linear).
It calls the linear List.splitAt operation to split the list at the index to get two list (before, rest), then builds a new list by appending the elements in before, the updated element and then the elements in rest.tail.
I'm not sure - this would have to be tested, but it seems that if the updated element was at the start of the list, it may be pretty efficient as in theory getting rest and appending rest.tail could be done in constant time.
I suppose performance would be O(n) since list doesn't store index to each element and implemented as links to next el -> el2 -> el3` so only list.head operation are O(1) as fast.
You should use IndexedSeq for that purpose with most common implmentation Vector.
Although it doesn't copy any data so only 1 value are actually updated in memory.
In general all scala immutable collections dosn't copy all data on modification or creation of updated new instance. It is key difference with Java collections.

Is this scala parallel array code threadsafe?

I want to use parallel arrays for a task, and before I start with the coding, I'd be interested in knowing if this small snipept is threadsafe:
import collection.mutable._
var listBuffer = ListBuffer[String]("one","two","three","four","five","six","seven","eight","nine")
var jSyncList = java.util.Collections.synchronizedList(new java.util.ArrayList[String]())
listBuffer.par.foreach { e =>
println("processed :"+e)
// using sleep here to simulate a random delay
Thread.sleep((scala.math.random * 1000).toLong)
jSyncList.add(e)
}
jSyncList.toArray.foreach(println)
Are there better ways of processing something with parallel collections, and acumulating the results elsewhere?
The code you posted is perfectly safe; I'm not sure about the premise though: why do you need to accumulate the results of a parallel collection in a non-parallel one? One of the whole points of the parallel collections is that they look like other collections.
I think that parallel collections also will provide a seq method to switch to sequential ones. So you should probably use this!
For this pattern to be safe:
listBuffer.par.foreach { e => f(e) }
f has to be able to run concurrently in a safe way. I think the same rules that you need for safe multi-threading apply (access to share state needs to be thread safe, the order of the f calls for different e won't be deterministic and you may run into deadlocks as you start synchronizing your statements in f).
Additionally I'm not clear what guarantees the parallel collections gives you about the underlying collection being modified while being processed, so a mutable list buffer which can have elements added/removed is possibly a poor choice. You never know when the next coder will call something like foo(listBuffer) before your foreach and pass that reference to another thread which may mutate the list while it's being processed.
Other than that, I think for any f that will take a long time, can be called concurrently and where e can be processed out of order, this is a fine pattern.
immutCol.par.foreach { e => threadSafeOutOfOrderProcessingOf(e) }
disclaimer: I have not tried // colls myself, but I'm looking forward at having SO questions/answers show us what works well.
The synchronisedList should be safe, though the println may give unexpected results - you have no guarantees of the order that items will be printed, or even that your printlns won't be interleaved mid-character.
A synchronised list is also unlikely to be the fastest way you can do this, a safer solution is to map over an immutable collection (Vector is probably your best bet here), then print all the lines (in order) afterwards:
val input = Vector("one","two","three","four","five","six","seven","eight","nine")
val output = input.par.map { e =>
val msg = "processed :" + e
// using sleep here to simulate a random delay
Thread.sleep((math.random * 1000).toLong)
msg
}
println(output mkString "\n")
You'll also note that this code has about as much practical usefulness as your example :)
This code is plain weird -- why add stuff in parallel to something that needs to be synchronized? You'll add contention and gain absolutely nothing in return.
The principle of the thing -- accumulating results from parallel processing, are better achieved with stuff like fold, reduce or aggregate.
The code you've posted is safe - there will be no errors due to inconsistent state of your array list, because access to it is synchronized.
However, parallel collections process items concurrently (at the same time), AND out-of-order. The out-of-order means that the 54. element may be processed before the 2. element - your synchronized array list will contain items in non-predefined order.
In general it's better to use map, filter and other functional combinators to transform a collection into another collection - these will ensure that the ordering guarantees are preserved if a collection has some (like Seqs do). For example:
ParArray(1, 2, 3, 4).map(_ + 1)
always returns ParArray(2, 3, 4, 5).
However, if you need a specific thread-safe collection type such as a ConcurrentSkipListMap or a synchronized collection to be passed to some method in some API, modifying it from a parallel foreach is safe.
Finally, a note - parallel collection provide parallel bulk operations on data. Mutable parallel collections are not thread-safe in the sense that you can add elements to them from different threads. Mutable operations like insertion to a map or appending a buffer still have to be synchronized.

With parallel collection, does aggregate respect order?

in scala, i have a parallel Iterable of items and i want to iterate over them and aggregate the results in some way, but in order. i'll simplify my use case and say that we start with an Iterable of integers and want to concatenate the string representation of them in paralle, with the result in order.
is this possible with either fold or aggregate? it's unclear from the documentation which methods work parallelized but maintain order.
Yes, order is gauranteed to be preserved for fold/aggregate/reduce operations on parallel collections. This is not very well documented. The trick is that the operation you which to fold over must be associative (and thus capable of being arbitrarily split up and recombined), but need not be commutative (and so not capable of being safely reordered). String concatenation is a perfect example of an associative, non-commutative operation, so the fold can be done in parallel.
val concat = myParallelList.map(_.toString).reduce(_+_)
For folds: foldRight and foldLeft cannot be processed in parallel, you'll need to use the new fold method (more info there).
Like fold, aggregate can do its work in parallel: it “traverses the elements in different partitions sequentially” (Scaladoc), though it looks like you have no direct influence on how the partitions are chosen.
I THINK the preservation of 'order' in the sense of the comment to Jean-Philippe Pellets answer is guaranteed due to the way parallel collections are implemented according to a publication of Odersky (http://infoscience.epfl.ch/record/150220/files/pc.pdf) IFF the part that splits your collection is behaving well with respect to order.
i.e. if you have elements a < b < c and a and c end up in one partition it follows that b is in the same partition as well.
I don't remember what exactly was the part responsible for the splitting, but if you find it, you might sufficient information in its documentation or source code in order to answer your question.