Does Kotlin have functional List data structure? - scala

Scala, for example, has functional List data structure that has O(1) for adding new item on head with Cons(::) operator.
But, + operator of List in Kotlin works in that it creates new list object bigger-sized by 1 and copies all original items to it, adds new item, and returns it. So addition operation takes O(N).
Is there any immutable List data structure with O(1) for append in Kotlin?

No, the kotlin standard library does not have that. You can write your own cons list though.
but I don't want to write all of the util functions for that by myself, for example reduce/fold/map/take/drop/max/sum....
You can get those functions for free just by implementing Iterable. For a cons list, this is easy to do:
sealed class ConsList<out T>: Iterable<T> {
object Nil : ConsList<Nothing>() {
override fun iterator() = object : Iterator<Nothing> {
override fun hasNext() = false
override fun next() = throw NoSuchElementException()
}
}
data class Cons<out T>(val element: T, val rest: ConsList<T>): ConsList<T>() {
override fun iterator() = iterator {
yield(element)
yieldAll(rest)
}
}
}
Note that implementing List involves quite a lot more code, especially the listIterator and sublist methods. You could use the some of the default AbstractList implementations, but it wouldn't be terribly efficient.

Related

Scala How to create an Array of defs/Objects and the call those defs in a foreach?

I have a bunch of Scala objects with def's that do a bunch of processing
Foo\CatProcessing (def processing)
Foo\DogProcessing (def processing)
Foo\BirdProcessing (def processing)
Then I have a my main def that will call all of the individual Foo\obj defProcessing. Passing in common parameter values and such
I am trying to put all the list of objects into an Array or List, and then do a 'Foreach' to loop through the list passing in the parameter values or such. ie
foreach(object in objList){
object.Processing(parametmers)
}
Coming from C#, I could do this via binders or the like, so who would I manage this is in Scala?
for (obj <- objList) {
obj.processing(parameters) // `object` is a reserved keyword in Scala
}
or
objList.foreach(obj => obj.processing(parameters))
They are actually the same thing, the former being "syntactic sugar" for the latter.
In the second case, you can bind the only parameter of the anonymous function passed to the foreach function to _, resulting in the following
objList.foreach(_.processing(parameters))
for comprehensions in Scala can be quite expressive and go beyond simple iteration, if you're curious you can read more about it here.
Since you are coming from C#, if by any chance you have had any exposure to LINQ you will find yourself at home with the Scala Collection API. The official documentation is quite extensive in this regard and you can read more about it here.
As it came up in the comments following my reply, you also need the objects you want to iterate to:
have a common type that
exposes the processing method
Alternatively, Scala allows to use structural typing but that relies on runtime reflection and it's unlikely something you really need or want in this case.
You can achieve it by having a common trait for your objects, as in the following example:
trait Processing {
def processing(): Unit
}
final class CatProcessing extends Processing {
def processing(): Unit = println("cat")
}
final class DogProcessing extends Processing {
def processing(): Unit = println("dog")
}
final class BirdProcessing extends Processing {
def processing(): Unit = println("bird")
}
val cat = new CatProcessing
val dog = new DogProcessing
val bird = new BirdProcessing
for (process <- List(cat, dog, bird)) {
process.processing()
}
You can run the code above and play around with it here on Scastie.
Using a Map instead, you can do it as such. (wonder if this works through other types of lists)
val test = Map("foobar" -> CatProcessing)
test.values.foreach(
(movie) => movie.processing(spark)
)

How to program an iterator in scala without using a mutable variable?

I want to implement the iterator trait but in the functional way, ie, without using a var. How to do that?
Suppose I have an external library where I get some elements by calling a function getNextElements(numOfElements: Int):Array[String] and I want to implement an Iterator using that function but without using a variable indicating the "current" array (in my case, the var buffer). How can I implement that in the functional way?
class MyIterator[T](fillBuffer: Int => Array[T]) extends Iterator[T] {
var buffer: List[T] = fillBuffer(10).toList
override def hasNext(): Boolean = {
if (buffer.isEmpty) buffer = fillBuffer(10).toList
buffer.nonEmpty
}
override def next(): T = {
if (!hasNext()) throw new NoSuchElementException()
val elem: T = buffer.head
buffer = buffer.tail
elem
}
}
class Main extends App {
def getNextElements(num: Int): Array[String] = ???
val iterator = new MyIterator[String](getNextElements)
iterator.foreach(println)
}
Iterators are mutable, at least without an interface that also returns a state variable, so you can't in general implement the interface directly without some sort of mutation.
That being said, there are some very useful functions in the Iterator companion object that let you hide the mutation, and make the implementation cleaner. I would implement yours something like:
Iterator.continually(getNextElements(10)).flatten
This calls getNextElements(10) whenever it needs to fill the buffer. The flatten changes it from an Iterator[Array[A]] to an Iterator[A].
Note this returns an infinite iterator. Your question didn't say anything about detecting the end of your source elements, but I would usually implement that using takeWhile. For example, if getNextElements returns an empty array when there are no more elements, you can do:
Iterator.continually(getNextElements(10)).takeWhile(!_.isEmpty).flatten

Elegant traversal of a source in Scala

As a data scientist I frequently use the following pattern for data extraction (i.e. DB, file reading and others):
val source = open(sourceName)
var item = source.getNextItem()
while(item != null){
processItem(item)
item = source.getNextItem()
}
source.close
My (current) dream is to wrap this verbosity into a Scala object "SourceTrav" that would allow this elegance:
SourceTrav(sourceName).foreach(item => processItem(item))
with the same functionality as above, but without running into StackOverflowError, as might happen with the examples in Semantics of Scala Traversable, Iterable, Sequence, Stream and View?
Any idea?
If Scala's standard library (for example scala.io.Source) doesn't suit your needs, you can use different Iterator or Stream companion object methods to wrap manual iterator traversal.
In this case, for example, you can do the following, when you already have an open source:
Iterator.continually(source.getNextItem()).takeWhile(_ != null).foreach(processItem)
If you also want to add automatic opening and closing of the source, don't forget to add try-finally or some other flavor of loan pattern:
case class SourceTrav(sourceName: String) {
def foreach(processItem: Item => Unit): Unit = {
val source = open(sourceName)
try {
Iterator.continually(source.getNextItem()).takeWhile(_ != null).foreach(processItem)
} finally {
source.close()
}
}
}

populate immutable sequence with iterator

I'm interoperating with some Java code that uses iterator-like functionality, and presumes you will continue to test it's .next for null values. I want to put it into immutable Scala data structures to reason about it with functional programming.
Right now, I'm filling mutable data structures and then converting them to immutable data structures. I know there's a more functional way to do this.
How can I refactor the code below to populate the immutable data structures without using intermediate mutable collections?
Thanks
{
val sentences = MutableList[Seq[Token]]()
while(this.next() != null){
val sentence = MutableList[Token]()
var token = this.next()
while(token.next != null){
sentence += token
token = token.next
}
sentences += sentence.to[Seq]
}
sentences.to[Seq]
}
You might try to use the Iterator.iterate method in order to simulate a real iterator, and then use standard collection methods like takeWhile and toSeq. I'm not totally clear on the type of this and next, but something along these lines might work:
Iterator.iterate(this){ _.next }.takeWhile{ _ != null }.map { sentence =>
Iterator.iterate(sentence.next) { _.next }.takeWhile{ _ != null }.toSeq
}.toSeq
You can also extend Iterable by defining your own next and hasNext method in order to use these standard methods more easily. You might even define an implicit or explicit conversion from these Java types to this new Iterable type – this is the same pattern you see in JavaConversions and JavaConverters

How to efficiently select a random element from a Scala immutable HashSet

I have a scala.collection.immutable.HashSet that I want to randomly select an element from.
I could solve the problem with an extension method like this:
implicit class HashSetExtensions[T](h: HashSet[T]) {
def nextRandomElement (): Option[T] = {
val list = h.toList
list match {
case null | Nil => None
case _ => Some (list (Random.nextInt (list.length)))
}
}
}
...but converting to a list will be slow. What would be the most efficient solution?
WARNING This answer is for experimental use only. For real project you probably should use your own collection types.
So i did some research in the HashSet source and i think there is little opportunity to someway extract the inner structure of most valuable class HashTrieSet without package violation.
I did come up with this code, which is extended Ben Reich's solution:
package scala.collection
import scala.collection.immutable.HashSet
import scala.util.Random
package object random {
implicit class HashSetRandom[T](set: HashSet[T]) {
def randomElem: Option[T] = set match {
case trie: HashSet.HashTrieSet[T] => {
trie.elems(Random.nextInt(trie.elems.length)).randomElem
}
case _ => Some(set.size) collect {
case size if size > 0 => set.iterator.drop(Random.nextInt(size)).next
}
}
}
}
file should be created somewhere in the src/scala/collection/random folder
note the scala.collection package - this thing makes the elems part of HashTrieSet visible. This is only solution i could think, which could run better than O(n). Current version should have complexity O(ln(n)) as any of immutable.HashSet's operation s.
Another warning - private structure of HashSet is not part of scala's standard library API, so it could change any version making this code erroneous (though it's didn't changed since 2.8)
Since size is O(1) on HashSet, and iterator is as lazy as possible, I think this solution would be relatively efficient:
implicit class RichHashSet[T](val h: HashSet[T]) extends AnyVal {
def nextRandom: Option[T] = Some(h.size) collect {
case size if size > 0 => h.iterator.drop(Random.nextInt(size)).next
}
}
And if you're trying to get every ounce of efficiency you could use match here instead of the more concise Some/collect idiom used here.
You can look at the mutable HashSet implementation to see the size method. The iterator method defined there basically just calls iterator on FlatHashTable. The same basic efficiencies of these methods apply to immutable HashSet if that's what you're working with. As a comparison, you can see the toList implementation on HashSet is all the way up the type hierarchy at TraversableOnce and uses far more primitive elements which are probably less efficient and (of course) the entire collection must be iterated to generate the List. If you were going to convert the entire set to a Traversable collection, you should use Array or Vector which have constant-time lookup.
You might also note that there is nothing special about HashSet in the above methods, and you could enrich Set[T] instead, if you so chose (although there would be no guarantee that this would be as efficient on other Set implementations, of course).
As a side note, when implementing enriched classes for extension methods, you should always consider making an implicit, user-defined value class by extending AnyVal. You can read about some of the advantages and limitations in the docs, and on this answer.