Scala collection map var to another value - scala

I have a collection col with elements of type Foo.
Foo has a var bar, which I need to change for every element in the collection.
Currently my code looks like this
col.map(baz => {
baz.bar = <something>
baz
})
Is there a nicer way to do this? I feel like this could be done with a one liner.

foreach is designed for side-effects like that
col.foreach(_.bar = <something>)
after this col will have all elements mutated. If you wish to avoid Unit return type try chaining
import util.chaining._
col.map(_.tap(_.bar = <something>))
or other way around
col.tap(_.foreach(_.bar = <something>))
Idimoatic approach would be to avoid var and have immutable case class Foo then copy
col.map(_.copy(bar = <something>))

Related

Scala var best practice - Encapsulation

I'm trying to understand what's the best practice for using vars in scala, for example
class Rectangle() {
var x:Int = 0
}
Or something like:
class Rectangle() {
private var _x:Int = 0
def x:Int = _x
def x_(newX:Int):Unit = _x=newX
}
Which one can be considered as better? and why?
Thank you!
As Luis already explained in the comment, vars are something that should be avoided whenever you are able to avoid it, and such a simple case like you gave is one of those that can be better designed using something like this:
// Companion object is not necessary in your case
object Rectangle {
def fromInt(x: Int): Option[Rectangle] = {
if(x > 0) {
Some(Rectangle(x))
} else None
}
final case class Rectangle(x: Int)
It would be very rare situations when you can't avoid using vars in scala. Scala general idiom is: "Make your variables immutable, unless there is a good reason not to"
I'm trying to understand what's the best practice for using vars in scala, […]
Best practice is to not use vars at all.
Which one can be considered as better? and why?
The second one is basically equivalent to what the compiler would generate for the first one anyway, so it doesn't really make sense to use the second one.
It would make sense if you wanted to give different accessibility to the setter and the getter, something like this:
class Rectangle {
private[this] var _x = 0
def x = _x
private def x_=(x: Int) = _x = x
}
As you can see, I am using different accessibility for the setter and the getter, so it makes sense to write them out explicitly. Otherwise, just let the compiler generate them.
Note: I made a few other changes to the code:
I changed the visibility of the _x backing field to private[this].
I changed the name of the setter to x_=. This is the standard naming for setters, and it has the added advantage that it allows you to use someRectangle.x = 42 syntactic sugar to call it, making it indistinguishable from a field.
I added some whitespace to give the code room to breathe.
I removed some return type annotations. (This one is controversial.) The community standard is to always annotate your return types in public interfaces, but in my opinion, you can leave them out if they are trivial. It doesn't really take much mental effort to figure out that 0 has type Int.
Note that your first version can also be simplified:
class Rectangle(var x: Int = 0)
However, as mentioned in other answers, you really should make your objects immutable. It is easy to create a simple immutable data object with all the convenience functions generated automatically for you by using a case class:
final case class Rectangle(x: Int = 0)
If you now want to "change" your rectangle, you instead create a new one which has all the properties the same except x (in this case, x is the only property, but there could be more). To do this, Scala generates a nifty copy method for you:
val smallRectangle = Rectangle(3)
val enlargedRectangle = smallRectangle.copy(x = 10)

Accessing values in an Nx2 matrix of Strings in Scala

I have the following:
var data = Array[Array[String]]()
data :+= Array("item1", "")
data :+= Array("item2", "")
data :+= Array("item3", "")
I would like to somehow access the second element (the value lets say) of a certain "key" in data. I have to do this right now:
data(2)(1) = "test"
But I would like to do it without having to worry about indexes, something where I can just call the "key" and modify the value, like data("item1") = "test".
Unfornately I do have to use this matrix Array of Strings, which is obviously not optimal, but this is what I have to work with. How do I do this with my datastructure?
Also if I was to change datastructure, what Scala datastrucure would be the best? (I am considering changing it, but unlikely)
Use Map
val items = mutable.Map("foo" -> "bar", "bar" -> "bat")
items.update("foo", "baz")
If you have to stick with array, something like this will work
data.find(_.head == "foo").foreach { _(1) = "baz" }
You can make it look like a function call with a little implicit trickery:
object ArrayAsMap {
implicit class Wrapper(val it:Array[String]) extends AnyVal {
def <<-(foo: String) = it(1) = foo
}
implicit class Wrapper2(val it: Array[Array[String]]) extends AnyVal {
def apply(s: String) = Wrapper(it.find(_.head == s).get)
}
implicit def toarray(w: Wrapper) = w.it
}
Now, you can write that assignment like:
import ArrayAsMap._
data("foo") <<- "bar"
A few things about this:
the approach you are using is really bad: not only you have to scan the entire array to find the key every time, you are also making implicit assumptions about the contents of the (outer) array, which is especially bad because it is mutable. Something like this data("foo") = Array.empty will cause that code to crash badly at runtime.
you should avoid using mutable state (var) and mutable containers (like Array or mutable.Map ... Arrays are kinda inevitable legacy from java, but you should not normally mutate them). Code that uses immutable and referentially transparent statements is much easier to read, maintain and reason about, and much less error-prone. 99% of real-life use cases in scala do not require mutable state if implemented correctly. So, it might be a good idea for you to just pretend that vars and mutable containers do not exist at all, and you cannot change any value once it is assigned, until you have learned enough scala to be able to tell that 1% of cases when mutation is really necessary.

Making this code with Map and Set more Scala-ish

How do I make these line of codes more scala-ish (shorter?). I still get the Java feeling in it (which I want to stay away from). Thanks in advance!
import scala.collection.mutable
val outstandingUserIds: mutable.LinkedHashSet[String] = mutable.LinkedHashSet[String]()
val tweetJson = JacksMapper.readValue[Map[String, AnyRef]](body)
val userObj = tweetJson.get("user")
tweetJson.get("user").foreach(userObj => {
userObj.asInstanceOf[Map[String, AnyRef]].get("id_str").foreach(idStrObj => {
if (outstandingUserIds.exists(outstandingIdStr => outstandingIdStr.equals(idStrObj))) {
outstandingUserIds.remove(idStrObj.asInstanceOf[String])
}
})
})
One thing you want to do in Scala is take advantage of type inference. That way, you don't need to repeat yourself on the LHS:
val outstandingUserIds = mutable.LinkedHashSet[String]()
You also don't need the inner braces after the closure variable userObj =>. Instead, use braces after foreach {} to execute multiple statements:
tweetJson.get("user").foreach { userObj =>
}
In fact, you could use the anonymous variable '_' and say:
tweetJson.get("user").foreach {
_.get("id_str").foreach ...
}
Scala encourages the use of immutable collections. One way to simplify the above even further would be to use collect (instead of exists+delete) which would return a new collection with only the elements you want.

What are good examples of: "operation of a program should map input values to output values rather than change data in place"

I came across this sentence in Scala in explaining its functional behavior.
operation of a program should map input of values to output values rather than change data in place
Could somebody explain it with a good example?
Edit: Please explain or give example for the above sentence in its context, please do not make it complicate to get more confusion
The most obvious pattern that this is referring to is the difference between how you would write code which uses collections in Java when compared with Scala. If you were writing scala but in the idiom of Java, then you would be working with collections by mutating data in place. The idiomatic scala code to do the same would favour the mapping of input values to output values.
Let's have a look at a few things you might want to do to a collection:
Filtering
In Java, if I have a List<Trade> and I am only interested in those trades executed with Deutsche Bank, I might do something like:
for (Iterator<Trade> it = trades.iterator(); it.hasNext();) {
Trade t = it.next();
if (t.getCounterparty() != DEUTSCHE_BANK) it.remove(); // MUTATION
}
Following this loop, my trades collection only contains the relevant trades. But, I have achieved this using mutation - a careless programmer could easily have missed that trades was an input parameter, an instance variable, or is used elsewhere in the method. As such, it is quite possible their code is now broken. Furthermore, such code is extremely brittle for refactoring for this same reason; a programmer wishing to refactor a piece of code must be very careful to not let mutated collections escape the scope in which they are intended to be used and, vice-versa, that they don't accidentally use an un-mutated collection where they should have used a mutated one.
Compare with Scala:
val db = trades filter (_.counterparty == DeutscheBank) //MAPPING INPUT TO OUTPUT
This creates a new collection! It doesn't affect anyone who is looking at trades and is inherently safer.
Mapping
Suppose I have a List<Trade> and I want to get a Set<Stock> for the unique stocks which I have been trading. Again, the idiom in Java is to create a collection and mutate it.
Set<Stock> stocks = new HashSet<Stock>();
for (Trade t : trades) stocks.add(t.getStock()); //MUTATION
Using scala the correct thing to do is to map the input collection and then convert to a set:
val stocks = (trades map (_.stock)).toSet //MAPPING INPUT TO OUTPUT
Or, if we are concerned about performance:
(trades.view map (_.stock)).toSet
(trades.iterator map (_.stock)).toSet
What are the advantages here? Well:
My code can never observe a partially-constructed result
The application of a function A => B to a Coll[A] to get a Coll[B] is clearer.
Accumulating
Again, in Java the idiom has to be mutation. Suppose we are trying to sum the decimal quantities of the trades we have done:
BigDecimal sum = BigDecimal.ZERO
for (Trade t : trades) {
sum.add(t.getQuantity()); //MUTATION
}
Again, we must be very careful not to accidentally observe a partially-constructed result! In scala, we can do this in a single expression:
val sum = (0 /: trades)(_ + _.quantity) //MAPPING INTO TO OUTPUT
Or the various other forms:
(trades.foldLeft(0)(_ + _.quantity)
(trades.iterator map (_.quantity)).sum
(trades.view map (_.quantity)).sum
Oh, by the way, there is a bug in the Java implementation! Did you spot it?
I'd say it's the difference between:
var counter = 0
def updateCounter(toAdd: Int): Unit = {
counter += toAdd
}
updateCounter(8)
println(counter)
and:
val originalValue = 0
def addToValue(value: Int, toAdd: Int): Int = value + toAdd
val firstNewResult = addToValue(originalValue, 8)
println(firstNewResult)
This is a gross over simplification but fuller examples are things like using a foldLeft to build up a result rather than doing the hard work yourself: foldLeft example
What it means is that if you write pure functions like this you always get the same output from the same input, and there are no side effects, which makes it easier to reason about your programs and ensure that they are correct.
so for example the function:
def times2(x:Int) = x*2
is pure, while
def add5ToList(xs: MutableList[Int]) {
xs += 5
}
is impure because it edits data in place as a side effect. This is a problem because that same list could be in use elsewhere in the the program and now we can't guarantee the behaviour because it has changed.
A pure version would use immutable lists and return a new list
def add5ToList(xs: List[Int]) = {
5::xs
}
There are plenty examples with collections, which are easy to come by but might give the wrong impression. This concept works at all levels of the language (it doesn't at the VM level, however). One example is the case classes. Consider these two alternatives:
// Java-style
class Person(initialName: String, initialAge: Int) {
def this(initialName: String) = this(initialName, 0)
private var name = initialName
private var age = initialAge
def getName = name
def getAge = age
def setName(newName: String) { name = newName }
def setAge(newAge: Int) { age = newAge }
}
val employee = new Person("John")
employee.setAge(40) // we changed the object
// Scala-style
case class Person(name: String, age: Int) {
def this(name: String) = this(name, 0)
}
val employee = new Person("John")
val employeeWithAge = employee.copy(age = 40) // employee still exists!
This concept is applied on the construction of the immutable collection themselves: a List never changes. Instead, new List objects are created when necessary. Use of persistent data structures reduce the copying that would happen on a mutable data structure.

Update mutable HashMap value which is a mutable collection

I'm have a map that looks like this: Map[ A -> Collection[B]]. This map gets updated in a loop - the special thing is however, that updates mostly just mean adding an element B to the Collection[B] (for some key A).
I am trying to find out if I can get some speedup by changing the type of my Collection from List[ ] to ListBuffer[ ].
Up to now my code looked like this (simplified):
var incoming = new HashMap[A, List[B]() {
override def default(a: A) = List()
}
..
for(b < someCollectionOfBs){
..
incoming(b.getA) = b :: incoming(b.getA)
..
}
This works fine. Now, I changed the type of the map so it looks like this:
var incoming = new collection.mutable.HashMap[A, ListBuffer[B]() {
override def default(a: A) = collection.mutable.ListBuffer()
}
..
for(b < someCollectionOfBs){
..
incoming(b.getA) += b
..
}
Note the change in how the element B is added to the collection in the 2nd example (no more immutable List, hence we do not need to create and assign new collection...).
But. This does not work: incoming(X) += .. does not update the value of the map for X, actually it does not change anything.
What am I missing here? I thought that I should be able to update the values of a mutable HashMap... So, if my values are mutable collections, why can't I just add elements to those?
The default is returned when the key is not found, but it does not update the map with the default value. You can use getOrElseUpdate for that.
incoming.getOrElseUpdate(b.getA, ListBuffer()) += b
That should do what you want.
Additional note:
If you're concerned about performance, I don't think replacing List with ListBuffer will buy you much, because you are prepending to a List and that should be very fast. ListBuffer is handy when you want to append to a list. You should look at using java.util.HashMap and see if that helps.