Subtract Seq[A] from Seq[B] - scala

I have two classes A and B. Both of them have the same property: id and many other different properties.
How can I subtract Seq[A] from Seq[B] by matching the id's?

This should work as long as the id field of both classes have the same type.
val as: Seq[A] = ???
val bs: Seq[B] = ???
val asSet = as.iterator.map(a => a.id).toSet
val substracted: Seq[B] = bs.filterNot(b => asSet(b.id))

Another feasible solution:
val seqSub = seqB.filterNot(x => seqA.exists(_.id == x.id))

Couldn't find an answer matching my definition of subtract, where duplicate elements aren't filtered, (e.g. Seq(1,2,2) subtract Seq(2) = Seq(1,2), det0's definition gives Seq(1) so posting it here.
trait IntId {
def id: Int
}
case class A(id: Int) extends IntId
case class B(id: Int) extends IntId
val seqB = Seq(B(1),B(4),B(7),B(7),B(7))
val seqA = Seq(A(7))
// BSubtractA = Seq(B(1),B(4),B(7),B(7)), only remove one instance of id 7
val BSubtractA = seqA.foldLeft(seqB){
case (seqBAccumulated, a) =>
val indexOfA = seqBAccumulated.map(_.id).indexOf(a.id)
if(indexOfA >= 0) {
seqBAccumulated.take(indexOfA) ++ seqBAccumulated.drop(indexOfA + 1)
}
else {
seqBAccumulated
}
}
Yes, there are shortcomings to this solution. For example, if seqA is larger than seqB , then it runs into null pointers (+ I haven't refactored it into a def). Also the performance could be improved to iterate fewer times over the input, however, this satisfied my use case.

That will be far more clean -
val seqSub = seqB.filterNot(x => seqA.contains(x))

Related

How to convert values of a case class into Seq?

I am new to Scala and I am having to provide values extracted from an object/case class into a Seq. I was wondering whether there would be any generic way of extracting values of an object into Seq of those values in order?
Convert the following:
case class Customer(name: Option[String], age: Int)
val customer = Customer(Some("John"), 24)
into:
val values = Seq("John", 24)
case class extends Product class and it provides such method:
case class Person(age:Int, name:String, lastName:Option[String])
def seq(p:Product) = p.productIterator.toList
val s:Seq[Any] = seq(Person(100, "Albert", Some("Einstain")))
println(s) //List(100, Albert, Some(Einstain))
https://scalafiddle.io/sf/oD7qk8u/0
Problem is that you will get untyped list/array from it. Most of the time it is not optimal way of doing things, and you should always prefer statically typed solutions.
Scala 3 (Dotty) might give us HList out-of-the-box which is a way of getting product's values without loosing type information. Given val picard = Customer(Some("Picard"), 75) consider the difference between
val l: List[Any] = picard.productIterator.toList
l(1)
// val res0: Any = 75
and
val hl: (Option[String], Int) = Tuple.fromProductTyped(picard)
hl(1)
// val res1: Int = 75
Note how res1 did not loose type information.
Informally, it might help to think of an HList as making a case class more generic by dropping its name whilst retaining its fields, for example, whilst Person and Robot are two separate models
Robot(name: Option[String], age: Int)
Person(name: Option[String], age: Int)
they could both represented by a common "HList" that looks something like
(_: Option[String], _: Int) // I dropped the names
If it's enough for you to have Seq[Any] you can use productIterator approach proposed by #Scalway. If I understood correctly you want also to unpack Option fields. But you haven't specified what to do with None case like Customer(None, 24).
val values: Seq[Any] = customer.productIterator.map {
case Some(x) => x
case x => x
}.toSeq // List(John, 24)
Statically typed solution would be to use heterogeneous collection e.g. HList
class Default[A](val value: A)
object Default {
implicit val int: Default[Int] = new Default(0)
implicit val string: Default[String] = new Default("")
//...
}
trait LowPriorityUnpackOption extends Poly1 {
implicit def default[A]: Case.Aux[A, A] = at(identity)
}
object unpackOption extends LowPriorityUnpackOption {
implicit def option[A](implicit default: Default[A]): Case.Aux[Option[A], A] = at {
case Some(a) => a
case None => default.value
}
}
val values: String :: Int :: HNil =
Generic[Customer].to(customer).map(unpackOption) // John :: 24 :: HNil
Generally it would be better to work with Option monadically rather than to unpack them.

Scala case class: avoid recomputation of the property

I am asking this because I have encountered this use case many times.
Let's say we have a case class like this:
case class C(xs: Iterable[Int]) {
val max = xs.max
def ++(that: C) = C(xs ++ that.xs)
}
This works fine, but the ++ operation is inefficient, since the collection is needlessly traversed once more to compute the maximum of the result; since we already know the maximums of both collections, we could reuse that - by using something like this:
def ++(that: C) =
C(xs ++ that.xs, max = math.max(max, that.max))
This is just a simple example to demonstrate the purpose - the computation avoided could be a lot more complex, or maybe even a TCP data fetch.
How to avoid this recomputation (see the second code snippet), keeping the code elegant?
Something like this would work
class C private (val xs: Iterable[Int], val max: Int) {
def ++(that: C) = new C(xs ++ that.xs, math.max(this.max, that.max)
}
object C {
def apply(xs: Iterable[Int]) = new C(xs, xs.max)
}
Note that C is no longer a case class to avoid max and xs becoming inconsistent. If C was a case class, you could call e.g. c.copy(max = -1) and get an inconsistent instance.
case class C(xs: Iterable[Int]) {
private var maxOp = Option.empty[Int]
lazy val max = maxOp getOrElse {
maxOp = Some(xs.max)
maxOp.get
}
def ++(that: C) = {
val res = C(xs ++ that.xs)
res.maxOp = Some(math.max(this.max, that.max))
res
}
}
Since max is already a val (in contrast to a method) you could do it this way:
case class C private (xs: Iterable[Int], max: Int) {
def ++(that: C) = C(xs ++ that.xs, math.max(max, that.max))
def copy(_xs: Iterable[Int] = this.xs) = {
if (_xs == this.xs) {
C(xs, max)
} else {
C(_xs)
}
}
}
object C {
def apply(xs: Iterable[Int]): C = C(xs, xs.max)
}
If you are going to pattern match on this case class, then it depends on your use cases, if you can (or must) pattern match on max as well.
Update 1 As pointed out by RĂ¼diger I have added private to the constructor so that xs and max are consistent.
Update 2 As pointed out by som-snytt, the copy method must be handled as well to to prevent inconsistency.
sealed trait C {
val xs: Iterable[Int]
val max: Int
def ++(that: C) = ComposedC(this, that)
}
case class ValidatedC(xs: Iterable[Int]) extends C {
val max = xs.max
}
case class ComposedC(a: C, b: C) extends C {
val max = math.max(a.max, b.max)
val xs = a.xs ++ b.xs
}
object C {
def apply(xs: Iterable[Int]) = ValidatedC(xs)
}
A simpler solution (which doesn't enforce correctness) -
Introduce a way to provide pre-computed max and an auxiliary constructor that gets 2 Cs.
case class C(xs: Iterable[Int])(val max: Int = xs.max) {
def this(a: C, b: C) = {
this(a.xs ++ b.xs)(math.max(a.max, b.max))
}
def ++(that: C) = new C(this, that)
}

Two methods in one scala

Starting my first project with Scala: a poker framework.
So I have the following class
class Card(rank1: CardRank, suit1: Suit){
val rank = rank1
val suit = suit1
}
And a Utils object which contains two methods that do almost the same thing: they count number of cards for each rank or suit
def getSuits(cards: List[Card]) = {
def getSuits(cards: List[Card], suits: Map[Suit, Int]): (Map[Suit, Int]) = {
if (cards.isEmpty)
return suits
val suit = cards.head.suit
val value = if (suits.contains(suit)) suits(suit) + 1 else 1
getSuits(cards.tail, suits + (suit -> value))
}
getSuits(cards, Map[Suit, Int]())
}
def getRanks(cards: List[Card]): Map[CardRank, Int] = {
def getRanks(cards: List[Card], ranks: Map[CardRank, Int]): Map[CardRank, Int] = {
if (cards isEmpty)
return ranks
val rank = cards.head.rank
val value = if (ranks.contains(rank)) ranks(rank) + 1 else 1
getRanks(cards.tail, ranks + (rank -> value))
}
getRanks(cards, Map[CardRank, Int]())
}
Is there any way I can "unify" these two methods in a single one with "field/method-as-parameter"?
Thanks
Yes, that would require high order function (that is, function that takes function as parameter) and type parameters/genericity
def groupAndCount[A,B](elements: List[A], toCount: A => B): Map[B, Int] = {
// could be your implementation, just note key instead of suit/rank
// and change val suit = ... or val rank = ...
// to val key = toCount(card.head)
}
then
def getSuits(cards: List[Card]) = groupAndCount(cards, {c : Card => c.suit})
def getRanks(cards: List[Card]) = groupAndCount(cards, {c: Card => c.rank})
You do not need type parameter A, you could force the method to work only on Card, but that would be a pity.
For extra credit, you can use two parameter lists, and have
def groupAndCount[A,B](elements: List[A])(toCount: A => B): Map[B, Int] = ...
that is a little peculiarity of scala with type inference, if you do with two parameters lists, you will not need to type the card argument when defining the function :
def getSuits(cards: List[Card]) = groupAndCount(cards)(c => c.suit)
or just
def getSuits(cards: List[Card] = groupAndCount(cards)(_.suit)
Of course, the library can help you with the implementation
def groupAndCount[A,B](l: List[A])(toCount: A => B) : Map[A,B] =
l.groupBy(toCount).map{case (k, elems) => (k, elems.length)}
although a hand made implementation might be marginally faster.
A minor note, Card should be declared a case class :
case class Card(rank: CardRank, suit: Suit)
// declaration done, nothing else needed

Is there such a thing as bidirectional maps in Scala?

I'd like to link 2 columns of unique identifiers and be able to get a first column value by a second column value as well as a second column value by a first column value. Something like
Map(1 <-> "one", 2 <-> "two", 3 <-> "three")
Is there such a facility in Scala?
Actually I need even more: 3 columns to select any in a triplet by another in a triplet (individual values will never be met more than once in the entire map). But a 2-column bidirectional map can help too.
Guava has a bimap that you can use along with
import scala.collection.JavaConversions._
My BiMap approach:
object BiMap {
private[BiMap] trait MethodDistinctor
implicit object MethodDistinctor extends MethodDistinctor
}
case class BiMap[X, Y](map: Map[X, Y]) {
def this(tuples: (X,Y)*) = this(tuples.toMap)
private val reverseMap = map map (_.swap)
require(map.size == reverseMap.size, "no 1 to 1 relation")
def apply(x: X): Y = map(x)
def apply(y: Y)(implicit d: BiMap.MethodDistinctor): X = reverseMap(y)
val domain = map.keys
val codomain = reverseMap.keys
}
val biMap = new BiMap(1 -> "A", 2 -> "B")
println(biMap(1)) // A
println(biMap("B")) // 2
Of course one can add syntax for <-> instead of ->.
Here's a quick Scala wrapper for Guava's BiMap.
import com.google.common.{collect => guava}
import scala.collection.JavaConversions._
import scala.collection.mutable
import scala.languageFeature.implicitConversions
class MutableBiMap[A, B] private (
private val g: guava.BiMap[A, B] = new guava.HashBiMap[A, B]()) {
def inverse: MutableBiMap[B, A] = new MutableBiMap[B, A](g.inverse)
}
object MutableBiMap {
def empty[A, B]: MutableBiMap[A, B] = new MutableBiMap()
implicit def toMap[A, B] (x: MutableBiMap[A, B]): mutable.Map[A,B] = x.g
}
I have a really simple BiMap in Scala:
case class BiMap[A, B](elems: (A, B)*) {
def groupBy[X, Y](pairs: Seq[(X, Y)]) = pairs groupBy {_._1} mapValues {_ map {_._2} toSet}
val (left, right) = (groupBy(elems), groupBy(elems map {_.swap}))
def apply(key: A) = left(key)
def apply[C: ClassTag](key: B) = right(key)
}
Usage:
val biMap = BiMap(1 -> "x", 2 -> "y", 3 -> "x", 1 -> "y")
assert(biMap(1) == Set("x", "y"))
assert(biMap("x") == Set(1, 3))
I don't think it exists out of the box, because the generic behavior is not easy to extract
How to handle values matching several keys in a clean api?
However for specific cases here is a good exercise that might help. It must be updated because no hash is used and getting a key or value is O(n).
But the idea is to let you write something similar to what you propose, but using Seq instead of Map...
With the help of implicit and trait, plus find, you could emulate what you need with a kind of clean api (fromKey, fromValue).
The specificities is that a value is not supposed to appear in several places... In this implementation at least.
trait BiMapEntry[K, V] {
def key:K
def value:V
}
trait Sem[K] {
def k:K
def <->[V](v:V):BiMapEntry[K, V] = new BiMapEntry[K, V]() { val key = k; val value = v}
}
trait BiMap[K, V] {
def fromKey(k:K):Option[V]
def fromValue(v:V):Option[K]
}
object BiMap {
implicit def fromInt(i:Int):Sem[Int] = new Sem[Int] {
def k = i
}
implicit def fromSeq[K, V](s:Seq[BiMapEntry[K, V]]) = new BiMap[K, V] {
def fromKey(k:K):Option[V] = s.find(_.key == k).map(_.value)
def fromValue(v:V):Option[K] = s.find(_.value == v).map(_.key)
}
}
object test extends App {
import BiMap._
val a = 1 <-> "a"
val s = Seq(1 <-> "a", 2 <-> "b")
println(s.fromKey(2))
println(s.fromValue("a"))
}
Scala is immutable and values are assigned as reference not copy, so memory footprint will for reference/pointer storage only, which it's better to use to two maps, with type A being key for first and type being B being key for second mapped to B and A respectively, than tun time swapping of maps. And the swapping implementation also has it's own memory footprint and the newly swapped hash-map will also be there in memory till the execution of parent call back and the garbage collector call. And if the the swapping of map is required frequently than virtually your are using equally or more memory than the naive two maps implementation at starting.
One more approach you can try with single map is this(will work only for getting key using mapped value):
def getKeyByValue[A,B](map: Map[A,B], value: B):Option[A] = hashMap.find((a:A,b:B) => b == value)
Code for Scala implementation of find by key:
/** Find entry with given key in table, null if not found.
*/
#deprecatedOverriding("No sensible way to override findEntry as private findEntry0 is used in multiple places internally.", "2.11.0")
protected def findEntry(key: A): Entry =
findEntry0(key, index(elemHashCode(key)))
private[this] def findEntry0(key: A, h: Int): Entry = {
var e = table(h).asInstanceOf[Entry]
while (e != null && !elemEquals(e.key, key)) e = e.next
e
}

How can I extend Scala collections with an argmax method?

I would like to add to all collections where it makes sense, an argMax method.
How to do it? Use implicits?
On Scala 2.8, this works:
val list = List(1, 2, 3)
def f(x: Int) = -x
val argMax = list max (Ordering by f)
As pointed by mkneissl, this does not return the set of maximum points. Here's an alternate implementation that does, and tries to reduce the number of calls to f. If calls to f don't matter that much, see mkneissl's answer. Also, note that his answer is curried, which provides superior type inference.
def argMax[A, B: Ordering](input: Iterable[A], f: A => B) = {
val fList = input map f
val maxFList = fList.max
input.view zip fList filter (_._2 == maxFList) map (_._1) toSet
}
scala> argMax(-2 to 2, (x: Int) => x * x)
res15: scala.collection.immutable.Set[Int] = Set(-2, 2)
The argmax function (as I understand it from Wikipedia)
def argMax[A,B](c: Traversable[A])(f: A=>B)(implicit o: Ordering[B]): Traversable[A] = {
val max = (c map f).max(o)
c filter { f(_) == max }
}
If you really want, you can pimp it onto the collections
implicit def enhanceWithArgMax[A](c: Traversable[A]) = new {
def argMax[B](f: A=>B)(implicit o: Ordering[B]): Traversable[A] = ArgMax.argMax(c)(f)(o)
}
and use it like this
val l = -2 to 2
assert (argMax(l)(x => x*x) == List(-2,2))
assert (l.argMax(x => x*x) == List(-2,2))
(Scala 2.8)
Yes, the usual way would be to use the 'pimp my library' pattern to decorate your collection. For example (N.B. just as illustration, not meant to be a correct or working example):
trait PimpedList[A] {
val l: List[A]
//example argMax, not meant to be correct
def argMax[T <% Ordered[T]](f:T => T) = {error("your definition here")}
}
implicit def toPimpedList[A](xs: List[A]) = new PimpedList[A] {
val l = xs
}
scala> def f(i:Int):Int = 10
f: (i: Int) Int
scala> val l = List(1,2,3)
l: List[Int] = List(1, 2, 3)
scala> l.argMax(f)
java.lang.RuntimeException: your definition here
at scala.Predef$.error(Predef.scala:60)
at PimpedList$class.argMax(:12)
//etc etc...
Nice and easy ? :
val l = List(1,0,10,2)
l.zipWithIndex.maxBy(x => x._1)._2
You can add functions to an existing API in Scala by using the Pimp my Library pattern. You do this by defining an implicit conversion function. For example, I have a class Vector3 to represent 3D vectors:
class Vector3 (val x: Float, val y: Float, val z: Float)
Suppose I want to be able to scale a vector by writing something like: 2.5f * v. I can't directly add a * method to class Float ofcourse, but I can supply an implicit conversion function like this:
implicit def scaleVector3WithFloat(f: Float) = new {
def *(v: Vector3) = new Vector3(f * v.x, f * v.y, f * v.z)
}
Note that this returns an object of a structural type (the new { ... } construct) that contains the * method.
I haven't tested it, but I guess you could do something like this:
implicit def argMaxImplicit[A](t: Traversable[A]) = new {
def argMax() = ...
}
Here's a way of doing so with the implicit builder pattern. It has the advantage over the previous solutions that it works with any Traversable, and returns a similar Traversable. Sadly, it's pretty imperative. If anyone wants to, it could probably be turned into a fairly ugly fold instead.
object RichTraversable {
implicit def traversable2RichTraversable[A](t: Traversable[A]) = new RichTraversable[A](t)
}
class RichTraversable[A](t: Traversable[A]) {
def argMax[That, C](g: A => C)(implicit bf : scala.collection.generic.CanBuildFrom[Traversable[A], A, That], ord:Ordering[C]): That = {
var minimum:C = null.asInstanceOf[C]
val repr = t.repr
val builder = bf(repr)
for(a<-t){
val test: C = g(a)
if(test == minimum || minimum == null){
builder += a
minimum = test
}else if (ord.gt(test, minimum)){
builder.clear
builder += a
minimum = test
}
}
builder.result
}
}
Set(-2, -1, 0, 1, 2).argmax(x=>x*x) == Set(-2, 2)
List(-2, -1, 0, 1, 2).argmax(x=>x*x) == List(-2, 2)
Here's a variant loosely based on #Daniel's accepted answer that also works for Sets.
def argMax[A, B: Ordering](input: GenIterable[A], f: A => B) : GenSet[A] = argMaxZip(input, f) map (_._1) toSet
def argMaxZip[A, B: Ordering](input: GenIterable[A], f: A => B): GenIterable[(A, B)] = {
if (input.isEmpty) Nil
else {
val fPairs = input map (x => (x, f(x)))
val maxF = fPairs.map(_._2).max
fPairs filter (_._2 == maxF)
}
}
One could also do a variant that produces (B, Iterable[A]), of course.
Based on other answers, you can pretty easily combine the strengths of each (minimal calls to f(), etc.). Here we have an implicit conversion for all Iterables (so they can just call .argmax() transparently), and a stand-alone method if for some reason that is preferred. ScalaTest tests to boot.
class Argmax[A](col: Iterable[A]) {
def argmax[B](f: A => B)(implicit ord: Ordering[B]): Iterable[A] = {
val mapped = col map f
val max = mapped max ord
(mapped zip col) filter (_._1 == max) map (_._2)
}
}
object MathOps {
implicit def addArgmax[A](col: Iterable[A]) = new Argmax(col)
def argmax[A, B](col: Iterable[A])(f: A => B)(implicit ord: Ordering[B]) = {
new Argmax(col) argmax f
}
}
class MathUtilsTests extends FunSuite {
import MathOps._
test("Can argmax with unique") {
assert((-10 to 0).argmax(_ * -1).toSet === Set(-10))
// or alternate calling syntax
assert(argmax(-10 to 0)(_ * -1).toSet === Set(-10))
}
test("Can argmax with multiple") {
assert((-10 to 10).argmax(math.pow(_, 2)).toSet === Set(-10, 10))
}
}