How do I remove the proper subsets from a list of sets in Scala? - scala

I have a list of sets of integers as followed: {(1, 0), (0, 1, 2), (1, 2), (1, 2, 3, 4, 5), (3, 4)}.
I want to write a program in Scala to remove the sets that are proper subset of some set in the given list, i.e. the final result would be: {(0,1,2), (1,2,3,4,5)}.
An O(n2) solution can be done by checking each set against the entire list but that would be very expensive and does not scale very well for ~100000 sets. I also thought of generating edges from the sets, remove duplicate edges and run a DFS but I have no idea how to do it in Scala (the more Scala-ish way and not one-to-one from Java code).

Individual elements (sets) need only be compared to other elements of the same size or larger.
val ss = List(Set(1, 0), Set(0, 1, 2), Set(1, 2), Set(1, 2, 3, 4, 5), Set(3, 4))
ss.sortBy(- _.size) match {
case Nil => Nil
case hd::tl =>
tl.foldLeft(List(hd)){case (acc, s) =>
if (acc.exists(s.forall(_))) acc
else s::acc
}
}
//res0: List[Set[Int]] = List(Set(0, 1, 2), Set(5, 1, 2, 3, 4))

Related

Printing specific output in Scala

I have the following array of arrays that represents a cycle in a graph that I want to print in the below format.
scala> result.collect
Array[Array[Long]] = Array(Array(0, 1, 4, 0), Array(1, 5, 2, 1), Array(1, 4, 0, 1), Array(2, 3, 5, 2), Array(2, 1, 5, 2), Array(3, 5, 2, 3), Array(4, 0, 1, 4), Array(5, 2, 3, 5), Array(5, 2, 1, 5))
0:0->1->4;
1:1->5->2;1->4->0;
2:2->3->5;2->1->5;
3:3->5->2;
4:4->0->1;
5:5->2->3;5->2->1;
How can I do this? I have tried to do a for loop with if statements like other coding languages but scala's ifs in for loops are for filtering and cannot make use if/else to account for two different criteria.
example python code
for (array,i) in enumerate(range(0,result.length)):
if array[i] == array[i+1]:
//print thing needed
else:
// print other thing
I also tried to do result.groupBy to make it easier to print but doing that ruins the arrays.
Array[(Long, Iterable[Array[Long]])] = Array((4,CompactBuffer([J#3677a08a)), (0,CompactBuffer([J#695fd7e)), (1,CompactBuffer([J#50b0f441, [J#142efc4d)), (3,CompactBuffer([J#1fd66db2)), (5,CompactBuffer([J#36811d3b, [J#61c4f556)), (2,CompactBuffer([J#2eba1b7, [J#2efcf7a5)))
Is there a way to nicely print the output needed in Scala?
This should do it:
result
.groupBy(_.head)
.toArray
.sortBy(_._1)
.map {
case (node, cycles) =>
val paths = cycles.map { cycle =>
cycle
.init // drop last node
.mkString("->")
}
s"$node:${paths.mkString(";")}"
}
.mkString(";\n")
This is the output for the sample input you provided:
0:0->1->4;
1:1->5->2;1->4->0;
2:2->3->5;2->1->5;
3:3->5->2;
4:4->0->1;
5:5->2->3;5->2->1

Efficiently find common values in a map of lists - scala

I asked a similar question already here. However, I misjudged the scale of my specific case. In my example I gave, there were only 4 keys in the map. I am actually dealing with over 10,000 keys and they are mapped to lists of different sizes. So the solution given was correct, but I am now looking for a way that will do this in a more efficient manner.
Say I have:
val myMap: Map[Int, List[Int]] = Map(
1 -> List(1, 10, 12, 76, 105), 2 -> List(2, 5, 10), 3 -> List(10, 12, 76, 5), 4 -> List(2, 4, 5, 10),
... -> List(...)
)
Imagine the (...) go on for over 10,000 keys. I want to return a List of Lists containing a pair of keys and their shared values if the size of the intersection of their respective lists is >= 3.
For example:
res0: List[(Int, Int, List[Int])] = List(
(1, 3, List(10, 12, 76)),
(2, 4, List(2, 5, 10)),
(...),
(...),
)
I've been pretty stuck on this for a couple of days, so any help is genuinely appreciated. Thank you in advance!
If space is not the concern then the problem can be solved in the O(N) where N is the number of elements in the list.
Algorithm:
Create a reverse lookup map out from the input map. Here reverse lookup maps the list element to the key (Id).
For each input map key
Create a temp map
Iterate over the list and look for value (Id) in the reverse lookup. Count the number of occurred for the fetched id.
All key which occurred equal or more than 3 times is the desired pair.
Code
import scala.collection.mutable
import scala.collection.mutable.ArrayBuffer
object Application extends App {
val inputMap = Map(
1 -> List(1, 2, 3, 4),
2 -> List(2, 3, 4, 5),
3 -> List(3, 5, 6, 7),
4 -> List(1, 2, 3, 6, 7))
/*
Expected pairs
| pair | common elements |
---------------------------
(1, 2) -> 2, 3, 4
(1, 4) -> 2, 3, 4
(2, 1) -> 2, 3, 4
(3, 4) -> 3, 5, 6
(4, 1) -> 1, 2, 3
(4, 3) -> 3, 5, 6
*/
val reverseMap = mutable.Map[Int, ArrayBuffer[Int]]()
inputMap.foreach {
case (id, list) => list.foreach(
o => if (reverseMap.contains(o)) reverseMap(o).append(id) else reverseMap.put(o, ArrayBuffer(id)))
}
val result = inputMap.map {
case (id, list) =>
val m = mutable.Map[Int, Int]()
list.foreach(o =>
reverseMap(o).foreach(k => if (m.contains(k)) m.update(k, m(k)+1) else m.put(k, 1)))
val res = m.toList.filter(o => o._2 >= 3 && o._1 != id).map(o => (id, o._1))
res
}.flatten
println(result)
}

Dependencies in a ListBuffer[Set[Int]] in Scala

I'm solving a problem and I got this:
ant : scala.collection.mutable.ListBuffer[Set[Int]] = ListBuffer(Set(), Set(0), Set(0), Set(1), Set(2), Set(1), Set(3,4), Set(5, 6), Set(7))
The Sets in the ListBuffer represent dependencies, for example: ant(1) is the Set(0), which means that ant(1) depends of ant(0) which is the Set(). The same with the others, another example: ant(7) is the Set(5, 6) which means that ant(7) depends of ant(5) and ant(6).
What I need to obtain is a new ListBuffer[Set[Int]] with all the dependencies between the Sets without repetitions, for example: ant(6) depends of ant(3) and ant(4), at the same time ant(3) depends of ant(1) and ant(4) depends of ant(2), and ant(1) and ant(2) depend of ant(0), so the result with all the dependencies in ant(6) is: Set(3,4,1,2,0)
So the result of the initial ListBuffer should be:
solution : scala.collection.mutable.ListBuffer[Set[Int]] = ListBuffer(Set(), Set(0), Set(0), Set(1,0), Set(2,0), Set(1,0), Set(3,4,1,2,0), Set(5,6,1,3,4,0,2), Set(7,5,6,1,0,4,3,2))
Which is the best way to do it?
Thanks.
This is definitely the wrong data structure for what you are trying to represent. To get the result you seek you'll have to go through a tortured sequence of steps even more convoluted than the data structure itself.
So here's where we start.
import collection.mutable.ListBuffer
val ant: ListBuffer[Set[Int]] = ListBuffer(Set(), Set(0), Set(0), Set(1), Set(2),
Set(1), Set(3,4), Set(5, 6), Set(7))
Now we need to add the sub-dependencies to each of the current Sets of dependencies. Since these are Sets of Ints, the order of presentation doesn't matter.
ant.map(_.flatMap(x => ant(x) + x))
// ListBuffer(Set(), Set(0), Set(0), Set(0, 1), Set(0, 2), Set(0, 1), Set(1, 3, 2, 4), Set(5, 1, 6, 3, 4), Set(5, 6, 7))
Now we need to repeat that until the new result is the same as the previous result. A Stream iterator will set up the repetitions and we'll dropWhile each element is different from the previous.
// a ListBuffer Stream
val lbStrm: Stream[ListBuffer[Set[Int]]] =
Stream.iterate[ListBuffer[Set[Int]]](ant)(_.map(_.flatMap(x => ant(x) + x)))
// grab the first one after the results settle
lbStrm.zipWithIndex.dropWhile{case (lb,x) => lb != lbStrm(x+1)}.head._1
// ListBuffer(Set(), Set(0), Set(0), Set(0, 1), Set(0, 2), Set(0, 1), Set(0, 1, 2, 3, 4), Set(0, 5, 1, 6, 2, 3, 4), Set(0, 5, 1, 6, 2, 7, 3, 4))
Not pretty, but doable. It would be much better to redesign that starting data structure.

Create and Append list based on other list member in Scala

I have list of Integer like this:
val aRowcol: List[List[Int]]] =
List(List(0, 0), List(0, 1), List(0, 2)),
List(List(1, 0), List(1, 1), List(1, 2)),
List(List(2, 0), List(2, 1), List(2, 2)),
List(List(0, 0), List(1, 1), List(2, 2)),
List(List(2, 0), List(1, 1), List(0, 2)),
List(List(1, 0), List(0, 1), List(0, 2)),
List(List(1, 0), List(2, 1), List(2, 2))
val aAlpha: List[List[String]] = List(
List("a","b","c","d"),
List("e","f","g","h"),
List("i","j","k","l","m"))
val i = 4
val resNum:List[List[Int,String]] = (0 to i) {
_map => List(
aRowcol.take(i).head.head,
aRowcol.take(i).head(1),
aAlpha(aRowcol.take(i).head.head)(aRowcol.take(i).head(1))}
.toList
But the result I want for val resNum is:
List(
List(0,0,"a"),
List(1,0,"e"),
List(2,0,"i"),
List(0,0,"a"),
List(2,0,"i"))
(0,0) means first row first column, we have "a" on that possition, so i will define how many aAlpha we will have. I think it will be much easier if we do i++, but you know that we couldn't do i++ in scala.
I'm guessing that you want to treat the first element in each "list of lists" in aRowcol as the "coordinates" of a letter in aAlpha, and want to append that letter to each of these "first elements".
If so:
val result: List[List[Any]] = aRowcol.take(5) // only 5 first rows
.map(_.head) // first List(i, j) only, the rest is ignored
.map { case List(i, j) => List(i, j, aAlpha(i)(j)) } // append the right letter to list
result.foreach(println)
// List(0, 0, a)
// List(1, 0, e)
// List(2, 0, i)
// List(0, 0, a)
// List(2, 0, i)
If that's not what you meant - please clarify.
EDIT: as for your version - it can work (and achieve the same goal) with a few fixes:
list.take(i) doesn't return the i-th element, it returns a list with the first i elements, I think you're trying to use list.apply(i) which returns the i-th element, or it's shorthand version: list(i)
If you want to map the numbers 0..4 - call map and then name the argument of the anonymous function you pass i - don't use a var declared outside of the method and expect it to increment
With these corrections (and some more), your version becomes:
val resNum: List[List[Any]] = (0 to 4).map { i =>
List(
aRowcol(i).head.head,
aRowcol(i).head(1),
aAlpha(aRowcol(i).head.head)(aRowcol(i).head(1))) }
.toList
Which works as you expect; But above is a similar yet simpler version.

Scala: sliding(N,N) vs grouped(N)

I found myself lately using sliding(n,n) when I need to iterate collections in groups of n elements without re-processing any of them. I was wondering if it would be more correct to iterate those collections by using grouped(n). My question is if there is an special reason to use one or another for this specific case in terms of performance.
val listToGroup = List(1,2,3,4,5,6,7,8)
listToGroup: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8)
listToGroup.sliding(3,3).toList
res0: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7, 8))
listToGroup.grouped(3).toList
res1: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7, 8))
The reason to use sliding instead of grouped is really only applicable when you want to have the 'windows' be of a length different than what you 'slide' by (that is to say, using sliding(m, n) where m != n):
listToGroup.sliding(2,3).toList
//returns List(List(1, 2), List(4, 5), List(7, 8))
listToGroup.sliding(4,3).toList
//returns List(List(1, 2, 3, 4), List(4, 5, 6, 7), List(7, 8))
As som-snytt points out in a comment, there's not going to be any performance difference, as both of them are implemented within Iterator as returning a new GroupedIterator. However, it's simpler to write grouped(n) than sliding(n, n), and your code will be cleaner and more obvious in its intended behavior, so I would recommend grouped(n).
As an example for where to use sliding, consider this problem where grouped simply doesn't suffice:
Given a list of numbers, find the sublist of length 4 with the greatest sum.
Now, putting aside the fact that a dynamic programming approach can produce a more efficient result, this can be solved as:
def maxLengthFourSublist(list: List[Int]): List[Int] = {
list.sliding(4,1).maxBy(_.sum)
}
If you were to use grouped here, you wouldn't get all the sublists, so sliding is more appropriate.