How to avoid Duplicates in List.newBuilder Scala? - scala

How do I avoid duplicates for this code:
val lastUpdatesBuilder = List.newBuilder[(String, Int)]
val somelist = List("a","a")
for (v <- somelist) {
lastUpdatesBuilder += v -> 1
}
println(lastUpdatesBuilder.result())
Result is List((a,1), (a,1)) and I want it to be List((a,1)) only.

Here you go:
object Demo extends App {
val lastUpdatesBuilder = Set.newBuilder[(String, Int)]
val somelist = List("a","a")
for (v <- somelist) {
lastUpdatesBuilder += v -> 1
}
println(lastUpdatesBuilder.result())
}
Tho i would suggest not to use mutable set you can do something like this.
val ans = somelist.map{ key =>
key -> 1
}.toMap
println(ans)
Or you can first remove the duplicate using distinct and then create a map out of it.
val somelist = List("a","a").distinct
val ans = somelist.map{ key =>
key -> 1
}.toMap

This is what the distinct method does.

Related

Reading CSV into Map[String, Array[String]] in Scala

Given a csv in the format below, what is the best way to load it into Scala as type Map[String, Array[String]], with the first key being the unique values for Col2, and the value Array[String]] as all co-occurring values of Col1?
a,1,
b,2,m
c,2,
d,1,
e,3,m
f,4,
g,2,
h,3,
I,1,
j,2,n
k,2,n
l,1,
m,5,
n,2,
I have tried to use the function below, but am getting errors trying to add to the Option type:
+= is not a member of Option[Array[String]]
In addition, I get overloaded method value ++ with alternatives:
with regards to the line case None => mapping ++ (linesplit(2) -> Array(linesplit(1)))
def parseCSV() : Map[String, Array[String]] = {
var mapping = Map[String, Array[String]]()
val lines = Source.fromFile("test.csv")
for (line <- lines.getLines) {
val linesplit = line.split(",")
mapping.get(linesplit(2)) match {
case Some(_) => mapping.get(linesplit(2)) += linesplit(1)
case None => mapping ++ (linesplit(2) -> Array(linesplit(1)))
}
}
mapping
}
}
I am hoping for a Map[String, Array[String]] like the following:
(2 -> Array["b","c","g","j", "k", "n"])
(3 -> Array["e","h"])
(4 -> Array["f"])
(5 -> Array["m"])
You can do the following:
First - read the file to List[List[String]]:
val rows: List[List[String]] = using(io.Source.fromFile("test.csv")) { source =>
source.getLines.toList map { line =>
line.split(",").map(_.trim).toList
}
}
Then, because the input has only 2 values per row, I filter the rows (rows with only one value I want to ignore)
val filteredRows = rows.filter(row => row.size > 1)
And the last step is to groupBy the first value (which is the second column - the index column is not returned from Source.fromFile):
filteredRows.groupBy(row => row.head).mapValues(_.map(_.last)))
This isn't complete, but it should give you an outline of how it might be done.
io.Source
.fromFile("so.txt") //open file
.getLines() //line by line
.map(_.split(",")) //split on commas
.toArray //load into memory
.groupMap(_(1))(_(0)) //Scala 2.13
//res0: Map[String,Array[String]] = Map(4 -> Array(f), 5 -> Array(m), 1 -> Array(a, d, I, l), 2 -> Array(b, c, g, j, k, n), 3 -> Array(e, h))
You'll notice that the file resource isn't closed, and it doesn't handle malformed input. I leave that for the diligent reader.
For the above code mutable Map & ArrayBuffer should be used, as they could be mutated/updated later.
def parseCSV(): Map[String, Array[String]] = {
val mapping = scala.collection.mutable.Map[String, ArrayBuffer[String]]()
val lines = Source.fromFile("test.csv")
for (line <- lines.getLines) {
val linesplit = line.split(",")
val key = line.split(",")(1)
val values = line.replace(s",$key", "").split(",")
mapping.get(key) match {
case Some(_) => mapping(linesplit(1)) ++= values
case None =>
val ab = ArrayBuffer[String]()
mapping(linesplit(1)) = ab ++= values
}
}
mapping.map(v => (v._1, v._2.toArray)).toMap
}

How to convert a Seq of tuples into set's of individual elements Scala

We have a sequence of tuples Seq(department, title) depTitleSeq we would like to extract Set(department) and Set(title) looking for the best way to do so far we could come up with is
val depTitleSeq = getDepTitleTupleSeq()
var departmentSeq = ArrayBuffer[String]()
var titleSeq = ArrayBuffer[String]()
for (depTitle <- depTitleSeq) yield {
departmentSeq += depTitle._1
titleSeq += depTitle._2
}
val depSet = departmentSeq.toSet
val titleSet = titleSeq.toSet
Fairly new to scala, i'm sure there are better and more efficient ways to achieve this if you could please point us in the right direction it would of great help
If you have two Seqs of data that you want combined into a Seq of tuples, you can zip them together.
If you have a Seq of tuples and you want the elements separated, then you can unzip them.
val (departmentSeq, titleSeq) = getDepTitleTupleSeq().unzip
val depSet :Set[String] = departmentSeq.toSet
val titleSet :Set[String] = titleSeq.toSet
val depTitleSeq = Seq(("x","a"),("y","b"))
val depSet = depTitleSeq.map(_._1).toSet
val titleSet = depTitleSeq.map(_._2).toSet
In Scala REPL:
scala> val depTitleSeq = Seq(("x","a"),("y","b"))
depTitleSeq: Seq[(String, String)] = List((x,a), (y,b))
scala> val depSet = depTitleSeq.map(_._1).toSet
depSet: scala.collection.immutable.Set[String] = Set(x, y)
scala> val titleSet = depTitleSeq.map(_._2).toSet
titleSet: scala.collection.immutable.Set[String] = Set(a, b)
val result:(Set[String], Set[String]) = depTitleSeq.foldLeft((Set[String](), Set[String]())){(a, b) => (a._1 + b._1, a._2 + b._2) }
you can use foldLeft to achieve this.

Scala - Transform a Iterator into a Map

How to get from an Iterator like this
val it = Iterator("one","two","three","four","five")
a map like
Map(four -> 4, three -> 5, two -> 3, five -> 4, one -> 3)
var m = Map[String, Int]()
while (it.hasNext) {
val cell = it.next()
m += (cell -> cell.length())
}
this is a solution using var but I'd like to use just Immutable and val variable.
If I use the for yield statement the returning object would be a Iterator[Map] and I do not want that:
val m = for(i<- it if it.hasNext) yield Map(i->i.length())
You can just use map:
val m = it.map(c => c -> c.length).toMap

Map one value to all values with a common relation Scala

Having a set of data:
{sentenceA1}{\t}{sentenceB1}
{sentenceA1}{\t}{sentenceB2}
{sentenceA2}{\t}{sentenceB1}
{sentenceA3}{\t}{sentenceB1}
{sentenceA4}{\t}{sentenceB2}
I want to map a sentenceA to all the sentences that have a common sentenceB in Scala so the result will be something like this:
{sentenceA1}->{sentenceA2,sentenceA3,sentenceA4} or
{sentenceA2}->{sentenceA1, sentenceA3}
val lines = List(
"sentenceA1\tsentenceB1",
"sentenceA1\tsentenceB2",
"sentenceA2\tsentenceB1",
"sentenceA3\tsentenceB1",
"sentenceA4\tsentenceB2"
)
val afterSplit = lines.map(_.split("\t"))
val ba = afterSplit
.groupBy(_(1))
.mapValues(_.map(_(0)))
val ab = afterSplit
.groupBy(_(0))
.mapValues(_.map(_(1)))
val result = ab.map { case (a, b) =>
a -> b.foldLeft(Set[String]())(_ ++ ba(_)).diff(Set(a))
}

combining two lists with index wise

first List
remoteDeviceAndPort===>List(
(1,891w.yourdomain.com,wlan-ap0),
(13,ap,GigabitEthernet0),
(11,Router-3900,GigabitEthernet0/0)
)
second List
interfacesList===>List(
(1,UP,,0,0,0,0,UP,4294,other,VoIP-Null0,0,0),
(13,DOWN,,0,0,0,0,UP,100,Ethernet,FastEthernet6,0,0),
(11,UP,,0,0,0,0,UP,100,vlan,Vlan11,4558687845,1249542878),
(2,UP,,0,0,972,1327,UP,0,Tunnel,Virtual-Access1,0,0),
(4,DOWN,,0,0,0,0,UP,100,Ethernet,FastEthernet2,0,0),
(6,DOWN,,0,0,0,0,UP,100,Ethernet,FastEthernet2,0,0)
)
The above are my two lists now i have to combine these two lists like below.
Expected OutPut =>
combineList = List(
(1,UP,,0,0,0,0,UP,4294,other,VoIP-Null0,0,0,891w.yourdomain.com,wlan-ap0),
(13,DOWN,,0,0,0,0,UP,100,Ethernet,FastEthernet6,0,0,ap,GigabitEthernet0),
(11,UP,,0,0,0,0,UP,100,vlan,Vlan11,4558687845,1249542878,Router-3900,GigabitEthernet0/0),
(2,UP,,0,0,972,1327,UP,0,Tunnel,Virtual-Access1,0,0,empty,empty),
(4,DOWN,,0,0,0,0,UP,100,Ethernet,FastEthernet2,0,0,empty,empty),
(6,DOWN,,0,0,0,0,UP,100,Ethernet,FastEthernet2,0,0,empty,empty)
)
The similar question here
case class NetworkDeviceInterfaces(index: Int, params: String*)
val remoteDeviceAndPort = List(
(1,"891w.yourdomain.com","wlan-ap0"),
(13,"ap","GigabitEthernet0"),
(11,"Router-3900","GigabitEthernet0/0")
)
val rdapMap = remoteDeviceAndPort map {case (k, v1, v2) => k -> (v1, v2) } toMap
val interfacesList = List(NetworkDeviceInterfaces(1,"UP","","0","0","0","0","UP","4294","other","VoIP-Null0","0","0"))
val result = interfacesList map {
interface => {
val (first, second) = rdapMap.getOrElse(interface.index, ("empty", "empty"))
NetworkDeviceInterfaces(interface.index, (interface.params ++ Seq(first, second)):_*)
}
}