Related
The queston is a mouthful, but the idea pretty simple.
I have 3 lists and a string.
val a = List("x", "y", "z")
val b = List("a1", "a2", "b1", "b2", "c1", "c2", "d1", "d2")
val c = List("1", "1", "2", "2", "3", "3", "4", "4")
val d = "xc1b1"
I need to check if d contains elements from a. If it does I check the position of all the elemtns from b that are present in d and return a set of elements from c that corespond these positions.
The result for the given example is
Set("3", "2")
But when I try
if(a.exists(d.contains)) c(b.indexWhere(d.contains))
I only get
Any = 2
Which corespond to the first encountered elemnt from b ie b1
How would I get the set?
-
if(a.exists(d.contains)) b.zip(c).collect{
case (x, y) if d.contains(x) => y
}
// res1: Any = List(2, 3)
If you need a Set:
if(a.exists(d.contains)) b.zip(c).collect{
case (x, y) if d.contains(x) => y
}.toSet
// res2: Any = Set(2, 3)
I think I've understood what you need to do here, although the question could do with some clarification.
These are the two ways of getting to your set that I found:
if(a.exists(d.contains)) b.collect {
case x if d.contains(x) => c(b.indexOf(x))
}.toSet
if(a.exists(d.contains)) b.filter(d.contains).map(b.indexOf).map(c).toSet
Both find elements of b that are in d, then find their index in b and find their relative elements in c. The first way is more explicit in what it's doing, while the second way is more concise.
Two newbie questions.
It seems that for comprehension knows about Options and can skip automatically None and unwrap Some, e.g.
val x = Map("a" -> List(1,2,3), "b" -> List(4,5,6), "c" -> List(7,8,9))
val r = for {map_key <- List("WRONG_KEY", "a", "b", "c")
map_value <- x get map_key } yield map_value
outputs:
r: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9))
Where do the Options go? Can someone please shed some light on how does this work? Can we always rely on this behaviour?
The second things is why this does not compile?
val x = Map("a" -> List(1,2,3), "b" -> List(4,5,6), "c" -> List(7,8,9))
val r = for {map_key <- List("WRONG_KEY", "a", "b", "c")
map_value <- x get map_key
list_value <- map_value
} yield list_value
It gives
Error:(57, 26) type mismatch;
found : List[Int]
required: Option[?]
list_value <- map_value
^
Looking at the type of the first example, I am not sure why we need to have an Option here?
For comprehensions are converted into calls to sequence of map or flatMap calls. See here
Your for loop is equivalent to
List("WRONG_KEY", "a", "b", "c").flatMap(
map_key => x.get(map_key).flatMap(map_value => map_value)
)
flatMap in Option is defined as
#inline final def flatMap[B](f: A => Option[B]): Option[B]
So it is not allowed to pass List as argument as you are notified by compiler.
I think the difference is due to the way for comprehensions are expanded into map() and flatMap method calls within the Seq trait.
For conciseness, lets define some variables:
var keys = List("WRONG_KEY", a, b, c)
Your first case is equivalent to:
val r = keys.flatMap(x.get(_))
whereas your second case is equivalent to:
val r= keys.flatMap(x.get(_).flatMap{ case y => y })
I think the issue is that Option.flatMap() should return an Option[], which is fine in the first case, but is not consistent in the second case with what the x.get().flatMap is passed, which is a List[Int].
These for-comprehension translation rules are explained in further detail in chapter 7 of "Programming Scala" by Wampler & Payne.
Maybe this small difference, setting parenthesis and calling flatten, makes it clear:
val r = for {map_key <- List("WRONG_KEY", "a", "b", "c")
| } yield x get map_key
r: List[Option[List[Int]]] = List(None, Some(List(1, 2, 3)), Some(List(4, 5, 6)), Some(List(7, 8, 9)))
val r = (for {map_key <- List("WRONG_KEY", "a", "b", "c")
| } yield x get map_key).flatten
r: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9))
That's equivalent to:
scala> List("WRONG_KEY", "a", "b", "c").map (x get _)
res81: List[Option[List[Int]]] = List(None, Some(List(1, 2, 3)), Some(List(4, 5, 6)), Some(List(7, 8, 9)))
scala> List("WRONG_KEY", "a", "b", "c").map (x get _).flatten
res82: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9))
The intermediate value (map_key) vanished as _ in the second block.
You are mixing up two different monads (List and Option) inside the for statement. This sometimes works as expected, but not always. In any case, you can trasform options into lists yourself:
for {
map_key <- List("WRONG_KEY", "a", "b", "c")
list_value <- x get map_key getOrElse Nil
} yield list_value
If I have a list of strings and I know the numeric vlaue of each string in the list how do i get the sum of the list?
Example:
I know:
a = 1
b = 2
c = 3
d = 4
e = 5
I am given the following list:
List("a","b","d")
what the best way of calculating the sum 7?
Thanks
val a = Map("a" -> 1, "b" -> 2, "c" -> 3, "d" -> 4, "e" -> 5)
val b = List("a", "b", "d")
b.map(a.getOrElse(_, 0)).sum
If you know that the values are the same as the element position, then you can avoid a map:
object test {
val list = List("a", "b", "c", "d", "e")
def sumThem = (for((letter, value) <- list.zipWithIndex) yield(value + 1)).sum
}
scala> test.sumThem
res2: Int = 15
If you're 100% sure it's only letters
List("a","b","d").foldLeft(0)(_ + _.hashCode - 96)
if not, you can map it before
val letters = (1 to 26).map(x => Character.toString((x+96).toChar) -> x).toMap
and use #sheunis's answer:
val input = List("a","b","d")
input.map(letters.getOrElse(_, 0)).sum
I have a Scala List that contains some repeated numbers. I want to count the number of times a specific number will repeat itself. For example:
val list = List(1,2,3,3,4,2,8,4,3,3,5)
val repeats = list.takeWhile(_ == List(3,3)).size
And the val repeats would equal 2.
Obviously the above is pseudo-code and takeWhile will not find two repeated 3s since _ represents an integer. I tried mixing both takeWhile and take(2) but with little success. I also referred code from How to find count of repeatable elements in scala list but it appears the author is looking to achieve something different.
Thanks for your help.
This will work in this case:
val repeats = list.sliding(2).count(_.forall(_ == 3))
The sliding(2) method gives you an iterator of lists of elements and successors and then we just count where these two are equal to 3.
Question is if it creates the correct result to List(3, 3, 3)? Do you want that to be 2 or just 1 repeat.
val repeats = list.sliding(2).toList.count(_==List(3,3))
and more generally the following code returns tuples of element and repeats value for all elements:
scala> list.distinct.map(x=>(x,list.sliding(2).toList.count(_.forall(_==x))))
res27: List[(Int, Int)] = List((1,0), (2,0), (3,2), (4,0), (8,0), (5,0))
which means that the element '3' repeats 2 times consecutively at 2 places and all others 0 times.
and also if we want element repeats 3 times consecutively we just need to modify the code as follows:
list.distinct.map(x=>(x,list.sliding(3).toList.count(_.forall(_==x))))
in SCALA REPL:
scala> val list = List(1,2,3,3,3,4,2,8,4,3,3,3,5)
list: List[Int] = List(1, 2, 3, 3, 3, 4, 2, 8, 4, 3, 3, 3, 5)
scala> list.distinct.map(x=>(x,list.sliding(3).toList.count(_==List(x,x,x))))
res29: List[(Int, Int)] = List((1,0), (2,0), (3,2), (4,0), (8,0), (5,0))
Even sliding value can be varied by defining a function as:
def repeatsByTimes(list:List[Int],n:Int) =
list.distinct.map(x=>(x,list.sliding(n).toList.count(_.forall(_==x))))
Now in REPL:
scala> val list = List(1,2,3,3,4,2,8,4,3,3,5)
list: List[Int] = List(1, 2, 3, 3, 4, 2, 8, 4, 3, 3, 5)
scala> repeatsByTimes(list,2)
res33: List[(Int, Int)] = List((1,0), (2,0), (3,2), (4,0), (8,0), (5,0))
scala> val list = List(1,2,3,3,3,4,2,8,4,3,3,3,2,4,3,3,3,5)
list: List[Int] = List(1, 2, 3, 3, 3, 4, 2, 8, 4, 3, 3, 3, 2, 4, 3, 3, 3, 5)
scala> repeatsByTimes(list,3)
res34: List[(Int, Int)] = List((1,0), (2,0), (3,3), (4,0), (8,0), (5,0))
scala>
We can go still further like given a list of integers and given a maximum number
of consecutive repetitions that any of the element can occur in the list, we may need a list of 3-tuples representing (the element, number of repetitions of this element, at how many places this repetition occurred). this is more exhaustive information than the above. Can be achieved by writing a function like this:
def repeats(list:List[Int],maxRep:Int) =
{ var v:List[(Int,Int,Int)] = List();
for(i<- 1 to maxRep)
v = v ++ list.distinct.map(x=>
(x,i,list.sliding(i).toList.count(_.forall(_==x))))
v.sortBy(_._1) }
in SCALA REPL:
scala> val list = List(1,2,3,3,3,4,2,8,4,3,3,3,2,4,3,3,3,5)
list: List[Int] = List(1, 2, 3, 3, 3, 4, 2, 8, 4, 3, 3, 3, 2, 4, 3, 3, 3, 5)
scala> repeats(list,3)
res38: List[(Int, Int, Int)] = List((1,1,1), (1,2,0), (1,3,0), (2,1,3),
(2,2,0), (2,3,0), (3,1,9), (3,2,6), (3,3,3), (4,1,3), (4,2,0), (4,3,0),
(5,1,1), (5,2,0), (5,3,0), (8,1,1), (8,2,0), (8,3,0))
scala>
These results can be understood as follows:
1 times the element '1' occurred at 1 places.
2 times the element '1' occurred at 0 places.
............................................
............................................
.............................................
2 times the element '3' occurred at 6 places..
.............................................
3 times the element '3' occurred at 3 places...
............................................and so on.
Thanks to Luigi Plinge I was able to use methods in run-length encoding to group together items in a list that repeat. I used some snippets from this page here: http://aperiodic.net/phil/scala/s-99/
var n = 0
runLengthEncode(totalFrequencies).foreach{ o =>
if(o._1 > 1 && o._2==subjectNumber) n+=1
}
n
The method runLengthEncode is as follows:
private def pack[A](ls: List[A]): List[List[A]] = {
if (ls.isEmpty) List(List())
else {
val (packed, next) = ls span { _ == ls.head }
if (next == Nil) List(packed)
else packed :: pack(next)
}
}
private def runLengthEncode[A](ls: List[A]): List[(Int, A)] =
pack(ls) map { e => (e.length, e.head) }
I'm not entirely satisfied that I needed to use the mutable var n to count the number of occurrences but it did the trick. This will count the number of times a number repeats itself no matter how many times it is repeated.
If you knew your list was not very long you could do it with Strings.
val list = List(1,2,3,3,4,2,8,4,3,3,5)
val matchList = List(3,3)
(matchList.mkString(",")).r.findAllMatchIn(list.mkString(",")).length
From you pseudocode I got this working:
val pairs = list.sliding(2).toList //create pairs of consecutive elements
val result = pairs.groupBy(x => x).map{ case(x,y) => (x,y.size); //group pairs and retain the size, which is the number of occurrences.
result will be a Map[List[Int], Int] so you can the count number like:
result(List(3,3)) // will return 2
I couldn't understand if you also want to check lists of several sizes, then you would need to change the parameter to sliding to the desired size.
def pack[A](ls: List[A]): List[List[A]] = {
if (ls.isEmpty) List(List())
else {
val (packed, next) = ls span { _ == ls.head }
if (next == Nil) List(packed)
else packed :: pack(next)
}
}
def encode[A](ls: List[A]): List[(Int, A)] = pack(ls) map { e => (e.length, e.head) }
val numberOfNs = list.distinct.map{ n =>
(n -> list.count(_ == n))
}.toMap
val runLengthPerN = runLengthEncode(list).map{ t => t._2 -> t._1}.toMap
val nRepeatedMostInSuccession = runLengthPerN.toList.sortWith(_._2 <= _._2).head._1
Where runLength is defined as below from scala's 99 problems problem 9 and scala's 99 problems problem 10.
Since numberOfNs and runLengthPerN are Maps, you can get the population count of any number in the list with numberOfNs(number) and the length of the longest repitition in succession with runLengthPerN(number). To get the runLength, just compute as above with runLength(list).map{ t => t._2 -> t._1 }.
I'm trying to learn Scala and tried to write a sequence comprehension that extracts unigrams, bigrams and trigrams from a sequence. E.g., [1,2,3,4] should be transformed to (not Scala syntax)
[1; _,1; _,_,1; 2; 1,2; _,1,2; 3; 2,3; 1,2,3; 4; 3,4; 2,3,4]
In Scala 2.8, I tried the following:
def trigrams(tokens : Seq[T]) = {
var t1 : Option[T] = None
var t2 : Option[T] = None
for (t3 <- tokens) {
yield t3
yield (t2,t3)
yield (t1,t2,Some(t3))
t1 = t2
t2 = t3
}
}
But this doesn't compile as, apparently, only one yield is allowed in a for-comprehension (no block statements either). Is there any other elegant way to get the same behavior, with only one pass over the data?
You can't have multiple yields in a for loop because for loops are syntactic sugar for the map (or flatMap) operations:
for (i <- collection) yield( func(i) )
translates into
collection map {i => func(i)}
Without a yield at all
for (i <- collection) func(i)
translates into
collection foreach {i => func(i)}
So the entire body of the for loop is turned into a single closure, and the presence of the yield keyword determines whether the function called on the collection is map or foreach (or flatMap). Because of this translation, the following are forbidden:
Using imperative statements next to a yield to determine what will be yielded.
Using multiple yields
(Not to mention that your proposed verison will return a List[Any] because the tuples and the 1-gram are all of different types. You probably want to get a List[List[Int]] instead)
Try the following instead (which put the n-grams in the order they appear):
val basis = List(1,2,3,4)
val slidingIterators = 1 to 4 map (basis sliding _)
for {onegram <- basis
ngram <- slidingIterators if ngram.hasNext}
yield (ngram.next)
or
val basis = List(1,2,3,4)
val slidingIterators = 1 to 4 map (basis sliding _)
val first=slidingIterators head
val buf=new ListBuffer[List[Int]]
while (first.hasNext)
for (i <- slidingIterators)
if (i.hasNext)
buf += i.next
If you prefer the n-grams to be in length order, try:
val basis = List(1,2,3,4)
1 to 4 flatMap { basis sliding _ toList }
scala> val basis = List(1, 2, 3, 4)
basis: List[Int] = List(1, 2, 3, 4)
scala> val nGrams = (basis sliding 1).toList ::: (basis sliding 2).toList ::: (basis sliding 3).toList
nGrams: List[List[Int]] = ...
scala> nGrams foreach (println _)
List(1)
List(2)
List(3)
List(4)
List(1, 2)
List(2, 3)
List(3, 4)
List(1, 2, 3)
List(2, 3, 4)
I guess I should have given this more thought.
def trigrams(tokens : Seq[T]) : Seq[(Option[T],Option[T],T)] = {
var t1 : Option[T] = None
var t2 : Option[T] = None
for (t3 <- tokens)
yield {
val tri = (t1,t2,t3)
t1 = t2
t2 = Some(t3)
tri
}
}
Then extract the unigrams and bigrams from the trigrams. But can anyone explain to me why 'multi-yields' are not permitted, and if there's any other way to achieve their effect?
val basis = List(1, 2, 3, 4)
val nGrams = basis.map(x => (x)) ::: (for (a <- basis; b <- basis) yield (a, b)) ::: (for (a <- basis; b <- basis; c <- basis) yield (a, b, c))
nGrams: List[Any] = ...
nGrams foreach (println(_))
1
2
3
4
(1,1)
(1,2)
(1,3)
(1,4)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(3,3)
(3,4)
(4,1)
(4,2)
(4,3)
(4,4)
(1,1,1)
(1,1,2)
(1,1,3)
(1,1,4)
(1,2,1)
(1,2,2)
(1,2,3)
(1,2,4)
(1,3,1)
(1,3,2)
(1,3,3)
(1,3,4)
(1,4,1)
(1,4,2)
(1,4,3)
(1,4,4)
(2,1,1)
(2,1,2)
(2,1,3)
(2,1,4)
(2,2,1)
(2,2,2)
(2,2,3)
(2,2,4)
(2,3,1)
(2,3,2)
(2,3,3)
(2,3,4)
(2,4,1)
(2,4,2)
(2,4,3)
(2,4,4)
(3,1,1)
(3,1,2)
(3,1,3)
(3,1,4)
(3,2,1)
(3,2,2)
(3,2,3)
(3,2,4)
(3,3,1)
(3,3,2)
(3,3,3)
(3,3,4)
(3,4,1)
(3,4,2)
(3,4,3)
(3,4,4)
(4,1,1)
(4,1,2)
(4,1,3)
(4,1,4)
(4,2,1)
(4,2,2)
(4,2,3)
(4,2,4)
(4,3,1)
(4,3,2)
(4,3,3)
(4,3,4)
(4,4,1)
(4,4,2)
(4,4,3)
(4,4,4)
You could try a functional version without assignments:
def trigrams[T](tokens : Seq[T]) = {
val s1 = tokens.map { Some(_) }
val s2 = None +: s1
val s3 = None +: s2
s1 zip s2 zip s3 map {
case ((t1, t2), t3) => (List(t1), List(t1, t2), List(t1, t2, t3))
}
}