how to delete the elements that have the same prefix String (2 to 5 Chars) in a List? - scala

there is a List as below:
val a: List[String] = List(aaaaa1, aaaaa2, bb, cc, dd1, ee, dd2, dd3, ff, ggg1, ggg2, aaaaa3)
how to delete the elements that have the same prefix String (2 to 5 Chars) in a List?
for example:
"aaaaa1","aaaaa2","aaaaa3" have the same prefix String "aaaaa"(5 Chars). so delete them.
"dd1","dd2","dd3" have the same prefix String "dd"(2 Chars). so delete them.
"ggg1","ggg2" have the same prefix String "ggg"(3 Chars). so delete them.
expected:
val b: List[String] = List(bb,cc,ee,ff)
==========
thx for your idea. now fix it.
a.foldLeft(scala.collection.mutable.LinkedHashMap[String,Int]().withDefaultValue(0)){
case(m,e) =>
val k = e.take(2)
m(k)+=1
m
}.filter(_._2==1).keys.toList

Try this.
a.groupBy(_.take(2)).values.collect{case x if x.length == 1 => x.head}
// res0: Iterable[String] = List(cc, bb, ee, ff)
The original order is not retained because the collection passes through a Map() phase which, by definition, has no intrinsic order.
update
The original order can be preserved but it requires a two-step procedure.
val uniqPrefix = a.groupBy(_.take(2)).mapValues(_.length == 1)
a.filter(x => uniqPrefix(x.take(2)))
// res0: List[String] = List(bb, cc, ee, ff)

I think this code can be used in this case:
val list = List(
"aaaaa1", "aaaaa2", "bb", "cc", "dd1", "ee", "dd2", "dd3", "ff", "ggg1", "ggg2", "aaaaa3")
val prefix2items = list.groupBy(_.take(2))
list.filter(item => prefix2items(item.take(2)).length == 1)
//res0: List[String] = List(bb, cc, ee, ff)

This would work even if you have multiple character in prefix
The prefix will be identify until get a digit, the character after digit will not consider as prefix here
val a = List("abb1","abbbbbb1","aaaaa2","aaaaa3","aaaaaa44")
"abb1" will be removed here
a.filter(_.takeWhile(data => !data.isDigit).groupBy(_.toChar).values.exists{case x if (2 to 5).contains(x.length) => true case _ => false})

Related

How to xor char within string and add to List?

In this code where I'm attempting to xor the corresponding characters of two strings :
val s1 = "1c0111001f010100061a024b53535009181c";
val s2 = "686974207468652062756c6c277320657965";
val base64p1 = Base64.getEncoder().encodeToString(new BigInteger(s1, 16).toByteArray())
val base64p2 = Base64.getEncoder().encodeToString(new BigInteger(s2, 16).toByteArray())
val zs : IndexedSeq[(Char, Char)] = base64p1.zip(base64p2);
val xor = zs.foldLeft(List[Char]())((a: List[Char] , b: (Char, Char)) => ((Char)((b._1 ^ b._2))) :: a)
produces error :
Char.type does not take parameters
[error] val xor = zs.foldLeft(List[Char]())((a: List[Char] , b: (Char, Char)) => ((Char)((b._1 ^ b._2))) :: a)
How to xor the corresponding string char values and add them to List ?
What you're doing is can be simplified.
val xor = base64p1.zip(base64p2).map{case (a,b) => (a^b).toChar}.reverse
The result of the XOR op (^) is an Int. Just add .toChar to change it to a Char value.
But it looks like what you really want to do is XOR two large hex values that are represented as strings, and then return the result as a string. To do that all you need is...
val (v1, v2) = (BigInt(s1, 16), BigInt(s2, 16))
f"${v1 ^ v2}%x" // res0: String = 746865206b696420646f6e277420706c6179
You use java casting syntax. In scalla you cast like var.asInstanceOf[Type].
Should be (b._1 ^ b._2).asInstanceOf[Char].

Map versus FlatMap on String

Listening to the Collections lecture from Functional Programming Principles in Scala, I saw this example:
scala> val s = "Hello World"
scala> s.flatMap(c => ("." + c)) // prepend each element with a period
res5: String = .H.e.l.l.o. .W.o.r.l.d
Then, I was curious why Mr. Odersky didn't use a map here. But, when I tried map, I got a different result than I expected.
scala> s.map(c => ("." + c))
res8: scala.collection.immutable.IndexedSeq[String] = Vector(.H, .e, .l, .l, .o,
". ", .W, .o, .r, .l,
I expected that above call to return a String, since I'm map-ing, i.e. applying a function to each item in the "sequence," and then returning a new "sequence."
However, I could perform a map rather than flatmap for a List[String]:
scala> val sList = s.toList
sList: List[Char] = List(H, e, l, l, o, , W, o, r, l, d)
scala> sList.map(c => "." + c)
res9: List[String] = List(.H, .e, .l, .l, .o, ". ", .W, .o, .r, .l, .d)
Why was a IndexedSeq[String] the return type of calling map on the String?
The reason for this behavior is that, in order to apply "map" to a String, Scala treats the string as a sequence of chars (IndexedSeq[String]). This is what you get as a result of the map invocation, where for each element of said sequence, the operation is applied. Since Scala treated the string as a sequence to apply map, that is what mapreturns.
flatMap then simply invokes flatten on that sequence afterwards, which then "converts" it back to a String
You also have an interesting "collection of Scala flatMap examples", the first of which illustrates that difference between flatMap and map:
scala> val fruits = Seq("apple", "banana", "orange")
fruits: Seq[java.lang.String] = List(apple, banana, orange)
scala> fruits.map(_.toUpperCase)
res0: Seq[java.lang.String] = List(APPLE, BANANA, ORANGE)
scala> fruits.flatMap(_.toUpperCase)
res1: Seq[Char] = List(A, P, P, L, E, B, A, N, A, N, A, O, R, A, N, G, E)
Quite a difference, right?
Because flatMap treats a String as a sequence of Char, it flattens the resulting list of strings into a sequence of characters (Seq[Char]).
flatMap is a combination of map and flatten, so it first runs map on the sequence, then runs flatten, giving the result shown.
You can see this by running map and then flatten yourself:
scala> val mapResult = fruits.map(_.toUpperCase)
mapResult: Seq[String] = List(APPLE, BANANA, ORANGE)
scala> val flattenResult = mapResult.flatten
flattenResult: Seq[Char] = List(A, P, P, L, E, B, A, N, A, N, A, O, R, A, N, G, E)
Your map function c => ("." + c) takes a char and returns a String. It's like taking a List and returning a List of Lists. flatMap flattens that back.
If you would return a char instead of a String you wouldn't need the result flattened, e.g. "abc".map(c => (c + 1).toChar) returns "bcd".
With map you are taking a list of characters and turning it into a list of strings. That's the result you see. A map never changes the length of a list – the list of strings has as many elements as the original string has characters.
With flatMap you are taking a list of characters and turning it into a list of strings and then you mush those strings together into a single string again. flatMap is useful when you want to turn one element in a list into multiple elements, without creating a list of lists. (This of course also means that the resulting list can have any length, including 0 – this is not possible with map unless you start out with the empty list.)
Use flatMap in situations where you run map followed by flattern. The specific situation is this:
• You’re using map (or a for/yield expression) to create a new collection from an existing collection.
• The resulting collection is a List of Lists.
• You call flatten immediately after map (or a for/yield expression).
When you’re in this situation, you can use flatMap instead.
Example: Add all the Integers from the bag
val bag = List("1", "2", "three", "4", "one hundred seventy five")
def toInt(in: String): Option[Int] = {
try {
Some(Integer.parseInt(in.trim))
} catch {
case e: Exception => None
}
}
Using a flatMap method
> bag.flatMap(toInt).sum
Using map method (3 steps needed)
bag.map(toInt) // List[Option[Int]] = List(Some(1), Some(2), None, Some(4), None)
bag.map(toInt).flatten //List[Int] = List(1, 2, 4)
bag.map(toInt).flatten.sum //Int = 7

Map a variable of type of Pair -- impossible

This seems not logical for me:
scala> val a = Map((1, "111"), (2, "222"))
a: scala.collection.immutable.Map[Int,String] = Map(1 -> 111, 2 -> 222)
scala> val b = a.map((key, value) => value)
<console>:8: error: wrong number of parameters; expected = 1
val b = a.map((key, value) => value)
^
scala> val c = a.map(x => x._2)
c: scala.collection.immutable.Iterable[String] = List(111, 222)
I know that I can say val d = a.map({ case(key, value) => value })
But why isn't it possible to say a.map((key, value) => value) ? There is only one argument there of type Tuple2[Int, String] or Pair of Int, String. What's the difference between a.map((key, value) => value) and a.map(x => x._2) ?
UPDATE:
val myTuple2 = (1, 2) -- this is one variable, correct?
for ( (k, v) <- a ) yield v -- (k, v) is also only one variable, correct?
map((key, value) => value) -- 2 variables. weird.
So how do I specify a variable of type Tuple2 (or any other type) in map without using case?
UPDATE2:
What's wrong with that?
Map((1, "111"), (2, "222")).map( ((x,y):Tuple2[Int, String]) => y) -- wrong
Map((1, "111"), (2, "222")).map( ((x):Tuple2[Int, String]) => x._2) -- ok
Okay, you still not convinced. In cases like this it is pretty reasonable to fallback to the source of the truth (well, kinda): The Holy Specification (aka, Scala Language Specification).
So, in anonymous function parameters are treated on individual basis, not as a whole tuple band (and it is pretty smart, otherwise, how would you call the anonymous function with 2, ... n parameters?).
At the same time
val x = (1, 2)
is a single item of type Tiple2[Int,Int] (if you're interested you may find corresponding section of spec as well).
for ( (k, v) <- a ) yield v
In this case you have one variable unpacked to two variables. It is similar to
val x = (1, 2) // one variable -- tuple
val (y,z) = x // two integer variables unpacked from one
Some call this destructuring assignment and this is a particular case of pattern matching. And you've already provided another example of pattern matching in action:
a.map({ case(key, value) => value })
Which we can read as map accepts a function produced by a partial function literal, which enables use of pattern matching.
You're basically asking this same questions:
Scala - can a lambda parameter match a tuple?
You've already listed most of the options they listed there, including the accepted answer of using a PartialFunction.
However, since you're using your lambda in a map function, you could use a for comprehension instead:
for ( (k, v) <- a ) yield v
Alternatively, you can use the Function2.tupled method to fix your lambda's type:
scala> val a = Map((1, "111"), (2, "222"))
a: scala.collection.immutable.Map[Int,String] = Map(1 -> 111, 2 -> 222)
scala> a.map( ((k:Int,v:String) => v).tupled )
res1: scala.collection.immutable.Iterable[String] = List(111, 222)
To answer your question in your thread with om-nom-nom above, look at this output:
scala> ( (x:Int,y:String) => y ).getClass.getSuperclass
res0: Class[?0] forSome { type ?0 >: ?0; type ?0 <: (Int, String) => String } = class scala.runtime.AbstractFunction2
Notice that the superclass of the anonymous function (x:Int,y:String) => y is Function2[Int, String, String], not Function1[(Int, String), String].
You can use pattern matching (or partial function, in this instance this is the same), notice angular brackets:
val b = a.map{ case (key, value) => value }

Standard function to enumerate all strings of given length over given alphabet

Suppose, I have an alphabet of N symbols and want to enumerate all different strings of length M over this alphabet. Does Scala provide any standard library function for that?
Taking inspiration from another answer:
val letters = Seq("a", "b", "c")
val n = 3
Iterable.fill(n)(letters) reduceLeft { (a, b) =>
for(a<-a;b<-b) yield a+b
}
Seq[java.lang.String] = List(aaa, aab, aac, aba, abb, abc, aca, acb, acc, baa, bab, bac, bba, bbb, bbc, bca, bcb, bcc, caa, cab, cac, cba, cbb, cbc, cca, ccb, ccc)
To work with something other than strings:
val letters = Seq(1, 2, 3)
Iterable.fill(n)(letters).foldLeft(List(List[Int]())) { (a, b) =>
for (a<-a;b<-b) yield(b::a)
}
The need for the extra type annotation is a little annoying but it will not work without it (unless someone knows another way).
Another solution:
val alph = List("a", "b", "c")
val n = 3
alph.flatMap(List.fill(alph.size)(_))
.combinations(n)
.flatMap(_.permutations).toList
Update: If you want to get a list of strings in the output, then alph should be a string.
val alph = "abcd"

Scala: How do I use fold* with Map?

I have a Map[String, String] and want to concatenate the values to a single string.
I can see how to do this using a List...
scala> val l = List("te", "st", "ing", "123")
l: List[java.lang.String] = List(te, st, ing, 123)
scala> l.reduceLeft[String](_+_)
res8: String = testing123
fold* or reduce* seem to be the right approach I just can't get the syntax right for a Map.
Folds on a map work the same way they would on a list of pairs. You can't use reduce because then the result type would have to be the same as the element type (i.e. a pair), but you want a string. So you use foldLeft with the empty string as the neutral element. You also can't just use _+_ because then you'd try to add a pair to a string. You have to instead use a function that adds the accumulated string, the first value of the pair and the second value of the pair. So you get this:
scala> val m = Map("la" -> "la", "foo" -> "bar")
m: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(la -> la, foo -> bar)
scala> m.foldLeft("")( (acc, kv) => acc + kv._1 + kv._2)
res14: java.lang.String = lalafoobar
Explanation of the first argument to fold:
As you know the function (acc, kv) => acc + kv._1 + kv._2 gets two arguments: the second is the key-value pair currently being processed. The first is the result accumulated so far. However what is the value of acc when the first pair is processed (and no result has been accumulated yet)? When you use reduce the first value of acc will be the first pair in the list (and the first value of kv will be the second pair in the list). However this does not work if you want the type of the result to be different than the element types. So instead of reduce we use fold where we pass the first value of acc as the first argument to foldLeft.
In short: the first argument to foldLeft says what the starting value of acc should be.
As Tom pointed out, you should keep in mind that maps don't necessarily maintain insertion order (Map2 and co. do, but hashmaps do not), so the string may list the elements in a different order than the one in which you inserted them.
The question has been answered already, but I'd like to point out that there are easier ways to produce those strings, if that's all you want. Like this:
scala> val l = List("te", "st", "ing", "123")
l: List[java.lang.String] = List(te, st, ing, 123)
scala> l.mkString
res0: String = testing123
scala> val m = Map(1 -> "abc", 2 -> "def", 3 -> "ghi")
m: scala.collection.immutable.Map[Int,java.lang.String] = Map((1,abc), (2,def), (3,ghi))
scala> m.values.mkString
res1: String = abcdefghi