Hello currently I have the following string:
"STRING$INTEGER$STRING$STRING"
How can I pattern match that in scala?
Currently I know that I can use .split, but that produces a Array[String] my regex is flawed I could match everything against (.*) but that will handle the second as a String but it's an int, is there a way to have
data match {}
You can indead use a regex, but everything you match will be a String.
val format = """(\w+)\$(\d+)\$(\w+)\$(\w+)""".r
"hello$5$foo$bar" match {
case format(s1, i, s2, s3) => // i is a String
val n = i.toInt
}
You could also create an extractor which could use the regex above or split.
object Format {
def unapply(string: String) = string.split("""\$""") match {
case Array(s1, i, s2, s3) =>
Try(i.toInt).toOption.map(i => (s1, i, s2, s3))
}
}
"hello$5$foo$bar" match {
case Format(s1, i, s2, s3) => i + 5 // i is an Int
}
// Int = 10
Your question is plenty unclear, but I guess you might need something like this?
val string = "STRING$INTEGER$STRING$STRING"
val regex = """(\w+)\$(\w+)\$(\w+)\$(\w+)""".r
string match {
case regex(s1, i, s2, s3) =>
s"$s1, $i, $s2, $s3"
case _ =>
"error"
}
|> res0: String = STRING, INTEGER, STRING, STRING
To separate a string val s = "abc$123$foo$bar" by $ consider
val xs = s.split("\\$")
xs: Array[String] = Array(abc, 123, foo, bar)
To collect all the integer values in the resulting array consider for instance
xs.flatMap( s => scala.util.Try(s.toInt).toOption )
res: Array[Int] = Array(123)
were we try to convert each string into an integer value; in case of failure it delivers a None which is flattened out.
Related
This is working fine,
val Array(a,b) = "Hello,Bye".split(',')
But it is an error because extra-information is not ignored:
val Array(a,b) = "Hello,Bye,etc".split(',')
// scala.MatchError: ...
how to ignore extra-information?
Same error in the case of less items:
val Array(a,b) = "Hello".split(',')
IMPORTANT: no elegant way like the Javascript Destructuring assignment?
Add a placeholder using underscore:
val Array(a,b, _) = "Hello,Bye,etc".split(',')
EDIT: Using match-case syntax is generally more preferred and more flexible (and you can catch all possible outcome):
val s = "Hello,Bye,etc"
s.split(',') match {
case Array(a) => //...
case Array(a, b) => //...
case Array(a, b, rest#_*) => //...
case _ => //Catch all case to avoid MatchError
}
#_ will cover both instances.
val Array(a,b,x#_*) = "Hello,Bye,etc".split(',')
//a: String = Hello
//b: String = Bye
//x: Seq[String] = ArraySeq(etc)
val Array(c,d,z#_*) = "Hello,Bye".split(',')
//c: String = Hello
//d: String = Bye
//z: Seq[String] = ArraySeq()
From your comments it looks like you want to default to "", an empty String. I found a way to do it with Stream, which has been deprecated in Scala 2.13, but so far it is the cleanest solution I've found.
val Stream(a,b,c,d,_*) = "one,two,etc".split(",").toStream ++ Stream.continually("")
//a: String = one
//b: String = two
//c: String = etc
//d: String = ""
I would consider making the result values of type Option[String] by lift-ing the split Array[String] (viewed as a partial function) into an Int => Option[String] function:
val opts = "Hello".split(",").lift
// opts: Int => Option[String] = <function1>
opts(0)
// res1: Option[String] = Some(Hello)
opts(1)
// res2: Option[String] = None
Or, if String values are preferred with None translated to "":
val strs = "Hello,world".split(",").lift.andThen(_.getOrElse(""))
// strs: Int => String = scala.Function1$$Lambda$...
strs(0)
// res3: String = Hello
strs(1)
// res4: String = "world"
strs(2)
// res5: String = ""
Note that with this approach, you can take as many opts(i) or strs(i), i = 0, 1, 2, ..., as wanted.
You can do this by converting to List first:
val a :: b :: _ = "Hello,Bye,etc".split(',').toList
I am attempting to make a lexer in Scala.
I am attempting to do something like this
def lex(s: String): Expr = s match {
case num(a) => Number(a.toDouble)
case mul(a, b) => Mul(Number(a.toDouble), Number(b.toDouble))
case div(a, b) => Div(Number(a.toDouble), Number(b.toDouble))
case add(a, b) => Add(Number(a.toDouble), Number(b.toDouble))
case sub(a, b) => Sub(Number(a.toDouble), Number(b.toDouble))
case _ => Number(0)
}
where num, mul, div, add, sub are defined as so:
val num: Regex = "[0-9]+".r
val add: Regex = "[0-9]+\\s*\\+\\s*[0-9]+".r
val sub: Regex = "[0-9]+\\s*\\-\\s*[0-9]+".r
val div: Regex = "[0-9]+\\s*\\/\\s*[0-9]+".r
val mul: Regex = "[0-9]+\\s*\\*\\s*[0-9]+".r
When attempting to lex any expression (lex("1 + 2")) the result is always Number(0.0) instead of the expected Add(Number(1), Number(2))
Im not sure where it's going wrong...
You need to specify which groups you want to extract.
val num = "([0-9]+)".r
val add = "([0-9]+)\\s*\\+\\s*([0-9]+)".r
val sub = "([0-9]+)\\s*\\-\\s*([0-9]+)".r
val div = "([0-9]+)\\s*\\/\\s*([0-9]+)".r
val mul = "([0-9]+)\\s*\\*\\s*([0-9]+)".r
You need one pair of parentheses per variable you extract.
I have a list and two strings :
val features = List("one","two","three")
val strOne = "one_five"
val strTwo = "seven_five"
I'd like to match each string to items of the list.
If beginning of string matches one of list items then print matched list item and string itself.
If not, nothing to print.
I have method that I think make what I need but I cannot compile it :
def getElement(any: String): String = any match {
case s :: rest if features.contains(s) => s + "= " + any
case _ => // Nothing
}
I wanted the following :
scala> getElement(strOne)
"one_five= one"
scala> getElement(strTwo)
You can't just return nothing. You promised that your method would return a String, so you must return one. You can either return an Option[String] (preferred) or return Unit and do the printing yourself. Further, the built in method TraversableLike#find will do part of the job.
def findFeature(str: String): Option[String] = features.find(_ startsWith str) map { value => s"$str=$value" }
In order to get the printing behavior:
findFeature(str) foreach println
// or redefine findFeature similarly
Further, you seem to misunderstand pattern matching: You don't want to match on the string; you want to match the list's elements against the string. Here's a version that uses pattern matching:
def getElement(feature: String): Option[String] = {
#tailrec def getElem0(feature: String, strs: List[String]): Option[String] = strs match {
case s :: _ if s startsWith feature => Some(s"$feature=$s") // Matching case
case _ :: rest => getElem0(feature, rest) // Not matched, but more to search
case Nil => None // Empty list; failure
}
getElem0(feature, features)
}
Your solution can't compile because :: is a List method, and s is a String. Moreover, getElement is declared to return a String therefore it should return a String for any input. So you can't just return "nothing" in the second case.
Here's an alternative implementation:
def printElement(any: String): Unit = features
.find(s => any.startsWith(s)) // find matching (returns Option[String])
.foreach(s => println(s + "= "+ any)) // print if found
printElement(strOne) // one= one_five
printElement(strTwo)
Simple one line Scala code
Find in list the item who's first part is present in the list
features.find(_ == str.split("_")(0)).map { elem => s"$str= $elem"}.getOrElse("")
Put the above line inside the function.
def getElement(str: String): String = features.find(_ == str.split("_")(0)).map { elem => s"$str= $elem"}.getOrElse("")
Scala REPL
scala> val strOne = "one_five"
strOne: String = one_five
scala> val str = "one_five"
str: String = one_five
scala> features.find(_ == str.split("_")(0)).getOrElse("")
res2: String = one
scala> features.find(_ == str.split("_")(0)).map(elem => s"$str= $elem").getOrElse("")
res3: String = one_five= one
What would be the best way to combine "__L_" and "_E__" to "_EL_" in Scala?
I don't want to use if and for commands.
How about this:
def combine(xs: String, ys: String): String =
(xs zip ys).map {
case ('_', y) => y
case (x, _) => x
}.mkString("")
The only thing that is not really nice about this is how to get back from a collection (IndexedSeq[Char]) to a string. An alternative is to use the String constructor that takes an Array[Char]. That would probably be more efficient.
Note that zip will work for strings of different length, but the result will be the size of the shorter string. This may or may not be what you want.
def zipStrings(first: String,
second: String,
comb: (Char, Char) => String = (f, _) => f.toString,
placeholder: Char = '_') =
first.zipAll(second, '_', '_').map {
case (c, `placeholder`) => c
case (`placeholder`, c) => c
case (f, s) => comb(f, s)
}.mkString
that prioritises characters from first over second by default
zipStrings("__A", "X_CD") // yields "X_AD"
zipStrings("A__YY", "BXXXX", (f, s) => s"($f|$s)") // yields "(A|B)XX(Y|X)(Y|X)"
For you original strings:
zipStrings("L_", "_E") // yield "LE"
zipStrings("--L-", "-E--", placeholder = '-') // yields "-EL-"
The following Iterable can be o size one, two, or (up to) three.
org.apache.spark.rdd.RDD[Iterable[(String, String, String, String, Long)]] = MappedRDD[17] at map at <console>:75
The second element of each tuple can have any of the following values: A, B, C. Each of these values can appear (at most) once.
What I would like to do is sort them based on the following order (B , A , C), and then create a string by concatenating the elements of the 3rd place. If the corresponding tag is missing then concatenate with a blank space: ``. For example:
this:
CompactBuffer((blah,A,val1,blah,blah), (blah,B,val2,blah,blah), (blah,C,val3,blah,blah))
should result in:
val2,val1,val3
this:
CompactBuffer((blah,A,val1,blah,blah), (blah,C,val3,blah,blah))
should result in:
,val1,val3
this:
CompactBuffer((blah,A,val1,blah,blah), (blah,B,val2,blah,blah))
should result in:
val2,val1,
this:
CompactBuffer((blah,B,val2,blah,blah))
should result in:
val2,,
and so on so forth.
In your case when A, B and C appear at most once, you could add the corresponding values to a temporary map and retrieve the values from the map in the correct order.
If we use getOrElse to get the values from the map, we can specify the empty string as default value. This way we still get the correct result if our Iterable doesn't contain all the tuples with A, B and C.
type YourTuple = (String, String, String, String, Long)
def orderTuples(order: List[String])(iter: Iterable[YourTuple]) = {
val orderMap = iter.map { case (_, key, value, _, _) => key -> value }.toMap
order.map(s => orderMap.getOrElse(s, "")).mkString(",")
}
We can use this function as follows :
val a = ("blah","A","val1","blah",1L)
val b = ("blah","B","val2","blah",2L)
val c = ("blah","C","val3","blah",3L)
val order = List("B", "A", "C")
val bacOrder = orderTuples(order) _
bacOrder(Iterable(a, b, c)) // String = val2,val1,val3
bacOrder(Iterable(a, c)) // String = ,val1,val3
bacOrder(Iterable(a, b)) // String = val2,val1,
bacOrder(Iterable(b)) // String = val2,,
def orderTuples(xs: Iterable[(String, String, String, String, String)],
order: (String, String, String) = ("B", "A", "C")) = {
type T = Iterable[(String, String, String, String, String)]
type KV = Iterable[(String, String)]
val ord = List(order._1, order._2, order._3)
def loop(xs: T, acc: KV, vs: Iterable[String] = ord): KV = xs match {
case Nil if vs.isEmpty => acc
case Nil => vs.map((_, ",")) ++: acc
case x :: xs => loop(xs, List((x._2, x._3)) ++: acc, vs.filterNot(_ == x._2))
}
def comp(x: String) = ord.zipWithIndex.toMap.apply(x)
loop(xs, Nil).toList.sortBy(x => comp(x._1)).map(_._2).mkString(",")
}