Scala Regex with $ and String Interpolation - scala

I am writing a regex in scala
val regex = "^foo.*$".r
this is great but if I want to do
var x = "foo"
val regex = s"""^$x.*$""".r
now we have a problem because $ is ambiguous. is it possible to have string interpolation and be able to write a regex as well?
I can do something like
val x = "foo"
val regex = ("^" + x + ".*$").r
but I don't like to do a +

You can use $$ to have a literal $ in an interpolated string.
You should use the raw interpolator when enclosing a string in triple-quotes as the s interpolator will re-enable escape sequences that you might expect to be interpreted literally in triple-quotes. It doesn't make a difference in your specific case but it's good to keep in mind.
so val regex = raw"""^$x.*$$""".r

Using %s should work.
var x = "foo"
val regex = """^%s.*$""".format(x).r
In the off case you need %s to be a regex match term, just do
val regex = """^%s.*%s$""".format(x, "%s").r

Related

How to use split function in scala

I have String
val s1 = "dog#$&cat#$&cow#$&snak"
val s2 = s1.split()
how to split string into words
For a precise split, you could use #\\$& to match all 3 characters where the dollar sign has to be escaped, and the backslash itself also has to be escaped.
val s1= "dog#$&cat#$&cow#$&snak"
val s2= s1.split("#\\$&")
Output
s2: Array[String] = Array(dog, cat, cow, snak)
A broader pattern could be using \\W+ to match 1+ times any character except a word character.

Scala: Convert a string to string array with and without split given that all special characters except "(" an ")" are allowed

I have an array
val a = "((x1,x2),(y1,y2),(z1,z2))"
I want to parse this into a scala array
val arr = Array(("x1","x2"),("y1","y2"),("z1","z2"))
Is there a way of directly doing this with an expr() equivalent ?
If not how would one do this using split
Note : x1 x2 x3 etc are strings and can contain special characters so key would be to use () delimiters to parse data -
Code I munged from Dici and Bogdan Vakulenko
val x2 = a.getString(1).trim.split("[\()]").grouped(2).map(x=>x(0).trim).toArray
val x3 = x2.drop(1) // first grouping is always null dont know why
var jmap = new java.util.HashMap[String, String]()
for (i<-x3)
{
val index = i.lastIndexOf(",")
val fv = i.slice(0,index)
val lv = i.substring(index+1).trim
jmap.put(fv,lv)
}
This is still suceptible to "," in the second string -
Actually, I think regex are the most convenient way to solve this.
val a = "((x1,x2),(y1,y2),(z1,z2))"
val regex = "(\\((\\w+),(\\w+)\\))".r
println(
regex.findAllMatchIn(a)
.map(matcher => (matcher.group(2), matcher.group(3)))
.toList
)
Note that I made some assumptions about the format:
no whitespaces in the string (the regex could easily be updated to fix this if needed)
always tuples of two elements, never more
empty string not valid as a tuple element
only alphanumeric characters allowed (this also would be easy to fix)
val a = "((x1,x2),(y1,y2),(z1,z2))"
a.replaceAll("[\\(\\) ]","")
.split(",")
.sliding(2)
.map(x=>(x(0),x(1)))
.toArray

Concatenate characters in Scala

Suppose we have a string "code". How would we concatenate any two characters? Say for example we need to concatenate last two characters,
str.init.last + str.last gives result as 201. How would we get de instead?
You can use string interpolation to make any combination of characters:
scala> val code = "code"
code: String = code
scala> s"${code(1)}${code(3)}"
res0: String = oe
"code".init.last.toString + "code".last.toString
val res7: String = de
(use toString first to convert char to String and then concatenate with +)

Removing values after particular character from rdd in scala

I have following input:
(A,123#3A,B,C,D,134#wer,E,242#wer)
Is there a way to to get following output using filter/replace/trim or any other function.
(A,123,B,C,D,134,E,242)
Your question is not completely clear.
If you mean that your input is a list of strings then you can do:
val input = Seq("A","123#3A","B","C","D","134#wer","E","242#wer")
input.map(_.split("#").head)
but if you mean that your input it one string then you can do:
val input2 = "(A,123#3A,B,C,D,134#wer,E,242#wer)"
val Pattern = "\\(([a-zA-Z\\d,#]*)\\)".r
input2 match {
case Pattern(str) => "(" + str.split(",").map(_.split("#").head).mkString(",") + ")"
}

How do I escape tilde character in scala?

Given that i have a file that looks like this
CS~84~Jimmys Bistro~Jimmys
...
using tilde (~) as a delimiter, how can i split it?
val company = dataset.map(k=>k.split(""\~"")).map(
k => Company(k(0).trim, k(1).toInt, k(2).trim, k(3).trim)
The above don't work
Hmmm, I don't see where it needs to be escaped.
scala> val str = """CS~84~Jimmys Bistro~Jimmys"""
str: String = CS~84~Jimmys Bistro~Jimmys
scala> str.split('~')
res15: Array[String] = Array(CS, 84, Jimmys Bistro, Jimmys)
And the array elements don't need to be trimmed unless you know that errant spaces can be part of the input.