I am new to Scala so feel free to point me in the direction of documentation but I was not able to find an answer to this question in my research.
I am using scala 2.11.8 with Spark2.2 and trying to create a dynamic string containing dateString1_dateString2 (with underscores) using interpolation but having some issues.
val startDt = "20180405"
val endDt = "20180505"
This seems to work:
s"$startDt$endDt"
res62: String = 2018040520180505
But this fails:
s"$startDt_$endDt"
<console>:27: error: not found: value startDt_
s"$startDt_$endDt"
^
I expected this simple workaround with escapes to work but does not produce desired results:
s"$startDt\\_$endDt"
res2: String = 20180405\_20180505
Note that this question differs from Why can't _ be used inside of string interpolation? in that this question is looking to find a workable string interpolation solution while the previous question is much more internals-of-scala focused.
You can be explicit using curly braces:
# s"${startDt}_${endDt}"
res11: String = "20180405_20180505"
Your code:
s"$startDt_$endDt"
fails since startDt_ is a valid identifier, and scala tries to interpolate that non-existant variable.
Related
I have a spark scala dataframe which has column "Name"
I have extracted the values of that column in to scala array[string]
org_name: Array[String] = Array(SARATOGA SENIOR HIGH SCHOOL)
I want to replace whitespaces with _ and encode that value in to utf-8 (any encoding is fine as long as it replaces special chars with something else)
so if there are any special chars those will be removed. later i want to use those in file path .
var org_name = orgsFlatDF.rdd.collect
.map( _.getString(2))
This is how i am extracting those vals ^^. I haven't found any method which I can use to do that. Replace or replaceall doesn't work on array
I tried this :
org_name.replace("\\s", "")
That didn't work .
Expected output : SARATOGA_SENIOR_HIGH_SCHOOL
if name is : new $ high school it should gets converted to new_$_high_school then encoded to new_%24_high_school
There are a couple of issues with what you are asking.
Java/Scala Arrays don't have a replace method. Even if they did have a replace method, would they replace the values they hold or the characters in a String they hold?
Let's assume this line org_name.replace("\\s", "") didn't compiled and org_name is indeed a an Array[String] holding one element.
scala> val org_name=Array("SARATOGA SENIOR HIGH SCHOOL")
val org_name: Array[String] = Array(SARATOGA SENIOR HIGH SCHOOL)
scala> org_name(0).replace(" ","_")
val res15: String = SARATOGA_SENIOR_HIGH_SCHOOL
replace("\\s","_") wouldn't work because it represents a \s string. "\" represents \. That's only way you'd be able to define strings containing other escape codes like \n or \t.
PS: to transform all the string in the array use org_name.map(_.replace(" ","_")), this gives you back another another array.
This question already has answers here:
How to insert double quotes into String with interpolation in scala
(13 answers)
Closed 5 years ago.
I have the following lines in repl
scala> val accountID = "123"
accountID: String = 123
scala> s"{\"AccountID\":\$accountID\, \"ProcessMessage\":\"true\", \"Reason\":\"Integration Test Message\"}"
<console>:1: error: ';' expected but string literal found.
s"{\"AccountID\":\"$accountID\", \"ProcessMessage\":\"true\", \"Reason\":\"Integration Test Message\"}"
^
I assume it's some small silly quotations thing, but I still want to understand what I am doing wrong here. If I put the account ID directly it evaluates fine.
Use triple quotes and remove \s
scala> s"""{"AccountID":"${accountID}", "ProcessMessage":"true", "Reason":"Integration Test Message"}"""
res6: String = {"AccountID":"123", "ProcessMessage":"true", "Reason":"Integration Test Message"}
Imagine that I wanted to take the characters from a string in Scala but have the toInt conversion to behave as it would on a string instead of as on a character.
To illustrate the following code behaves like so:
"0".toInt // results in 0
"000".charAt(0).toInt // results in 48
I'd like a version of the second line that would also result in 0. I have a solution like the following:
"000".charAt(0).toString.toInt // results in 0
But I wonder if there is a more direct or better way?
You can use asDigit:
val i: Int = "000".charAt(0).asDigit
You can do:
"000".substring(0, 1).toInt
But I'm not sure it's more "direct" than "000".charAt(0).toString.toInt
I want to implement a Scala-style string interpolation in Scala. Here is an example,
val str = "hello ${var1} world ${var2}"
At runtime I want to replace "${var1}" and "${var2}" with some runtime strings. However, when trying to use Regex.replaceAllIn(target: CharSequence, replacer: (Match) ⇒ String), I ran into the following problem:
import scala.util.matching.Regex
val placeholder = new Regex("""(\$\{\w+\})""")
placeholder.replaceAllIn(str, m => s"A${m.matched}B")
java.lang.IllegalArgumentException: No group with name {var1}
at java.util.regex.Matcher.appendReplacement(Matcher.java:800)
at scala.util.matching.Regex$Replacement$class.replace(Regex.scala:722)
at scala.util.matching.Regex$MatchIterator$$anon$1.replace(Regex.scala:700)
at scala.util.matching.Regex$$anonfun$replaceAllIn$1.apply(Regex.scala:410)
at scala.util.matching.Regex$$anonfun$replaceAllIn$1.apply(Regex.scala:410)
at scala.collection.Iterator$class.foreach(Iterator.scala:743)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1174)
at scala.util.matching.Regex.replaceAllIn(Regex.scala:410)
... 32 elided
However, when I removed '$' from the regular expression, it worked:
val placeholder = new Regex("""(\{\w+\})""")
placeholder.replaceAllIn(str, m => s"A${m.matched}B")
res2: String = hello $A{var1}B world $A{var2}B
So my question is that whether this is a bug in Scala Regex. And if so, are there other elegant ways to achieve the same goal (other than brutal force replaceAllLiterally on all placeholders)?
$ is a treated specially in the replacement string. This is described in the documentation of replaceAllIn:
In the replacement String, a dollar sign ($) followed by a number will be interpreted as a reference to a group in the matched pattern, with numbers 1 through 9 corresponding to the first nine groups, and 0 standing for the whole match. Any other character is an error. The backslash (\) character will be interpreted as an escape character and can be used to escape the dollar sign. Use Regex.quoteReplacement to escape these characters.
(Actually, that doesn't mention named group references, so I guess it's only sort of documented.)
Anyway, the takeaway here is that you need to escape the $ characters in the replacement string if you don't want them to be treated as references.
new scala.util.matching.Regex("""(\$\{\w+\})""")
.replaceAllIn("hello ${var1} world ${var2}", m => s"A\\${m.matched}B")
// "hello A${var1}B world A${var2}B"
It's hard to tell what you're expecting the behavior to do. The issue is that s"${m.matched}" is turning into "${var1}" (and "${var2}"). The '$' is special character to say "place the group with name {var1} here instead".
For example:
scala> placeholder.replaceAllIn(str, m => "$1")
res0: String = hello ${var1} world ${var2}
It replaces the match with the first capturing group (which is m itself).
It's hard to tell exactly what you're doing, but you could escape any $ like so:
scala> placeholder.replaceAllIn(str, m => s"${m.matched.replace("$","\\$")}")
res1: String = hello ${var1} world ${var2}
If what you really want to do is evaluate var1/var2 for some variables in the local scope of the method; that's not possible. In fact, the s"Hello, $name" pattern is actually converted into new StringContext("Hello, ", "").s(name) at compile time.
I have the following code:
object testLines extends App {
val items = Array("""a-b-c d-e-f""","""a-b-c th-i-t""")
val lines = items.map(_.replaceAll("-", "")split("\t"))
print(lines.map(_.mkString(",")).mkString("\n"))
}
By mistake i did not put a dot between replaceAll and split but it worked.
By contrary when putting a dot between replaceAll and split i got an error
identifier expected but ';' found.
Implicit conversions found: items =>
What is going on?
Why does it work without a dot but is not working with a dot.
Update:
It works also with dot. The error message is a bug in the scala ide. The first part of the question is still valid
Thanks,
David
You have just discovered that Operators are methods. x.split(y) can also be written x split y in cases where the method is operator-like and it looks nicer. However there is nothing stopping you putting either side in parentheses like x split (y), (x) split y, or even (x) split (y) which may be necessary (and is a good idea for readability even if not strictly necessary) if you are passing in a more complex expression than a simple variable or constant and need parentheses to override the precedence.
With the example code you've written, it's not a bad idea to do the whole thing in operator style for clarity, using parentheses only where the syntax requires and/or they make groupings more obvious. I'd probably have written it more like this:
object testLines extends App {
val items = Array("a-b-c d-e-f", "a-b-c th-i-t")
val lines = items map (_ replaceAll ("-", "") split "\t")
print(lines map (_ mkString ",") mkString "\n")
}