Scala - write Windows file paths that contain spaces as string literals - scala

I need to make some Windows file paths that contain spaces into string literals in Scala. I have tried wrapping the entire path in double quotes AND wrapping the entire path in double quotes with each directory name that has a space with single quotes. Now it is wanting an escape character for "\Jun" in both places and I don't know why.
Here are the strings:
val input = "R:\'Unclaimed Property'\'CT State'\2015\Jun\ct_finderlist_2015.pdf"
val output = "R:\'Unclaimed Property'\'CT State'\2015\Jun"
Here is the latest error:

The problem is with the \ character, that has to be escaped.
This should work:
val input = "R:\\Unclaimed Property\\CT State\\2015\\Jun.ct_finderlist_2015.pdf"
val output = "R:\\Unclaimed Property\\CT State\\2015\\Jun"

A cleaner way to create string literals is to use triple quotes.
You can wrap your string directly in triple quotes without escaping special characters. And you can put multiple lines string in it.
It's much easier to code and read.
For example
val input =
"""
|R:\Unclaimed Property\CT State\2015\Jun.ct_finderlist_2015.pdf
"""
To add a variable to the string, do it like the following by adding "$variableName".
val input =
s"""
|R:\Unclaimed Property\$variablePath\CT State\2015\Jun.ct_finderlist_2015.pdf
"""

Related

How do i replace whitespace with underscore and encode values in scala array / list

I have a spark scala dataframe which has column "Name"
I have extracted the values of that column in to scala array[string]
org_name: Array[String] = Array(SARATOGA SENIOR HIGH SCHOOL)
I want to replace whitespaces with _ and encode that value in to utf-8 (any encoding is fine as long as it replaces special chars with something else)
so if there are any special chars those will be removed. later i want to use those in file path .
var org_name = orgsFlatDF.rdd.collect
.map( _.getString(2))
This is how i am extracting those vals ^^. I haven't found any method which I can use to do that. Replace or replaceall doesn't work on array
I tried this :
org_name.replace("\\s", "")
That didn't work .
Expected output : SARATOGA_SENIOR_HIGH_SCHOOL
if name is : new $ high school it should gets converted to new_$_high_school then encoded to new_%24_high_school
There are a couple of issues with what you are asking.
Java/Scala Arrays don't have a replace method. Even if they did have a replace method, would they replace the values they hold or the characters in a String they hold?
Let's assume this line org_name.replace("\\s", "") didn't compiled and org_name is indeed a an Array[String] holding one element.
scala> val org_name=Array("SARATOGA SENIOR HIGH SCHOOL")
val org_name: Array[String] = Array(SARATOGA SENIOR HIGH SCHOOL)
scala> org_name(0).replace(" ","_")
val res15: String = SARATOGA_SENIOR_HIGH_SCHOOL
replace("\\s","_") wouldn't work because it represents a \s string. "\" represents \. That's only way you'd be able to define strings containing other escape codes like \n or \t.
PS: to transform all the string in the array use org_name.map(_.replace(" ","_")), this gives you back another another array.

How do you expand one literal of regex into multiple lines?

For example, I have a regex string:
val myRegex:Regex = "blahblah".r
but if the 'blahblah' is like more than thousand characters long, I want to split them into multiple lines so I can read easier. like so:
val myRegex:Regex = "blah".r
+ "blah".r
this does not work because value unary_+ is not a member of scala.util.matching.Regex.
is there a proper way?
One possible solution:
val myRegex:Regex =
"""a
|very
|long
|pattern
|"""
.stripMargin
.replaceAll("\n", "")
.r

Scala string formating exercises error: not compiling

I am working on the exercises from https://www.scala-exercises.org/std_lib/formatting
For the following question, m answer seems incorrect but I do not know why.
val c = 'a' //unicode for a
val d = '\141' //octal for a
val e = '\"'
val f = '\\'
"%c".format(c) should be("a") //my answers
"%c".format(d) should be("a")
"%c".format(e) should be(")
"%c".format(f) should be(\)
your answer should be enclosed in quotes
"%c".format(e) should be("\"")
"%c".format(f) should be("\\")
because it wouldn't recognize string unless it's enclosed in quotes
Your last two lines are invalid Scala code and cannot be compiled:
// These are wrong
"%c".format(e) should be(")
"%c".format(f) should be(\)
The be() function needs to be passed a String, and neither of those calls are being passed a String. A String needs to start and end with a double-quote (there are some exceptions).
// In this case you started a String with a double-quote, but you are never
// closing the string with a second double-quote
"%c".format(e) should be(")
// In this case you are missing both double-quotes
"%c".format(f) should be(\)
In this case the code should be:
"%c".format(e) should be("\"")
"%c".format(f) should be("\\")
If you want a character to be treated literally in a String, you need to "escape" it with a backslash. So if you want to literally show a double-quote, you need to prefix it with a backslash:
\"
And as a String:
"\""
Similarily for a backslash:
\\
As a String:
"\\"
Using an IDE makes this easier to see. Using IntelliJ the String is green but the special non-literal characters are highlighted in orange.
Check quote signs.
https://www.tutorialspoint.com/scala/scala_strings.htm
https://docs.scala-lang.org/overviews/core/string-interpolation.html
https://learnxinyminutes.com/docs/scala/
You can run Scala code online and check yourself here:
https://scastie.scala-lang.org
https://ideone.com/

remove pipe delimiter from data using spark

i am new to spark, i am using scala to separate pipe delimited file and save in hdfs without pipe delimited, for that i have written this code.
object WordCount {
def main(args: Array[String])
{
val textfile = sc.textFile("/user/cloudera/xxxx/xxxx")
val word = textfile.map( l => l.split("|"))
word.saveAsTextFile("/user/cloudera/xxxxx/Sparktest")
}
}
but when i am executing it i am not getting any error's but in my hdfs i am getting below data.
[Ljava.lang.String;#10ed847f
[Ljava.lang.String;#4316ebe
[Ljava.lang.String;#495d7e18
[Ljava.lang.String;#19017f49
[Ljava.lang.String;#314b9e72
[Ljava.lang.String;#5b8f67a6
[Ljava.lang.String;#23ddf240
[Ljava.lang.String;#404b5a25
[Ljava.lang.String;#130b541d
[Ljava.lang.String;#4cbf45af
[Ljava.lang.String;#21780b86
[Ljava.lang.String;#503c9b94
[Ljava.lang.String;#3b0a3ab3
i don't know what i am doing wrong.
Please help
That's because you are splitting each string into a Array of Strings. To save as text file, you'll need to use mkString(",") if you wish to concatenate with a comma. But I don't see any purpose in that.
If you want to replace pipe separator by a comma, you can use _.replaceAll("|",",") instead and save it :
val word = textfile.map(_.replaceAll("\\|",",").replaceFirst(",","").trim)
word.saveAsTextFile("/user/cloudera/xxxxx/Sparktest")
PS : You can replace the comma with anything you want e.g a whitespace, a word, etc.
So Why does the pipe need to be escaped ?
A string split expects a regular expression argument. An unescaped | is parsed as a regex meaning "empty string or empty string," which isn't what you mean.

Use variable with quotes with system in MATLAB

I have
myVar.value = 123521#machine OK
now I'm using this variable with system command as it's an argument passed to a binary .exe
so I have to add quotes to myVar.value as it caontains spaces
I tried :
'''myVar.value''' but this will give 'myVar.value', whereas I just want to have the result equal to "123521#machine OK"
how could I use the quotes in this case ?
Try this:
x = ['"' myVar.value '"']
I think you can use double quote characters within strings demarcated by single quotes. Within a string demarcated by single quotes characters by doubling up:
x = ['''' myVar.value '''']