Scala: String Chomp - scala

does Scala have an API to do a "chomp" on a String?
Preferrably, I would like to convert a string "abcd \n" to "abcd"
Thanks
Ajay

There's java.lang.String.trim(), but that also removes leading whitespace. There's also RichString.stripLineEnd, but that only removes \n and \r.

If you don't want to use Apache Commons Lang, you can roll your own, along these lines.
scala> def chomp(text: String) = text.reverse.dropWhile(" \n\r".contains(_)).reverse
chomp: (text: String)String
scala> "[" + chomp(" a b cd\r \n") + "]"
res28: java.lang.String = [ a b cd]

There is in fact an out of the box support for chomp1
scala> val input = "abcd\n"
input: java.lang.String =
abcd
scala> "[%s]".format(input)
res2: String =
[abcd
]
scala> val chomped = input.stripLineEnd
chomped: String = abcd
scala> "[%s]".format(chomped)
res3: String = [abcd]
1 for some definition of chomp; really same answer as sepp2k but showing how to use it on String

Why not use Apache Commons Lang and the StringUtils.chomp() function ? One of the great things about Scala is that you can leverage off existing Java libraries.

Related

Error in code Regex

I am trying to find only the word contains 3 letters(e is below example) in the word
need to find using regex.
val inputString = """edepak,suman,employdee,eeeee,eme,ev"""
and i have written the below code.
val numberPatteren = "([a-z]*e){3,}".r
but i am getting the below output which is not as expected.
employdee,eeeee
but the output should be only -- employdee
can you please help me on this.
You can achieve that simply by doing the following
scala> inputString.split(",").filter(word => word.count(_ == 'e') == 3).mkString(",")
//res16: String = employdee
If you want to use regex, you can do as below
scala> val numberPatteren = "[a-df-zA-DF-Z0-9]".r
//numberPatteren: scala.util.matching.Regex = [a-df-zA-DF-Z0-9]
scala> inputString.split(",").filter(numberPatteren.replaceAllIn(_, "").length == 3).mkString(",")
//res0: String = employdee

Convert \\ to \ in Scala

Let's say that I have a string: "\\u2026". And, I want it to change that to "\u2026" to print out the unicode in Scala. Is there a way to do that? Thank you for your time.
Edit:
Let me clarify. Due to some circumstances, I have a string like: "Following char is in unicode: \\u2026", which prints:
Following char is in unicode: \u2026
But, I want to edit it so that it prints:
Following char is in unicode: …
Thank you for the answers. This is what I ended up doing.
def FixString(string: String) : String = {
var newString = string;
// Find the 1st problematic string
var start = string.indexOf("\\u");
while(start != -1) {
// Extract the problematic string
val end = start + 6;
val wrongString = string.substring(start,end);
// Convert to unicode
val hexCode = wrongString.substring(2);
val intCode = Integer.parseInt(hexCode, 16);
val finalString = new String(Character.toChars(intCode));
// Replace
newString = string.replace(wrongString,finalString);
// Find next problematic string
start = string.indexOf("\\u", end);
}
return newString;
}
If you know the string is exactly \uXXXX (unescaped), then
val stringWithBackslash = "\\u2026" // just for example
val hexCode = stringWithBackslash.substring(2) // "2026"
val intCode = Integer.parseInt(hexCode, 16) // 8230
val finalString = new String(Character.toChars(intCode)) // "…"
(code adapted from Creating Unicode character from its number). If not, pick the part you want with regular expression """\\u(\d{4})""".
Short answer to the question as asked to use the String.replace method:
"\\u2026".replace("\\\\", "\\")
Notice that I had to double each backslash because the backslash character also begins Java String escape sequences.
If you want the JVM to perform UTF-8 IO (not required for this question), set the Java system property file.encoding=UTF-8, like this:
$ sbt console
Welcome to Scala 2.12.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_151).
Type in expressions for evaluation. Or try :help.
scala> System.setProperty("file.encoding","UTF-8")
res0: String = UTF-8
scala> val strWithError: String = "\\u2026"
strWithError: String = \u2026
scala> val prefixedString: String = strWithError.replace("\\\\", "\\") // corrected string as per OP
prefixedString: String = \u2026
Here is bonus information, adapted from https://stackoverflow.com/a/16034658/553865 (referenced by Alexey Romanov's answer):
scala> val utfString: String = strWithError.replace("\\u", "") // utf code point
utfString: String = 2026
scala> val intCode = Integer.parseInt(utfString, 16)
intCode: Int = 8230
scala> val symbol = new String(Character.toChars(intCode))
symbol: String = …

How to eval a string val in Scala?

I have scala expression stored in String variable:
val myExpr = "(xml \ \"node\")"
How do I execute this?
s"${myExpr}"
Right now it only gives me the string contents
What I'm trying to achieve is parsing user string input in the form:
"/some/node/in/xml"
and get that corresponding node in Scala:
(xml \ "node" \ "in" \ "xml")
For the REPL, my init includes:
implicit class interpoleter(val sc: StringContext) {def i(args: Any*) = $intp interpret sc.s(args: _*) }
with which
scala> val myExpr = "(xml \\ \"node\")"
myExpr: String = (xml \ "node")
scala> val xml = <x><node/></x>
xml: scala.xml.Elem = <x><node/></x>
scala> i"${myExpr}"
res3: scala.xml.NodeSeq = NodeSeq(<node/>)
res2: scala.tools.nsc.interpreter.IR.Result = Success
because isn't code really just a string, like everything else?
Probably, there is some more idiomatic way in recent scala versions, but you can use Twitter's Eval for that:
val i: Int = new Eval()("1 + 1") // => 2

How to split a string by delimiter from the right?

How to split a string by a delimiter from the right?
e.g.
scala> "hello there how are you?".rightSplit(" ", 1)
res0: Array[java.lang.String] = Array(hello there how are, you?)
Python has a .rsplit() method which is what I'm after in Scala:
In [1]: "hello there how are you?".rsplit(" ", 1)
Out[1]: ['hello there how are', 'you?']
I think the simplest solution is to search for the index position and then split based on that. For example:
scala> val msg = "hello there how are you?"
msg: String = hello there how are you?
scala> msg splitAt (msg lastIndexOf ' ')
res1: (String, String) = (hello there how are," you?")
And since someone remarked on lastIndexOf returning -1, that's perfectly fine with the solution:
scala> val msg = "AstringWithoutSpaces"
msg: String = AstringWithoutSpaces
scala> msg splitAt (msg lastIndexOf ' ')
res0: (String, String) = ("",AstringWithoutSpaces)
You could use plain old regular expressions:
scala> val LastSpace = " (?=[^ ]+$)"
LastSpace: String = " (?=[^ ]+$)"
scala> "hello there how are you?".split(LastSpace)
res0: Array[String] = Array(hello there how are, you?)
(?=[^ ]+$) says that we'll look ahead (?=) for a group of non-space ([^ ]) characters with at least 1 character length. Finally this space followed by such sequence has to be at the end of the string: $.
This solution wont break if there is only one token:
scala> "hello".split(LastSpace)
res1: Array[String] = Array(hello)
scala> val sl = "hello there how are you?".split(" ").reverse.toList
sl: List[String] = List(you?, are, how, there, hello)
scala> val sr = (sl.head :: (sl.tail.reverse.mkString(" ") :: Nil)).reverse
sr: List[String] = List(hello there how are, you?)

Is this a bug in Scala 2.10 String Interpolation inside a multiline String with backslash?

Using Scala 2.10.0-RC1, I tried to use String Interpolation inside a Windows file path, e.g. like this:
val path = s"""c:\foo\bar\$fileName.csv"""
And got an Exception
java.lang.StringIndexOutOfBoundsException: String index out of range: 11
Without the multiline string literal (""") it works just fine
val path = s"""c:\foo\bar\$fileName.csv""" 
val path = s"c:\foo\bar\${fileName}.csv" //> path : String = c:\foo\bar\myFile.csv
Further testing to reproduce the issue:
object wcScala10 {
util.Properties.versionString //> res0: String = version 2.10.0-RC1
val name = "James" //> name : String = James
val test1 = s"Hello $name" //> test1 : String = Hello James
val test2 = s"""Hello $name""" //> test2 : String = Hello James
val test3 = """Hello \$name""" //> test3 : String = Hello \$name
val test4 = s"""Hello \$name""" //> java.lang.StringIndexOutOfBoundsException:
//> String index out of range: 7
}
Is this exception due to a bug? or am I simply not allowed to use backslash before the $ sign when doing String interpolation?
Here is more of the stacktrace:
java.lang.StringIndexOutOfBoundsException: String index out of range: 7
at java.lang.String.charAt(String.java:686)
at scala.collection.immutable.StringOps$.apply$extension(StringOps.scala :39)
at scala.StringContext$.treatEscapes(StringContext.scala:202)
at scala.StringContext$$anonfun$s$1.apply(StringContext.scala:90)
at scala.StringContext$$anonfun$s$1.apply(StringContext.scala:90)
at scala.StringContext.standardInterpolator(StringContext.scala:120)
at scala.StringContext.s(StringContext.scala:90)
at wcScala10$$anonfun$main$1.apply$mcV$sp(wcScala10.scala:9)
at org.scalaide.worksheet.runtime.library.WorksheetSupport$$anonfun$$exe
cute$1.apply$mcV$sp(WorksheetSupport.scala:76)
at org.scalaide.worksheet.runtime.library.WorksheetSupport$.redirected(W
orksheetSupport.scala:65)
at org.scalaide.worksheet.runtime.library.WorksheetSupport$.$execute(Wor
ksheetSupport.scala:75)
at wcScala10$.main(wcScal
Output exceeds cutoff limit.
Update:
Now marked as fixed for Scala 2.10.1-RC1
https://issues.scala-lang.org/browse/SI-6631
By the way, even after the fix, the right way to do interpolation and avoid escaping is using raw:
val path = raw"c:\foo\bar\$fileName.csv"
e.g.
val fileName = "myFileName" //> fileName : String = myFileName
val path = raw"c:\foo\bar\$fileName.csv" //> path : String = c:\foo\bar\myFileName.csv
String interpolation notation takes over control of whether a string is a raw string or not. All that triple-quoting gets you is the ability to quote single quotes. If you want no interpolation, use raw"Hi $name" instead. (Except raw is also buggy in 2.10.0; a fix is in for 2.10.1 AFAIK.)
That said, this is not a very friendly way to deal with the situation of having a malformatted string. I'd classify it as a bug, only because it's returning an out-of-bounds exception not something that says an escape code can't be completed.
Note: these break also:
s"Hi \$name"
s"""Hi \"""