Let's say that I have a string: "\\u2026". And, I want it to change that to "\u2026" to print out the unicode in Scala. Is there a way to do that? Thank you for your time.
Edit:
Let me clarify. Due to some circumstances, I have a string like: "Following char is in unicode: \\u2026", which prints:
Following char is in unicode: \u2026
But, I want to edit it so that it prints:
Following char is in unicode: …
Thank you for the answers. This is what I ended up doing.
def FixString(string: String) : String = {
var newString = string;
// Find the 1st problematic string
var start = string.indexOf("\\u");
while(start != -1) {
// Extract the problematic string
val end = start + 6;
val wrongString = string.substring(start,end);
// Convert to unicode
val hexCode = wrongString.substring(2);
val intCode = Integer.parseInt(hexCode, 16);
val finalString = new String(Character.toChars(intCode));
// Replace
newString = string.replace(wrongString,finalString);
// Find next problematic string
start = string.indexOf("\\u", end);
}
return newString;
}
If you know the string is exactly \uXXXX (unescaped), then
val stringWithBackslash = "\\u2026" // just for example
val hexCode = stringWithBackslash.substring(2) // "2026"
val intCode = Integer.parseInt(hexCode, 16) // 8230
val finalString = new String(Character.toChars(intCode)) // "…"
(code adapted from Creating Unicode character from its number). If not, pick the part you want with regular expression """\\u(\d{4})""".
Short answer to the question as asked to use the String.replace method:
"\\u2026".replace("\\\\", "\\")
Notice that I had to double each backslash because the backslash character also begins Java String escape sequences.
If you want the JVM to perform UTF-8 IO (not required for this question), set the Java system property file.encoding=UTF-8, like this:
$ sbt console
Welcome to Scala 2.12.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_151).
Type in expressions for evaluation. Or try :help.
scala> System.setProperty("file.encoding","UTF-8")
res0: String = UTF-8
scala> val strWithError: String = "\\u2026"
strWithError: String = \u2026
scala> val prefixedString: String = strWithError.replace("\\\\", "\\") // corrected string as per OP
prefixedString: String = \u2026
Here is bonus information, adapted from https://stackoverflow.com/a/16034658/553865 (referenced by Alexey Romanov's answer):
scala> val utfString: String = strWithError.replace("\\u", "") // utf code point
utfString: String = 2026
scala> val intCode = Integer.parseInt(utfString, 16)
intCode: Int = 8230
scala> val symbol = new String(Character.toChars(intCode))
symbol: String = …
Related
I want to extract domain name from uri.
For example, input to the regular expression may be of one of the below types
test.net
https://www.test.net
https://test.net
http://www.test.net
http://test.net
in all the cases the input should return test.net
Below is the code in implemented for my purpose
val re = "([http[s]?://[w{3}\\.]?]+)(.*)".r
But I didn't get expected result
below is my output
val re(prefix, domain) = "https://www.test.net"
prefix: String = https://www.t
domain: String = est.net
what is problem with my regular expression and how can I fix it?
what is problem with my regular expression and how can I fix it?
You are using a character class
[http.?://(www.)?]
This means:
either an h
or a t
or a t
or a .
or a ?
or a :
or a /
or a /
or a (
or a w
or a w
or a w
or a .
or a )
or a ?
It does not include an s, so it will not match https://.
It is not clear to me why you are using a character class here, nor why you are using duplicate characters in the class.
Ideally, you shouldn't try to parse URIs yourself; someone else has already done the hard work. You could, for example, use the java.net.URI class:
import java.net.URI
val u1 = new URI("test.net")
u1.getHost
// res: String = null
val u2 = new URI("https://www.test.net")
u2.getHost
// res: String = www.test.net
val u3 = new URI("https://test.net")
u3.getHost
// res: String = test.net
val u4 = new URI("http://www.test.net")
u4.getHost
// res: String = www.test.net
val u5 = new URI("http://test.net")
u5.getHost
// res: String = test.net
Unfortunately, as you can see, what you want to achieve does not actually comply with the official URI syntax.
If you can fix that, then you can use java.net.URI. Otherwise, you will need to go back to your old solution and parse the URI yourself:
val re = "(?>https?://)?(?>www.)?([^/?#]*)".r
val re(domain1) = "test.net"
//=> domain1: String = test.net
val re(domain2) = "https://www.test.net"
//=> domain2: String = test.net
val re(domain3) = "https://test.net"
//=> domain3: String = test.net
val re(domain4) = "http://www.test.net"
//=> domain4: String = test.net
val re(domain5) = "http://test.net"
//=> domain5: String = test.net
I am trying to find only the word contains 3 letters(e is below example) in the word
need to find using regex.
val inputString = """edepak,suman,employdee,eeeee,eme,ev"""
and i have written the below code.
val numberPatteren = "([a-z]*e){3,}".r
but i am getting the below output which is not as expected.
employdee,eeeee
but the output should be only -- employdee
can you please help me on this.
You can achieve that simply by doing the following
scala> inputString.split(",").filter(word => word.count(_ == 'e') == 3).mkString(",")
//res16: String = employdee
If you want to use regex, you can do as below
scala> val numberPatteren = "[a-df-zA-DF-Z0-9]".r
//numberPatteren: scala.util.matching.Regex = [a-df-zA-DF-Z0-9]
scala> inputString.split(",").filter(numberPatteren.replaceAllIn(_, "").length == 3).mkString(",")
//res0: String = employdee
Given s : String how can I cast the result of
s.first()
into a String ?
You can use the method take as followed:
scala> val s = "abcdef"
s: String = abcdef
scala> val first = s.take(1)
first: String = a
scala>
String doesn't have a .first() function. Do you mean .head?
Using head and returning a String is as simple as:
s.head.toString
Another option:
val s : String = "hello"
val first : String = s(0)+""
Using Scala 2.10.0-RC1, I tried to use String Interpolation inside a Windows file path, e.g. like this:
val path = s"""c:\foo\bar\$fileName.csv"""
And got an Exception
java.lang.StringIndexOutOfBoundsException: String index out of range: 11
Without the multiline string literal (""") it works just fine
val path = s"""c:\foo\bar\$fileName.csv"""
val path = s"c:\foo\bar\${fileName}.csv" //> path : String = c:\foo\bar\myFile.csv
Further testing to reproduce the issue:
object wcScala10 {
util.Properties.versionString //> res0: String = version 2.10.0-RC1
val name = "James" //> name : String = James
val test1 = s"Hello $name" //> test1 : String = Hello James
val test2 = s"""Hello $name""" //> test2 : String = Hello James
val test3 = """Hello \$name""" //> test3 : String = Hello \$name
val test4 = s"""Hello \$name""" //> java.lang.StringIndexOutOfBoundsException:
//> String index out of range: 7
}
Is this exception due to a bug? or am I simply not allowed to use backslash before the $ sign when doing String interpolation?
Here is more of the stacktrace:
java.lang.StringIndexOutOfBoundsException: String index out of range: 7
at java.lang.String.charAt(String.java:686)
at scala.collection.immutable.StringOps$.apply$extension(StringOps.scala :39)
at scala.StringContext$.treatEscapes(StringContext.scala:202)
at scala.StringContext$$anonfun$s$1.apply(StringContext.scala:90)
at scala.StringContext$$anonfun$s$1.apply(StringContext.scala:90)
at scala.StringContext.standardInterpolator(StringContext.scala:120)
at scala.StringContext.s(StringContext.scala:90)
at wcScala10$$anonfun$main$1.apply$mcV$sp(wcScala10.scala:9)
at org.scalaide.worksheet.runtime.library.WorksheetSupport$$anonfun$$exe
cute$1.apply$mcV$sp(WorksheetSupport.scala:76)
at org.scalaide.worksheet.runtime.library.WorksheetSupport$.redirected(W
orksheetSupport.scala:65)
at org.scalaide.worksheet.runtime.library.WorksheetSupport$.$execute(Wor
ksheetSupport.scala:75)
at wcScala10$.main(wcScal
Output exceeds cutoff limit.
Update:
Now marked as fixed for Scala 2.10.1-RC1
https://issues.scala-lang.org/browse/SI-6631
By the way, even after the fix, the right way to do interpolation and avoid escaping is using raw:
val path = raw"c:\foo\bar\$fileName.csv"
e.g.
val fileName = "myFileName" //> fileName : String = myFileName
val path = raw"c:\foo\bar\$fileName.csv" //> path : String = c:\foo\bar\myFileName.csv
String interpolation notation takes over control of whether a string is a raw string or not. All that triple-quoting gets you is the ability to quote single quotes. If you want no interpolation, use raw"Hi $name" instead. (Except raw is also buggy in 2.10.0; a fix is in for 2.10.1 AFAIK.)
That said, this is not a very friendly way to deal with the situation of having a malformatted string. I'd classify it as a bug, only because it's returning an out-of-bounds exception not something that says an escape code can't be completed.
Note: these break also:
s"Hi \$name"
s"""Hi \"""
does Scala have an API to do a "chomp" on a String?
Preferrably, I would like to convert a string "abcd \n" to "abcd"
Thanks
Ajay
There's java.lang.String.trim(), but that also removes leading whitespace. There's also RichString.stripLineEnd, but that only removes \n and \r.
If you don't want to use Apache Commons Lang, you can roll your own, along these lines.
scala> def chomp(text: String) = text.reverse.dropWhile(" \n\r".contains(_)).reverse
chomp: (text: String)String
scala> "[" + chomp(" a b cd\r \n") + "]"
res28: java.lang.String = [ a b cd]
There is in fact an out of the box support for chomp1
scala> val input = "abcd\n"
input: java.lang.String =
abcd
scala> "[%s]".format(input)
res2: String =
[abcd
]
scala> val chomped = input.stripLineEnd
chomped: String = abcd
scala> "[%s]".format(chomped)
res3: String = [abcd]
1 for some definition of chomp; really same answer as sepp2k but showing how to use it on String
Why not use Apache Commons Lang and the StringUtils.chomp() function ? One of the great things about Scala is that you can leverage off existing Java libraries.