scala> val st1 = "|||||||000001|09/01/2014|V|174500,00|22||BD |2540|LEC|1000|BEB|
01|53||AE|111 ||49|94,22|6||||||||2|2|App|80|2|||"
scala> st1.split('|').length
resXX: Int = 39
scala> val st2 = "|||||||000001|09/01/2014|V|174500,00|22||BD |2540|LEC|1000|BEB|
01|53||AE|111 ||49|94,22|6||||||||2|2|App|80|2| | |"
scala> st2.split('|').length
resXX: Int = 41
that is the last empty fields are ignored by the split.
is there any solution other that replacing all "||" by "| |"
the expected output is Int = 41.
indeed in the real file I may have lines such as:
"|||||||000001|09/01/2014|V|174500,00|22||BD |2540|LEC|1000|BEB|
01|53||AE|111 ||49|94,22|6||||||||2|2|App|80|2|||150"
that is a 42nd column comprising a number. (In this case the result is Int = 42)
Every line has the same number of |, but depending on the content of the column, the split('|').length returns a different result! (31, 40, ...,42).
I can understand the lack of the column after the last separator, but not the lack of the previous ones.
This issue comes from Java (since that's where String#split is defined).
As you can see here, in the default case (which is limit=0), the trailing empty strings are discarded.
To make it work as you expect, you can use str.split('|', -1).
Related
I am using Scala and reading input from the console. I am able to regurgitate the strings that make up each line, but if my input has the following format, how can I access each integer within each line?
2 2
1 2 2
2 1 1
Currently I just regurgitate the input back to the console using
object Main {
def main(args: Array[String]): Unit = {
for (ln <- io.Source.stdin.getLines) println(ln)
//how can I access each individual number within each line?
}
}
And I need to compile this project like so:
$ scalac main.scala
$ scala Main <input01.txt
2 2
1 2 2
2 1 1
A reasonable algorithm would be:
for each line, split it into words
parse each word into an Int
An implementation of that algorithm:
io.Source.stdin.getLines // for each line...
.flatMap(
_.split("""\s+""") // split it into words
.map(_.toInt) // parse each word into an Int
)
The result of this expression will be an Iterator[Int]; if you want a Seq, you can call toSeq on that Iterator (if there's a reasonable chance there will be more than 7 or so integers, it's probably worth calling toVector instead). It will blow up with a NumberFormatException if there's a word which isn't an integer. You can handle this a few different ways... if you want to ignore words that aren't integers, you can:
import scala.util.Try
io.Source.stdin.getLines
.flatMap(
_.split("""\s+""")
.flatMap(Try(_.toInt).toOption)
)
The following will give you a flat list of numbers.
val integers = (
for {
line <- io.Source.stdin.getLines
number <- line.split("""\s+""").map(_.toInt)
} yield number
)
As you can read here, some care must be taken when parsing the numbers.
I like to print a lot of numbers between -1 and 1 and need them to be aligned by the decimal point.
What I get with %2.2f is:
val (a, b) = (0.38, -0.38); println (f"${a}%2.2f\n${b}%2.2f ")
0,38
-0,38
What I like to get is:
0,38
-0,38
Is there an elegant solution?
What you can actually do is to add -+ preceding the formatting likewise:
scala> val (a, b) = (0.38, -0.38); println (f"${a}%-+2.2f\n${b}%-+2.2f")
+0.38
-0.38
a: Double = 0.38
b: Double = -0.38
You will get the + before the number though.
EDIT:
If you know the number of digits of the numbers (the first number of %n.m indicates the length of the digits), you can actually go like:
scala> printf("%5.2f", a);
0.38
scala> printf("%5.2f", b);
-0.38
Although there is already an accepted answer, I'll add one more for future reference. Scala f"" string interpolator actually uses Java formatting infrastructure and in the Java documentation you may find following flag:
' ' '\u0020' Requires the output to include a single extra space ('\u0020') for non-negative values.
So you might actually want to use it. Here is an example that shows the difference:
val arr = Array(0.38, -0.38, 10.38, -10.38, 123.38, -123.38)
println("Without space:")
arr.foreach(a => println(f"${a}%6.2f"))
println("----------------")
println("With space:")
arr.foreach(a => println(f"${a}% 6.2f"))
which produces following output:
Without space:
0,38
-0,38
10,38
-10,38
123,38
-123,38
----------------
With space:
0,38
-0,38
10,38
-10,38
123,38
-123,38
note the difference for 123.38/-123.38 i.e. for the case when there is an "overflow"
The solution is trivial: The first number does not indicate digits before the dot, but digits total, and does not yield to an errormessage, if too short. So for 2 digits after the dot, plus dot, plus one in front and an optional minus sign, I need 5 digits in total, and then it works:
val (a, b) = (0.38, -0.38); println (f"${a}%5.2f\n${b}%5.2f ")
0,38
-0,38
And no, a plus sign is not an option.
I am new to Scala and I want to calculate number of occurrences of a character in which start with a particular alphabet in a list of Strings.
For example-
val test1 : List[String] = List("zero","zebra","zenith","tiger","mosquito")
I have defined above List of Strings and I want to calculate count of all strings which start with "z".
I tried with below code-
scala> test2.count(s=> s.charAt(0) == "z")
res7: Int = 0
It is giving me result as 0. I am not sure what I am doing wrong. Please suggest.
Character values are delimited by single quotes. Double quotes are reserved for strings:
val test : List[String] = List("zero","zebra","zenith","tiger","mosquito")
test.count(_.charAt(0) == 'z') // 3: Int
you can simply use filter and find the length of the list
println(test1.filter(_.startsWith("z")).length)
If you want to ignore the cases (uppercase or lowercase) you can add .toLowerCase as
println(test1.filter(_.toLowerCase.startsWith("z")).length)
I hope the answer is helpful
I come up a pattern like
val pattern = "(\\w+)\\|(.*)\\|\\[(.*)\\]\\|\"(.*)\"\\|\"(.*)\"\\|\\[(.*)\\]\\|\\[(.*)\\]\\|(.*)\\|\\[(.*)\\]\\|\\[(.*)\\]".r
and I have a original string
var str = """AuthLogout|vmlxapp21a|[13/Jan/2016:16:33:15 +0100]|"66.77.444.44 uid=XXXXX,ou=People,o=Bank,o=External,dc=xxxx,dc=com"|"abcd_123_portalweb_w "|[]|[41]||[]|[]"""
then apply pattern to the string, but it is always empty.
val items = pattern.findAllIn(str).toList
If I understand what you're trying to do, perhaps using a giant regex isn't the easiest way: You can split by | and get rid of the unwanted separators ([, ], ") using replaceAll:
val str = """AuthLogout|vmlxapp21a|[13/Jan/2016:16:33:15 +0100]|"66.77.444.44 uid=XXXXX,ou=People,o=Bank,o=External,dc=xxxx,dc=com"|"abcd_123_portalweb_w "|[]|[41]||[]|[]"""
val withoutBoundaries = str.replaceAll("[\"\\]\\[]","")
val result = withoutBoundaries.split("\\|")
result.foreach(println)
Which prints:
AuthLogout
vmlxapp21a
13/Jan/2016:16:33:15 +0100
66.77.444.44 uid=XXXXX,ou=People,o=Bank,o=External,dc=xxxx,dc=com
abcd_123_portalweb_w
41
If you do want to use a regex here, I'd create sub-regex vars representing the different text parts that you're after, to make this somewhat manageable:
val plain = "(.*)" // no boundary characters
val boxed = s"\\[$plain\\]" // same, encapsulated by square brackets
val quoted = '"' + plain + '"' // same, encapsulated by double quotes
// the whole thing, separated by pipes:
val r = s"$plain\\|$plain\\|$boxed\\|$quoted\\|$quoted\\|$boxed\\|$boxed\\|$plain\\|$boxed\\|$boxed".r
val result = r.findAllIn(str).toList // this list has one item, as expected.
Now, if you want to see how this regex looks like, here it is - but I don't recommend having this in your code...:
val r = """(.*)\|(.*)\|\[(.*)\]\|"(.*)"\|"(.*)"\|\[(.*)\]\|\[(.*)\]\|(.*)\|\[(.*)\]\|\[(.*)\]""".r
This is driving me nuts... there must be a way to strip out all non-digit characters (or perform other simple filtering) in a String.
Example: I want to turn a phone number ("+72 (93) 2342-7772" or "+1 310-777-2341") into a simple numeric String (not an Int), such as "729323427772" or "13107772341".
I tried "[\\d]+".r.findAllIn(phoneNumber) which returns an Iteratee and then I would have to recombine them into a String somehow... seems horribly wasteful.
I also came up with: phoneNumber.filter("0123456789".contains(_)) but that becomes tedious for other situations. For instance, removing all punctuation... I'm really after something that works with a regular expression so it has wider application than just filtering out digits.
Anyone have a fancy Scala one-liner for this that is more direct?
You can use filter, treating the string as a character sequence and testing the character with isDigit:
"+72 (93) 2342-7772".filter(_.isDigit) // res0: String = 729323427772
You can use replaceAll and Regex.
"+72 (93) 2342-7772".replaceAll("[^0-9]", "") // res1: String = 729323427772
Another approach, define the collection of valid characters, in this case
val d = '0' to '9'
and so for val a = "+72 (93) 2342-7772", filter on collection inclusion for instance with either of these,
for (c <- a if d.contains(c)) yield c
a.filter(d.contains)
a.collect{ case c if d.contains(c) => c }