I'm new to Scala.
Is it possible to force using a semicolon as the end of a line ?
e.g.
val s = "my line"
+ " ends here";
Thanks
You don't want to "force using a semicolon", quite the contrary: you want to avoid that a semicolon is inferred at the end of the first line.
Several possibilities here:
Move plus to previous line (that's the preferred way to do it):
val s = "my line" +
"ends here";
Explicit method calls starting with . prevent the semicolon from being inferred (this works reasonably well for "builder-pattern" like chains of methods, but looks ugly for +):
val s = "my line"
.+("ends here");
Add parentheses. Semicolons are never inferred inside parentheses:
val s = ("my line"
+ "ends here");
Related
Sorry, I ran into another question about using PetitParser. I've figured out my recursive issues, but now I have a problem with parentheses. If I need to be able to parse the following two expressions:
'(use = "official").empty()'
'(( 5 + 5 ) * 5) + 5'
I've tried doing something like the following:
final expression = (char('(') & any().starGreedy(char(')')).flatten() & char(')')).map((value) => ParenthesesParser(value));
But that doesnt' work on the first expression.
If I try this:
final expression = (char('(') & any().starLazy(char(')')).flatten() & char(')')).map((value) => ParenthesesParser(value));
It doesn't work on the second expression. Any suggestions on how to parse both?
I think neither of the parsers does what you want: The first parser, the greedy one with starGreedy, will consume up to the last closing parenthesis. The second parser, the lazy one with starLazy, will consume up to the first closing parenthesis.
To parse a balanced parenthesis you need recursion, so that each opening parenthesis is followed by a matching closing one:
final inner = undefined();
final parser = char('(') & inner.star().flatten() & char(')');
inner.set(parser | pattern('^)'));
In the snippet above, the inner parser is recursively trying to either parse another parenthesis pair, or otherwise it simply consumes any character that is not a closing parenthesis.
I am new to spark. I have a huge file which has data like-
18765967790#18765967790#T#20130629#00#31#2981546 " "18765967790#18765967790#T#20130629#19#18#3240165 " "18765967790#18765967790#T#20130629#18#18#1362836
13478756094#13478756094#T#20130629#31#26#2880701 " "13478756094#13478756094#T#20130629#19#18#1230206 " "13478756094#13478756094#T#20130629#00#00#1631440
40072066693#40072066693#T#20130629#79#18#1270246 " "40072066693#40072066693#T#20130629#79#18#3276502 " "40072066693#40072066693#T#20130629#19#07#3321860
I am trying to replace " " with new line character so that my output looks like this-
18765967790#18765967790#T#20130629#00#31#2981546
18765967790#18765967790#T#20130629#19#18#3240165
18765967790#18765967790#T#20130629#18#18#1362836
13478756094#13478756094#T#20130629#31#26#2880701
13478756094#13478756094#T#20130629#19#18#1230206
13478756094#13478756094#T#20130629#00#00#1631440
40072066693#40072066693#T#20130629#79#18#1270246
40072066693#40072066693#T#20130629#79#18#3276502
40072066693#40072066693#T#20130629#19#07#3321860
I have tried with-
val fact1 = sc.textFile("s3://abc.txt").map(x=>x.replaceAll("\"","\n"))
But this doesn't seem to be working. Can someone tell what I am missing?
Edit1- My final output will be a dataframe with schema imposed after splitting with delimeter "#".
I am getting below o/p-
scala> fact1.take(5).foreach(println)
18765967790#18765967790#T#20130629#00#31#2981546
18765967790#18765967790#T#20130629#19#18#3240165
18765967790#18765967790#T#20130629#18#18#1362836
13478756094#13478756094#T#20130629#31#26#2880701
13478756094#13478756094#T#20130629#19#18#1230206
13478756094#13478756094#T#20130629#00#00#1631440
40072066693#40072066693#T#20130629#79#18#1270246
40072066693#40072066693#T#20130629#79#18#3276502
40072066693#40072066693#T#20130629#19#07#3321860
I am getting extra blank lines which is further troubling me to create dataframe. This might seem simple here, but the file is huge, also the rows containing " " are long. In the question I have put only 2 double quotes but they can be more than 40-50 in numbers.
There are more than one quote in between textes, which is creating multiple line breaks. You either need to remove additional quotes before replace or empty lines after replace:
.map(x=>x.replaceAll("\"","\n").replaceAll("(?m)^[ \t]*\r?\n", ""))
Reference: Remove all empty lines
You might be missing implicit Encoders and you try the code as below
spark.read.text("src/main/resources/doubleQuoteFile.txt").map(row => {
row.getString(0).replace("\"","\n") // looking to replace " " with next line
row.getString(0).replace("\" \"","\n") // looking to replace " " with next line
})(org.apache.spark.sql.Encoders.STRING)
When using triple quotes in an indented position I for sure get indentation in the output js string too:
Comparing these two in a nested let
let input1 = "T1\nX55.555Y-44.444\nX52.324Y-40.386"
let input2 = """T1
X66.324Y-40.386
X52.324Y-40.386"""
giving
// single quotes with \n
"T1\x0aX55.555Y-44.444\x0aX52.324Y-40.386"
// triple quoted
"T1\x0a X66.324Y-40.386\x0a X52.324Y-40.386"
Is there any agreed upon thing like stripMargin in Scala so I can use those without having to unindent to top level?
Update, just to clarify what I mean, I'm currently doing:
describe "header" do
it "should parse example header" do
let input = """M48
;DRILL file {KiCad 4.0.7} date Wednesday, 31 January 2018 'AMt' 11:08:53
;FORMAT={-:-/ absolute / metric / decimal}
FMAT,2
METRIC,TZ
T1C0.300
T2C0.400
T3C0.600
T4C0.800
T5C1.000
T6C1.016
T7C3.400
%
"""
doesParse input header
describe "hole" do
it "should parse a simple hole" do
doesParse "X52.324Y-40.386" hole
Update:
I was asked to clarify stripMargin from Scala. It's used like so:
val speech = """T1
|X66.324Y-40.386
|X52.324Y-40.386""".stripMargin
which then removes the leading whitespace. stripMargin can take any separator, but defaults to |.
More examples:
Rust has https://docs.rs/trim-margin/0.1.0/trim_margin/
Kotlin has in stdlib: https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.text/trim-margin.html
I guess it might sound like asking for left-pad ( :) ) but if there's something there already I'd rather not brew it myself…
I'm sorry you didn't get a prompt response to this one, but I have implemented this function here. In case the pull request isn't merged, here's an implementation that just depends on purescript-strings:
import Data.String (joinWith, split) as String
import Data.String.CodeUnits (drop, dropWhile) as String
import Data.String.Pattern (Pattern(..))
stripMargin :: String -> String
stripMargin =
let
lines = String.split (Pattern "\n")
unlines = String.joinWith "\n"
mapLines f = unlines <<< map f <<< lines
in
mapLines (String.drop 1 <<< String.dropWhile (_ /= '|'))
I have the following code:
object testLines extends App {
val items = Array("""a-b-c d-e-f""","""a-b-c th-i-t""")
val lines = items.map(_.replaceAll("-", "")split("\t"))
print(lines.map(_.mkString(",")).mkString("\n"))
}
By mistake i did not put a dot between replaceAll and split but it worked.
By contrary when putting a dot between replaceAll and split i got an error
identifier expected but ';' found.
Implicit conversions found: items =>
What is going on?
Why does it work without a dot but is not working with a dot.
Update:
It works also with dot. The error message is a bug in the scala ide. The first part of the question is still valid
Thanks,
David
You have just discovered that Operators are methods. x.split(y) can also be written x split y in cases where the method is operator-like and it looks nicer. However there is nothing stopping you putting either side in parentheses like x split (y), (x) split y, or even (x) split (y) which may be necessary (and is a good idea for readability even if not strictly necessary) if you are passing in a more complex expression than a simple variable or constant and need parentheses to override the precedence.
With the example code you've written, it's not a bad idea to do the whole thing in operator style for clarity, using parentheses only where the syntax requires and/or they make groupings more obvious. I'd probably have written it more like this:
object testLines extends App {
val items = Array("a-b-c d-e-f", "a-b-c th-i-t")
val lines = items map (_ replaceAll ("-", "") split "\t")
print(lines map (_ mkString ",") mkString "\n")
}
I want to split the following Scala code line like this:
ConditionParser.parseSingleCondition("field=*value1*").description
must equalTo("field should contain value1")
But which is the line continuation character?
Wrap it in parentheses:
(ConditionParser.parseSingleCondition("field=*value1*").description
must equalTo("field should contain value1"))
Scala does not have a "line continuation character" - it infers a semicolon always when:
An expression can end
The following (not whitespace) line begins not with a token that can start a statement
There are no unclosed ( or [ found before
Thus, to "delay" semicolon inference one can place a method call or the dot at the end of the line or place the dot at the beginning of the following line:
ConditionParser.
parseSingleCondition("field=*value1*").
description must equalTo("field should contain value1")
a +
b +
c
List(1,2,3)
.map(_+1)