String interpolation of variable value - scala

I want the variable value to be processed by string interpolation.
val temp = "1 to 10 by 2"
println(s"$temp")
output expected:
inexact Range 1 to 10 by 2
but getting
1 to 10 by 2
is there any way to get this way done?
EDIT
The normal case for using StringContext is:
$> s"${1 to 10 by 2}"
inexact Range 1 to 10 by 2
This return the Range from 1 to 10 with the step value of 2.
And String context won't work on variable, so can there be a way I can do like
$> val temp = "1 to 10 by 2"
$> s"${$temp}" //hypothetical
such that the interpreter will evaluate this as
s"${$temp}" => s"${1 to 10 by 2}" => Range from 1 to 10 by step of 2 = {1,3,5,7,9}

By setting a string value to temp you are doing just that - creating a flat String. If you want this to be actual code, then you need to drop the quotes:
val temp = 1 to 10 by 2
Then you can print the results:
println(s"$temp")
This will print the following output string:
inexact Range 1 to 10 by 2
This is the toString(...) output of a variable representing a Range. If you want to print the actual results of the 1 to 10 by 2 computation, you need to do something like this:
val resultsAsString = temp.mkString(",")
println(resultsAsString)
> 1,3,5,7,9
or even this (watch out: here the curly brackets { } are used not for string interpolation but simply as normal string characters):
println(s"{$resultsAsString}")
> {1,3,5,7,9}
Edit
If what you want is to actually interpret/compile Scala code on the fly (not recommended though - for security reasons, among others), then you may be interested in this:
https://ammonite.io/ - Ammonite, Scala scripting
In any case, to interpret your code from a String, you may try using this:
https://docs.scala-lang.org/overviews/repl/embedding.html
See these lines:
val scripter = new ScriptEngineManager().getEngineByName("scala")
scripter.eval("""println("hello, world")""")

Related

Regex expression in q to match specific integer range following string

Using q’s like function, how can we achieve the following match using a single regex string regstr?
q) ("foo7"; "foo8"; "foo9"; "foo10"; "foo11"; "foo12"; "foo13") like regstr
>>> 0111110b
That is, like regstr matches the foo-strings which end in the numbers 8,9,10,11,12.
Using regstr:"foo[8-12]" confuses the square brackets (how does it interpret this?) since 12 is not a single digit, while regstr:"foo[1[0-2]|[1-9]]" returns a type error, even without the foo-string complication.
As the other comments and answers mentioned, this can't be done using a single regex. Another alternative method is to construct the list of strings that you want to compare against:
q)str:("foo7";"foo8";"foo9";"foo10";"foo11";"foo12";"foo13")
q)match:{x in y,/:string z[0]+til 1+neg(-/)z}
q)match[str;"foo";8 12]
0111110b
If your eventual goal is to filter on the matching entries, you can replace in with inter:
q)match:{x inter y,/:string z[0]+til 1+neg(-/)z}
q)match[str;"foo";8 12]
"foo8"
"foo9"
"foo10"
"foo11"
"foo12"
A variation on Cillian’s method: test the prefix and numbers separately.
q)range:{x+til 1+y-x}.
q)s:"foo",/:string 82,range 7 13 / include "foo82" in tests
q)match:{min(x~/:;in[;string range y]')#'flip count[x]cut'z}
q)match["foo";8 12;] s
00111110b
Note how unary derived functions x~/: and in[;string range y]' are paired by #' to the split strings, then min used to AND the result:
q)flip 3 cut's
"foo" "foo" "foo" "foo" "foo" "foo" "foo" "foo"
"82" ,"7" ,"8" ,"9" "10" "11" "12" "13"
q)("foo"~/:;in[;string range 8 12]')#'flip 3 cut's
11111111b
00111110b
Compositions rock.
As the comments state, regex in kdb+ is extremely limited. If the number of trailing digits is known like in the example above then the following can be used to check multiple patterns
q)str:("foo7"; "foo8"; "foo9"; "foo10"; "foo11"; "foo12"; "foo13"; "foo3x"; "foo123")
q)any str like/:("foo[0-9]";"foo[0-9][0-9]")
111111100b
Checking for a range like 8-12 is not currently possible within kdb+ regex. One possible workaround is to write a function to implement this logic. The function range checks a list of strings start with a passed string and end with a number within the range specified.
range:{
/ checking for strings starting with string y
s:((c:count y)#'x)like y;
/ convert remainder of string to long, check if within range
d:("J"$c _'x)within z;
/ find strings satisfying both conditions
s&d
}
Example use:
q)range[str;"foo";8 12]
011111000b
q)str where range[str;"foo";8 12]
"foo8"
"foo9"
"foo10"
"foo11"
"foo12"
This could be made more efficient by checking the trailing digits only on the subset of strings starting with "foo".
For your example you can pad, fill with a char, and then simple regex works fine:
("."^5$("foo7";"foo8";"foo9";"foo10";"foo11";"foo12";"foo13")) like "foo[1|8-9][.|0-2]"

Reading multiple integers from line in text file

I am using Scala and reading input from the console. I am able to regurgitate the strings that make up each line, but if my input has the following format, how can I access each integer within each line?
2 2
1 2 2
2 1 1
Currently I just regurgitate the input back to the console using
object Main {
def main(args: Array[String]): Unit = {
for (ln <- io.Source.stdin.getLines) println(ln)
//how can I access each individual number within each line?
}
}
And I need to compile this project like so:
$ scalac main.scala
$ scala Main <input01.txt
2 2
1 2 2
2 1 1
A reasonable algorithm would be:
for each line, split it into words
parse each word into an Int
An implementation of that algorithm:
io.Source.stdin.getLines // for each line...
.flatMap(
_.split("""\s+""") // split it into words
.map(_.toInt) // parse each word into an Int
)
The result of this expression will be an Iterator[Int]; if you want a Seq, you can call toSeq on that Iterator (if there's a reasonable chance there will be more than 7 or so integers, it's probably worth calling toVector instead). It will blow up with a NumberFormatException if there's a word which isn't an integer. You can handle this a few different ways... if you want to ignore words that aren't integers, you can:
import scala.util.Try
io.Source.stdin.getLines
.flatMap(
_.split("""\s+""")
.flatMap(Try(_.toInt).toOption)
)
The following will give you a flat list of numbers.
val integers = (
for {
line <- io.Source.stdin.getLines
number <- line.split("""\s+""").map(_.toInt)
} yield number
)
As you can read here, some care must be taken when parsing the numbers.

How to check for valid file name format in kdb/q?

I'd like to check that the file names in my directory are all formatted properly. First I create a variable dir and then use the keyword key to see what files are listed...
q)dir:`:/myDirectory/data/files
q)dirkey:key dir
q)dirkey
`FILEA_XYZ_20190501_b233nyc9_OrderPurchase_000123.json
`FILEB_ABC_20190430_b556nyc1_OrderSale_000456.meta
I select and parse the .json file name...
q)dirjsn:dirkey where dirkey like "*.json"
q)sepname:raze{"_" vs string x}'[dirjsn]
"FILEA"
"XYZ"
"20190501"
"b233nyc9"
"OrderPurchase"
"000123.json"
Next I'd like to confirm that each character in sepname[0] and sepname[1] are letters, that characters in sepname[2] are numerical/temporal, and that sepname[3] contains alphanumeric values.
What is the best way to optimize the following sequential if statements for performance and how can I check for alphanumeric values, like in the case of sepname[3], not just one or the other?
q)if[not sepname[0] like "*[A-Z]";:show "Incorrect Submitter"];
if[not sepname[1] like "*[A-Z]";:show "Incorrect Reporter"];
if[not sepname[2] like "*[0-9]";:show "Incorrect Date"];
if[not sepname[3] like " ??? ";:show "Incorrect Kind"];
show "Correct File Format"
If your valid filenames alway have that same structure (specifically 5 chars, 3 chars, 8 chars, 8 chars) then you can use a single regex like statement like so:
dirjsn:("FILEA_XYZ_20190501_b233nyc9_OrderPurchase_000123.json";"F2ILEA_XYZ_20190501_b233nyc9_OrderPurchase_000123.json";"FILEA_XYZ2_20190501_b233nyc9_OrderPurchase_000123.json";"FILEA_XYZ_2A190501_b233nyc9_OrderPurchase_000123.json";"FILEA_XYZ_20190501_b233%yc9_OrderPurchase_000123.json";"FILEA_XYZ_20190501_b233nyc9_OrderPurchase_000123.json");
q)dirjsn
FILEA_XYZ_20190501_b233nyc9_OrderPurchase_000123.json
F2ILEA_XYZ_20190501_b233nyc9_OrderPurchase_000123.json
FILEA_XYZ2_20190501_b233nyc9_OrderPurchase_000123.json
FILEA_XYZ_2A190501_b233nyc9_OrderPurchase_000123.json
FILEA_XYZ_20190501_b233%yc9_OrderPurchase_000123.json
FILEA_XYZ_20190501_b233nyc9_OrderPurchase_000123.json
q)AZ:"[A-Z]";n:"[0-9]";Azn:"[A-Za-z0-9]";
q)dirjsn where dirjsn like raze(AZ;"_";AZ;"_";n;"_";Azn;"*")where 5 1 3 1 8 1 8 1
"FILEA_XYZ_20190501_b233nyc9_OrderPurchase_000123.json"
"FILEA_XYZ_20190501_b233nyc9_OrderPurchase_000123.json"
like will not work in this case as we need to check each character. One way to do that is to use in and inter:
q) a: ("FILEA"; "XYZ"; "20190501"; "b233nyc9")
Create a character set
q) c: .Q.a, .Q.A
For first 3 cases, check if each charcter belongs to specific set:
q) r1: all#'(3#a) in' (c;c;.Q.n) / output 111b
For alphanumeric case, check if it contains both number and character and no other symbol.
q)r2: (sum[b]=count a[3]) & all b:sum#'a[3] in/: (c;.Q.n) / output 1b
Print output/errors:
q) errors: ("Incorrect Submitter";"Incorrect Reporter";"Incorrect Date";"Incorrect Kind")
q) show $[0=count r:where not r1,r2;"All good";errors r]
q) "All good"

Spark / Scala Split

I have this code:
rdd.map(_.split("-")).filter(row => { ... })
when I do row.length on:
This-is-a-test----on-split--
This-is-a-test-------
the output is 9 and 4 respectively. It doesn't count the trailing delimited characters if it is empty. What is the workaround here if I want both outputs to be 10?
You can accomplish what you want by passing -1 as limit parameter to split like this:
rdd.map(_.split("-", -1)).filter(row => { ... })
Btw, the expected result is 11, and not 10 (since if you want to keep empty tokens and your string ends with the delimiter, then it's interpreted as if there's an empty token after that delimiter). You can see this for more information.

Concatenate String and Int to form file name prefix

I am working with PowerShell to create a renaming script for a number of files in a directory.
Two questions here:
I have a string variable $strPrefix = "ACV-100-" and an integer counter $intInc = 000001 and I wish to increment the counter $intInc 1 -> 2 and then concatenate the two and store it in a variable $strCPrefix in the following format: ACV-100-000002.
I believe the $intInc will need to be cast in order to convert it once incrementing is complete but I am unsure how to do this.
Secondly, I have found that the script will display 000001 as 1, 000101 as 101 and so on... I need the full 6 digits to be displayed as this will form a file name. How do I keep or pad the numbers before I process the concatenation?
$intInc = 1
$strCprefix = $strprefix + "{0:000000}" -f $intInc # give ACV-100-000001
hope can help
This should do it:
$intInc = 1
...
$filename = $strPrefix + ('0' * (6 - $intInc.ToSTring().Length)) + ($intInc++).ToString()