Using M4, how to extract code from a string and increase indentation? - macros

I need to write a M4 macro that extracts + transforms code between curly braces.
I want to transform
{
import math
a_list = [1, 4, 9, 16]
if val:
print([math.sqrt(i) for i in a_list])
else:
print("val is False")
print("bye bye")
}
to
import math
a_list = [1, 4, 9, 16]
if val:
print([math.sqrt(i) for i in a_list])
else:
print("val is False")
print("bye bye")
The macro has to trim the whitespace before the first { and after the last }.
Because this is python code, nside the curly braces, the relative indentation must be preserved.
Because the output of the macro will be outputted somewhere, needed a certain level of indentation.
The macro should also be able to add extra indentation (=some number of spaces), e.g. given as an argument.
The project is already using m4sugar, so the quotes are [ and ].
Thanks.

Related

Swift: how to convert readLine() input " [-5,20,8...] " to an Int array

I already run the search today and found a similar issue here, but it not fully fix the issue. In my case, I want to convert readLine input string "[3,-1,6,20,-5,15]" to an Int array [3,-1,6,20,-5,15].
I'm doing an online coding quest from one website, which requires input the test case from readLine().
For example, if I input [1,3,-5,7,22,6,85,2] in the console then I need to convert it to Int array type. After that I could deal with the algorithm part to solve the quest. Well, I think it is not wise to limit the input as readLine(), but simply could do nothing about that:(
My code as below, it could deal with positive array with only numbers smaller than 10. But for this array, [1, -3, 22, -6, 5,6,7,8,9] it will give nums as [1, 3, 2, 2, 6, 5, 6, 7, 8, 9], so how could I correctly convert the readLine() input?
print("please give the test array with S length")
if let numsInput = readLine() {
let nums = numsInput.compactMap {Int(String($0))}
print("nums: \(nums)")
}
Here is a one liner to convert the input into an array of integers. Of course you might want to split this up in separate steps if some validation is needed
let numbers = input
.trimmingCharacters(in: .whitespacesAndNewlines)
.dropFirst()
.dropLast()
.split(separator: ",")
.compactMap {Int($0)}
dropFirst/dropLast can be replaced with a replace using a regular expression
.replacingOccurrences(of: "[\\[\\]]", with: "", options: .regularExpression)
Use split method to get a sequence of strings from an input string
let nums = numsInput.split(separator: ",").compactMap {Int($0)}

Strip margin of indented triple-quote string in Purescript?

When using triple quotes in an indented position I for sure get indentation in the output js string too:
Comparing these two in a nested let
let input1 = "T1\nX55.555Y-44.444\nX52.324Y-40.386"
let input2 = """T1
X66.324Y-40.386
X52.324Y-40.386"""
giving
// single quotes with \n
"T1\x0aX55.555Y-44.444\x0aX52.324Y-40.386"
// triple quoted
"T1\x0a X66.324Y-40.386\x0a X52.324Y-40.386"
Is there any agreed upon thing like stripMargin in Scala so I can use those without having to unindent to top level?
Update, just to clarify what I mean, I'm currently doing:
describe "header" do
it "should parse example header" do
let input = """M48
;DRILL file {KiCad 4.0.7} date Wednesday, 31 January 2018 'AMt' 11:08:53
;FORMAT={-:-/ absolute / metric / decimal}
FMAT,2
METRIC,TZ
T1C0.300
T2C0.400
T3C0.600
T4C0.800
T5C1.000
T6C1.016
T7C3.400
%
"""
doesParse input header
describe "hole" do
it "should parse a simple hole" do
doesParse "X52.324Y-40.386" hole
Update:
I was asked to clarify stripMargin from Scala. It's used like so:
val speech = """T1
|X66.324Y-40.386
|X52.324Y-40.386""".stripMargin
which then removes the leading whitespace. stripMargin can take any separator, but defaults to |.
More examples:
Rust has https://docs.rs/trim-margin/0.1.0/trim_margin/
Kotlin has in stdlib: https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.text/trim-margin.html
I guess it might sound like asking for left-pad ( :) ) but if there's something there already I'd rather not brew it myself…
I'm sorry you didn't get a prompt response to this one, but I have implemented this function here. In case the pull request isn't merged, here's an implementation that just depends on purescript-strings:
import Data.String (joinWith, split) as String
import Data.String.CodeUnits (drop, dropWhile) as String
import Data.String.Pattern (Pattern(..))
stripMargin :: String -> String
stripMargin =
let
lines = String.split (Pattern "\n")
unlines = String.joinWith "\n"
mapLines f = unlines <<< map f <<< lines
in
mapLines (String.drop 1 <<< String.dropWhile (_ /= '|'))

Inconsistent word and character boundaries in ICU

I'm using ICU's break iterators for both characters and words as described in here. I expect the output of character break iterator stops more frequently and the break-points be a superset of that of word break iterator. For instance, if I pass abc, I get a, b, and c from character break iterator while I get abc from word break iterator.
Now, I have a Thai string as ด้าน้ำ. The problem is that the behavior of these two break iterators are inconsistent. Given the length of the above string is 6 in Unicode, I get these results from ICU 61.1 on MacOS:
Word boundaries:
[0, 5)
[5, 6)
Character boundaries:
[0, 2)
[2, 3)
[3, 6)
As you can see, character break operator breaks the word in [3, 6) (which seems correct), while word break operator breaks it in [5, 6). Here's a small Python3 code which uses PyICU to repro the issue:
import PyICU
def wordBreakIterator():
return PyICU.BreakIterator.createWordInstance(PyICU.Locale("th"))
def charBreakIterator():
return PyICU.BreakIterator.createCharacterInstance(PyICU.Locale("th"))
def printBoundaries(txt, bi):
bi.setText(txt)
start = bi.first()
try:
while True:
end = next(bi)
print("[{}, {})".format(start, end))
start = end
except StopIteration:
pass
if __name__ == "__main__":
text = u'ด้าน้ำ'
print("Word boundaries:")
printBoundaries(text, wordBreakIterator())
print("Character boundaries:")
printBoundaries(text, charBreakIterator())

syntax for multi-line object properties doesn't allow key/value on first line?

The coffeescript parser is telling me that this is not okay:
{ one: 1,
two: 2
}
But this is:
{
one: 1,
two: 2
}
Is this a straightforward syntax rule, or a side-effect of something else going on in this example?
Coffee-script is white space sensitive, and has optional tokens to enhance readability.
Without looking into the internals of the parser, my understanding is that coffee-script is stripping out optional parts during parsing phase - something like...
# original...
x = { one: 1,
two: 2
}
# ignore curlies, leaving...
x = one: 1,
two: 2
# remove comma leaving incorrectly indented code...
x = one: 1
two: 2
By writing in white space sensitive style from the offset...
# new style
x = {
one: 1,
two: 2
}
# remove the curlies
x =
one: 1,
two: 2
# remove the comma, and nice clear whitespace sensitive definition remains...
x =
one: 1
two: 2

xquery substring and then concat

I am trying to use substring and concat functions in an XQuery statement together.
Here is what I have:
fn:concat(fn:substring({data($x/DueDate)},1,4),fn:substring({data($x/DueDate)},6,2),fn:substring({data($x/(DueDate))},9,2))
The value for DueDate in my xml is: 2013-06-27.
I expected the above function to return: 20130704.
Instead, this is what I get:
fn:concat(fn:substring(2013-06-27,1,4),fn:substring(2013-06-27,6,2),fn:substring(2013-06-27,9,2))
I am confused as to why!
If you use it within an element, you need to surround the expression with curly parentheses, like:
{
fn:concat(
fn:substring(data($x/DueDate), 1, 4),
fn:substring(data($x/DueDate), 6, 2),
fn:substring(data($x/DueDate), 9, 2)
)
}
Are you sure that your DueDate value is actually 2013-06-27? Here is what each individual call to substring equals:
substring('2013-06-27',1,4)
=> 2013
fn:substring('2013-06-27',6,2)
=> 06
fn:substring('2013-06-27',9,2)
=> 27
substring() starts at the 1-indexed character number passed in the second param and copies the number of characters passed in the third param.
Clean up you formatting and bracketing (remove unnecessary curly brackets, remove some parentheses) and you will get some valid code:
fn:concat(
fn:substring(data($x/DueDate), 1, 4),
fn:substring(data($x/DueDate), 6, 2),
fn:substring(data($x/DueDate), 9, 2)
)
There is a much shorter and more readable version, too:
fn:string-join(fn:tokenize($x/DueDate, '-'))
Or even replace using regular expressions:
fn:replace($x/DueDate, '-', '')