I'm trying to create a parser for program. For example,
I entered (what I want)
"(2+3)-4" it will become something like this "(minus, (plus, num 2, num 3),num 4)"
What I've done so far..
"(2+3)-4" I then split it and it becomes list Z = ["(","2","+","3",")","-","4"] then I compared if "-" is a member of Z, if true I append the element "-" into a new list ["-"]
I'm not sure if the way I'm doing is correct, I'm new to Er-lang and struggling quite a lot. If anyone is able to offer me some insight, thanks.
Consider the following, which returns a tuple-based representation of its input:
parse(Expr) ->
Elems = re:split(Expr, "([-+)(])", [{return,list}]),
parse(lists:filter(fun(E) -> E /= [] end, Elems), []).
parse([], [Result]) ->
Result;
parse([], [V2,{op,Op},V1|Tacc]) ->
parse([], [{Op,V1,V2}|Tacc]);
parse(["("|Tail], Acc) ->
parse(Tail, [open|Acc]);
parse([")"|Tail], [Op,open|TAcc]) ->
parse(Tail, [Op|TAcc]);
parse(["+"|Tail], Acc) ->
parse(Tail, [{op,plus}|Acc]);
parse(["-"|Tail], Acc) ->
parse(Tail, [{op,minus}|Acc]);
parse([V2|Tail], [{op,Op},V1|Tacc]) ->
parse(Tail, [{Op,V1,{num,list_to_integer(V2)}}|Tacc]);
parse([Val|Tail], Acc) ->
parse(Tail, [{num,list_to_integer(Val)}|Acc]).
The first function, parse/1, splits the expression along the + and - operators and parentheses, preserving these in the resulting list. It then filters that list to remove empty elements, and passes it with an empty accumulator to parse/2.
The parse/2 function has eight clauses, described below:
The first two handle the case when the parsed input list has been exhausted. The second of these handles the case where multiple elements in the accumulator need to be collapsed into a single tuple consisting of operator and operands.
The next two handle clauses parentheses. When we see an open parenthesis, we push an atom open into the accumulator. Upon seeing the matching close parenthesis, we expect to see an operation tuple and the atom open in the accumulator, and we replace them with just the tuple.
Clauses 5 and 6 handle + and - respectively. Each just pushes a {op,Operator} tuple into the accumulator, where Operator is either the atom plus or the atom minus.
The final two clauses handle values. The first one handles the case where the accumulator holds a value and an op tuple, which gets replaced with a full operation tuple consisting of the atom plus or minus followed by two num tuples each holding integer operands. The last clause just handles plain values.
Putting this in a module p, compiling it, and running it in an Erlang shell yields the following:
1> p:parse("2+3").
{plus,{num,2},{num,3}}
2> p:parse("(2+3)-4").
{minus,{plus,{num,2},{num,3}},{num,4}}
Related
I am trying to convert a PCollection, that has many elements, into a PCollection that has one element. Basically, I want to go from:
[1,2,3,4,5,6]
to:
[[1,2,3,4,5,6]]
so that I can work with the entire PCollection in a DoFn.
I've tried CombineGlobally(lamdba x: x), but only a portion of elements get combined into an array at a time, giving me the following result:
[1,2,3,4,5,6] -> [[1,2],[3,4],[5,6]]
Or something to that effect.
This is my relevant portion of my script that I'm trying to run:
import apache_beam as beam
raw_input = range(1024)
def run_test():
with TestPipeline() as test_pl:
input = test_pl | "Create" >> beam.Create(raw_input)
def combine(x):
print(x)
return x
(
input
| "Global aggregation" >> beam.CombineGlobally(combine)
)
pl.run()
run_test()
I figured out a pretty painless way to do this, which I missed in the docs:
The more general way to combine elements, and the most flexible, is
with a class that inherits from CombineFn.
CombineFn.create_accumulator(): This creates an empty accumulator. For
example, an empty accumulator for a sum would be 0, while an empty
accumulator for a product (multiplication) would be 1.
CombineFn.add_input(): Called once per element. Takes an accumulator
and an input element, combines them and returns the updated
accumulator.
CombineFn.merge_accumulators(): Multiple accumulators could be
processed in parallel, so this function helps merging them into a
single accumulator.
CombineFn.extract_output(): It allows to do additional calculations
before extracting a result.
I suppose supplying a lambda function that simply passes its argument to the "vanilla" CombineGlobally wouldn't do what I expected initially. That functionality has to be specified by me (although I still think it's weird this isn't built into the API).
You can find more about subclassing CombineFn here, which I found very helpful:
A CombineFn specifies how multiple values in all or part of a
PCollection can be merged into a single value—essentially providing
the same kind of information as the arguments to the Python “reduce”
builtin (except for the input argument, which is an instance of
CombineFnProcessContext). The combining process proceeds as follows:
Input values are partitioned into one or more batches.
For each batch, the create_accumulator method is invoked to create a fresh initial “accumulator” value representing the combination of
zero values.
For each input value in the batch, the add_input method is invoked to combine more values with the accumulator for that batch.
The merge_accumulators method is invoked to combine accumulators from separate batches into a single combined output accumulator value,
once all of the accumulators have had all the input value in their
batches added to them. This operation is invoked repeatedly, until
there is only one accumulator value left.
The extract_output operation is invoked on the final accumulator to get the output value. Note: If this CombineFn is used with a transform
that has defaults, apply will be called with an empty list at
expansion time to get the default value.
So, by subclassing CombineFn, I wrote this simple implementation, Aggregated, that does exactly what I want:
import apache_beam as beam
raw_input = range(1024)
class Aggregated(beam.CombineFn):
def create_accumulator(self):
return []
def add_input(self, accumulator, element):
accumulator.append(element)
return accumulator
def merge_accumulators(self, accumulators):
merged = []
for a in accumulators:
for item in a:
merged.append(item)
return merged
def extract_output(self, accumulator):
return accumulator
def run_test():
with TestPipeline() as test_pl:
input = test_pl | "Create" >> beam.Create(raw_input)
(
input
| "Global aggregation" >> beam.CombineGlobally(Aggregated())
| "print" >> beam.Map(print)
)
pl.run()
run_test()
You can also accomplish what you want with side inputs, e.g.
with beam.Pipeline() as p:
pcoll = ...
(p
# Create a PCollection with a single element.
| beam.Create([None])
# This will process the singleton exactly once,
# with the entirity of pcoll passed in as a second argument as a list.
| beam.Map(
lambda _, pcoll_as_side: ...consume pcoll_as_side here...,
pcoll_as_side=beam.pvalue.AsList(pcoll))
I think I'm close to what I want, though I suspect I'm not understanding how thaw / TH Region works.
Here is what I'm trying to implement (at least roughly)
modifyPerIndex :: forall t a. Foldable t => t (Tuple Int (a -> a)) -> Array a -> Array a
modifyPerIndex foldableActions array = run do
mutableArray <- thaw array
let actions = fromFoldable foldableActions
foreach actions (\(Tuple index action) -> modify index action mutableArray)
freeze mutableArray
This is sort of how I imagine updateAtIndices works. I suppose I could write modifyPerIndex to use updateAtIndices by reading in the values, applying the (a -> a) and mapping the result into a list of Tuples to be sent to updateAtIndices.
I'm curious how to do it this way though.
In the code above modify returns ST h Boolean, which I'd like to change into ST h Unit. That's where I'm lost. I get that h here is a constraint put on mutable data to stop it from leaving run, what I don't understand is how to use that.
There are a few options. But it has nothing to do with h. You don't have to "use" it for anything, and you don't have to worry about it at all.
First, the most dumb and straightforward approach - just bind the result to an ignored variable and then separately return unit:
foreach actions \(Tuple index action) -> do
_ <- modify index action mutableArray
pure unit
Alternatively, you can use void, which does more or less the same thing under the hood:
foreach actions \(Tuple index action) -> void $ modify index action mutableArray
But I would go straight for for_, which is the same as foreach, but works for any monad (not just ST) and ignores individual iterations' return values:
for_ actions \(Tuple index action) -> modify index action mutableArray
From https://code.kx.com/q/wp/parse-trees/#the-solution
I came across below function, which translates enlisted symbols or symbol lists into the string "enlist".
ereptest:{ //returns a boolean
(1=count x) and ((0=type x) and 11=type first x) or 11=type x}
ereplace:{"enlist",.Q.s1 first x}
funcEn:{$[ereptest x;ereplace x;0=type x;.z.s each x;x]} <<<<<
In last line, it seems $ is applied to 5 arguments, but this page shows $ is of rank 2 or 3. What am I missing here?
From the kx wiki
Odd number of expressions
For brevity, nested triads can be flattened.
$[q;a;r;b;c] <=> $[q;a;$[r;b;c]]
These two expressions are equivalent:
$[0;a;r;b;c]
$[r;b;c]
What is the difference between (0:2):4 and 0:(2:4)?
Both neglects the 2nd part of the bracket thus printing values similar to writing (0:4) and (0:2) respectively.
I could generalize from this that the bracket's first element is only working in this vector. But I would like to know the actual reason why is it happening.
the colon operator has lower priority than (), so, matlab first evaluates the vector inside the parenthesis, then, if one of the operands is a vector, colon only takes the first value. here are the evaluation steps:
(0:2):4 -> (0:2)=[0 1 2] -> 0:4 -> [0,1,2,3,4]
0:(2:4) -> (2:4)=[2 3 4] -> 0:2 -> [0,1,2]
I have two lists for example:
x:("AA","BB","CC")
y:("1","2","3")
I would like to target the concatenation of both lists element wise as below:
z = ("AA1","BB2","CC3")
I have tried the following which only works if the lists have one string:
(x,y)
Use eachboth adverb which takes one element from each list at a time and perform operation.
Also change comma to semicolon in your x and y list to get list with 3 items.
q) x:("AA";"BB";"CC")
q) y:("1";"2";"3")
q) x,'y
Output:
("AA1";"BB2";"CC3")