Suppose the parent vector p is defined as list where each element is a pointer to the position index of the parent of the given item in the same vector.
Then children of each parent can be found as:
q) c:group p:0N 0 1 0 2
| ,0
0| 1 3
1| ,2
2| ,4
If c is given, what is the efficient way to flatten children dictionary c back to the parent vector p?
Obviously ungroup does not work on dictionaries directly:
q) ungroup c
'type
But we can ungroup tables:
q) {#[;`k] `v xasc ungroup ([]k:key x;v:value x)} c
0N 0 1 0 2
Is there more efficient solution how to get p given c ?
There is no native q command for the type of ungroup you're looking for.
One option that may be useful is the following function:
invgroup:{key[x]#[raze x;value x;:;til count x]}
Effectively what this does is, returns the values of the group dictionary as a single list (raze x), indexes into this list at each set of associated indices (value x) and assigns these the correct index from the original list.
Then we use these indices to index into the distinct values of the original list (key x) to pull out the original list
p ~ invgroup group p:0N 0 1 0 2
1b
Simple solution is:
q) #[raze c;value c;:;key c]
q) 0N 0 1 0 2
Related
I am interested in learning how to implement the Kadane (maximum subarray sum) algorithm in scala with foldLeft function. I run through this example on stack overflow, however I am not sure I understand what the algorithm does exactly. This is how the algorithm looks like:
someArray.foldLeft(0 -> 0) {
case ((maxUpToHere, maxSoFar), n) => val maxEndingHere = 0 max maxUpToHere + n
maxEndingHere -> (maxEndingHere max maxSoFar)
}._2
Is the content included in the {} the lambda function that needs to be applied on every element? and also what does this line do exactly maxEndingHere -> (maxEndingHere max maxSoFar)? Why are the brackets in the parenthesis separated by space? I appreciate any help, sorry if my question comes across as too ignorant, but I am new to Scala
First, you need to understand what foldLeft is. The meaning of this function is to fold over collection into a single value, by passing combining operation and initial element:
// Think of applying op(op(op(…), z) from left to right:
def foldLeft[B](z: B)(op: (B, A) ⇒ B): B
Now, let's see what's happening in your foldLeft. First, the 0 -> 0 is passed. It means that the type B of the initial element is a tuple (Int, Int) with the value (0, 0).
Second, the opening brackets define a function. In scala you can pass it with curly braces. So, the function expects arguments of (B, A) in our case the type B is a tuple (Int, Int) and the type A is the type of an Array elements which is Int.
So, when you can translate your code like this:
someArray.foldLeft(0 -> 0) {
(tuple: (Int, Int), element: Int) => //the body
}
Now, in Scala you can create partial functions with case keyword by applying provided pattern. The pattern in our case matches the provided argument by binding the variables maxUpToHere and maxSoFar to the tuple elements and the n to the element of an array.
So, the function will take each element from an array, apply it with the provided tuple and pass it to the next application until the array was processed fully. Now, let's see what's happening in function body:
val maxEndingHere = 0 max maxUpToHere + n
maxEndingHere -> (maxEndingHere max maxSoFar)
Remember, that our function should return the next B to apply for invocation with element from an array. The B is a tuple in our case. So, the idea is to store the overall max and the local max of the sequence in a tuple.
The maxEndingHere takes the max between 0 and the sum of the previous calculation with the current element of an array n. If the current element will be negative, it will reduce the max sequence hence produce 0 on the max comparison result, thus resetting the accumulated value.
Then we just create new tuple with the new calculated sum of the sequence maxEndingHere and the maximum between current value and the one that is calculated so far (hence the name maxSoFar).
And lastly, we just take the second value of the calculated tuple by calling ._2.
{}
lambda function will be applied on every element in array
maxEndingHere -> (maxEndingHere max maxSoFar)
It will set maxUpToHere to maxEndingHere and maxSoFar to result of maximum between maxEndingHere and maxSoFar for the next iteration
So to dry run the code: for the below array the code with run as follows for each element of array
someArray: Array[Int] = Array(5, 2, -10, 6, 8)
For element n = 5
maxUptoHere = 0
maxSoFar = 0
n = 5
maxEndingHere = 5
For element n = 2
maxUptoHere = 5
maxSoFar = 5
n = 2
maxEndingHere = 7
For element n = -10
maxUptoHere = 7
maxSoFar = 7
n = -10
maxEndingHere = 0
For element n = 6
maxUptoHere = 0
maxSoFar = 7
n = 6
maxEndingHere = 6
For element n = 8
maxUptoHere = 6
maxSoFar = 7
n = 8
maxEndingHere = 14
res15: Int = 14
I have a pair RDD like this:
id value
id1 set(1232, 3,1,93,35)
id2 set(321,42,5,13)
id3 set(1233,3,5)
id4 set(1232, 56,3,35,5)
Now, I want to get the total count of ids per value contained in the set. So the output for the above table should be something like this:
set value count
1232 2
1 1
93 1
35 2
3 3
5 3
321 1
42 1
13 1
1233 1
56 1
Is there a way to achieve this?
I would recommend using the dataframe API since it is easier and more understandable. Using this API, the problem can be solved by using explode and groupBy as follows:
df.withColumn("value", explode($"value"))
.groupBy("value")
.count()
Using an RDD instead, one possible solution is using flatMap and aggregateByKey:
rdd.flatMap(x => x._2.map(s => (s, x._1)))
.aggregateByKey(0)((n, str) => n + 1, (p1, p2) => p1 + p2)
The result is the same in both cases.
yourrdd.toDF().withColumn(“_2”,explode(col(“_2”))).groupBy(“_2”).count.show
In Mathematica, almost all commands automatically thread (or map) over a list.
In Maple, how does one determine which command automatically acts over entries of a list or a set?
For example:
y+p*x=2*sqrt(x*y);
r:=[solve(%,y)];
This gives list of two entries (the solutions)
#r := [-p*x+(2*(1+sqrt(1-p)))*x, -p*x+(2*(1-sqrt(1-p)))*x]
Now I found that collect automatically maps on each list entry
collect(r,x);
# [(-p+2+2*sqrt(1-p))*x, (-p+2-2*sqrt(1-p))*x]
But another command does not (I just picked this one)
MmaTranslator[Mma][LeafCount](r);
#37
For the above one needs to explicitly iterate over the entries of a list or a set.
map(MmaTranslator[Mma][LeafCount],r)
#[17, 19]
Is there a way in Maple to find which command automatically threads over entries of a list or a set other than trial and error?
Maple 2018.1
I don't know of any place in the documentation that says exactly which commands will automatically map over a list.
But the collection of such commands is not large. The vast majority of commands will not automatically map over a list. Most of the ones which auto-map over a list relate to simplication or related manipulation of expressions. The collection of commands which auto-map over a list contains at least these:
collect, combine, expand,
evala, evalc, evalf,
factor, normal, radnormal, rationalize, simplify
The auto-mapping over lists for those commands is mostly a convenience to provide a shorter syntax than wrapping explicitly with the map command.
There are also commands which preserve structure (unless explicitly
told, via options, that the outer list structure is the thing to alter) and thus usually accomplish the same thing for a list as mapping over the list:
convert, eval, evalindets, subs, subsindets
Modern Maple has another shorter syntax which can map a command over a list (or a set, or a Vector, etc). It is called the "elementwise" operation, and its syntax consists of appending ~ (tilde) to the command.
Eg,
discont~( [ csc(x), sec(x) ], x );
[{Pi _Z1~}, {Pi _Z2~ + 1/2 Pi}]
As far as your other example goes, note that LeafCount computes a value (metric) for the first argument considered as a single expression. But a list of items is still a single expression. So it certainly should not be surprising that (without the ~) it acts on the list as a whole, rather than automatically mapping over it. It counts the enclosing list as an additional "leaf".
MmaTranslator:-Mma:-LeafCount( L0 );
8
L0 := [ sin(x), 1/2*x*cos(x) ]:
MmaTranslator:-Mma:-LeafCount~( L0 );
[2, 5]
map( MmaTranslator:-Mma:-LeafCount, L0 );
[2, 5]
For an example similar to your original there is no difference in applying collect (which auto-maps) and applying it elementwise with collect~. Here, the first two results are the same because the addtional argument, x, happens to be a scalar. Eg,
r := [p*x+(2*(x^2+p^2))*x, p*x+(2*(x^2-p^2))*x]:
collect(r, x);
3 2 3 2
[2 x + (2 p + p) x, 2 x + (-2 p + p) x]
collect~(r, x);
3 2 3 2
[2 x + (2 p + p) x, 2 x + (-2 p + p) x]
map(collect, r, x);
3 2 3 2
[2 x + (2 p + p) x, 2 x + (-2 p + p) x]
I should mention that the above examples will behave differently if the second argument is a list such as [x,p] rather than a scalar such as x.
s := [a*b+(2*(a^2*b+b^2))*a, a*b+(2*(a^2*b-b^2))*a]:
collect(s, [a,b]);
3 2 3 2
[2 b a + (2 b + b) a, 2 b a + (-2 b + b) a]
map(collect, s, [a,b]);
3 2 3 2
[2 b a + (2 b + b) a, 2 b a + (-2 b + b) a]
collect~(s, [a,b]);
3 2 2 3
[2 b a + (2 b + b) a, -2 a b + (2 a + a) b]
zip(collect, s, [a,b]);
3 2 2 3
[2 b a + (2 b + b) a, -2 a b + (2 a + a) b]
In the above, the elementiwise collect~ example acts like zip when the second argument is also a list. That is, the first item in the first argument is collected wrt the first item in the second argument, and the second item in the first argument is collected wrt to the second item in the second argument.
Another feature of the elementwise operator syntax is that it will not map the command over the operands of a scalar expression (ie. not a list, set, Vector, etc). This is in stark contrast to map, which can be used to map an operation over the operands of an expression.
Here are two examples where map applies the command to the operands of a scalar expression, while using elementwise ~ gets the command applied only to the scalar expression itself. In the first example the operands are the summands of a sum of terms. In the second example the operands are the arguments of an unevaluated function call.
T := x^2 * sin(x) + y^2 * cos(x):
F( T );
2 2
F(x sin(x) + y cos(x))
F~( T );
2 2
F(x sin(x) + y cos(x))
map( F, T );
2 2
F(x sin(x)) + F(y cos(x))
G( arctan(a, b) );
G(arctan(a, b))
G~( arctan(a, b) );
G(arctan(a, b))
map( G, arctan(a, b) );
arctan(G(a), G(b))
So, if you don't want to map a command inadvertantly over the operands of a scalar expression (addend, multiplicands, etc) then you can use the elementwise ~ syntax without having to first test whether the first expression is a scalar or a list (etc).
Again, if there is an additional argument then it makes a difference whether it is a scalar to a list.
F( T, a );
F(sin(x) + cos(x), a)
F~( T, a );
F(sin(x) + cos(x), a)
map( F, T, a );
F(sin(x), a) + F(cos(x), a)
F( T, [a,b] );
F(sin(x) + cos(x), [a, b])
map( F, T, [a,b] );
F(sin(x), [a, b]) + F(cos(x), [a, b])
F~( T, [a,b] );
[F(sin(x) + cos(x), a), F(sin(x) + cos(x), b)]
zip( F, T, [a,b] );
[F(sin(x) + cos(x), a), F(sin(x) + cos(x), b)]
a = [ 1, 2, 3]
a
[1,2,3]
b = [ 3, 4, 5]
b
[3,4,5]
c = [a ,b]
c
[[1,2,3],[3,4,5]]
a !! 2
(Just 3)
a !! 2
(Just 3)
a !! 1
(Just 2)
c !! 2
Nothing
c !! 1
(Just [3,4,5])
c !! 1 !! 0
Error found:
in module $PSCI
at line 1, column 1 - line 1, column 11
Could not match type
Maybe
with type
Array
while trying to match type Maybe (Array Int)
with type Array t0
while checking that expression (index c) 1
has type Array t0
in value declaration it
where t0 is an unknown type
Indexing into an array returns not the plain element, but values wrapped in Maybe, because the array might not have an element at the given index. In your case, the result of c !! 1 has type Maybe (Array Int). So you have to handle that Maybe somehow.
I guess you expect the end result to be of type Maybe Int. There are different ways to do so. The perhaps most explicit one is:
case c !! 1 of
Nothing -> Nothing
(Just x) -> x !! 0
(this will return Just 3)
Because "chaining" functions like this is very common, there are abstractions that lead to the same result, e.g.:
(c !! 1) >>= (_ !! 0)
Anyways, the trick is to reach into the first result (if it was successful) and then try the second indexing. If both succeed, return the end result. If one fails, return Nothing.
As a minimal example, for instance if I have:
q) x:flip `a`b!(enlist 1;enlist 2);
q) y:flip `c`d!(enlist 3;enlist 4);
q) (raze x), (raze y)
`a`b`c`d!1j 2j 3j 4j # works as expected
But with peach involved,
q) {(raze x), (raze y)} peach x
enlist 1j 2j # I was expecting `a`b`c`d!1j 2j 3j 4j
There is no 3j 4j in the output - why has my raze y been ignored?
Indeed, each also gives a different output
q) {(raze x), (raze y)} each x
({:(raze x), (raze y);}';flip `a`b!(enlist 1j;enlist 2j))
I thought peach was just a parallel version of each, so both should yield the same...
What's going on?
That was not an inconsistent behavior of peach and each.
First, Functions in kdb has implicit parameters as x,y,z if not specified any.
So f:{x+y} is equivalent to f:{[x;y] x+y}
But f:{[a;b] a+b} will not have x,y,z as implicit parameter
For more details, see section Implicit Parameters in http://code.kx.com/q4m3/6_Functions/#617-implicit-parameters
Peach Case:
When you do {(raze x), (raze y)} peach x :
i) Another way of writing this function is:
f:{[x;y] (raze x),(raze y)}
And call is like: f[;] peach x
So you are passing global x to local x of a function but nothing in y, that's why you are getting only 1 and 2 and not 3 &4 in output.
ii) Why only 1 and 2 and not ab!1 2 in output?
When you pass each row of table (x in your case) to a function, it goes in form of a dictionary. And raze on dictionary gives only values.
You have to modify your function for correct working like this(Use Each Both ):
flip {x,y}' [x;y]
' is each both which is used when you have more then one arguments and you want to apply each on all of them simultaneoulsy.
This will take one row at a time from both global x and y and copy it to local x and y in dictionary form and then join them.
Each Case:
Each is just giving you message that your function requires 2 arguments and hence it couldn't execute it.
Why Peach worked and Each didn't?
Peach and Each are not same for some scenarios.
When you have dyadic function, then peach works like each prior(':) and not as Each.
They are same only for monadic functions. In your case you have dyadic function.
x and y can be implicit arguments to a function. Your use of x and y for the name of the list variables is confusing and not recommended.
To make it clear what is happening consider if I renamed the variables a,b:
q)a:flip `a`b!(enlist 1;enlist 2);
q)b:flip `c`d!(enlist 3;enlist 4);
q)(raze a), (raze b)
a| 1
b| 2
c| 3
d| 4
q){(raze a), (raze b)} peach x
a b c d
-------
1 2 3 4
You can see how peach/each handle the implicit function arguments with this example:
q)x:x
q)y:y
q){(x;y)} each 1
{(x;y)}'[1]
q){(x;y)} peach 1
1
I can't tell for sure what behaviour you want so all I can do is point out why there's an issue.