When, in Elixir, should one use Macro.escape/1 instead of quote/1? I've looked at the beginner's guide and it's not helping.
quote/2 returns the abstract syntax tree (AST) of the passed in code block.
Macro.escape/2 returns the AST of the passed in value.
Here is a example:
iex(1)> a = %{"apple": 12, "banana": 90}
%{apple: 12, banana: 90}
iex(2)> b = quote do: a
{:a, [], Elixir}
iex(3)> c = Macro.escape(a)
{:%{}, [], [apple: 12, banana: 90]}
quote/2 will keep the origin variable a, while Macro.escape/2 will inject a's value into the returned AST.
iex(4)> Macro.to_string(b) |> Code.eval_string
warning: variable "a" does not exist and is being
expanded to "a()", please use parentheses to remove
the ambiguity or change the variable name
nofile:1
iex(5)> Macro.to_string(c) |> Code.eval_string
{%{apple: 12, banana: 90}, []}
iex(6)> Macro.to_string(b) |> Code.eval_string([a: "testvalue"])
{"testvalue", [a: "testvalue"]}
For completeness' sake:
iex(1)> a = %{"apple": 12, "banana": 90}
%{apple: 12, banana: 90}
iex(3)> Macro.escape(a)
{:%{}, [], [apple: 12, banana: 90]}
iex(2)> quote do: %{"apple": 12, "banana": 90}
{:%{}, [], [apple: 12, banana: 90]}
Related
I have the following code that does a groupby on an array of numbers and returns an array of tuples with the numbers and respective counts:
using Query
a = [1, 2, 1, 2, 3, 4, 6, 1, 5, 5, 5, 5]
key_counts = a |> #groupby(_) |> #map g -> (key(g), length(values(g)))
collect(key_counts)
Is there a way to complete the last step in the pipeline to convert the key_counts of type QueryOperators.EnumerableMap{Tuple{Int64, Int64}, QueryOperators.EnumerableIterable{Grouping{Int64, Int64}, QueryOperators.EnumerableGroupBy{Grouping{Int64, Int64}, Int64, Int64, QueryOperators.EnumerableIterable{Int64, Vector{Int64}}, var"#12#17", var"#13#18"}}, var"#15#20"} to Vector{Tuple{Int, Int}} directly by integrating the collect operation to the pipeline as one liner?
The question has been clarified. My answer is no longer intended as a solution but provides additional information.
Using key_counts |> collect instead of collect(key_counts) works on the second line, but |> collect at the end of the pipe line does not, which feels like unwanted behavior.
Below response no longer relevant
When I run your code I actually do receive a Vector{Tuple{Int, Int}} as output.
I'm using Julia v1.6.0 with Query v1.0.0.
using Query
a = [1, 2, 1, 2, 3, 4, 6, 1, 5, 5, 5, 5]
key_counts = a |> #groupby(_) |> #map g -> (key(g), length(values(g)))
output = collect(key_counts)
typeof(output) # Vector{Tuple{Int64, Int64}} (alias for Array{Tuple{Int64, Int64}, 1})
I tried this :
rdd1= sc.parallelize(["Let's have some fun.",
"To have fun you don't need any plans."])
output = rdd1.map(lambda t: t.split(" ")).map(lambda lists: (lists, len(lists)))
output.foreach(print)
output:
(["Let's", 'have', 'some', 'fun.'], 4)
(['To', 'have', 'fun', 'you', "don't", 'need', 'any', 'plans.'], 8)
and i got the count of total number of words per line. but I wanted the count of each word per line.
You can try this:
from collections import Counter
output = rdd1.map(lambda t: t.split(" ")).map(lambda lists: dict(Counter(lists)))
I'll give a small python example:
from collections import Counter
example_1 = "Let's have some fun."
Counter(example_1.split(" "))
# [{"Let's": 1, 'have': 1, 'some': 1, 'fun.': 1}
example_2 = "To have fun you don't need any plans."
Counter(example_2.split(" "))
# {'To': 1, 'have': 1, 'fun': 1, 'you': 1, "don't": 1, 'need': 1, 'any': 1, 'plans.': 1}]
Based on your input and from what I understand please find below code. Just minor changes to your code:
output = rdd1.flatMap(lambda t: t.split(" ")).map(lambda lists: (lists, 1)).reduceByKey(lambda x,y : x+y)
You used map for splitting data. Instead use flatMap. It will break your string into words. PFB output:
output.collect()
[('have', 2), ("Let's", 1), ('To', 1), ('you', 1), ('need', 1), ('fun', 1), ("don't", 1), ('any', 1), ('some', 1), ('fun.', 1), ('plans.', 1)]
I'm attempting to add after transforming a two-dimensional array into a one dimensional array using the following code in a Playground:
let twoDimensionalArray = [[1, 3, 5], [2, 4, 6], [12, 15, 16]]
let oneDimensionalArray = twoDimensionalArray.flatMap { $0.map { $0 += 2 } }
print(oneDimensionalArray)
However I receive the error:
left side of mutating operator isn't mutable: '$0' is immutable
Also I see that the flatmap method is deprecated in the Apple Documentation so what should I be doing differently?
You almost right. All you need is remove =:
let twoDimensionalArray = [[1, 3, 5], [2, 4, 6], [12, 15, 16]]
let oneDimensionalArray = twoDimensionalArray.flatMap { $0.map { $0 + 2 } }
print(oneDimensionalArray) // [3, 5, 7, 4, 6, 8, 14, 17, 18]
You can apply changes to the value ($0) in closure by manipulating with it and something else, not by directly changing (i.e. $0 += 2).
I have dataframe like this
+---------+--------------------+----------------------------+
| Name| rem1| quota |
+---------+--------------------+----------------------------+
|Customer_3|[258, 259, 260, 2...|[1, 2, 3, 4, 5, 6, 7,..500]|
|Customer_4|[18, 19, 20, 27, ...|[1, 2, 3, 4, 5, 6, 7,..500]|
|Customer_5|[16, 17, 51, 52, ...|[1, 2, 3, 4, 5, 6, 7,..500]|
|Customer_6|[6, 7, 8, 9, 10, ...|[1, 2, 3, 4, 5, 6, 7,..500]|
|Customer_7|[0, 30, 31, 32, 3...|[1, 2, 3, 4, 5, 6, 7,..500]|
I would like to remove list value in rem1 from quota and create as one new column. I have tried.
val dfleft = dfpci_remove2.withColumn("left",$"quota".filter($"rem1"))
<console>:123: error: value filter is not a member of org.apache.spark.sql.ColumnName
Please advise.
You can use a filter in a column in such way, you can write an udf as below
val filterList = udf((a: Seq[Int], b: Seq[Int]) => a diff b)
df.withColumn("left", filterList($"rem1", $"quota") )
This should give you the expected result.
Hope this helps!
In my function, I am returning a finalDF, a sequence of data frames. In the loop shown below, map returns Seq[DataFrame] and it is being stored in finalDF to be able to return to the caller, but in some cases where there is further processing, I would like to store the filtered dataframe for each iteration and pass it to next loop.
How do I do it? If I try to assign it to some temp val, it throws and error that expression of type Seq[unit] does not conform to expected type Seq[DataFrame].
var finalDF: Seq[DataFrame] =null
for (i <- 0 until stop){
finalDF=strataCount(i).map(x=> {
df.filter(df(cols(i)) === x)
//how to get the above data frame to pass on to the next computation?
}
)
}
Regards
Maybe this is helpful:
val finalDF: Seq[DataFrame] = (0 to stop).flatMap(i => strataCount(i).map(x => df.filter(df(cols(i)) === x))).toSeq
flatMap to flatten the Seq(Seq).
(0 to stop) will loop from 0 to stop, flatMap will flatten List, Like:
scala> (0 to 20).flatMap(i => List(i))
res0: scala.collection.immutable.IndexedSeq[Int] = Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
scala> (0 to 20).map(i => List(i)).flatten
res1: scala.collection.immutable.IndexedSeq[Int] = Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
for two counters, maybe you can do it like:
(0 to stop).flatMap(j => {
(0 to stop).flatMap(i => strataCount(i).map(x => df.filter(df(cols(i)) === x)))
}).toSeq
or try: for yield, see: Scala for/yield syntax