Not displaying output in Maple for for-do loop - maple

I am trying to write something that displays a rotating square. However, the output is
display(PLOT(...), PLOT(...), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), insequence = true)
My for-do code is
for k to 30 do
RotSquare := Trans(1.5, 1.5).HomRot((1/15)*k*Pi).Trans(-1.5, -1.5).HomSquare;
graph[k] := plot(Transpose(`<,>`(RotSquare[1], RotSquare[2])), x = -3 .. 3, y = -3 .. 3)
od:
display(seq(graph[k], k = 1 .. 30), insequence = true)
However my output is
display(PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), PLOT(`...`), insequence = true)
I was wondering if anyone knows whats going on

Make sure you load the "plots" module. Either reference display as plots[display] or use with(plots) in initializing your worksheet.

Related

How can i get output as below jn spark scala

I have data like below.
A
B
C
D
1
A
Day
D1
1
A
Tim
1am
1
A
Tim
3am
Need to create like this
A
B
Day
Tim1
Tim2
1
A
D1
1am
3am
Can you help how to get in spark scala
You can add the row numbers for the duplicates first and then do the pivot.
import org.apache.spark.sql.expressions.Window
val w1 = Window.partitionBy("A", "B", "C").orderBy("D")
val w2 = Window.partitionBy("A", "B", "C")
val df1 = df0.withColumn("row_num", row_number().over(w1)).withColumn("max_num", max("row_num").over(w2))
df1.show(false)
//+---+---+---+---+-------+-------+
//|A |B |C |D |row_num|max_num|
//+---+---+---+---+-------+-------+
//|1 |A |Tim|1am|1 |2 |
//|1 |A |Tim|3am|2 |2 |
//|1 |A |Day|D1 |1 |1 |
//+---+---+---+---+-------+-------+
val df2 = df1.withColumn("C", expr("if(max_num != 1, concat(C, row_num), C)"))
df2.show(false)
//+---+---+----+---+-------+-------+
//|A |B |C |D |row_num|max_num|
//+---+---+----+---+-------+-------+
//|1 |A |Tim1|1am|1 |2 |
//|1 |A |Tim2|3am|2 |2 |
//|1 |A |Day |D1 |1 |1 |
//+---+---+----+---+-------+-------+
val df3 = df2.groupBy("A", "B").pivot("C").agg(first("D"))
df3.show(false)
//+---+---+---+----+----+
//|A |B |Day|Tim1|Tim2|
//+---+---+---+----+----+
//|1 |A |D1 |1am |3am |
//+---+---+---+----+----+

How to convert a map to individual columns in spark scala?

I have a spark dataframe with values like below and I am struggling to find ways to convert in the input dataframe to separate columns like Id, Fld1, Fld2. Appreciate any help or pointer to the documentation which does this ?
val df2 = Seq(
("1", Map("Fld1" -> "USA","Fld2" -> "UK")),
("2", Map("Fld1" -> "Germany", "Fld2" -> "Portugal"))
).toDF("id", "map")
df2.show()
Input:
+---+-----------------------------------+
|id |map |
+---+-----------------------------------+
|1 |[Fld1 -> USA, Fld2 -> UK] |
|2 |[Fld1 -> Germany, Fld2 -> Portugal]|
+---+-----------------------------------+
Expected Output:
+---+-------+--------+
| id| Fld1 | Fld2 |
+---+-------+--------+
| 1 | USA | UK |
| 2 |Germany|Portugal|
+---+-------+--------+
Here's the performant solution:
df2
.withColumn("Fld1", $"map".getItem("Fld1"))
.withColumn("Fld2", $"map".getItem("Fld2"))
.drop("map")
.show()
+---+-------+--------+
| id| Fld1| Fld2|
+---+-------+--------+
| 1| USA| UK|
| 2|Germany|Portugal|
+---+-------+--------+
The other answer suggests using pivot which can be really slow.
You could explode the map using the selectExpr and then apply pivot as shown below:
.selectExpr("id", "explode(map)")
.groupBy(col("id")).pivot(col("key")).agg(first(col("value")))
// result
+---+-------+--------+
|id |Fld1 |Fld2 |
+---+-------+--------+
|1 |USA |UK |
|2 |Germany|Portugal|
+---+-------+--------+

How to include/map calculated percentiles to the result dataframe?

I'm using spark-sql-2.4.1v, and I'm trying to do find quantiles, i.e. percentile 0, percentile 25, etc, on each column of my given data.
As I am doing multiple percentiles, how to retrieve each calculated percentile from the results?
My dataframe df:
+----+---------+-------------+----------+-----------+
| id| date| revenue|con_dist_1| con_dist_2|
+----+---------+-------------+----------+-----------+
| 10|1/15/2018| 0.010680705| 6|0.019875458|
| 10|1/15/2018| 0.006628853| 4|0.816039063|
| 10|1/15/2018| 0.01378215| 4|0.082049528|
| 10|1/15/2018| 0.010680705| 6|0.019875458|
| 10|1/15/2018| 0.006628853| 4|0.816039063|
+----+---------+-------------+----------+-----------+
I need to get expected output/result as below:
+----+---------+-------------+-------------+------------+-------------+
| id| date| revenue| perctile_col| quantile_0 |quantile_10 |
+----+---------+-------------+-------------+------------+-------------+
| 10|1/15/2018| 0.010680705| con_dist_1 |<quant0_val>|<quant10_val>|
| 10|1/15/2018| 0.010680705| con_dist_2 |<quant0_val>|<quant10_val>|
| 10|1/15/2018| 0.006628853| con_dist_1 |<quant0_val>|<quant10_val>|
| 10|1/15/2018| 0.006628853| con_dist_2 |<quant0_val>|<quant10_val>|
| 10|1/15/2018| 0.01378215| con_dist_1 |<quant0_val>|<quant10_val>|
| 10|1/15/2018| 0.01378215| con_dist_2 |<quant0_val>|<quant10_val>|
| 10|1/15/2018| 0.010680705| con_dist_1 |<quant0_val>|<quant10_val>|
| 10|1/15/2018| 0.010680705| con_dist_2 |<quant0_val>|<quant10_val>|
| 10|1/15/2018| 0.006628853| con_dist_1 |<quant0_val>|<quant10_val>|
| 10|1/15/2018| 0.006628853| con_dist_2 |<quant0_val>|<quant10_val>|
+----+---------+-------------+-------------+------------+-------------+
I have already calculated the quantiles like this but need to add them to the output dataframe:
val col_list = Array("con_dist_1","con_dist_2")
val quantiles = df.stat.approxQuantile(col_list, Array(0.0,0.1,0.5),0.0)
val percentile_0 = 0;
val percentile_10 = 1;
val Q0 = quantiles(col_list.indexOf("con_dist_1"))(percentile_0)
val Q10 =quantiles(col_list.indexOf("con_dist_1"))(percentile_10)
How to get expected output show above?
An easy solution would be to create multiple dataframes, one for each "con_dist" column, and then use union to merge them together. This can easily be done using a map over col_list as follows:
val col_list = Array("con_dist_1", "con_dist_2")
val quantiles = df.stat.approxQuantile(col_list, Array(0.0,0.1,0.5), 0.0)
val df2 = df.drop(col_list: _*) // we don't need these columns anymore
val result = col_list
.zipWithIndex
.map{case (col, colIndex) =>
val Q0 = quantiles(colIndex)(percentile_0)
val Q10 = quantiles(colIndex)(percentile_10)
df2.withColumn("perctile_col", lit(col))
.withColumn("quantile_0", lit(Q0))
.withColumn("quantile_10", lit(Q10))
}.reduce(_.union(_))
The final dataframe will then be:
+---+---------+-----------+------------+-----------+-----------+
| id| date| revenue|perctile_col| quantile_0|quantile_10|
+---+---------+-----------+------------+-----------+-----------+
| 10|1/15/2018|0.010680705| con_dist_1| 4.0| 4.0|
| 10|1/15/2018|0.006628853| con_dist_1| 4.0| 4.0|
| 10|1/15/2018| 0.01378215| con_dist_1| 4.0| 4.0|
| 10|1/15/2018|0.010680705| con_dist_1| 4.0| 4.0|
| 10|1/15/2018|0.006628853| con_dist_1| 4.0| 4.0|
| 10|1/15/2018|0.010680705| con_dist_2|0.019875458|0.019875458|
| 10|1/15/2018|0.006628853| con_dist_2|0.019875458|0.019875458|
| 10|1/15/2018| 0.01378215| con_dist_2|0.019875458|0.019875458|
| 10|1/15/2018|0.010680705| con_dist_2|0.019875458|0.019875458|
| 10|1/15/2018|0.006628853| con_dist_2|0.019875458|0.019875458|
+---+---------+-----------+------------+-----------+-----------+

How to check all values in columns efficiently using Spark?

I'm wondering how to make dynamic filter given unknown columns in Spark.
For example, the dataframe is like below:
+-------+-------+-------+-------+-------+-------+
| colA | colB | colC | colD | colE | colF |
+-------+-------+-------+-------+-------+-------+
| Red | Red | Red | Red | Red | Red |
| Red | Red | Red | Red | Red | Red |
| Red | Blue | Red | Red | Red | Red |
| Red | Red | Red | Red | Red | Red |
| Red | Red | Red | Red | Blue | Red |
| Red | Red | White | Red | Red | Red |
+-------+-------+-------+-------+-------+-------+
The columns can only be known at runtime, meaning it can have colG, H ..
I need to check if the whole column's value is Red, then get a count, in above case is 3 as colA, colD and ColF columns are all Red.
What I am doing is something like below, and it is SLOW..
val allColumns = df.columns
df.foldLeft(allColumns) {
(df, column) =>
val tmpDf = df.filter(df(column) === "Red")
if (tmpDf.rdd.isEmpty) {
count += 1
}
df
}
I am wondering if there is a better way. Many thanks!
you got N RDD scans where N is number of columns. You can scan all of them at once and reducing in parallel. For example this way:
df.reduce((a, r) => Row.fromSeq(a.toSeq.zip(r.toSeq)
.map { case (a, r) =>
if (a == "Red" && r == "Red") "Red" else "Not"
}
))
res11: org.apache.spark.sql.Row = [Red,Not,Not]
This code will do one RDD scan, and then iterated Row columns inside reduce. Row.toSeq get Seq from Row. fromSeq restore Row to return the same object.
Edit: for count just add: .toSeq.filter(_ == "Red").size
Why not simply do df.filter + df.count using only DataFrame API?
val filter_expr = df.columns.map(c => col(c) === lit("Red")).reduce(_ and _)
val count = df.filter(filter_expr).count
//count: Long = 3

Reduce rows in a group based on a column value using Spark / Scala

I want to implement Netting in the sense of reducing rows in each group based on below conditions:
- if the UNITS column has negative and positive values across rows in each group, do arithmetic sum. And the final row will have Amt from the row that has
the more amount.
- If Units has only positive or negative values in a group, then we pass the all the rows as-is
Out of this below dataset below, I want to do netting but am unable to figure out as this is not aggregation:
+-----+------+----+-----+
|store|prod |amt |units|
+-----+------+----+-----+
|West |Apple |2.0 |-10 |
|West |Apple |3.0 |10 |
|West |Orange|5.0 |-15 |
|West |Orange|17.0|-15 |
|South|Orange|3.0 |9 |
|South|Orange|6.0 |-18 |
|East |Milk |5.0 |-5 |
|West |Milk |5.0 |8 |
+-----+------+----+-----+
Summing should happen if there is atleast 2 rows with opposite sign Units. That is in group below:
+-----+------+----+-----+
|West |Apple |2.0 |-10 |
|West |Apple |3.0 |10 |
+-----+------+----+-----+
There are 2 rows with -10 and 10, so this group will reduce to zero rows, as summing -10 and 10 is Zero.
But in below group, the difference between the Units is -9:
+-----+------+----+-----+
|South|Orange|3.0 |9 |
|South|Orange|6.0 |-18 |
+-----+------+----+-----+
... the result of this group would be taking
- amt from the row that has more units.
- difference in units.
+-----+------+----+-----+
|South|Orange|6.0 |-9 |
+-----+------+----+-----+
Any rows in group that dont contain both negative and positive numbers would pass-thru straight.
So the final dataset should look as below:
+-----+------+----+-----+
|store|prod |amt |units|
+-----+------+----+-----+
|West |Orange|5.0 |-15 |
|West |Orange|17.0|-15 |
|South|Orange|6.0 |-9 |
|East |Milk |5.0 |-5 |
|West |Milk |5.0 |8 |
+-----+------+----+-----+
Below rows are ...
a) removed
+-----+------+----+-----+
|West |Apple |2.0 |-10 |
|West |Apple |3.0 |10 |
+-----+------+----+-----+
or
b) reduced
+-----+------+----+-----+
|South|Orange|3.0 |9 |
|South|Orange|6.0 |-18 |
+-----+------+----+-----+
to
+-----+------+----+-----+
|South|Orange|6.0 |-9 |
+-----+------+----+-----+
Just wrapping my comment up in an answer.
I've checked the following code in the spark-shell and it worked.
# create dataframe
val data = Seq(
("West" ,"Apple" ,2.0 ,-10 ),
("West" ,"Apple" ,3.0 ,10 ),
("West" ,"Orange",5.0 ,-15 ),
("West" ,"Orange",17.0,-15 ),
("South","Orange",3.0 ,9 ),
("South","Orange",6.0 ,-18 ),
("East" ,"Milk" ,5.0 ,-5 ),
("West" ,"Milk" ,5.0 ,8 ))
val df_raw = spark.createDataFrame(data)
val col_names = Seq("store", "prod", "amt", "units")
val df = df_raw.toDF(col_names: _*)
# define window
import org.apache.spark.sql.expressions.Window
val w = Window.partitionBy($"prod", $"store")
# add should_reduce flag via window function
val should_reduce_df = df.withColumn("should_reduce", (max($"units").over(w) > 0) && (min($"units").over(w) < 0))
# select those that are only passed on / not reduced
val pass_df = should_reduce_df.filter(!$"should_reduce").select(col_names.head, col_names.tail: _*)
# reduce those that we need to reduce
# maximum amount and sum of units
# finally filter out those that reduce to 0 units
val reduced_df = should_reduce_df.filter($"should_reduce").groupBy($"store", $"prod").agg(max($"amt").alias("amt"), sum($"units").alias("units")).filter($"units" !== 0)
# do a union of passed-on and reduced df
val final_df = pass_df.union(reduced_df)
final_df.show()