Related
EDIT: Still no fix, but I know what value being returned from the cypher is causing the error:
[43.4171031, 37.5049815, 43.4171031, 37.5049815]
it is coming from a bbox as a spatial query:
val query = "call blah blah yield node return node.fromDate as fromDate, node.bbox as bbox ORDER BY node.toDateFormatLong DESC";
It does NOT like that return node.bbox as bbox, I have to take that out for the query to work.
if I do I get the data frame back. If I don't I get the error:
defined class DateLayerData
defined class ChangeScoreObj
changeScoreMap: java.util.Map[Integer,ChangeScoreObj] = {}
doCalculation: (lat1: BigDecimal, lon1: BigDecimal, lat2: BigDecimal, lon2: BigDecimal, radius: Double)Unit
main: (args: Array[String])Unit
MinLat: 34.6 minlon 40.9 maxlat: 34.7 maxlon: 41
java.lang.UnsupportedOperationException: empty collection
at org.apache.spark.rdd.RDD$$anonfun$first$1.apply(RDD.scala:1370)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
at org.apache.spark.rdd.RDD.first(RDD.scala:1367)
at org.neo4j.spark.Neo4j.loadDataFrame(Neo4j.scala:280)
at doCalculation(<console>:114)
at main(<console>:89)
... 81 elided
So guessing I am using Option wrong?
val initialDf2 = neo.cypher(query).loadDataFrame //this seems to fail on empty collection
initialDf2.take(1).headOption.map(_.getString(1)).foreach(println)
This line above is ran inside a loop. Part of a doCalculation function which is called over and over again with different values etc.
Trying to load a data frame that once back from NEO4j looks like this inside NEO4j:
-1 “Detected” 1 20161104 3318 37.5049815 43.4171031 20161023 “filename.val” 9.2 "23OCT16" [43.4171031, 37.5049815, 43.4171031, 37.5049815]
So I make the query call to get one of many rows that looks like the above one:
try {
val initialDf2 = neo.cypher(query).loadDataFrame
val someVal = initialDf2.collectAsList()
val detectt = someVal.get(0).getString(1) //try to get the second field
println(detectt)
} catch {
case e: Exception => e.printStackTrace
}
I do have a try catch because sometimes the query sent to the cypher returns nothing (I have no idea how else to handle that)
17/09/18 08:44:48 ERROR TaskSetManager: Task 0 in stage 298.0 failed 1 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 298.0 failed 1 times, most recent failure: Lost task 0.0 in stage 298.0 (TID 298, localhost, executor driver): java.lang.RuntimeException: Error while encoding: java.lang.RuntimeException: java.util.Collections$UnmodifiableRandomAccessList is not a valid external type for schema of string
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 0, altitude), DoubleType) AS altitude#1678
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 1, detect_type), StringType), true) AS detect_type#1679
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, gtype), LongType) AS gtype#1680L
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 3, toDateFormatLong), LongType) AS toDateFormatLong#1681L
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 4, change_area), LongType) AS change_area#1682L
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 5, latitude), DoubleType) AS latitude#1683
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 6, longitude), DoubleType) AS longitude#1684
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 7, fromDateFormatLong), LongType) AS fromDateFormatLong#1685L
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 8, iids), StringType), true) AS iids#1686
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 9, detect_strength), DoubleType) AS detect_strength#1687
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 10, fromDate), StringType), true) AS fromDate#1688
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 11, bbox), StringType), true) AS bbox#1689
at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:290)
at org.apache.spark.sql.SparkSession$$anonfun$3.apply(SparkSession.scala:573)
The exact query:
call spatial.bbox('geom', {lat:37.5,lon:43.4}, {lat:37.6,lon:43.5}) yield node return node.altitude as altitude, node.detect_type as detect_type, node.gtype as gtype, node.toDateFormatLong as toDateFormatLong, node.change_area as change_area, node.latitude as latitude, node.longitude as longitude, node.fromDateFormatLong as fromDateFormatLong, node.iids as iids, node.detect_strength as detect_strength, node.fromDate as fromDate, node.bbox as bbox ORDER BY node.toDateFormatLong DESC
Example data set:
╒══════════╤═════════════╤═══════╤══════════════════╤═════════════╤══════════╤═══════════╤════════════════════╤═════════════════════════════════════════════════════════════════════╤═════════════════╤══════════╤═════════════════════════════════════════════╕
│"altitude"│"detect_type"│"gtype"│"toDateFormatLong"│"change_area"│"latitude"│"longitude"│"fromDateFormatLong"│"iids" │"detect_strength"│"fromDate"│"bbox" │
╞══════════╪═════════════╪═══════╪══════════════════╪═════════════╪══════════╪═══════════╪════════════════════╪═════════════════════════════════════════════════════════════════════╪═════════════════╪══════════╪═════════════════════════════════════════════╡
│-1 │"Arrival" │1 │20161104 │16981 │37.5608649│43.4297988 │20161023 │"23OCT16S1A89377_09_IW1_09_pp_1231_04NOV16S1A90776_09_123_31_TT_QQQQ”│7.2 │"23OCT16" │[43.4297988,37.5608649,43.4297988,37.5608649]│
├──────────┼─────────────┼───────┼──────────────────┼─────────────┼──────────┼───────────┼────────────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────┼──────────┼─────────────────────────────────────────────┤
│-1 │"Arrival" │1 │20161104 │3123 │37.56749 │43.4807208 │20161023 │"23OCT16S1A89377_09_IW1_09_pp_1231_04NOV16S1A90776_09_124_32_TT_QQQQ"│7.5 │"23OCT16" │[43.4807208,37.56749,43.4807208,37.56749] │
├──────────┼─────────────┼───────┼──────────────────┼─────────────┼──────────┼───────────┼────────────────────┼─────────────────────────────────────────────────────────────────────┼─────────────────┼──────────┼─────────────────────────────────────────────┤
that I call
try {
val initialDf2 = neo.cypher(query).loadDataFrame
val someVal = initialDf2.collectAsList()
} catch {
case e: Exception => e.printStackTrace
}
I get this error:
17/09/18 08:44:48 ERROR TaskSetManager: Task 0 in stage 298.0 failed 1 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 298.0 failed 1 times, most recent failure: Lost task 0.0 in stage 298.0 (TID 298, localhost, executor driver): java.lang.RuntimeException: Error while encoding: java.lang.RuntimeException: java.util.Collections$UnmodifiableRandomAccessList is not a valid external type for schema of string
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 0, altitude), DoubleType) AS altitude#1678
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 1, detect_type), StringType), true) AS detect_type#1679
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, gtype), LongType) AS gtype#1680L
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 3, toDateFormatLong), LongType) AS toDateFormatLong#1681L
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 4, change_area), LongType) AS change_area#1682L
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 5, latitude), DoubleType) AS latitude#1683
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 6, longitude), DoubleType) AS longitude#1684
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 7, fromDateFormatLong), LongType) AS fromDateFormatLong#1685L
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 8, iids), StringType), true) AS iids#1686
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 9, detect_strength), DoubleType) AS detect_strength#1687
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 10, fromDate), StringType), true) AS fromDate#1688
if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 11, bbox), StringType), true) AS bbox#1689
at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:290)
at org.apache.spark.sql.SparkSession$$anonfun$3.apply(SparkSession.scala:573)
Data comes back if I don't include the bbox.
In the neo4j browser, I can run the problem query and the results come back:
-1 “Detected” 1 20161104 3318 37.5049815 43.4171031 20161023 “filename.val” 9.2 "23OCT16" [43.4171031, 37.5049815, 43.4171031, 37.5049815]
It is that secondary list, I might have to return node.bbox.somevalue1 as bbbox1, but no idea what the exact syntax would be....
I think this is a similar issue to what I was having...
Neo4j spark connector loadDataFrame gives error
and solved by:
https://github.com/neo4j-contrib/neo4j-spark-connector/issues/40
It just seems like it wants more from what I am returning.
Array properties are not yet supported by the connector according to the maintainers (see https://neo4j-users.slack.com/archives/C0N7LHVS9/p1534429756000100)
There are two workarounds:
Use UNWIND in the query and then collect it back on spark
Convert the array to a string using a REDUCE operation
I prefer the 2nd approach but I am not sure how well it scales with big data.
So your query would look something like:
RETURN REDUCE(s = HEAD(bbox), n IN TAIL(bbox) | s + ', ' + n) AS bbox,
And the code that processes it:
neo4j.cypher(QUERY).as[String].map(bbox => bbox.split(', ')
loadDataFrame you need decleard the dataframe schema with the ( fieldName and fieldtype )
like this:
val rawGraphnode=neo.cypher("MATCH (n:person)where (n.duration <>0) RETURN n.user as user,n.other as other,n.direction as direction,n.duration as duration,n.timestamp as timestamp")
.loadDataFrame(schema = ("user","object"),("other","object"),("direction","string"),("duration","String"),("timestamp","String"))
rawGraphnode.printSchema()
rawGraphnode.show(10)
I have several binary files that I need to merge into one parquet files in spark. I have tried this but it doesn't work.
val rdd = spark.sparkContext.binaryFiles("/mypath/*").map{case (filePath, content) =>
Row("file_name" -> filePath, "content" -> content.toArray())
}
val schema = new StructType().add(StructField("file_name", StringType, true)).add(StructField("content", ArrayType(ByteType), true))
spark.createDataFrame(rdd, schema).show(1)
I got this stack trace:
Caused by: java.lang.RuntimeException: Error while encoding: java.lang.RuntimeException: scala.Tuple2 is not a valid external type for schema of string
if (assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object), 0, file_name), StringType), true) AS file_name#55
+- if (assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object), 0, file_name), StringType), true)
:- assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object).isNullAt
: :- assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object)
: : +- input[0, org.apache.spark.sql.Row, true]
: +- 0
:- null
+- staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object), 0, file_name), StringType), true)
+- validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object), 0, file_name), StringType)
+- getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object), 0, file_name)
+- assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object)
+- input[0, org.apache.spark.sql.Row, true]
...
Caused by: java.lang.RuntimeException: scala.Tuple2 is not a valid external type for schema of string
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:276)
... 17 more
Like the error says, you declared the column to be a StringType but the actual data is a tuple
Row("file_name" -> filePath, "content" -> content.toArray())
Here you create a row of two tuples but your schema is a string column and an array column. Your assumption that you need to create a tuple with the column name is incorrect, you just use the data. Column names come when you apply a schema.
use Row(filePath, content.toArray()) instead to match your schema, or alter your schema to accept the tuples.
I'm pretty new to Spark and am having a problem converting an RDD to a DataFrame. What I'm trying to do is take a log file, convert it to JSON using an existing jar (returns a string), and then make that resulting json into a dataframe. Here is what I have so far:
val serverLog = sc.textFile("/Users/Downloads/file1.log")
val jsonRows = serverLog.mapPartitions(partition => {
val txfm = new JsonParser //*jar to parse logs to json*//
partition.map(line => {
Row(txfm.parseLine(line))
})
})
When I run a take(2) on this I get something like:
[{"pwh":"600","sVe":"10.0","psh":"667","udt":"mobile"}]
[{"pwh":"800","sVe":"10.0","psh":"1000","udt":"desktop"}]
My problem comes here. I create a schema and try to create the df
val schema = StructType(Array(
StructField("pwh",StringType,true),
StructField("sVe",StringType,true),...))
val jsonDf = sqlSession.createDataFrame(jsonRows, schema)
And the returned error is
java.lang.RuntimeException: Error while encoding: java.lang.ArrayIndexOutOfBoundsException: 1
if (assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object), 0, pwh), StringType), true) AS _pwh#0
+- if (assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object), 0, pwh), StringType), true)
:- assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object).isNullAt
: :- assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object)
: : +- input[0, org.apache.spark.sql.Row, true]
: +- 0
:- null
Can someone tell me what I'm doing wrong here? Most of the SO answers I've found say I can use either createDataFrame or toDF(), but I've had no luck with either. I also tried converting the RDD to a JavaRDD, but that also did not work. Appreciate any insight you can give.
your defined schema is for RDD like:
{"pwh":"600","sVe":"10.0","psh":"667","udt":"mobile"}
{"pwh":"800","sVe":"10.0","psh":"1000","udt":"desktop"}
if you can change your RDD to make data as
{"logs": [{"pwh":"600","sVe":"10.0","psh":"667","udt":"mobile"}]}
an use this schema:
val schema = StructType(Seq(
StructField("logs",ArrayType( StructType(Seq(
StructField("pwh",StringType,true),
StructField("sVe",StringType,true), ...))
))
))
sqlContext.read.schema(schema).json(jsonRows)
For the following code - in which a DataFrame is converted to RDD[Row] and data for a new column is appended via mapPartitions:
// df is a DataFrame
val dfRdd = df.rdd.mapPartitions {
val bfMap = df.rdd.sparkContext.broadcast(factorsMap)
iter =>
val locMap = bfMap.value
iter.map { r =>
val newseq = r.toSeq :+ locMap(r.getAs[String](inColName))
Row(newseq)
}
}
The output is correct for the RDD[Row] with another column:
println("**dfrdd\n" + dfRdd.take(5).mkString("\n"))
**dfrdd
[ArrayBuffer(0021BEC286CC, 4, Series, series, bc514da3e0d534da8207e3aab231d1cb, livetv, 148818)]
[ArrayBuffer(0021BEE7C556, 4, Series, series, bc514da3e0d534da8207e3aab231d1cb, livetv, 26908)]
[ArrayBuffer(8C7F3BFD4B82, 4, Series, series, bc514da3e0d534da8207e3aab231d1cb, livetv, 99942)]
[ArrayBuffer(0021BEC8F8B8, 1, Series, series, 0d2debc63efa3790a444c7959249712b, livetv, 53994)]
[ArrayBuffer(10EA59F10C8B, 1, Series, series, 0d2debc63efa3790a444c7959249712b, livetv, 1427)]
Let us try to convert the RDD[Row] back to a DataFrame:
val newSchema = df.schema.add(StructField("userf",IntegerType))
Now let us create the updated DataFrame:
val df2 = df.sqlContext.createDataFrame(dfRdd,newSchema)
Is the new schema looking correct?
newSchema.show()
root
|-- user: string (nullable = true)
|-- score: long (nullable = true)
|-- programType: string (nullable = true)
|-- source: string (nullable = true)
|-- item: string (nullable = true)
|-- playType: string (nullable = true)
|-- userf: integer (nullable = true)
Notice we do see the new userf column..
However it does not work:
println("df2: " + df2.take(1))
Job aborted due to stage failure: Task 0 in stage 9.0 failed 1 times,
most recent failure: Lost task 0.0 in stage 9.0 (TID 9, localhost, executor driver): java.lang.RuntimeException: Error while encoding:
java.lang.RuntimeException: scala.collection.mutable.ArrayBuffer is not a
valid external type for schema of string
if (assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object), 0, user), StringType), true) AS user#28
+- if (assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object), 0, user), StringType), true)
:- assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object).isNullAt
: :- assertnotnull(input[0, org.apache.spark.sql.Row, true], top level row object)
: : +- input[0, org.apache.spark.sql.Row, true]
: +- 0
:- null
So: what detail is missing here?
Note: I am not interested in different approaches: e.g. withColumn or Datasets.. Let us please consider only the approach:
convert to RDD
add new data element to each row
update the schema for the new column
convert the new RDD+schema back to DataFrame
There seems to be a small mistake calling Row's constructor:
val newseq = r.toSeq :+ locMap(r.getAs[String](inColName))
Row(newseq)
The signature of this "constructor" (apply method, actually) is:
def apply(values: Any*): Row
When you pass a Seq[Any], it is treated as a single value of type Seq[Any]. You want to pass the elements of this Sequence, therefore you should use:
val newseq = r.toSeq :+ locMap(r.getAs[String](inColName))
Row(newseq: _*)
Once this is fixed, the Rows will match the schema you built, and you'll get the expected result.