How to map a RDD of type org.apache.spark.rdd.RDD[Array[String]]? - scala

I am new to Spark and Scala.
I have an RDD that is of type org.apache.spark.rdd.RDD[Array[String]].
Here is a listing from myRdd.take(3).
Array(Array(1, 2524474, CBSGPRS, 1, 2015-09-09 10:42:03, 0, 47880, 302001131103734, NAT, "", 502161081073570, "", BLANK, UNK, "", "", "", MV_PVC, BLANK, 1, "", 0, 475078439, 41131;0;0, "", 102651;0;0, 3|3), Array(2, 2524516, CBSGPRS, 1, 2015-09-09 23:42:14, 0, 1260, 302001131104272, NAT, "", 502161081074085, "", BLANK, UNK, "", "", "", MV_PVC, BLANK, 1, "", 0, 2044745984, 3652;0;0, "", 8636;0;0, 3|3), Array(3, 2524545, CBSGPRS, 1, 2015-09-09 14:56:55, 0, 32886, 302001131101629, NAT, "", 502161081071599, "", BLANK, UNK, "", "", "", MV_PVC, BLANK, 1, "", 0, 1956194307, 14164657;0;0, "", 18231194;0;0, 3|3))
I am trying to map it as follows ..
var gprsMap = frows.collect().map{ tuple =>
// bind variables to the tuple
var (recKey, origRecKey, recTypeId, durSpanId, timestamp, prevConvDur, convDur,
msisdn, callType, aPtyCellId, aPtyImsi, aPtyMsrn, bPtyNbr, bPtyNbrTypeId,
bPtyCellId, bPtyImsi, bPtyMsrn, inTrgId, outTrgId, callStatusId, suppSvcId, provChgAmt,
genFld1, genFld2, genFld3, genFld4, genFld5) = tuple
var dtm = timestamp.split(" ");
var idx = timestamp indexOf ' '
var dt = timestamp slice(0, idx)
var tm = timestamp slice(idx + 1, timestamp.length)
// return the results tuple
((dtm(0), msisdn, callType, recTypeId, provChgAmt), (convDur))
}
I keep getting error:
error: object Tuple27 is not a member of package scala.
I am not sure what the error is. Can someone help?

The problem is that Scala only supports tuples with up to 22 fields. Additionally, your frows: RDD[Array[String]] contains Array[String] as elements. Thus, also your tuple variable in the map function is of type Array[String]. Therefore, it is not possible to unapply the variable tuple into a tuple.
But what you can do is accessing the elements of the array directly via indices.
val recKey = tuple(0)
val timestamp = tuple(4)
...

Related

How to access a nested structure using PyADS

I have two structures defined in TwinCAT3 like so:
TYPE ADSUWDisplay :
STRUCT
sYarn : STRING;
sNextYarn : STRING;
lLength : REAL;
iPosition : INT;
sYarnSpec : STRING;
iPackageCount : INT;
iPackageTotalCount : INT;
iCount : INT;
iTotalCount : INT;
sUpcomingYarn : STRING;
sUpcomingYarnSpec : STRING;
sUWMessage : STRING;
sThreadUp : STRING;
END_STRUCT
END_TYPE
TYPE ADSRemoteDisplay :
STRUCT
iUW : INT;
iCount : INT;
sState : STRING
iStateNo : INT;
sRobotMessage : STRING;
adsUWDisplay : ARRAY[0..5] OF ADSUWDisplay;
END_STRUCT
END_TYPE
As you can see, ADSUWDisplay is nested inside ADSRemoteDisplay.
How can I access ADSUWDisplay using PyADS?
More specifically, how do I declare my Structure defs for use in pyads.connection.read_structure_by_name()?
Please help!
Thanks
You can read out a struct as follows, from the pyads docs:
Use the parameter structure_def to define the structure and array_size to define the size of the array.
>>> structure_def = (
("i", pyads.PLCTYPE_INT, 1),
("s", pyads.PLCTYPE_STRING, 1)
)
>>> symbol = plc.get_symbol("MyStructure", structure_def=structure_def, array_size=2)
>>> symbol.write([{"i": 1, " "s": "foo"}, {"i": 2, "s": "bar"}])
>>> symbol.read()
[{"i": 1, " "s": "foo"}, {"i": 2, "s": "bar"}]
Not sure how to define a nested struct. I guess you can try to add a nested tuple in structure_def, e.g.:
structure_def = (
("i", pyads.PLCTYPE_INT, 1),
(
("s1", pyads.PLCTYPE_STRING, 1),
("s2", pyads.PLCTYPE_STRING, 1)
)
)

Is there any catchall in swift?

I have varying input that will be a String like:
"James"
"James(1)"
"James(2)"
I'm looking to insert the values for these keys into a dictionary only after I remove the (x)
What can I use to represent x to then remove the entire parenthesis?
My current attempt looks like this:
replacingOccurrences(of: "(\((Any).self)", with: "")
But that's not working..
Desired output would be a dictionary only containing the one unique value as the key.
So with the input of
["James" : 3, "James(1)" : 3, "James(2)" : 4, "Sarah" : 10, "Sarah(1)" : 10, "Sarah(2)" : 10 ]
I would get this output
["James" : 10, "Sarah" : 30]
I like George's solution, but I think this is a nice use case for uniquingKeysWith:.
// Convert to [(name, value)], removing everything after "("
let keyValues = data.map { k, v in (String(k.prefix(while: { $0 != "(" })), v) }
// Sum matching values
let result = Dictionary(keyValues, uniquingKeysWith: +)
Easiest way would be to use reduce(into:_:).
It works by:
Starting with an empty dictionary to create in the reduce.
Get index of first bracket/parenthesis (nil if none)
Get the name excluding anything after the (. Otherwise take whole name.
Append the value from the input for the current key as the name. If the name doesn't exist yet, default value of 0 is given.
Code:
let input = ["James": 3, "James(1)": 3, "James(2)": 4, "Sarah": 10, "Sarah(1)": 10, "Sarah(2)": 10]
let output: [String: Int] = input.reduce(into: [:]) { partialResult, pair in
let firstBracket = pair.key.firstIndex(of: "(")
let basicName = String(pair.key.prefix(upTo: firstBracket ?? pair.key.endIndex))
partialResult[basicName, default: 0] += pair.value
}
print(output)
// Prints: ["James": 10, "Sarah": 30]

How to return the full Object inserted in mongo using post API using ReactiveMongo

I am studying Scala programming to create RestAPI service for my project, using Play framework and ReactiveMongoDB.
I am trying to insert an object to db, and as a response from insert I want to retrieve the object created with the Options Data that I generate on Repository ( ex: the Id is generated directly from Mongo).
The photo below shows the model I use. The marker_id, _creationDate and the _updateDate are set on the repository.
case class Point_LocationModel(
id: Option[BSONObjectID],
marker_id: Option[String],
step: String,
backgroundTargetName: String,
firstPoint: Option[PointModel],
secondPoint: Option[PointModel],
x: String,
y: String,
z: String,
rotation: String,
note: String,
tag: String,
work_session_id: String,
_creationDate: Option[DateTime],
_updateDate: Option[DateTime] )
This is the repository that I use to return a map of response data but I see that the marker_id , id, _creationDate, _updateDate are not set on the response data (line 53).
def create(point: Point_LocationModel)(implicit
writer: BSONDocumentWriter[Point_LocationModel]
) = {
logger.info(s"output pointInserted point: ${point}")
val pointInserted = collection.flatMap(
_.insert(ordered = false)
.one(
point.copy(
marker_id = Some(BSONObjectID.generate().stringify),
_creationDate = Some(new DateTime()),
_updateDate = Some(new DateTime())
)
)
.map((_, point))
)
val test1 = Await.result(pointInserted, Duration.Inf)
logger.info(s"output pointInserted: ${test1}")
pointInserted
}
Below is the controller. As a response I get the model only with the attributes which I pass to the body of the post call and not the attributes generated from the repository.
def insertPointData(): Action[JsValue] =
Action.async(controllerComponents.parsers.json) { implicit request =>
{
request.body
.validate[Point_LocationModel]
.fold(
_ => Future.successful(BadRequest("Cannot parse request body")),
point => {
point_LocationRepository.create(point).map { data =>
Created(Json.toJson(data._2))
}
}
)
}
}
This is the final response I get from the insert method of controller after the full object is successfully inserted in the db. As shown in the photo the marker_id , id, _creationDate, _updateDate are missing :
{
"step": "string11",
"backgroundTargetName": "string11",
"firstPoint": {
"hitPnt": [],
"normalX": "string1",
"normalY": "string1",
"normalZ": "string1"
},
"secondPoint": {
"hitPnt": [],
"normalX": "string1",
"normalY": "string1",
"normalZ": "string1"
},
"x": "1",
"y": "2",
"z": "3",
"rotation": "4",
"note": "test3",
"tag": "test3",
"work_session_id": "TEST"
}
How can I get the full object inserted in the database ?

How to create a Hash in Neo4j from given string?

I do the following
CREATE (p:person { FirstName:"M", LastName: "H" , Language: "en", hash: apoc.util.sha512('1234567','salt')});
Error:
Function call does not provide the required number of arguments: expected 1 got 2.
Function apoc.util.sha512 has signature: apoc.util.sha512(values :: LIST? OF ANY?) :: STRING?
meaning that it expects 1 argument of type LIST? OF ANY?
Description: apoc.util.sha512([values]) | computes the sha512 of the concatenation of all string values of the list (line 2, column 73 (offset: 73))
"CREATE (p:person { FirstName:"M", LastName: "H" , Language: "en", hash: apoc.util.sha512('1234567','salt')});"
what is a Type of List argument?
thanks rob
Have it - i used []
CREATE (p:person { FirstName:"M", LastName: "H" , Language: "de", hash: apoc.util.sha512(["1234567","salt"])});

How to create a struct from two arrays?

I have SearchViewController and I created "Data" class to load SearchViewController.
//File.swift
class Data {
var add1: String
var add2: String
init(add1: String, add2: String) {
self.add1 = add2
self.add1 = add2
}
}
//ViewController.swift
//This is the part that I want to load from two arrays of strings.
var entries = [Data]()
entries = [
Data(add1: "Canada", add2: "Ottawa"),
Data(add1: "USA", add2: "Washington"),
Data(add1: "Japan", add2: "Tokyo"),
Data(add1: "Denmark", add2: "Copenhagen")
]
My class for search controller's data list goes on. I'll have approx 1000 values. Therefore, I want to load this class from two arrays like shown below.
let arr1: [String] = ["Canada", "USA", "Japan", "Denmark"]
let arr2: [String] = ["Ottawa", "Washington", "Tokyo", "Copenhagen"]
You can use the function zip here to combine the two arrays into an array of a tuples (String, String) and then map that sequence into an array of Data
let arr1 = ["Canada", "USA", "Japan", "Denmark"]
let arr2 = ["Ottawa", "Washington", "Tokyo", "Copenhagen"]
let data = zip(arr1, arr2).map { Data(add1: $0.0, add2: $0.1) }
for d in data { print("\(d.add1) - \(d.add2)") }
Prints:
Canada - Ottawa
USA - Washington
Japan - Tokyo
Denmark - Copenhagen
I also suggest that you change the type Data to be a struct instead of a class if you don't have a good reason for it to be a class
try to iterate the value in loop and appends to the entries array
for i in 0..<arr1.count {
// create struct from array data
let data = Data(add1: arr1[i], add2: arr2[i])
// insert to your array
entries.append(data)
}