Setting geofencing from coordinate 41.7523,12.8629 - scala

cant find scala code to find the list of coordinates which comes with in 1 mile distance from a particular coordinates 41.7523,12.8629.
how can we geofencing for above given coordinate (spark scala)

Using this function we can get one nearby point randomly within one mile radius:
def getPoints(xc:Double,yc:Double,radiusInMiles:Int)={
val ran = new scala.util.Random()
val conv = 1609.344
//Here, there are about 111,300 meters in a degree
val radiusIndeg = radiusInMiles*conv / 111300f;
val u = ran.nextDouble()
val v = ran.nextDouble()
val w = radiusIndeg*Math.sqrt(u)
val t = 2*Math.PI*v
val x = w*Math.cos(t)
val y = w*Math.sin(t)
val newX = x/Math.cos(Math.toRadians(yc))
val fLong = newX+xc
val fLat = y+yc
(fLong,fLat)
}
By calling the above function repeatedly in a for loop, we can get desired number of random points within 1 mile:
for(i<-1 to 30) yield getPoints(41.7523,12.8629,1)
To get 30 nearby points randomly,
In Scala REPL:
scala> for(i<-1 to 30) yield getPoints(41.7523,12.8629,1)
res25: scala.collection.immutable.IndexedSeq[(Double, Double)] = Vector((41.74982541032955,12.86481315224305), (41.754168266
959056,12.870364411375881), (41.75544877222746,12.85451037482713), (41.7612335738966,12.856539452781801), (41.76358834447362
,12.861061408964183), (41.763040037484664,12.867369860689339), (41.75110057115767,12.873299266989251), (41.74658541773817,12
.865223104625423), (41.74925109768552,12.868277572490877), (41.76504777008776,12.86109583406441), (41.75732730141462,12.8722
5307703036), (41.75762735062798,12.860633801016085), (41.75003276741254,12.856383089176347), (41.760707286583,12.85341268512
5267), (41.748073299368386,12.858209316913472), (41.76018412949083,12.866118423321987), (41.74213603200559,12.87308644848186
), (41.761324688000265,12.86506896052553), (41.749976...
scala> res25.foreach(println)
(41.74982541032955,12.86481315224305)
(41.754168266959056,12.870364411375881)
(41.75544877222746,12.85451037482713)
(41.7612335738966,12.856539452781801)
(41.76358834447362,12.861061408964183)
(41.763040037484664,12.867369860689339)
(41.75110057115767,12.873299266989251)
(41.74658541773817,12.865223104625423)
(41.74925109768552,12.868277572490877)
(41.76504777008776,12.86109583406441)
(41.75732730141462,12.87225307703036)
(41.75762735062798,12.860633801016085)
(41.75003276741254,12.856383089176347)
(41.760707286583,12.853412685125267)
(41.748073299368386,12.858209316913472)
(41.76018412949083,12.866118423321987)
(41.74213603200559,12.87308644848186)
(41.761324688000265,12.86506896052553)
(41.74997668526327,12.86038167090363)
(41.75228048449065,12.872686927175733)
(41.75972428137232,12.859596070561539)
(41.7562836928502,12.86187286720154)
(41.75715996439461,12.861374766455278)
(41.760604332388,12.867977103427238)
(41.74018421174905,12.865172431590485)
(41.74059829855585,12.86438943748021)
(41.7593627526156,12.873744103200057)
(41.747241804657264,12.8542871167178)
(41.76014663643563,12.858456116302811)
(41.740826160697715,12.867433800624394)
scala>

Related

How to set type of dataset when applying transformations and how to implement transformations without using spark.sql.functions._?

I am a beginner for Scala and have been working on the following problem:
Example dataset named as given_dataset with player number and points scored
|player_no| |points|
1 25.0
1 20.0
1 21.0
2 15.0
2 18.0
3 24.0
3 25.0
3 29.0
Problem 1:
I have a dataset and need to calculate total points scored, average points per game, and number of games played. I am unable to explicitly set the data type to "double", "int", "float", when I apply the transformations. (Perhaps this is because they are untyped transformations?) Would anyone be able to help on this and correct my error?
No data type specified (but code is able to run)
val total_points_dataset = given_dataset.groupBy($"player_no").sum("points").orderBy("player_no")
val games_played_dataset = given_dataset.groupBy($"player_no").count().orderBy("player_no")
val avg_points_dataset = given_dataset.groupBy($"player_no").avg("points").orderBy("player_no")
Please note that I would like to retain the player number as I plan to merge total_points_dataset, games_played_dataset, and avg_points_dataset together.
Data type specified, but code crashes!
val total_points_dataset = given_dataset.groupBy($"player_no").sum("points").as[Double].orderBy("player_no")
val games_played_dataset = given_dataset.groupBy($"player_no").count().as[Int].orderBy("player_no")
val avg_points_dataset = given_dataset.groupBy($"player_no").avg("points").as[Double].orderBy("player_no")
Problem 2:
I would like to implement the above without using the library spark.sql.functions e.g. through functions such as map, groupByKey etc. If possible, could anyone provide an example for this and point me towards the right direction?
If you don't want to use import org.apache.spark.sql.types.{FloatType, IntegerType, StructType} then you have to cast it either at the time of reading or using as[(Int, Double)] in the dataset. Below is the example while reading from CSV file for your dataset:
/** A function that splits a line of input into (player_no, points) tuples. */
def parseLine(line: String): (Int, Float) = {
// Split by commas
val fields = line.split(",")
// Extract the player_no and points fields, and convert to integer & float
val player_no = fields(0).toInt
val points = fields(1).toFloat
// Create a tuple that is our result.
(player_no, points)
}
And then read as below:
val sc = new SparkContext("local[*]", "StackOverflow75354293")
val lines = sc.textFile("data/stackoverflowdata-noheader.csv")
val dataset = lines.map(parseLine)
val total_points_dataset2 = dataset.reduceByKey((x, y) => x + y)
val total_points_dataset2_sorted = total_points_dataset2.sortByKey(ascending = true)
total_points_dataset2_sorted.foreach(println)
val games_played_dataset2 = dataset.countByKey().toList.sorted
games_played_dataset2.foreach(println)
val avg_points_dataset2 =
dataset
.mapValues(x => (x, 1))
.reduceByKey((x, y) => (x._1 + y._1, x._2 + y._2))
.mapValues(x => x._1 / x._2)
.sortByKey(ascending = true)
avg_points_dataset2.collect().foreach(println)
I locally tried running both ways and both are working fine, we can check the below output also:
(3,78.0)
(1,66.0)
(2,33.0)
(1,3)
(2,2)
(3,3)
(1,22.0)
(2,16.5)
(3,26.0)
For details you can see it on mysql page
Regarding "Problem 1" try
val total_points_dataset = given_dataset.groupBy($"player_no").sum("points").as[(Int, Double)].orderBy("player_no")
val games_played_dataset = given_dataset.groupBy($"player_no").count().as[(Int, Long)].orderBy("player_no")
val avg_points_dataset = given_dataset.groupBy($"player_no").avg("points").as[(Int, Double)].orderBy("player_no")

How to program a circle fit in scala

I want to fit a circle to given 2D points in Scala.
Apache commons math has an example for this in java, which I am trying to translate to scala (without success, because my knowledge of Java is almost non existent).
I took the example code from "http://commons.apache.org/proper/commons-math/userguide/leastsquares.html", (see end of page) which I tried to translate into scala:
import org.apache.commons.math3.linear._
import org.apache.commons.math3.fitting._
import org.apache.commons.math3.fitting.leastsquares._
import org.apache.commons.math3.fitting.leastsquares.LeastSquaresOptimizer._
import org.apache.commons.math3._
import org.apache.commons.math3.geometry.euclidean.twod.Vector2D
import org.apache.commons.math3.util.Pair
import org.apache.commons.math3.fitting.leastsquares.LeastSquaresOptimizer.Optimum
def circleFitting: Unit = {
val radius: Double = 70.0
val observedPoints = Array(new Vector2D(30.0D, 68.0D), new Vector2D(50.0D, -6.0D), new Vector2D(110.0D, -20.0D), new Vector2D(35.0D, 15.0D), new Vector2D(45.0D, 97.0D))
// the model function components are the distances to current estimated center,
// they should be as close as possible to the specified radius
val distancesToCurrentCenter = new MultivariateJacobianFunction() {
//def value(point: RealVector): (RealVector, RealMatrix) = {
def value(point: RealVector): Pair[RealVector, RealMatrix] = {
val center = new Vector2D(point.getEntry(0), point.getEntry(1))
val value: RealVector = new ArrayRealVector(observedPoints.length)
val jacobian: RealMatrix = new Array2DRowRealMatrix(observedPoints.length, 2)
for (i <- 0 to observedPoints.length) {
var o = observedPoints(i)
var modelI: Double = Vector2D.distance(o, center)
value.setEntry(i, modelI)
// derivative with respect to p0 = x center
jacobian.setEntry(i, 0, (center.getX() - o.getX()) / modelI)
// derivative with respect to p1 = y center
jacobian.setEntry(i, 1, (center.getX() - o.getX()) / modelI)
}
new Pair(value, jacobian)
}
}
// the target is to have all points at the specified radius from the center
val prescribedDistances = Array.fill[Double](observedPoints.length)(radius)
// least squares problem to solve : modeled radius should be close to target radius
val problem:LeastSquaresProblem = new LeastSquaresBuilder().start(Array(100.0D, 50.0D)).model(distancesToCurrentCenter).target(prescribedDistances).maxEvaluations(1000).maxIterations(1000).build()
val optimum:Optimum = new LevenbergMarquardtOptimizer().optimize(problem) //LeastSquaresOptimizer.Optimum
val fittedCenter: Vector2D = new Vector2D(optimum.getPoint().getEntry(0), optimum.getPoint().getEntry(1))
println("circle fitting wurde aufgerufen!")
println("CIRCLEFITTING: fitted center: " + fittedCenter.getX() + " " + fittedCenter.getY())
println("CIRCLEFITTING: RMS: " + optimum.getRMS())
println("CIRCLEFITTING: evaluations: " + optimum.getEvaluations())
println("CIRCLEFITTING: iterations: " + optimum.getIterations())
}
This gives no compile errors, but crashes with:
Exception in thread "main" java.lang.NullPointerException
at org.apache.commons.math3.linear.EigenDecomposition.<init>(EigenDecomposition.java:119)
at org.apache.commons.math3.fitting.leastsquares.LeastSquaresFactory.squareRoot(LeastSquaresFactory.java:245)
at org.apache.commons.math3.fitting.leastsquares.LeastSquaresFactory.weightMatrix(LeastSquaresFactory.java:155)
at org.apache.commons.math3.fitting.leastsquares.LeastSquaresFactory.create(LeastSquaresFactory.java:95)
at org.apache.commons.math3.fitting.leastsquares.LeastSquaresBuilder.build(LeastSquaresBuilder.java:59)
at twoDhotScan.FittingFunctions$.circleFitting(FittingFunctions.scala:49)
at twoDhotScan.Main$.delayedEndpoint$twoDhotScan$Main$1(hotScan.scala:14)
at twoDhotScan.Main$delayedInit$body.apply(hotScan.scala:11)
at scala.Function0.apply$mcV$sp(Function0.scala:34)
at scala.Function0.apply$mcV$sp$(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App.$anonfun$main$1$adapted(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:389)
at scala.App.main(App.scala:76)
at scala.App.main$(App.scala:74)
at twoDhotScan.Main$.main(hotScan.scala:11)
at twoDhotScan.Main.main(hotScan.scala)
I guess the problem is somewhere in the definition of the function distancesToCurrentCenter. I don't even know if this MultivariateJacobianFunction is supposed to be a real function or an object or what ever.
After some long fiddeling with the code, I got it running
The NullPointerException was gone after I updated apache-commons-math3 from version 3.3 to version 3.6.1 in my build.sbt file. Don't know if I forgot a paramater of if it was a bug. There were also 2 bugs in the example on the apache-commons-math website: They had two times a .getX operator where should have been an .getY.
So here is a running example for a circle fit with known radius:
import org.apache.commons.math3.analysis.{ MultivariateVectorFunction, MultivariateMatrixFunction }
import org.apache.commons.math3.fitting.leastsquares.LeastSquaresOptimizer.Optimum
import org.apache.commons.math3.fitting.leastsquares.{ MultivariateJacobianFunction, LeastSquaresProblem, LeastSquaresBuilder, LevenbergMarquardtOptimizer }
import org.apache.commons.math3.geometry.euclidean.twod.Vector2D
import org.apache.commons.math3.linear.{ Array2DRowRealMatrix, RealMatrix, RealVector, ArrayRealVector }
object Main extends App {
val radius: Double = 20.0
val pointsList: List[(Double, Double)] = List(
(18.36921795, 10.71416674),
(0.21196357, -22.46528791),
(-4.153845171, -14.75588526),
(3.784114125, -25.55910336),
(31.32998899, 2.546924253),
(34.61542186, -12.90323269),
(19.30193011, -28.53185596),
(16.05620863, 10.97209111),
(31.67011956, -20.05020878),
(19.91175561, -28.38748712))
/*******************************************************************************
***** Random values on a circle with centerX=15, centerY=-9 and radius 20 *****
*******************************************************************************/
val observedPoints: Array[Vector2D] = (pointsList map { case (x, y) => new Vector2D(x, y) }).toArray
val vectorFunktion: MultivariateVectorFunction = new MultivariateVectorFunction {
def value(variables: Array[Double]): Array[Double] = {
val center = new Vector2D(variables(0), variables(1))
observedPoints map { p: Vector2D => Vector2D.distance(p, center) }
}
}
val matrixFunction = new MultivariateMatrixFunction {
def value(variables: Array[Double]): Array[Array[Double]] = {
val center = new Vector2D(variables(0), variables(1))
(observedPoints map { p: Vector2D => Array((center.getX - p.getX) / Vector2D.distance(p, center), (center.getY - p.getY) / Vector2D.distance(p, center)) })
}
}
// the target is to have all points at the specified radius from the center
val prescribedDistances = Array.fill[Double](observedPoints.length)(radius)
// least squares problem to solve : modeled radius should be close to target radius
val problem = new LeastSquaresBuilder().start(Array(100.0D, 50.0D)).model(vectorFunktion, matrixFunction).target(prescribedDistances).maxEvaluations(25).maxIterations(25).build
val optimum: Optimum = new LevenbergMarquardtOptimizer().optimize(problem)
val fittedCenter: Vector2D = new Vector2D(optimum.getPoint.getEntry(0), optimum.getPoint.getEntry(1))
println("Ergebnisse des LeastSquareBuilder:")
println("CIRCLEFITTING: fitted center: " + fittedCenter.getX + " " + fittedCenter.getY)
println("CIRCLEFITTING: RMS: " + optimum.getRMS)
println("CIRCLEFITTING: evaluations: " + optimum.getEvaluations)
println("CIRCLEFITTING: iterations: " + optimum.getIterations + "\n")
}
Tested on Scala version 2.12.6, compiled with sbt version 1.2.8
Does anabody know how to do this without a fixed radius?
After some reasearch on circle fitting I've found a wonderful algorith in the paper: "Error alalysis for circle fitting algorithms" by H. Al-Sharadqah and N. Chernov (available here: http://people.cas.uab.edu/~mosya/cl/ )
I implemented it in scala:
import org.apache.commons.math3.linear.{ Array2DRowRealMatrix, RealMatrix, RealVector, LUDecomposition, EigenDecomposition }
object circleFitFunction {
def circleFit(dataXY: List[(Double, Double)]) = {
def square(x: Double): Double = x * x
def multiply(pair: (Double, Double)): Double = pair._1 * pair._2
val n: Int = dataXY.length
val (xi, yi) = dataXY.unzip
//val S: Double = math.sqrt(((xi map square) ++ yi map square).sum / n)
val zi: List[Double] = dataXY map { case (x, y) => x * x + y * y }
val x: Double = xi.sum / n
val y: Double = yi.sum / n
val z: Double = ((xi map square) ++ (yi map square)).sum / n
val zz: Double = (zi map square).sum / n
val xx: Double = (xi map square).sum / n
val yy: Double = (yi map square).sum / n
val xy: Double = ((xi zip yi) map multiply).sum / n
val zx: Double = ((zi zip xi) map multiply).sum / n
val zy: Double = ((zi zip yi) map multiply).sum / n
val N: RealMatrix = new Array2DRowRealMatrix(Array(
Array(8 * z, 4 * x, 4 * y, 2),
Array(4 * x, 1, 0, 0),
Array(4 * y, 0, 1, 0),
Array(2.0D, 0, 0, 0)))
val M: RealMatrix = new Array2DRowRealMatrix(Array(
Array(zz, zx, zy, z),
Array(zx, xx, xy, x),
Array(zy, xy, yy, y),
Array(z, x, y, 1.0D)))
val Ninverse = new LUDecomposition(N).getSolver().getInverse()
val eigenValueProblem = new EigenDecomposition(Ninverse.multiply(M))
// Get all eigenvalues
// As we need only the smallest positive eigenvalue, all negative eigenvalues are replaced by Double.MaxValue
val eigenvalues: Array[Double] = eigenValueProblem.getRealEigenvalues() map (lambda => if (lambda < 0) Double.MaxValue else lambda)
// Now get the index of the smallest positive eigenvalue, to get the associated eigenvector
val i: Int = eigenvalues.zipWithIndex.min._2
val eigenvector: RealVector = eigenValueProblem.getEigenvector(3)
val A = eigenvector.getEntry(0)
val B = eigenvector.getEntry(1)
val C = eigenvector.getEntry(2)
val D = eigenvector.getEntry(3)
val centerX: Double = -B / (2 * A)
val centerY: Double = -C / (2 * A)
val Radius: Double = math.sqrt((B * B + C * C - 4 * A * D) / (4 * A * A))
val RMS: Double = (dataXY map { case (x, y) => (Radius - math.sqrt((x - centerX) * (x - centerX) + (y - centerY) * (y - centerY))) } map square).sum / n
(centerX, centerY, Radius, RMS)
}
}
I kept all the Names form the paper (see Chaper 4 and 8 and look for the Hyperfit-Algorithm) and I tried to limit the Matrix operations.
It's still not what I need, cause this sort of algorithm (algebraic fit) has known issues with fitting partially circles (arcs) and maybe big circles.
With my data, I had once the situation that it spit out completly wrong results, and I found out that I had an Eigenvalue of -0.1...
The Eigenvector of this Value produced the right result, but it was sorted out because of the negative Eigenvalue. So this one is not always stable (as so many other circle fitting algorithms)
But what a nice Algorithm!!!
Looks a bit like dark magic to me.
If someone needs not to much precision and a lot of speed (and has data from a full circle not to big) this would be my choice.
Next thing I will try is to implement a Levenberg Marquardt Algorithm form the same page I mentioned above.

Join per line two different RDDs in just one - Scala

I'm programming a K-means algorithm in Spark-Scala.
My model predicts in which cluster is each point.
Data
-6.59 -44.68
-35.73 39.93
47.54 -52.04
23.78 46.82
....
Load the data
val data = sc.textFile("/home/borja/flink/kmeans/points")
val parsedData = data.map(s => Vectors.dense(s.split(' ').map(_.toDouble))).cache()
Cluster the data into two classes using KMeans
val numClusters = 10
val numIterations = 100
val clusters = KMeans.train(parsedData, numClusters, numIterations)
Predict
val prediction = clusters.predict(parsedData)
However, I need to put the result and the points in the same file, in the next format:
[no title, numberOfCluster (1,2,3,..10), pointX, pointY]:
6 -6.59 -44.68
8 -35.73 39.93
10 47.54 -52.04
7 23.78 46.82
This is the entry of this executable in Python to print really nice the result.
But my best effort has got just this:
(you can check the first numbers are wrong: 68, 384, ...)
var i = 0
val c = sc.parallelize(data.collect().map(x => {
val tuple = (i, x)
i += 1
tuple
}))
i = 0
val c2 = sc.parallelize(prediction.collect().map(x => {
val tuple = (i, x)
i += 1
tuple
}))
val result = c.join(c2)
result.take(5)
Result:
res94: Array[(Int, (String, Int))] = Array((68,(17.79 13.69,0)), (384,(-33.47 -4.87,8)), (440,(-4.75 -42.21,1)), (4,(-33.31 -13.11,6)), (324,(-39.04 -16.68,6)))
Thanks for your help! :)
I don't have a spark cluster handy to test, but something like this should work:
val result = parsedData.map { v =>
val cluster = clusters.predict(v)
s"$cluster ${v(0)} ${v(1)}"
}
result.saveAsTextFile("/some/output/path")

Functional Programming: Perimeter of a polygon.

I am trying to find the perimeter of a polygon in a functional way. I tried my best but I couldn't make it purely functional. This is my code:
object Solution {
def main(args: Array[String]) {
var x:Double = 0
val N = scala.io.StdIn.readInt
val points = scala.io.Source.stdin.getLines().take(N).map(x=>x).toList
for(i <- 0 to N-1){
if(i==N-1) x+=dist(List(points(i),points(0)))
else x += dist(List(points(i),points(i+1)))
}
println(x)
}
def dist(A: List[String]): Double = {
scala.math.sqrt(scala.math.pow((A(0).split(" ")(0).toDouble-A(1).split(" ")(0).toDouble),2) + scala.math.pow((A(0).split(" ")(1).toDouble-A(1).split(" ")(1).toDouble),2))
}
}
I enter the number of points of the polygon first and then enter Cartesian coordinates of each point in a new line.
Can anyone help me make it purely functional?
Start with separating concerns:
// dist should just take 2 points
def dist(a: (Double,Double), b: (Double,Double)): Double = ...
// calculate perimeter
def perimeter (points: List[(Double,Double)]): Double = {
// create a list of lines by connecting adjacent points
val lines = points zip (points.tail ++ List(points.head))
// aggregate the length of each line using foldLeft (/:)
(0d /: lines)((acc, line) => acc ++ dist(line._1, line._2))
}
def main (args: Array[String]) {
// main just needs to parse the lines
val points = ... // parse the points
println(perimeter(points))
}
Consider val n = 5 points
val points = (1 to n).map(_ => Math.random * 10).toArray
and a distance function, for example
def dist(a: Double, b: Double) = math.abs(a-b)
Then iterate continually (in circles) n times on (grouped) pairs of points to which we apply dist,
Iterator.continually(points)
.flatten
.sliding(2)
.take(n)
.map { case a :: b :: Nil => dist(a,b) }
.sum

Linear operations with slices in breeze

Is it somehow possible to slice updates on Matrices in breeze? I could not find implicit value for parameter op.
Breeze 0.11.2.
val idxs = Seq(0,1)
val x = DenseMatrix.rand(3,3)
val y = DenseMatrix.rand(3,3)
x(idxs,idxs)+= y(idxs, idxs) // cant find implicit conversion for += here.
Analog code with DenseVectors works properly.
val xv = DenseVector.rand(3)
val yv = DenseVector.rand(3)
x(idxs) += y(idxs)
There is ugly work-around updating rows in iterative manner.
val idxs = IndexedSeq(0, 1)
val x:DenseMatrix[Double] = DenseMatrix.zeros(3, 3)
val y = DenseMatrix.rand(3, 3)
for(r<-idxs) {
val slx = x(::, r)
val sly = y(::, r)
slx(idxs) += sly(idxs)
}
It's an oversight. Please open an issue on github.