I would like to use a Named pipe to connect between Scala and Python. However, I am not able to find examples of named pipes in Scala. My Python program generates data and sends the data to either a C++ program or a Scala program. I am able to get the named pipes connection working between Python and C++ but not between Python and Scala. The data producer is the Python file while either C++ or Scala is the consumer. I
My OS is Ubuntu 14.04.
It would be great if anyone could post examples of named-pipes in scala.
Thanks in advance
Here's a Scala-centric approach just to get things started.
// first mkfifo /tmp/NP
scala> import scala.io.Source
import scala.io.Source
// this will block until the pipe is written to
scala> val itr = Source.fromFile("/tmp/NP")
itr: scala.io.BufferedSource = non-empty iterator
// after I write "12345 to the end" to the pipe I can work with the iterator
scala> itr.descr
res9: String = file:/tmp/NP
scala> itr.hasNext
res10: Boolean = true
scala> itr.next
res11: Char = 1
scala> itr.next
res12: Char = 2
scala> itr.mkString
res13: String =
"345 to the end
"
etc.
Related
As a part of an effort of converting Java code to Scala code, I need to convert the Java stream Files.walk(Paths.get(ROOT)) to Scala. I can't find a solution by googling. asScala won't do it. Any hints?
Here is the related code:
import static org.springframework.hateoas.mvc.ControllerLinkBuilder.linkTo;
import static org.springframework.hateoas.mvc.ControllerLinkBuilder.methodOn;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.stream.Collectors;
// ...snip...
Files.walk(Paths.get(ROOT))
.filter(path -> !path.equals(Paths.get(ROOT)))
.map(path -> Paths.get(ROOT).relativize(path))
.map(path -> linkTo(methodOn(FileUploadController.class).getFile(path.toString())).withRel(path.toString()))
.collect(Collectors.toList()))
The Files.walk(Paths.get(ROOT)) return type is Stream<Path> in Java.
There is a slightly nicer way without needing the compat layer or experimental 2.11 features mentioned here by #marcospereira
Basically just use an iterator:
import java.nio.file.{Files, Paths}
import scala.collection.JavaConverters._
Files.list(Paths.get(".")).iterator().asScala
Starting Scala 2.13, the standard library includes scala.jdk.StreamConverters which provides Java to Scala implicit stream conversions:
import scala.jdk.StreamConverters._
val javaStream = Files.walk(Paths.get("."))
// javaStream: java.util.stream.Stream[java.nio.file.Path] = java.util.stream.ReferencePipeline$3#51b1d486
javaStream.toScala(LazyList)
// scala.collection.immutable.LazyList[java.nio.file.Path] = LazyList(?)
javaStream.toScala(Iterator)
// Iterator[java.nio.file.Path] = <iterator>
Note the usage of LazyList (as opposed to Stream) as Streams are deprecated in Scala 2.13. LazyList is the supported replacement type.
Java 8 Streams and Scala Streams are conceptually different things; the Java 8 Stream is not a collection, so the usual collection converter won't work. You can use the scala-java8-compat (github) library to add a toScala method to Java Streams:
import scala.compat.java8.StreamConverters._
import java.nio.file.{ Files, Path, Paths }
val scalaStream: Stream[Path] = Files.walk(Paths.get(".")).toScala[Stream]
You can't really use this conversion (Java->Scala) from Java, so if you have to do this from Java, it's easier (but still awkward) to just run the stream and build the Scala Stream yourself (which is what the aforementioned library is doing under the hood):
import scala.collection.immutable.Stream$;
import scala.collection.mutable.Builder;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.stream.Stream;
final Stream<Path> stream = Files.walk(Paths.get("."));
final Builder<Path, scala.collection.immutable.Stream<Path>> builder = Stream$.MODULE$.newBuilder();
stream.forEachOrdered(builder::$plus$eq);
final scala.collection.immutable.Stream<Path> result = builder.result();
However, both ways will fully consume the Java Stream, so you don't get the benefit of the lazy evaluation by converting it to a Scala Stream and might as well just convert it directly to a Vector. If you just want to use the Scala function literal syntax, there different ways to achieve this. You could use the same library to use function converters, similar to collection converters:
import scala.compat.java8.FunctionConverters._
import java.nio.file.{ Files, Path, Paths }
val p: Path => Boolean = p => Files.isExecutable(p)
val stream: java.util.stream.Stream[Path] = Files.walk(Paths.get(".")).filter(p.asJava)
Alternatively since 2.11, Scala has experimental support for SAM types under the -Xexperimental flag. This will be non-experimental without a flag in 2.12.
$ scala
Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_92).
Type in expressions for evaluation. Or try :help.
scala> import java.nio.file.{ Files, Path, Paths }
import java.nio.file.{Files, Path, Paths}
scala> Files.walk(Paths.get(".")).filter(p => Files.isExecutable(p))
<console>:13: error: missing parameter type
Files.walk(Paths.get(".")).filter(p => Files.isExecutable(p))
^
scala> :set -Xexperimental
scala> Files.walk(Paths.get(".")).filter(p => Files.isExecutable(p))
res1: java.util.stream.Stream[java.nio.file.Path] = java.util.stream.ReferencePipeline$2#589838eb
scala> Files.walk(Paths.get(".")).filter(Files.isExecutable)
res2: java.util.stream.Stream[java.nio.file.Path] = java.util.stream.ReferencePipeline$2#185d8b6
I am still very new to spark and scala, but very familiar with Java. I have some java jar that has a function that returns an List (java.util.List) of Integers, but I want to convert these to a spark dataset so I can append it to another column and then perform a join. Is there any easy way to do this? I've tried things similar to this code:
val testDSArray : java.util.List[Integer] = new util.ArrayList[Integer]()
testDSArray.add(4)
testDSArray.add(7)
testDSArray.add(10)
val testDS : Dataset[Integer] = spark.createDataset(testDSArray, Encoders.INT())
but it gives me compiler errors (cannot resolve overloaded method)?
If you look at the type signature you will see that in Scala the encoder is passed in a second (and implicit) parameter list.
You may:
Pass it in another parameter list.
val testDS = spark.createDataset(testDSArray)(Encoders.INT)
Don't pass it, and leave the Scala's implicit mechanism resolves it.
import spark.implicits._
val testDS = spark.createDataset(testDSArray)
Convert the Java's List to a Scala's one first.
import collection.JavaConverters._
import spark.implicits._
val testDS = testDSArray.asScala.toDS()
As a part of an effort of converting Java code to Scala code, I need to convert the Java stream Files.walk(Paths.get(ROOT)) to Scala. I can't find a solution by googling. asScala won't do it. Any hints?
Here is the related code:
import static org.springframework.hateoas.mvc.ControllerLinkBuilder.linkTo;
import static org.springframework.hateoas.mvc.ControllerLinkBuilder.methodOn;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.stream.Collectors;
// ...snip...
Files.walk(Paths.get(ROOT))
.filter(path -> !path.equals(Paths.get(ROOT)))
.map(path -> Paths.get(ROOT).relativize(path))
.map(path -> linkTo(methodOn(FileUploadController.class).getFile(path.toString())).withRel(path.toString()))
.collect(Collectors.toList()))
The Files.walk(Paths.get(ROOT)) return type is Stream<Path> in Java.
There is a slightly nicer way without needing the compat layer or experimental 2.11 features mentioned here by #marcospereira
Basically just use an iterator:
import java.nio.file.{Files, Paths}
import scala.collection.JavaConverters._
Files.list(Paths.get(".")).iterator().asScala
Starting Scala 2.13, the standard library includes scala.jdk.StreamConverters which provides Java to Scala implicit stream conversions:
import scala.jdk.StreamConverters._
val javaStream = Files.walk(Paths.get("."))
// javaStream: java.util.stream.Stream[java.nio.file.Path] = java.util.stream.ReferencePipeline$3#51b1d486
javaStream.toScala(LazyList)
// scala.collection.immutable.LazyList[java.nio.file.Path] = LazyList(?)
javaStream.toScala(Iterator)
// Iterator[java.nio.file.Path] = <iterator>
Note the usage of LazyList (as opposed to Stream) as Streams are deprecated in Scala 2.13. LazyList is the supported replacement type.
Java 8 Streams and Scala Streams are conceptually different things; the Java 8 Stream is not a collection, so the usual collection converter won't work. You can use the scala-java8-compat (github) library to add a toScala method to Java Streams:
import scala.compat.java8.StreamConverters._
import java.nio.file.{ Files, Path, Paths }
val scalaStream: Stream[Path] = Files.walk(Paths.get(".")).toScala[Stream]
You can't really use this conversion (Java->Scala) from Java, so if you have to do this from Java, it's easier (but still awkward) to just run the stream and build the Scala Stream yourself (which is what the aforementioned library is doing under the hood):
import scala.collection.immutable.Stream$;
import scala.collection.mutable.Builder;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.stream.Stream;
final Stream<Path> stream = Files.walk(Paths.get("."));
final Builder<Path, scala.collection.immutable.Stream<Path>> builder = Stream$.MODULE$.newBuilder();
stream.forEachOrdered(builder::$plus$eq);
final scala.collection.immutable.Stream<Path> result = builder.result();
However, both ways will fully consume the Java Stream, so you don't get the benefit of the lazy evaluation by converting it to a Scala Stream and might as well just convert it directly to a Vector. If you just want to use the Scala function literal syntax, there different ways to achieve this. You could use the same library to use function converters, similar to collection converters:
import scala.compat.java8.FunctionConverters._
import java.nio.file.{ Files, Path, Paths }
val p: Path => Boolean = p => Files.isExecutable(p)
val stream: java.util.stream.Stream[Path] = Files.walk(Paths.get(".")).filter(p.asJava)
Alternatively since 2.11, Scala has experimental support for SAM types under the -Xexperimental flag. This will be non-experimental without a flag in 2.12.
$ scala
Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_92).
Type in expressions for evaluation. Or try :help.
scala> import java.nio.file.{ Files, Path, Paths }
import java.nio.file.{Files, Path, Paths}
scala> Files.walk(Paths.get(".")).filter(p => Files.isExecutable(p))
<console>:13: error: missing parameter type
Files.walk(Paths.get(".")).filter(p => Files.isExecutable(p))
^
scala> :set -Xexperimental
scala> Files.walk(Paths.get(".")).filter(p => Files.isExecutable(p))
res1: java.util.stream.Stream[java.nio.file.Path] = java.util.stream.ReferencePipeline$2#589838eb
scala> Files.walk(Paths.get(".")).filter(Files.isExecutable)
res2: java.util.stream.Stream[java.nio.file.Path] = java.util.stream.ReferencePipeline$2#185d8b6
I am working in the Scala Spark Shell and have the following RDD:
scala> docsWithFeatures
res10: org.apache.spark.rdd.RDD[(Long, org.apache.spark.mllib.linalg.Vector)] = MapPartitionsRDD[162] at repartition at <console>:9
I previously saved this to text using:
docsWithFeatures.saveAsTextFile("path/to/file")
Here's an example line from the text file (which I've shortened here for readability):
(22246418,(112312,[4,11,14,15,19,...],[109.0,37.0,1.0,3.0,600.0,...]))
Now, I know I could have saved this as object file to simplify things, but the raw text format is better for my purposes.
My question is, what is the proper way to get this text file back into an RDD of the same format as above (i.e. RDD of (integer, sparse vector) tuples)? I'm assuming I jut need to load with sc.textFile and then apply a couple mapping functions, but I'm very new to Scala and not sure how to go about it.
A simple regular expression and built-in vector utilities should do the trick:
import org.apache.spark.mllib.linalg.{Vector, Vectors}
import org.apache.spark.rdd.RDD
def parse(rdd: RDD[String]): RDD[(Long, Vector)] = {
val pattern: scala.util.matching.Regex = "\\(([0-9]+),(.*)\\)".r
rdd .map{
case pattern(k, v) => (k.toLong, Vectors.parse(v))
}
}
Example usage:
val docsWithFeatures = sc.parallelize(Seq(
"(22246418,(4,[1],[2.0]))", "(312332123,(3,[0,2],[-1.0,1.0]))"))\
parse(docsWithFeatures).collect
// Array[(Long, org.apache.spark.mllib.linalg.Vector)] =
// Array((22246418,(4,[1],[2.0])), (312332123,(3,[0,2],[-1.0,1.0])))
I would like to be able to get an instance of a Scala Object from a String specified in config. Say for example I have a config property db.driver = "scala.slick.driver.H2Driver"
I would like to be able to get an instance of this object H2Driver. Obviously I could create a map of configs to actual objects, but that seems like a hassle. I could also do this by including a $ on the end of the config and loading up the module, i.e.
val cl = Class.forName("scala.slick.driver.H2Driver$") //note the $
val driverObj = cl.getField("MODULE$").get(null).asInstanceOf[JdbcProfile]
But I was hoping there was a neater way to do this in Scala 2.10 using the newer reflections API.
Same in 2.10:
Welcome to Scala version 2.11.0 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0).
Type in expressions to have them evaluated.
Type :help for more information.
scala> import reflect.runtime._
import reflect.runtime._
scala> currentMirror staticModule "scala.Array"
res0: reflect.runtime.universe.ModuleSymbol = object Array
scala> currentMirror reflectModule res0
res1: reflect.runtime.universe.ModuleMirror = module mirror for scala.Array (bound to null)
scala> .instance
res2: Any = scala.Array$#67b467e9