Does scala offer async non-blocking IO when working with files? - scala

I am using scala 2.10 and I wonder If there is some package which has async IO when working with files?
I did some search o this topic but mostly found examples as following
val file = new File(canonicalFilename)
val bw = new BufferedWriter(new FileWriter(file))
bw.write(text)
bw.close()
what essentially essentially java.io package with blocking IO operations - write, read etc. I also found scala-io project with this intention but it seems that project is dead last activity 2012.
What is best practice in this scenario? Is there any scala package or the common way is wrapping java.io code to Futures and Observables ?
My use case is from an Akka actor need to manipulate files on local or remote file system. Need to avoid blocking. Or is there any better alternative?
Thnaks for clarifing this

Scala does not offer explicit API for asynchronous file IO, however the plain Java API is exactly the right thing to use in those cases (this is actually a good thing, we can use all these nice APIs without any wrapping!). You should look into using java.nio.channels.AsynchronousFileChannel, which is available since JDK7 and makes use of the underlying system async calls for file IO.
Akka IO, while not providing file IO in it's core, has a module developed by Dario Rexin, which allows to use AsynchronousFileChannel with Akka IO in a very simple manner. Have a look at this library to make use of it: https://github.com/drexin/akka-io-file
In the near future Akka will provide File IO in its akka-streams module. It may be as an external library for a while though, we're not exactly sure yet where to put this as it will require users to have JDK at-least 7, while most of Akka currently supports JDK6. Having that said, streams based asynchronous back-pressured file IO is coming soon :-)

If you're using scalaz-stream for your async support it has file functionality that's built on the java.nio async APIs - that's probably the approach I'd recommend. If you're using standard scala futures possibly you can use akka-io? which I think uses Netty as a backend. Or you can call NIO directly - it only takes a couple of lines to adapt a callback-based API to scalaz or scala futures.

Related

How to use scala actor based event sourcing app from another language (python)

I'm somewhat familiar with scala and less familiar with akka, although I know what actor models is (the idea seems quite simple).
So let's say that right now this is my code (in reality what I need is event sourcing application). I need to be able to use it from any language, not just from JVM.
So of course I googled about that and I've found this. The problem with that is that If my understanding is correct I would need to create some custom protocol, deserialization and dispatching for zmq messages and that is totally uncool. Maybe there exists solution for that already? If not, than how to do that in most efficient way? Maybe I need to create some message case classes and something like facade actor that would do deserialization?
class HelloActor extends Actor {
def receive = {
case "hello" => println("well, helllo!")
case _ => println("huh?")
}
}
object Main extends App {
val system = ActorSystem("HelloSystem")
val helloActor = system.actorOf(Props[HelloActor], name = "helloactor")
helloActor ! "hello"
helloActor ! "buenos dias"
}
There are many ways to do this, depends on the protocol you are using ect. For a language specific way is you can use Pyro. Just like java, you can serialize generic objects in python and then transfer them over the network, which you can use Pyro for. You can take advantage of the fact that python is implemented on both the jvm (Jython), and natively. Not sure if its a great idea to write this just in scala and python, I would create the API in java, and then add that to the scala classpath, then any other JVM language can also use your API. In addition it's more common to use jython with java so there are other benifits that come with being the majority.
But anway the common language that the jvm and python will understand will be these serialized python objects. So what you will need to know is:
How to use jython with java
How to use pyro
And yea using scala with jython is only a matter of adding the jars to the classpath, as you probably already know.
EDIT: Ok I think I might not have made this method clear enough. So basically:
JVM uses jython to create a jython instance, which is sent to a remote python object. The communication is done with the module Pyro. This program can send serialized python objects back as well.
This is what happens normally with remote actors in java, except the messages are implementing Serializable. Python and Java are not in the same process, or using native methods, or anything like that. They can be on the same machine or different machines. This method is not platform specific.
Hopefully this method is usefull to someone.
In my case the Akka actor solution was a little bit overkill, so I end up implementing my own event sourcing solution in this open source project.
The persistence layer is a decision for the developer, but I provide practical examples of execution using couchbase.
Take a look in case you consider useful.
https://github.com/politrons/Scalaydrated

Equivalent of Akka ByteString in Scala standard API

Is anyone aware of a standard API equivalent to Akka's ByteString: http://doc.akka.io/api/akka/2.3.5/index.html#akka.util.ByteString
This very convenient class has no dependency on any other Akka code, and it saddens me to have to import the whole Akka jar just to use it.
I found this fairly old discussion mentioning adding it to the standard API, but I don't know what happened to this project: https://groups.google.com/forum/#!msg/scalaz/ZFcjGpZswRc/0tCIdXvpGBAJ
Does anyone know of an equivalent piece of code in the standard API? Or in a very lightweight library?
You might want to check out scodec-bits. It provides two types, BitVector and ByteVector (API docs), supporting fast appends, take, drop, random access, etc. The library has zero dependencies. We split it out of scodec precisely because we thought it might of general use outside of scodec, where it's used heavily.

What well developed iteratee/pipes libraries are available for Scala?

Does Scala have any well developed libraries in the spirit of Haskell's pipes, or at least iteratee?
I found Play's iteratee library first, but I couldn't make it work, and it seems tightly coupled with Play's concurrency primitive Promise, which could be inappropriate in many cases.
Scalaz has some iteratee support (like IterV), but it seems there are only core classes with no additional support functions, predefined iteratees/enumerators etc. Also I couldn't find any documentation, even scaladoc is very sparse, so it's quite difficult to use properly.
And I couldn't find anything similar to pipes.
Building up on comments from Travis, currently there are:
Scalaz 7 iteratee package (iterv, you mentioned, is a compatibility layer with scalaz 6)
A port of Conduit library
Runar's scala-machines library (presentation, haskell version)

Simple and concise HTTP client library for Scala

I need a mature HTTP client library that is idiomatic to scala, concise in usage, simple semantics. I looked at the Apache HTTP and the Scala Dispatch and numerous new libraries that promise an idiomatic Scala wrapping. Apache HTTP client sure demands verbosity, while Dispatch was easily confusing.
What is a suitable HTTP client for Scala usage?
I did a comparison of most major HTTP client libraries available
Dispatch, and a few others libraries, are not maintained anymore.
The only serious ones currently are spray-client and Play! WS.
spray-client is a bit arcane in its syntax. play-ws is quite easy to use :
(build.sbt)
libraryDependencies += "com.typesafe.play" %% "play-ws" % "2.4.3"
(basic usage)
val wsClient = NingWSClient()
wsClient
.url("http://wwww.something.com")
.get()
.map { wsResponse =>
// read the response
}
I've recently started using Dispatch, a bit arcane (great general intro, serious lack of detailed scenario/use-case based docs). Dispatch 0.9.1 is a Scala wrapper around Ning's Async Http Client; to fully understand what going on requires introducing one's self to that library. In practice, the only thing I really had to look at was the RequestBuilder - everything else falling nicely into my understanding of HTTP.
I give the 0.9 release a solid thumbs up (so far!) on getting the job done very simply.. once you get past that initial learning curve.
Dispatch's Http "builder" is immutable, and seems to work well in a threaded environment. Though I can't find anything in docs to state that it is thread-safe; general reading of source suggests that it is.
Do be aware that the RequestBuilder's are mutable, and therefore are NOT thread-safe.
Here are some additional links I've found helpful:
I can't find a ScalaDoc link for the 0.9.* release, so I browse the source code for the 0.9.* release;
ScalaDoc for the 0.8 release; a substantially different beast (today) than 0.9.
The "Periodic" Table of operators, also 0.8 related.
The older 0.8 "dispatch-classic" docs helped me understand how they used the url builders, and gave some hints on how things are tied together that did carry forward to 0.9.
A little late to the party here, but I've been impressed with spray-client.
It's got a nice DSL for building requests, supports both sync and async execution, as well as a variety of (un)marshalling types (JSON, XML, forms). It plays very nicely with Akka, too.
sttp is the Scala HTTP library we've all been waiting for!
It has a fluent DSL for forming and executing requests (code samples from their README):
val request = sttp
.cookie("session", "*!##!#!$")
.body(file) // of type java.io.File
.put(uri"http://httpbin.org/put")
.auth.basic("me", "1234")
.header("Custom-Header", "Custom-Value")
.response(asByteArray)
It supports synchronous, asynchronous, and streaming calls via pluggable backends, including Akka-HTTP (formerly Spray) and the venerable AsyncHttpClient (Netty):
implicit val sttpHandler = AsyncHttpClientFutureHandler()
val futureFirstResponse: Future[Response[String]] = request.send()
It supports scala.concurrent.Future, scalaz.concurrent.Task, monix.eval.Task, and cats.effect.IO - all the major Scala IO monad libraries.
Plus it has a few additional tricks up its sleeve:
It has case class representations for both requests and responses (although it doesn't go as far as having e.g. strongly typed headers):
https://github.com/softwaremill/sttp/blob/master/core/src/main/scala/com/softwaremill/sttp/RequestT.scala
https://github.com/softwaremill/sttp/blob/master/core/src/main/scala/com/softwaremill/sttp/Response.scala
It provides a URI string interpolator:
val test = "chrabÄ…szcz majowy"
val testUri: Uri = uri"http://httpbin.org/get?bug=$test"
It supports encoders/decoders for request bodies/responses e.g. JSON via Circe:
import com.softwaremill.sttp.circe._
val response: Either[io.circe.Error, Response] =
sttp
.post(uri"...")
.body(requestPayload)
.response(asJson[Response])
.send()
Finally, it's maintained by the reliable folks at softwaremill and it's got great documentation.
Two Six years after originally responding to this post, I would have a different answer.
I've been using akka-http, a collaboration between the spray and akka teams. It's backed by Lightbend, tightly aligned with the akka async environment... it's the right tool for this job.
Having had some unhappy experiences with the Apache client, I set about writing my own. The built-in HttpURLConnection is widely asserted to be buggy. But that's not my experience of it. In fact, the reverse has been so, the Apache client having a somewhat problematic threading model. Since Java6 (or 5?), HttpURLConnection has provided efficient HTTP1.1 connections with essentials like keep-alive being built in, and it handles concurrent usage without fuss.
So, to compensate for the inconvenient API offered by HttpURLConnection, I set about writing a new API in Scala, as an open-source project. It's just a wrapper for HttpURLConnection, but unlike HttpURLConnection, it aims to be easy to use. Unlike Apache client, it should fit easily into an existing project. Unlike Dispatch, it should be easy to learn.
It's called Bee Client
Website: http://www.bigbeeconsultants.co.uk/bee-client
API docs: http://www.bigbeeconsultants.co.uk/docs/bee-client/latest.html
My apologies for the shameless plug. :)
Besides Dispatch there is not much out there. scalaz had a attempt at building a functional http client. But it is outdated for a while an no version of it exists in the scalaz7 branch. Additionally there is a useful wrapper of the ning async-http-client within the playframework. There your can do calls like:
WS.url("http://example.com/feed").get()
WS.url("http://example.com/item").post("content")
You can use this API as inspiration if you don't use play! in your project and dislike the Dispatch API.
Spray
You really should consider using Spray. In my opinion it has a bit of tricky syntax, but it is still pretty usable if you aim to build a high-performance http client. The main advantage of using Spray is that it is based on the akka actor library, which is extremely scalable and powerful. You can scale out your http client to several machines by only changing conf files.
Moreover few month ago Spray join Typesafe, and as I understand it will become a part of the basic akka distribution. proof
Play2
Another option is the Play2 WS lib usage (doc). As far as I know it is still not separated from the Play distribution, but due to its extremely simplicity it is worth it to spend some time attaching the whole Play framework to get that part. There are some issues with providing configuration to it, so this is not great for drop-and-use cases. However, we have used it in few non Play-based projects and everything was fine.
ScalaJ-Http is a very simple synchronous http client
https://github.com/scalaj/scalaj-http
I'd recommend it if you need a no-ceremony barebones Scala client.
Surprised that no one mentioned finagle here. It is super simple to use:
import com.twitter.finagle.{Http, Service}
import com.twitter.finagle.http
import com.twitter.util.{Await, Future}
object Client extends App {
val client: Service[http.Request, http.Response] = Http.newService("www.scala-lang.org:80")
val request = http.Request(http.Method.Get, "/")
request.host = "www.scala-lang.org"
val response: Future[http.Response] = client(request)
Await.result(response.onSuccess { rep: http.Response =>
println("GET success: " + rep)
})
}
See quick start guid for more detail: https://twitter.github.io/finagle/guide/Quickstart.html
I've used Dispatch, Spray Client and the Play WS Client Library...None of them were simply to use or configure. So I created a simpler HTTP Client library which lets you perform all the classic HTTP requests in simple one-liners.
See an example:
import cirrus.clients.BasicHTTP.GET
import scala.concurrent.Await
import scala.concurrent.duration._
object MinimalExample extends App {
val html = Await.result(Cirrus(GET("https://www.google.co.uk")), 3 seconds)
println(html)
}
... produces ...
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en-GB">...</html>
The library is called Cirrus and is available via Maven Central
libraryDependencies += "com.github.godis" % "cirrus_2.11" % "1.4.1"
The documentation is available on GitHub
https://github.com/Godis/Cirrus

How to make a code thread safe in scala?

I have a code in scala that, for various reasons, have few lines of code that cannot be accessed by more threads at the same time.
How to easily make it thread-safe? I know I could use Actors model, but I find it a bit too overkill for few lines of code.
I would use some kind of lock, but I cannot find any concrete examples on either google or on StackOverflow.
I think that the most simple solution would be to use synchronized for critical sections (just like in Java). Here is Scala syntax for it:
someObj.synchronized {
// tread-safe part
}
It's easy to use, but it blocks and can easily cause deadlocks, so I encourage you to look at java.util.concurrent or Akka for, probably, more complicated, but better/non-blocking solutions.
You can use any Java concurrency construct, such as Semaphores, but I'd recommend against it, as semaphores are error prone and clunky to use. Actors are really the best way to do it here.
Creating actors is not necessarily hard. There is a short but useful tutorial on actors over at scala-lang.org: http://www.scala-lang.org/node/242
If it is really very simple you can use synchronized: http://www.ibm.com/developerworks/java/library/j-scala02049/index.html
Or you could use some of the classes from the concurrent package in the jdk: http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/package-summary.html
If you want to use actors, you should use akka actors (they will replace scala actors in the future), see here: http://doc.akka.io/docs/akka/2.0.1/. They also support things like FSM (Finite State Machine) and STM (Software Transactional Memory).
In general try to use pure 'functions' or methods with immutable data structures that should help with thread safety.