What's the use case for gRPC where it could definitely overcome REST - rest

I made up simple benchmarking for the simpliest case: sending string Hello world over gRPC and rest in ruby:
# REST example
require 'sinatra'
set :bind, ''
set :logging, false
get '/' do
'Hello, world!'
gRPC example is based on official examples
// The greeting service definition.
service Greeter {
// Sends a greeting
rpc SayHello (HelloRequest) returns (HelloReply) {}
// The request message containing the user's name.
message HelloRequest {
string name = 1;
// The response message containing the greetings
message HelloReply {
string message = 1;
class GreeterServer < Helloworld::Greeter::Service
def say_hello(hello_req, _unused_call)
Helloworld::HelloReply.new(message: "Hello #{hello_req.name}")
deployed this code to remote server and run 1000 requests benchmark (ab for rest and looping client requests for gRPC) and get comparable results 51 sec vs 53 (REST vs gRPS)
so, I made up conclusion that in that case (with small amount of data in response) there is no benefits to gRPC. So, when would they appear? When data size would be magnitude of kilobytes or even megabytes? Or there are essentially different use cases for gRPC like streaming and duplexing data between server and client?

This blog post indicates that gRPC performs better while being slightly harder to use.
I'm think the improved performance comes from using protocol buffers for data transmission; I believe that means data is transmitted in binary format which would mean improved performance when you have more data, particularly non-string data.


sending two binary message in one http rest request

I have REST service running on a server using Python flask. I have REST client which is built using java. I need to send two binary message in single Http REST request. The reason these are two because they both different protobuff message type but are inter-related to each other and should go in single HTTP request. How can we accomplish that at send and receive side.
The simplest option here may be to simply declare a wrapper message type:
message FooRequest {
// remove "required" if using proto3 syntax
required Request1MessageType part1 = 1;
required Request2MessageType part2 = 2;
and send a single FooRequest composed of the two inner messages. This is not always possible, however, in which case you'll have to implement your own framing mechanism inside the binary payload. A simple but pragmatic option might be to measure the size of the first message (in bytes) - i.e. len, and send:
[len, 4 bytes little endian][message 1, len bytes][message 2]
and decode it again at the other end - i.e. take the first 4 bytes and use that to calculate the ranges of the two inner messages. In anticipation of requiring more messages in the future, it might make sense to include a length prefix against every message (i.e. also include a length prefix for message 2) - but strictly speaking it would be redundant in the current case.

What is the best way to communicate between python 2 applications using callbacks?

I have 2 independent python 2 applications running in the same linux (ubuntu) computer.
I want to send messages from one to another (bidirectional) and receives these messages inside a callback function.
Is it possible? Do you have any example as reference?
There are different options available for communicating between python apps.
A simple one would be to use an API based on HTTP. Each application will expose an specific port and communication takes place by exchanging HTTP requests.
There are several frameworks that allow you to build it in few steps. For example, using Bottle:
In app1:
from bottle import route, run, request
#route('/action_1', method='POST')
def action_1_handler():
data = request.json
# Do something with data
return {'success': True, 'data': {'some_data': 1}}
run(host='localhost', port=8080)
In app2:
import requests
r = requests.post("http://localhost:8080/action_1", json={'v1': 123, 'v2': 'foo'})
print r.status_code
# 200
data = r.json()
# {u'data': {u'some_data': 1}, u'success': True}
Note that if the action executed at app1 after receiving the HTTP request takes lot of time, this could result in a timeout error. In such a case, consider to run the action in another thread or use an alternative communication protocol (e.g. sockets, ZeroMQ Messaging Library).
Using http response headers in order to communicate server-side errors from the backend to the front-end

I am working on a REST backend consumed by a javascript/ajax front-end.
I am trying to find a way to deal with invalid requests sent over by the front-end to the backend.
One of the issues I have is that HTTP status codes such as (400, 409) are not fine-grained enough to cover business logic errors such as passwords not matching (in the case of a user changing his password) or an email being unknown to the system (in the case of a user trying to signin with the application).
I am thinking of using HTTP response headers in order to communicate server-side error from the backend to the front-end.
I could for instance have an Error enum (or a class with constants) as follows:
public enum Error {
I would then use that enum in order to set the headers on the response as follows:
response.setHeader(Error.UNKNOWN_EMAIL.name(), "true");
... and deal with the error appropriately on the front-end.
Can the above architecture be improved? If so how?
Is my usage of HTTP response headers correct?
Should I use constants or enums?
Is my usage of HTTP response headers correct?
I do not think it is incorrect, however I prefer to send an error message/code directly back in the response body. This is usually more convenient for the client to access and is more explicit. As part of consuming each response, the client can check the contents of the errors (you may have multiple) and act accordingly. The following is a little contrived just to provide an example:
// ...
"errors": {
"username": "not found"
"password": "no match"
"warnings": {
"account": "expired"
// ...
The above is quite a simple approach - your JSON message can be as sophisticated as you wish but keep in mind that you should only expose the information the client needs for it to achieve its goal. This will also depend on whether you are publishing an API for 3rd parties/public consumption or whether its just for your own clients ie. your own website. If you have other parties consuming it then put some thought into it since once you publish it then you need to maintain it that way - otherwise you break any consumers.
Check out JSON API for some standardized guidance on handling errors.
Should I use constants or enums?
Since these are a related set of properties an enum is preferable over constants (I assume you are using Java).

dlang vibe.d RESTful Service Performance

Why does my REST service seem to perform so poorly using rest interfaces in dlang vibe.d when compared to creating request handlers manually?
More Information:
I have been prototyping a RESTful service using the vibe.d library in dlang. I'm running a test where a client sends GET and POST requests to the server with a payload of some given size, say 2048 byte (i.e. the GET response would have 2k, the POST request would have 2k).
I'm using the "registerRestInterface" and "RestInterfaceClient" API in the vibe.d library to create my server and client sort of like this...
auto routes = new URLRouter;
registerRestInterface(routes, new ArtifactArchive());
auto settings = new HTTPServerSettings();
settings.port = port;
settings.bindAddresses = [host];
settings.options |= HTTPServerOption.distribute;
listenHTTP(settings, routes);
IArtifactArchive archive = new RestInterfaceClient!IArtifactArchive(endpoint)
IArtifactArchive.Payload result;
result = archive.getContents(info.FileDescriptor, offset, info.BlockSize);
I'm not doing anything fancy in my interface. Just filling a byte array and passing it along. I know performance depends on many different things; however I seem to see about 160kB transfer rate when using REST interfaces in vibe.d and roughly 5MB transfer rate when using manual http request handlers like this:
void ManualHandleRequest(HTTPServerRequest req, HTTPServerResponse res) ...
listenHTTP(settings, &ManualHandleRequest);
I really like the REST interface API, but I can't suffer that kind of performance loss in order to use it. Any thoughts on why it seems so much slower than the other method? Perhaps I'm configuring something wrong or missing something. I am somewhat new to the D programming language and the vibe.d library.
Thank you for your time!
I suspect that with custom request handler you actually write response as a byte array. REST interface generator serializes all return data into JSON by default which creates huge overhead compared to raw array.
This is just a random guess though, I need to see actual REST method implementation to say for sure and/or propose solution.

Streaming data in and out simultaneously on a single HTTP connection in play

streaming data out of play, is quite easy.
here's a quick example of how I intend to do it (please let me know if i'm doing it wrong):
def getRandomStream = Action { implicit req =>
import scala.util.Random
import scala.concurrent.{blocking, ExecutionContext}
import ExecutionContext.Implicits.global
def getSomeRandomFutures: List[Future[String]] = {
for {
i <- (1 to 10).toList
r = Random.nextInt(30000)
} yield Future {
blocking {
s"after $r ms. index: $i.\n"
val enumerator = Concurrent.unicast[Array[Byte]] {
(channel: Concurrent.Channel[Array[Byte]]) => {
getSomeRandomFutures.foreach {
_.onComplete {
case Success(x: String) => channel.push(x.getBytes("utf-8"))
case Failure(t) => channel.push(t.getMessage)
//following future will close the connection
Future {
blocking {
}.onComplete {
case Success(_) => channel.eofAndEnd()
case Failure(t) => channel.end(t)
new Status(200).chunked(enumerator).as("text/plain;charset=UTF-8")
now, if you get served by this action, you'll get something like:
after 1757 ms. index: 10.
after 3772 ms. index: 3.
after 4282 ms. index: 6.
after 4788 ms. index: 8.
after 10842 ms. index: 7.
after 12225 ms. index: 4.
after 14085 ms. index: 9.
after 17110 ms. index: 1.
after 21213 ms. index: 2.
after 21516 ms. index: 5.
where every line is received after the random time has passed.
now, imagine I want to preserve this simple example when streaming data from the server to the client, but I also want to support full streaming of data from the client to the server.
So, lets say i'm implementing a new BodyParser that parses the input into a List[Future[String]]. this means, that now, my Action could look like something like this:
def getParsedStream = Action(myBodyParser) { implicit req =>
val xs: List[Future[String]] = req.body
val enumerator = Concurrent.unicast[Array[Byte]] {
(channel: Concurrent.Channel[Array[Byte]]) => {
xs.foreach {
_.onComplete {
case Success(x: String) => channel.push(x.getBytes("utf-8"))
case Failure(t) => channel.push(t.getMessage)
//again, following future will close the connection
Future.sequence(xs).onComplete {
case Success(_) => channel.eofAndEnd()
case Failure(t) => channel.end(t)
new Status(200).chunked(enumerator).as("text/plain;charset=UTF-8")
but this is still not what I wanted to achieve. in this case, I’ll get the body from the request only after the request was finished, and all the data was uploaded to the server. but I want to start serving request as I go. a simple demonstration, would be to echo any received line back to the user, while keeping the connection alive.
so here's my current thoughts:
what if my BodyParser would return an Enumerator[String] instead of List[Future[String]]?
in this case, I could simply do the following:
def getParsedStream = Action(myBodyParser) { implicit req =>
new Status(200).chunked(req.body).as("text/plain;charset=UTF-8")
so now, i'm facing the problem of how to implement such a BodyParser.
being more precise as to what exactly I need, well:
I need to receive chunks of data to parse as a string, where every string ends in a newline \n (may contain multiple lines though...). every "chunk of lines" would be processed by some (irrelevant to this question) computation, which would yield a String, or better, a Future[String], since this computation may take some time. the resulted strings of this computation, should be sent to the user as they are ready, much like the random example above. and this should happen simultaneously while more data is being sent.
I have looked into several resources trying to achieve it, but was unsuccessful so far.
e.g. scalaQuery play iteratees -> it seems like this guy is doing something similar to what I want to do, but I couldn't translate it into a usable example. (and the differences from play2.0 to play2.2 API doesn't help...)
So, to sum it up: Is this the right approach (considering I don't want to use WebSockets)? and if so, how do I implement such a BodyParser?
I have just stumble upon a note on the play documentation regarding this issue, saying:
Note: It is also possible to achieve the same kind of live
communication the other way around by using an infinite HTTP request
handled by a custom BodyParser that receives chunks of input data, but
that is far more complicated.
so, i'm not giving up, now that I know for sure this is achievable.
What you want to do isn't quite possible in Play.
The problem is that Play can't start sending a response until it has completely received the request. So you can either receive the request in its entirety and then send a response, as you have been doing, or you can process requests as you receive them (in a custom BodyParser), but you still can't reply until you've received the request in its entirety (which is what the note in the documentation was alluding to - although you can send a response in a different connection).
To see why, note that an Action is fundamentally a (RequestHeader) => Iteratee[Array[Byte], SimpleResult]. At any time, an Iteratee is in one of three states - Done, Cont, or Error. It can only accept more data if it's in the Cont state, but it can only return a value when it's in the Done state. Since that return value is a SimpleResult (i.e, our response), this means there's a hard cut off from receiving data to sending data.
According to this answer, the HTTP standard does allow a response before the request is complete, but most browsers don't honor the spec, and in any case Play doesn't support it, as explained above.
The simplest way to implement full-duplex communication in Play is with WebSockets, but we've ruled that out. If server resource usage is the main reason for the change, you could try parsing your data with play.api.mvc.BodyParsers.parse.temporaryFile, which will save the data to a temporary file, or play.api.mvc.BodyParsers.parse.rawBuffer, which will overflow to a temporary file if the request is too large.
Otherwise, I can't see a sane way to do this using Play, so you may want to look at using another web server.
"Streaming data in and out simultaneously on a single HTTP connection in play"
I haven't finished reading all of your question, nor the code, but what you're asking to do isn't available in HTTP. That has nothing to do with Play.
When you make a web request, you open a socket to a web server and send "GET /file.html HTTP/1.1\n[optional headers]\n[more headers]\n\n"
You get a response after (and only after) you have completed your request (optionally including a request body as part of the request). When and only when the request and response are finished, in HTTP 1.1 (but not 1.0) you can make a new request on the same socket (in http 1.0 you open a new socket).
It's possible for the response to "hang" ... this is how web chats work. The server just sits there, hanging onto the open socket, not sending a response until someone sends you a message. The persistent connection to the web server eventually provides a response when/if you receive a chat message.
Similarly, the request can "hang." You can start to send your request data to the server, wait a bit, and then complete the request when you receive additional user input. This mechanism provides better performance than continually creating new http requests on each user input. A server can interpret this stream of data as a stream of distinct inputs, even though that wasn't necessarily the initial intention of the HTTP spec.
HTTP does not support a mechanism to receive part of a request, then send part of a response, then receive more of a request. It's just not in the spec. Once you've begun to receive a response, the only way to send additional information to the server is to use another HTTP request. You can use one that's already open in parallel, or you can open a new one, or you can complete the first request/response and issue an additional request on the same socket (in 1.1).
If you must have asynchronous io on a single socket connection, you might want to consider a different protocol other than HTTP.