Letting socket.io client version lag behind server version - sockets

Situation
We're using socket.io for mobile-server communications. Since we can't force-upgrade users' devices, if we want to upgrade to version 1 (non-back-compatible), we have to handle both versions on the server for a while.
Question
What are the options?
My current favourite is to wrap both the old version and the new version in a multiplexer. It detects the version of the incoming request based on headers and query parameters and thereby knows which functions to invoke.
Another (shittier) option is to wrap the new version in a module that can translate the old version of the protocol into the new version (and back again) when necessary. This suffers from a serious drawback. It would be time-consuming and uncertain work to ensure I've properly determined and handled all the tiny differences. Some differences might take some serious massaging.
(In case you're curious or it's helpful to know, we're doing this in Go.)

It appears that you could run two separate versions of socket.io on the server. Since the two versions don't have unique module filenames you would probably need to load one version from a different path. And, then obviously when loading the modules and initializing them you'd assign them to differently named variables. For example:
var io_old = require('old/socket.io');
var io = require('socket.io);
Once you have the two versions loaded on the server, I think there are two different approaches for how they could be run.
1) Use a different port for each version. The older version would use the default port 80 (no configuration change required for that) which is shared with the node.js web server. The newer version would be run on a different port (say port 3000). You would then initialize each version of socket.io to its own port. Your newer version clients would then connect to the the port the newer version was running on.
For the old socket.io server running on port 80, you would use whatever initialization you already have which probably hooks into your existing http server.
For the new socket.io server running on some other port, you would initialize it separately like this:
var io_old = require('old/socket.io')(server);
var io = require('socket.io')(3000);
Then, in the new version client, you would specify port 3000 when connecting.
var socket = io("http://yourdomain.com:3000");
2) Use a different HTTP request path for each version. By default, each socket.io connection starts with an HTTP request that looks like this: http://yourdomain.com/socket.io?EIO=xx&transport=xxx?t=xxx. But, the /socket.io portion of that request is configurable and two separate versions of socket.io could each be using a different path name. On the server, the .listen() method that starts socket.io listening takes an optional options object which can be configured with a custom path as in path: "/socket.io-v2". And similarly, the .connect() method in the client also accepts that options object. It's kind of hard to find the documentation for this option because it's actually an engine.io option (which socket.io uses), but socket.io passes the options through to engine.io.
I have not tried either of these myself, but I've studied how socket.io connections are initiated from client and server and it looks like the underlying engine supports this capability and I can see no reason why it should not work.
Here's how you'd change the path on the server:
var io = require('socket.io')(server, {path: "/socket.io.v1"});
Then, in the client code for the new version, you'd connect like this:
var socket = io({path: "/socket.io.v1"});
This would then result in the initial connection request being made to an HTTP URL like this:
http://yourdomain.com/socket.io.v1?EIO=xx&transport=xxx?t=xxx
Which would be handled by a different request handler on your HTTP server, thus separating the two version.
FYI, it is also possible that the EIO=3 query parameter in the socket.io connection URL is actually an engine.io version number and that can also be used to discern client version and "do the right thing" based on that value. I have not found any documentation on how that works and could not even find where that query parameter was looked at in the engine.io or socket.io source code so that would take more investigation as a another possibility.

I don't really have an immediate solution for this, but I have some kind of advice. I guess you could use it to save a lot of time.
first of all Im working in a startup which uses socketIo for almost
everything
We knew that this problem would happen so our initial design was to
make everything pluggable which means that we can swap out socketio for
sockjs and it will still work.
The way its done is by defining the common set of APIs which rarely change
in a system. We call it managers. The managers can just expose the API which the rest of the devs need to use without messing up anything. It speeds up a lot.
The manager implementation changes in the background but still the APIs are the same, so the engineers working on the core can confidently make changes.
Seems like you have a tight dependency in your code. Or may be not. I'm not so sure. Try following this principle if you haven't.

We're going to go the route of keeping both the 0.9.x version and the current version as separate libraries on the server. Eventually, when the pool of clients has more-or-less all updated, we'll just pull the plug on the 0.9.x version.
The way we'll manage the two versions is by wrapping the socket.io services in a package that will determine which wrapped socket.io version to pass the request off to. This determination will depend on features of the request, such as custom headers (that can be added to the newer clients) as well as query parameters and other headers utilized exclusively by one version or the other.
Since we're using Go, there's so far no universally agreed upon way to manage dependencies, let alone a way that can respect version differences. Assuming the back-compat branch of the repo wasn't broken (which it is), we'd have two options. The first would be to fork the repo and make the back-compat version the master. We'd then import it as if it had nothing to do with the other one. The second option would be to use gopkg.in to pretend the separate branches were separate repos.
In either event, we could import the two branches/repos like so
import (
socketioV0 "github.com/path/to/older/version"
socketioV1 "github.com/path/to/current/version"
)
And then refer to them in the code using their import names socketioV0 and socketioV1.

Related

Continuous deployment: how to deploy new features that are affecting client and server at the same time?

The server holds logic, iOS/Android App holds UI. Common case.
How do I suppose to deploy new features in this case with continuous deployment methodology?
I assume that server-side deploy looks like that:
I'm triggering new feature deployment, load balancer starts redirecting 1% of all users to the server instance with the new feature. If everything goes smoothly, then load balancer starts redirecting 10%, 30%, etc up to 100%.
The same can be done for client apps, using, say, Codepush.
So, if I'll deploy server without an app, then there will be no new features usage and therefore no problems with new deployment for sure.
So, probably I have to deploy app first and put some kind of server version checker, so if the server has api for this new feature, the UI for this feature is being shown, and if the app is connected to the wrong server, the new UI is hidden.
That's seems primitive. I need to persist socket connection to the same server to avoid hitting the wrong server, right? And what if instance/zone/region will go down and the user will be suddenly redirected to another sone/region and new server will not have the new feature api? Probably, my assumption is wrong.
So, how do I suppose to deploy new features in this case with continuous deployment methodology?
I would say that your question is more of version compatibility nature of server/client API than CD. We have a similar requirement where a server and the clients communicate and both are constantly enhanced with features. I don't know your production software architecture which might change the needs accordingly but I'll try to come up with some ideas.
I'm going to describe two cases which might apply for you.
First case:
The thing is easier when you do not face the situation that new client versions need to communicate with old server versions. The new server version is deployed first and old clients simply do not use the new feature, as you've already pointed out. In this situation my recommendation is to deploy the server app first and then start to roll out the new client apps. If that's possible I would do that. It applies only when the new feature doesn't force you to break the API.
Second case:
In the case that new client app versions need to talk to an old server app, which I would try to avoid at all costs, the new client needs some switch inside to deactivate feature e.g. B when it's talking to an old server that doesn't support this feature. An API version counter could be the solution. But it requires the client to be able to distinct between server versions. In REST you often see the .../v1/.. inside the URL but could be solved differently as well. Hopefully the API provides some mechanism to get the version the server speaks.
We faced both cases at the same time, the protocol changed over the time including breaking changes, so we needed to implement an API version negotiation mechanism.

What are the limitations of the flask built-in web server

I'm a newbie in web server administration. I've read multiple times that flask built-in web server is not designed for "production", and must be used only for tests and debug...
But what if my app touchs only a thousand users who occasionnaly send data to the server ?
If it works, when will I have to bother with the configuration of a more sophisticated web server ? (I am looking for approximative metrics).
In a nutshell, I would love to find what the builtin web server can do (with approx thresholds) and what it cannot.
Thanks a lot !
There isn't one right answer to this question, but here are some things to keep in mind:
With the right amount of horizontal scaling, it is quite possible you could keep scaling out use of the debug server forever. When exactly you would need to start scaling (or switch to using a "real" web server) would also depend on the environment you are hosting in, the expectations of the users, etc.
The main issue you would probably run into is that the server is single-threaded. This means that it will handle each request one at a time, serially. This means that if you are trying to serve more than one request (including favicons, static items like images, CSS and Javascript files, etc.) the requests will take longer. If any given requests happens to take a long time (say, 20 seconds) then your entire application is unresponsive for that time (20 seconds). This is only the default, of course: you could bump the thread counts (or have requests be handled in other processes), which might alleviate some issues. But once again, it can still be slow under a "high" load. What is considered a "high" load will be dependent on your application and the expectations of a maximum acceptable response time.
Another issue is security: if you are concerned at ALL about security (and not just the security of the data in the application itself, but the security of the box that will be running it as well) then you should not use the development server. It is not ready to withstand any sort of attack.
Finally, the development server could just fail outright. It is not designed to be used as a long-running process (days, weeks, months), and so it has not been well tested to work in this capacity.
So, yes, it has limitations. Yes, you could still conceivably use it in production. And yes, I would still recommend using a "real" web server. If you don't like the idea of needing to install something like Apache or Nginx, you can still go with a solution that is still as easy as "run a python script" by using some of the WSGI Standalone servers, which can run a server that is designed to be in production with something just as simple as running python run_app.py in the command line. You typically just need to create a 4-5 line python script to import and create the server object, point it to your Flask app, and run it.
gunicorn could be run with only the following on the command line, no extra script needed:
gunicorn myproject:app
...where "myproject" is the Python package that contains the app Flask object. Keep in mind that one of developers of gunicorn would probably recommend against this approach. See https://serverfault.com/questions/331256/why-do-i-need-nginx-and-something-like-gunicorn.
The OP has long-since moved on, but for those who encounter this question in the future I would just add that setting up an Apache server, even on a laptop, is free and pretty easy. It can be readily configured for as few or as many features as you want just by uncomment in or commenting out lines in the config file. There might be an even easier GUI method for doing that nowdays, but just editing the configs is simple.

JAX-WS service and client versioning

My group is getting into web services for the first time. What is the usual way to control versioning issues with web services and clients? We're generating our services and clients using in IBM's RAD using the wizards it presents. The services and clients work well enough, but we're wondering how managing these will become a task going forward.
-If a service changes, and it's interface doesn't change I presume older clients would never know the difference and would work fine.
-If a service changes to add a new property to a parameter, would older clients work with it if they didn't need to set that value?
-How do people handle version control of web services as they grow in number and the number of apps using their clients grows? What is best practice on this sort of thing?
There are two types of versioning that we use. One that is visible to clients (public) and one that isn't (private). This comes from (what you already touched upon) whether clients are affected or not.
If clients are affect by, for example change in XSD Schema definitions or functionality of the WS changes in such way that clients must also modify their end, we change the public version. We increment the version number in WS context root, which means that it will have different URL than the previous version. Also make sure that the code archive, in our case they are war, have also the public version, as not to overwrite the previous deployment.
For example our WS called foo is in public version 2. Its URL is http://ourserver:8000/foo_2 and war file is called foo_2. We modified our XSD Schema, so the clients must react to the change. We update the version and now URL is http://ourserver:8000/foo_3 and war file is called foo_3. The previous version still exists, while clients can slowly transition to the new version.
If the change does not force any action from clients, then we call this private versioning. This usually shows up, in combination with public version, as a part of a project name. Using previous example, we have a WS foo, with private version 5 and public 2. Our project for this service is called WS_foo_2_5. We now change the order in which we store incoming data. This does not effect the clients, so we change the private version. In effect we have project WS_foo_2_6. We create a code archive from it called foo_2 and deploy it with URL set to http://ourserver:8000/foo_2. This way we modified the version, without changing anything from clients POV.

encoding mobile-device versioning for REST Api server

We have restful api over HTTP. Amongst other clients we have also mobile-device clients (e.g. iphone). The issue is that there are several iphone apps in different versions out there (1.0, 2.0). Because they are distributed we don't have control which app-version is calling us.
To identify the app-version on server-side I see following options:
device must append URL parameter (e.g. /foo?iphone-app-version=1.0) : A bit yucky, but good thing is that I can see it always on server-logs (URL is always logged)
we authenticate api-clients with HTTP digest. We could encode the app-version inside the username (e.g. iphone_1_0): Good thing it is logged in server logs, but only works for resources which are exposed as HTTP digest.
device must use custom HTTP-header, e.g. X-IPHONE-APP-VERSION: In my view the cleanest approach, but we don't log HTTP headers in server logs (for log-noise it is switched off). So later analyzation is not possible.
Do you have a preferred approach or any other alternatives?
EDIT: With above versioning I don't mean api-versioning/content-negotiation. It is the version of the mobile-device.
You can use the Accept-Header to allow a client to declare what capabilities it has by identifying what versions of media types it supports. e.g.
mobile app does:
GET /server/foo
Accept: application/vnd.acme.fooappV1+xml
When you introduce new features that are not backward compatible you can tell the new updated clients to send,
GET /server/foo
Accept: application/vnd.acme.fooappV2+xml
Then your server knows the capabilities of the client it is talking to.
You could also get the new clients to do this:
GET /server/foo
Accept: application/vnd.acme.fooappV1+xml, application/vnd.acme.fooappV2+xml
That way you can migrate your server resources over to the new format slowly. If the endpoints deliver application/vnd.acme.fooappV1+xml then the client will revert back to the old way. If the endpoints return application/vnd.acme.fooappV2+xml then the new code can take over.
Using this approach, no URIs need to be changed, so bookmarks and statistics remain valid. Migration to a new format can be done incrementally over time and support for old clients can be gradually phased out.
I decided for custom X-xxx-USER-AGENT one. Main reason to decide against more standard "User-Agent" is that is already "polluted" with http-client library or mobile-device information. A custom X-xxx-USER-AGENT is easier to parse for server and does not intervent http-library which often sets it and could override a custom entry.

What's the best way to update code remotely?

For example, I have a website with various types of information. If that goes down I have a copy of the same website the users use on a local webserver, like Apache or IIS on the client. They use this local version until the Internet version returns. They can have no downtime, in other words.
The problem is that over time the Internet version will change while the client versions will remain the same unless I touch each client's machine to make the updates. I don't want to do that.
Is there a good way to keep my client up to date so that when I make a change on the server the client gets a copy so they can run it locally if needs be?
Thank you.
EDIT: do you think maybe using SVN and timely running of the update by the clients would work?
EDIT: they'll never ever submit anything. It's just so I don't have to update the client by hand, manually going to the machine. they're webpages that run in case the main server is down.
I will go for Git over SVN because of its distributed nature. Gives you multiple copies of code; use it along with this comment's solution:
Making git auto-commit
to autocommit.
Why not use something like HTTrack to make local copies of your actual internet site on each machine, rather then trying to do a separate deployment. That way you'll automatically stay in sync.
This has the advantage that if, at some point, part of your website is updated dynamically from a database, the user will still be able to have a static copy of the resulting site that is up-to-date.
There are tools like rsync which you can use periodically to sync the changes.