Faulty-connection Proof File Transfer Protocol? - file-transfer

I frequently do website development live over an FTP connection. That is to say, I use a code editor with a built in FTP window and push/pull files to work on them, upload the changes, etc. This is mostly because it's unreasonable to try to create a local development server, and I use too many computers for that to be practical anyway without a lot of work.
My trouble is, the internet connection at our home is not exactly... stable. It's fast and mostly reliable, but it has a tendancy to glitch far more frequently than any other connection I've worked on (it's wireless DSL) and as a result, dropped connections are far too frequent. (It's about as reliable as AT&T is with phone calls in that regard.) When working with FTP, I find that if it drops the connection mid-file transfer, it can be difficult to recover. First of all, when the connection is dropped, it saves a blank file to the server (how is this helpful?) breaking the page I was working on completely, and the icing on the cake is that depending on the timing, vsftpd will get itself stuck in a timeout and I have to SSH in and restart it before I can access that file again.
This process alone has only been beneficial because it's taught me to build up some data protection techniques clientside, to prevent the server from eating my recent changes if the dropped connection happens to hang or crash my client. Overall though, it's a pretty failed situation, and I'm surprised I get any work done at all.
Long, long context, I know, but my question is this: Is there a file transfer protocol that is designed to handle "flakey" connections like mine? I'd imagine that, for example, trying to transfer files over a 3G tethered connection would yield the same results, especially while traveling. It seems like FTP and SFTP both rely on a persistant connection, and can deal with dropped packets but not the loss of the entire socket through a reconnect. It seems to me like a file transfer daemon should be able to store the state of the user interacting with it, and thus detect failed transfers and be ready to "resume" if the user reconnects in a reasonable amount of time.
Thanks if anyone knows anything. I'm seriously considering trying to write such a protocol myself (I've had a lot of success coding the ajax on my page to handle faulty connections, for example) but I don't want to dive in if there's already a solution available.

You want rsync. If the connection drops, you just repeat the command and it picks up right where it left off. Built in error checking and everything. Works over SSH, Windows client exists. Somebody's probably written a GUI front end.

BitTorrent works well with flakey connections. I hear that it is fast, too!

Related

low connectivity protocols or technologies

I'm trying to enhance a server-app-website architecture in reliability, another programmer has developed.
At the moment, android smartphones start a tcp connection to a server component to exchange data. The server takes the data, writes them into a DB and another user can have a look on the data through a website. The problem is that the smartphones very regularly are in locations where connectivity is really bad. The consequence is that the smartphones lose the tcp connection and it's hard to reconnect. Now my question is, if there are any protocols that are so lightweight or accomodating concerning bad connectivity that the data exchange could work better or more reliable.
For example, I was thinking about replacing the raw TCP interface with a RESTful API, but I don't really know how well REST works in this scenario, as I don't have any experience in this area.
Maybe useful to know for answering this question: The server component is programmed in c#. The connecting components are android smartphones.
Please understand that I dont add some code to this question, because in my opinion its just a theoretically question.
Thank you in advance !
REST runs over HTTP which runs over TCP so it would have the same issues with connectivity.
Moving up the stack to the application you could perhaps think in terms of 'interference'. I quite often have to use technical stuff in remote areas with limited reception and it reminds of trying to communicate in a storm. If you think about it, if you're trying to get someone to do something in a storm where they can hardly hear you and the words get blown away (dropped signal), you don't read them the manual on how to fix something, you shout key words such as 'handle', 'pull', 'pull', 'PULL', 'ok'. So the information reaches them in small bursts you can repeat (pull, what? pull, eh? PULL! oh righto!)
Can you redesign the communications between the android app and the server so the server can recognise key 'words' with corresponding data and build up the request over a period of time? If you consider idempotency, each burst of data would not alter the request if it has already been received (pull, PULL!) and over time the android app could send/receive smaller chunks of the request. If the signal stays up, just keep sending. If it goes down, note which parts of the request haven't been sent and retry them when the signal comes back.
So you're sending the request jigsaw-style but the server knows how to reassemble the pieces in the right order. A STOP word at the end tells the server ok this request is complete, go work on it. Until that word arrives the server can store the incomplete request or discard it if no more data comes in.
If the server respond to the first request chunk with an id, the app can use the id to get the response and keep trying until the full response comes back, at which point the server can remove the response from its jigsaw cache. A fair amount of work though.

Where would I learn more about interpreting network packets?

I'm working on a personal project. It's to recreate server software for the game "Chu Chu Rocket" for the Sega Dreamcast. Its' servers went down in 2004 I believe. My approach is to use dnsmasq to change the originl hostname that the game originally connected to, to my own system. With a DC-PC server set up, I have done just that, now instead of it looking up a non-existent dns record, it connects to my computer which will eventually run the server software. I've used tshark (cli wireshark) to capture what's going on between the client (dreamcast) and the server (my computer). The problem is, I'm getting data, but I'm not sure how to interpret it, I don't know what it's saying, but I'm sure it can be done because private PSO servers were created, those are far more complex.
Very simply, where would I go about learning how to interpret data packets, and possibly creating packets that will respond to such queries from the client?
Thanks,
Dragos240
If you can get the source code for the server software on your PC, then that is the best place to look.
Otherwise, all you can do is look at the protocol, compare runs, and make notes of similarities and differences. With any luck, the protocol won't be encrypted.

Documentation of TCP possible errors / unpredictable behaviours

I've started some time ago to work with custom-made servers, and even though I have experience to deal with the actual message exchange / serialization, etc, of client/server communications, I've had never coded an actual server from scratch.
In this sense, I have found raw TCP socket connections to be much trickier and unpredictable than I'd like.
For example, I coded a simple client/server application that would establish a long lived TCP connection, and the clients would receive push notifications from the server. Very simple, it worked very well in my test environment, even with many computers.
When I actually published this, though, I've had got lots of errors that later I would found that it was the lack of keepalive signals, which would make the connection to be cut, without giving me (either client or server) any feedback / error at all. The messages simply wouldn't be delivered, and fail silently.
I knew that TCP could break the connection, but I thought I could at least receive an error or such so I could reconnect in case of loss of connection.
This made me very insecure about rolling my own servers, as the possible errors and scenarios seem too many and unexpected, and I really don't want to learn about the unexpected behaviours when the actual application is deployed. With my current experience with server-side programming, the best way to deal with errors would be to enumerate all possible errors, and make sure I cover all exceptional cases when writing a program.
So, is there anywhere I could find a good documentation on the possible pitfalls / exceptions I could find with sockets, with how to detect them? It's been some time since I last worked with that, so I don't have any more fresh examples, but I remember that e.g. when you receive an empty message it would mean that the connection broke.
I'd also love to hear suggestions, or maybe simple libs (preferrably in C) that cover them so I can base my work in it? My main platform is linux, but a cross-platform solution is much appreciated!
Thank you!

iPhone: Strategies for uploading large files from phone to server

We're running into issues uploading hires images from the iPhone to our backend (cloud) service. The call is a simple HTTP file upload, and the issue appears to be the connection breaking before the upload is complete - on the server side we're getting IOError: Client read error (Timeout?).
This happens sporadically: most of the time it works, sometimes it fails. When a good connection is present (ie. wifi) it always works.
We've tuned various timeout parameters on the client library to make sure we're not hitting any of them. The issue actually seems to be unreliable mobile connectivity.
I'm thinking about strategies for making the upload reliable even when faced with poor connectivity.
The first thing that came to mind was to break the file into smaller chunks and transfer it in pieces, increasing the likelihood of each piece getting there. But that introduces a fair bit of complexity on both the client and server side.
Do you have a cleverer approach? How would you tackle this?
I would use the ASIHTTPRequest library. It's have some great features like bandwidth throttling. It can upload files directly from the system instead of loading the file into memory first. Also I would break the photo into like 10 parts. So for a 5 meg photo, it would be like 500k each. You would just create each upload using a queue. Then when the app goes into background, it can complete the part it's currently uploading. If you cannot finish uploading all the parts in the allocated time, just post a local notification reminding the user it's not completed. Then after all the parts have been sent to your server, you would call a final request that would combine all the parts back into your photo on the server-side.
Yeah, timeouts are tricky in general, and get more complex when dealing with mobile connections.
Here are a couple ideas:
Attempt to upload to your cloud service as you are doing. After a few failures (timeouts), mark the file, and ask the user to connect their phone to a wifi network, or wait till they connect to the computer and have them manually upload via the web. This isn't ideal however, as it pushes more work to your users. The upside is that implementationwise, it's pretty straight forward.
Instead of doing an HTTP upload, do a raw socket send instead. Using raw socket, you can send binary data in chunks pretty easily, and if any chunk-send times out, resend it until the entire image file is sent. This is "more complex" as you have to manage binary socket transfer but I think it's easier than trying to chunk files through an HTTP upload.
Anyway that's how I would approach it.

gather file(s) from users

I'm looking for ways to gather files from clients. These clients have our software and we are currently using FTP for gathering files from them. The files are collected from the client's database, encrypted and uploaded via FTP to our FTP server. The process is fraught with frustration and obstacles. The software is frequently blocked by common firewalls and often runs into difficulties with VPNs and NAT (switching to Passive instead of Active helps usually).
My question is, what other ideas do people have for getting files programmatically from clients in a reliable manner. Most of the files they are submitting are < 1 MB in size. However, one of them ranges up to 25 MB in size.
I'd considered HTTP POST, however, I'm concerned that a 25 mb file would often fail over a post (the web server timing out before the file could completely be uploaded).
Thoughts?
AndrewG
EDIT: We can use any common web technology. We're using a shared host, which may make central configuration changes difficult to make. I'm familiar with PHP from a common usage perspective... but not from a setup perspective (written lots of code, but not gotten into anything too heavy duty). Ruby on Rails is also possible... but I would be starting from scratch. Ideally... I'm looking for a "web" way of doing it as I'd like to eventually be ready to transition from installed code.
Research scp and rsync.
One option is to have something running in the browser which will break the upload into chunks which would hopefully make it more reliable. A control which does this would also give some feedback to the user as the upload progressed which you wouldn't get with a simple HTTP POST.
A quick Google found this free Java Applet which does just that. There will be lots of other free and pay for options that do the same thing
You probably mean a HTTP PUT. That should work like a charm. If you have a decent web server. But as far as I know it is not restartable.
FTP is the right choice (passive mode to get through the firewalls). Use an FTP server that supports Restartable transfers if you often face VPN connection breakdowns (Hotel networks are soooo crappy :-) ) trouble.
The FTP command that must be supported is REST.
From http://www.nsftools.com/tips/RawFTP.htm:
Syntax: REST position
Sets the point at which a file transfer should start; useful for resuming interrupted transfers. For nonstructured files, this is simply a decimal number. This command must immediately precede a data transfer command (RETR or STOR only); i.e. it must come after any PORT or PASV command.