How does real time communication over the internet work? - raspberry-pi

I'm researching and trying to building a RC car that can be controlled by the internet. I've started looking into how communication over the web works, but I seem to be going nowhere. My goal for the project is straight forward:
The RC car has an on-board camera and 4g wifi router that enables communication (driving commands, video streaming) over the internet. A Raspberry Pi will serve as the on-board computer.
I will be able to control the car with my PC even across the globe, as long as I'm connected.
I want to preferably do as much by myself as possible without relying too much on other people's code.
So here are my questions:
How does an application communicate over the internet? What is the interface between the application's logic (e.g pressing "w" to go forward), and transmitting/receiving that command over the internet?
How is video data stream handled?
I've looked into WebRTC and WebSockets for communication, but they are aimed at providing real time communication to web browsers and mobile, not something like a raspberry pi, and I'm still in the blind as for exactly what technology should I use, and in general the overview and architecture of real time communication.
All I've achieved so far was an app that sends text messages between devices through a server on my network, with very primitive reading/writing using java Socket.
In short, what does messenger/skype/zoom do in the background when you send a message or video call?
Any guidance would be greatly appreciated.

First things first. You cannot do real-time control over Internet, period. There is absolutely no way to guarantee the delivery latency. Your control commands can arrive with a delay from milliseconds to seconds, or never. No way around it.
Now, you can still do a number of reasonable steps to absorb that unpredictable latency as much as possible and safe-guard your remote robot from the consequences of the unreliable communication.
For example, instead of sending the drive commands directly - as in, acceleration, deceleration, turn angle, etc., you can send a projected trajectory that is calculated from your drive commands locally on a model. Your RC car must be sufficiently smart to do some form of localisation - at the very least, wheel odometry, and with a good enough time sync between the sender and the RC car you'll be able to control the behaviour remotely without nasty consequences of drive commands executed at an unpredictable delay.
You can add a heart-beat to your protocol, to monitor the quality of the communication line, and if hear-beat is delayed or missing, initiate emergency stop.
Also, don't bother with TCP, use UDP only and maintain your own sequence counter to monitor missing packets. Same applies to the telemetry stream, not just command channel.

Related

Hololens Wifi card seem to buffer received udp messages

Currently Im building a simple collaboration room for Hololens 2, I want people to be able to see each other.
So I built a system using udp socket to share the hand of user across network (client-server).
To try proving that my system is not at fault I tried a simple experiment ,
I have 1 server and 2 clients, 1 of the client is on Holo 2 and the other on windows standalone both are connected to the server via wifi.
When both are connected they can see each other avatar as well as their own (The computer simulate hand with the MicrosoftRealityToolkit)
Inside the hololens when the user move his hand, we can see the Pc client receiving it instantly with a really small delay
however the Holo recv his own move way later (0.5s delay approximatively)
Same experiment with the pc moving his avatar, It will recv his own move instantly , however the holo will once again recv it with a delay.
Also I noticed that if I increase the messages frequency from the server the hololens will start to lose all the packets
It's really strange because all the packets are received, all the movements are well restored they just have a big delay on holo. I suppose that the networks card is at fault and buffer the received msgs, but Im not knowledgeable enough on network stuff to really understand what's happening there.
Alright so here are the couple of things you can do.
Try with TCP communication so you can keep track of the packages.
Don't send the the pose values constantly, just send when the pose changes and when you receive the data on the Hololens, perform interpolation to get a smoother results. This will make it feel like you have less lag.
Late reply to this, but hopefully helps someone:
Hololens seems to absolutely hate receiving UDP broadcasts. Was having identical latency issue, even with all other devices disconnected from the network, simple packets, and sending at a far lower framerate than the hololens. Try addressing the hololens IP only, not the broadcast address, even if it means you need a second server running for all the other devices

Google home action rest api call

I am struggling to find how to let google home do a local network rest call.
I have some ESP8266 laying around with mDNS and rest api in them.
Now with the google home I want it to send a rest call to the device.
I don't want any web hooks / services like IFTTT. I don't want the communication going through these 3rd party services.
It should work like this google home gets input (google service to understand is oke). It retrieves the action (local network, url rest call with body). Google home sends the rest api call to the local device.
No need to have port forwarding / firewall changes.
The Google Home does very very little on-device processing. Sending out local network calls is not one of the things it does. Almost all processing, including IoT controls through the Smart Home API, are done through cloud-based services.
Update
I can't answer "why" it doesn't do this, since I'm not one of the engineers that built it, but I can make a lot of guesses about why.
For starters - it increases the complexity of the software and hardware on the device dramatically. Right now, the device is really little more than a microphone and a speaker, with a little logic to detect the hotword and then stream everything else to the server, and then get a result back and play it. Most of the rest of the code is likely to handle setup and configuration.
If the device has to also be a general purpose IoT hub, then it needs software and hardware for Bluetooth and possibly other signaling systems. It needs to be able to keep track of the state of other devices on the network and manage that in between power cycles of the device (or even handle interruptions in power for the device itself). Some of the implications of that may need to open up the networking on the device to receive messages, not just send them. It has to have more extensive network configuration - to understand what local networking is and not just what the local router is and how to deal with that configuration (and that configuration when it changes). These are all possible, to be sure, but increase the complexity and, in some cases, lower the security of a device.
And that might be reasonable... if there was significant value in doing so. But you've already stipulated in the question that the voice processing could be done in the cloud, so once commands are sent to the cloud and parsed there - why not also do all of the above (device and state tracking, changing, etc) in the cloud? Particularly since most IoT devices maintain cloud servers anyway because people also want to be able to control or monitor their home devices when they aren't on their home LAN. Having a dual set of commands (some for when you're local, and some when you're not) does make sense in some cases - but also dramatically increases the complexity of both the controller and devices, so most just rely on the cloud, again.
So while I understand why some people would like to have a nice little system that can just sent your play local REST server a command now and then, the reality is that to do this for a consumer system isn't that reasonable.
If you really wanted a system that can do this - you can continue in the hobbyist spirit and build something with the Assistant SDK and your favorite IoT platform.
The “local” API for Google Home is a bit limited. Here’s a doc from someone who reverse-engineered the API.
Looks like they expose Bluetooth and Alarms/Timers, and some limited configuration stuff.
https://rithvikvibhu.github.io/GHLocalApi/

Initiating comms to an embedded 3G device

I have an Arduino based device interfaced to a 3G modem which I use to record data from several sensors in a remote environment. I would like to be able to send commands and stream some data from the device every now and then back to my standard network connected PC. If the remote device was connected to a WIFI or other local area network this would be relatively straightforward, but as the device connects over 3G this means that it is behind the 3G carriers NAT and so establishing a connection to the device becomes difficult.
The device can, of course, open a TCP connection to my host PC any time it wishes, the problem is telling the device when i want it to do so. I need some way of getting some kind of message to the device to notify it that I would like it to initiate a connection to my PC.
I've been reading up on NAT traversal techniques that app developers use to initiate P2P comms between 2 devices both behind NATs such as UDP and TCP 'hole punching' but this method seems rather too complex for my arduino system. Another general idea is to have the device polling a web server periodically looking for a signal to initiate a connection, but I'm not sure how much traffic (and data usage costs) this would generate as the device would have to poll every 10 seconds or so in order to make sure it initiates it's connection within a reasonable time frame of the request being set on the web server that it polls.
Is there any commonly used method of achieving something like this? Any general ideas or insight would be much appreciated
Thanks,
James
I think the solution will depend largely on your particular applications and requirements.
There are several ways to achieve this type of functionality and it looks like you have covered some of them already. The most common are:
have the device poll the server. This may be ok depending on the response times you need. If you need to poll as regularly as you suggest above then I imagine power may be more important to you than data rates, especially if you are battery connected. With a typical 3G data plan the polling itself will have a negligible data overhead, I would think.
have the server send a SMS which then triggers the device to contact the server. You need to make sure the SMS variable delivery time is ok for you and you also have to be aware that SMS delivery is not guaranteed so you would have to build in some sort of check for delivery at a higher layer (or into your application).
use some low cost Android based device for your 3G connectivity and leverage the Google push notifications mechanism
It is worth noting that server polling typically gets very bad press as it is seems intuitively wasteful to have the client and the server constantly checking for messages, especially when the actual messages are fairly infrequent. However, underneath most push solutions there is still a pull mechanism in the background, albeit generally a very efficient one that may, for example, piggy back on other messages between the network and the mobile device and hence have minimal power and data overhead. Personally, I would say that if you do not have major concerns with battery/power or with the load polling might generate for your servers, then it is worth exploring if the simplicity benefits of a polling solution outweigh its other disadvantages.

Is using airplane mode an acceptable way to test a lack of connection?

We are in the process of developing an method of caching so that our app can continue to operate in an area with very little/no signal.
Obviously users will try to continue to use functions that require data and we need to handle the inevitable failure of these requests appropriately.
Essentially we are sat in the office, switching airplane mode on and off to simulate entering/exiting signal then adjusting our app to fix any issues this may arise.
What I'd like to know is, is using airplane mode going to give us a reasonable simulation of entering/exiting an area with no data or are there other implications?
I've seen questions raising the issue that the 3G/EDGE connection may not always wake up after airplane mode is switched on - while I appreciate this method is no way as good as actually being out in the field testing, if we can get a reasonable simulation and account for the majority of the problems that arise then I think this is an acceptable tradeoff.
I apologise if this has been asked before, I did do a search on here & on google but couldn't find any appropriate results.
You should try the Network Link Conditioner
There is a WWDC 2012 session called Networking Best Practices that mentions it (but he does not explain how to use it there).
To get it, you have to go to XCode/Open Developer Tool/More Developer Tools.. and download the latest Hardware IO Tools for XCode.
Once you install it from the IO Tools pkg, "Network Link Conditioner" will appear in System Preferences
You can then do something like 100% packet loss to simulate one of those routers that pretends you are connected but actually doesn't work.
On iOS, the network link conditioner is under Settings / Developer (you must have enabled Developer mode in XCode first to see it)
The main problem is that in the Airplane Mode the networking operations fail fast, while spotty mobile signal will lead to timeouts and a-few-bytes-an-hour speeds. This is usually a significant difference from the UI viewpoint. (It might be worth a try to use some bandwidth throttle to starve the testing machine and see how it behaves when the network starts to break?)
A few years back, when testing remote devices which used the cell network to communicate with the 'home base', we did things like move them into a shielded room (make shift), place large shields on three of four sides to force them to connect to a certain tower (and therefore, network), etc. Brute force physical methods. Since this actually cuts off the signal, it may be a more realistic approach.
You may also want to try this through your wlan-router. First, disable data roaming on your iPhone. Then, let the iPhone be connected to the internet through your wlan network. Then, disconnect the gateway on your wlan router while your iPhone is still connected to the wlan network.
This depends on what failure modes you are trying to test.
I use Airplane mode as a first pass check to make sure an app submission isn't quickly rejected.
Other network failure handling checks might include:
3G only (no wifi).
WIFI only (in Airplane mode).
Pulling the power cord on the WIFI access point.
Pulling the network cable from the back of the WIFI access point after connecting to it (Reachability may falsely say yes).
Walking in and out of a basement
elevator (or other Faraday cage) in the middle of a transfer.
Driving between 2 cell towers during a data transfer.
Walking between 2 enabled WIFI access points between connection and data transfer.
Starting the app after more than 30 minutes of device inactivity (radios may be idle).
Running the app while another app (Safari, Mail) is downloading in the background.
etc.

Multi-player server for iPhone application, using device as socket server

I'm working on a multiplayer iPhone application that allows up to 6 users to connect and play in "real time." I've been looking at hosted and non-hosted socket servers (SmartFox, ElectroServer, Photon/Neutron, ProjectDarkstar) and I'm wondering if anyone has any recommendations for services or implementation? Anyone have any idea of what a game like Zynga's Live Poker uses for this type of functionality or what kind of hardware you might need?
Some sub-questions:
The game is turn-based. Would it make more sense to use AMF and poll a server or should I go for the socket-based route? My current concern is concurrent connection limits and hosting costs.
Is it possible to "broadcast" a device as a socket server? i.e. once I get all my players connected, could I allocate one of the 6 devices to be a socket server and push all communication through that device? Would that be crazy? That would get around concurrency issues and I'd only need to rely on the socket server service as a lobby for the initial connection. The allocated user would stay connected to facilitate game to server communication.
1.
It's much easier to use polling, and since the game is turn based you could poll at a relatively slow rate (perhaps a couple of seconds), which means less battery drain. That said, using sockets or persistent HTTP connections would be a slicker way of doing it (and much more work). These two questions might be of interest:
How do I create a chat server that is not driven by polling?
COMET (server push to client) on iPhone
I don't know why you would use AMF. Why not JSON? Or maybe HessianKit?
2.
It makes a lot of sense to designate one of the devices as a server. Having a completely decentralized network of game clients that need to synchronize is a very hard task. Again, since your game is turn based, which doesn't require perfect real-time synchronization, you don't have to worry that having centralized state will introduce more latency.
If you intend for users to play over a local network, you should consider using GameKit.