confusion regarding XMPP xep-0065 and xep-0096

confusion regarding XMPP xep-0065 and xep-0096 - xmpp

I am currently working on xmppframework, Requirements are to transfer the file between two iPhones. I searched for XEPs and found 0065 and 0096
XEP-0065 says:
XMPP is designed for sending relatively small chunks of XML between
network entities and is not designed for sending binary data. However,
sometimes it is desirable to send binary data to another entity that
one has discovered on the XMPP network (e.g., to send a file).
Therefore it is valuable to have a generic protocol for streaming
binary data between any two entities on an XMPP network. The main
application for such a bytestreaming technology is file transfer as
specified in SI File Transfer [1] and Jingle File Transfer [2].
However, other applications are possible, which is why it is important
to develop a generic protocol rather than one that is specialized for
a particular application such as file transfer.
Please see the line in bold, its confusing me if file transfer XEPs are SI File Transfer(0096) and Jingle File Transfer(0234), then what is the purpose of this 0065 XEP? why people on net referring sep-0065 for file transfer?

In XMPP there are different protocols (XEPS) for file transfer. Jingle, Bytestreams, OOB, IBB...
The purpose of XEP-0096 is stream initiation. So its build on top of the other file transfer protocols to enable seamless file transfers.
So its used to agree on one of the above file transfer protocols between 2 clients for a transfer, and also for finding a fallback method if this fails for any reason.
Alex

XEP-0065 is for proxied file transfers: you will need such a proxy in your infrastructure, unless you use a public one.
XEP-0096 is much more complex, I wouldn't recommend that for a start, although I would recommend it if you later extensively use large binary transfers/exchanges, as Jingle is used for VoIP at least.

Related

Is gRPC(HTTP/2) faster than REST with HTTP/2?

The goal is to introduce a transport and application layer protocol that is better in its latency and network throughput. Currently, the application uses REST with HTTP/1.1 and we experience a high latency. I need to resolve this latency problem and I am open to use either gRPC(HTTP/2) or REST/HTTP2.
HTTP/2:
Multiplexed
Single TCP Connection
Binary instead of textual
Header compression
Server Push
I am aware of all the above advantages. Question No. 1: If I use REST with HTTP/2, I am sure, I will get a significant performance improvement when compared to REST with HTTP/1.1, but how does this compare with gRPC(HTTP/2)?
I am also aware that gRPC uses proto buffer, which is the best binary serialization technique for transmission of structured data on the wire. Proto buffer also helps in developing an language agnostic approach. I agree with that and I can implement the same feature in REST using graphQL. But my concern is over serialization: Question No. 2: When HTTP/2 implements this binary feature, does using proto buffer give an added advantage on top of HTTP/2?
Question No. 3: In terms of streaming, bi-directional use-cases, how does gRPC(HTTP/2) compare with (REST and HTTP/2)?
There are so many blogs/videos out in the internet that compares gRPC(HTTP/2) with (REST and HTTP/1.1) like this. As stated earlier, I would like to know the differences, benefits on comparing GRPC(HTTP/2) and (REST with HTTP/2).

gRPC is not faster than REST over HTTP/2 by default, but it gives you the tools to make it faster. There are some things that would be difficult or impossible to do with REST.
Selective message compression. In gRPC a streaming RPC can decide to compress or not compress messages. For example, if you are streaming mixed text and images over a single stream (or really any mixed compressible content), you can turn off compression for the images. This saves you from compressing already compressed data which won't get any smaller, but will burn up your CPU.
First class load balancing. While not an improvement in point to point connections, gRPC can intelligently pick which backend to send traffic to. (this is a library feature, not a wire protocol feature). This means you can send your requests to the least loaded backend server without resorting to using a proxy. This is a latency win.
Heavily optimized. gRPC (the library) is under continuous benchmarks to ensure that there are no speed regressions. Those benchmarks are improving constantly. Again, this doesn't have anything to do with gRPC the protocol, but your program will be faster for having used gRPC.
As nfirvine said, you will see most of your performance improvement just from using Protobuf. While you could use proto with REST, it is very nicely integrated with gRPC. Technically, you could use JSON with gRPC, but most people don't want to pay the performance cost after getting used to protos.

I am not an expert on this by any means and I have no data to back any of this up.
The "binary feature" you're talking about is the binary representation of HTTP/2 frames. The content itself (a JSON payload) will still be UTF-8. You can compress that JSON and set Content-Encoding: gzip, just like HTTP/1.
But gRPC does gzip compression as well. So really, we're talking about the difference between gzip-compressed JSON vs gzip-compressed protobufs.
As you can imagine, compressed protobufs should beat compressed JSON in every way, or else protobufs have failed at their goal.
Besides the ubiquity of JSON vs protobufs, the only downside I can see to using protobufs is that you need the .proto to decode them, say in a tcpdump situation.

What I found out is,
If you want to send files such as txt, img or video files then REST/HTTP is much faster than gRPC.
If you want to send objects over the wire, then gRPC is effecient.
Sources:
For sending files,, sending large files,, another blog

Kind of received data via socket

Is there any way/trick/algorithm that allows me to know what kind of data is coming via socket? I can send both text and files via socket, but I wonder what I'm getting to treat differently.
Any idea?

Is there any way/trick/algorithm that allows me to know what kind of data is coming via socket?
No there is not, and it should be obvious that this is impossible. It's perfectly possible for a binary file and a text file to be identical at the binary level. In which case, how could you distinguish between them.
Sockets are simply a communication layer. It is up to you to determine the protocol for that communication.

There is a trick, exactly that. Unix command 'file' has a lot of heuristics built into it, which allows it to make a very educated guessess regarding random file contents. You can employ it by, for example, saving your data into a temporary file on disk and running file on it. Of course, it is non-binding, but file is good at what it does.

Protocol for sending streaming data over multiple sockets

I am working on designing an API for consuming messages from an application that will generate a very large amount of data; 10+ of GB/s is likely, even for smaller clients. I am looking for a protocol that allows me to deliver this data in a way that is easy for clients to consume.
The obvious answer for me is: split up the messages so they are consumable over multiple connections. Each connection would consume a fraction of the overall load.
But if I do this, there are a few things I need to account for:
How does the user know they are falling behind and need to launch more connections?
Twitter says consumers should check timestamps, which could work for us
When they launch a new connection to consume more of the data, how do they specify that this is part of the same consumption session?
We could give the session a name, correlate that with a "direct" amqp queue, and let our queue do the hard work
Is there something very important I am missing.
Probably.
For this reason, I'd much rather a protocol that already exists.
The protocol would be considered extra awesome if it:
is websocket or streaming HTTP friendly
supports data compression

The problems you are describing are pretty much the same issues that video streaming has to deal with, which you probably already know. The key HTTP friendly streaming protocols are HLS (Apple), SmoothStreaming (Microsoft), HDS (Adobe) and MPEG-DASH (open protocol, but new).
When considering video streaming, it is also worth understanding whether your streams are more like 'live' streams or 'static' content - the former is generated on the fly and any given part of the live stream may only be available for a set tine, while the latter stored on the server in full and generally any part is available at any time (until the content is removed). How you stream and playback these is subtly different.
It may be that you can simply reuse one of the above video streaming protocols by wrapping your data as if it were video (or maybe it even is video), and implementing your own custom client on the receiving side.
Alternatively, these protocols could provide a good reference point if you wanted to create your own simpler protocol - there are several open source streaming servers you could look to for ideas or even adapt to your needs if that looks like a sensible route:
http://gstreamer.freedesktop.org
http://icecast.org
Video streaming is quite complex as you may already be aware, but if your use cases are simpler you may be able to ignore or remove much of the complexity - for example you may not need seek, multiple format and bit rate streams, accompanying streams (for subtitles etc). Being able to simplify like this might justify the effort to modify one of the above for your needs, if you are not able to use them out of the box.
One final point - video and audio streaming protocols usually have a built in way of dealing with delayed or lost packets. Depending on your application these may not be applicable to you so you should look carefully at this aspect if reusing a video or audio streaming protocol or server. For example, audio clients are typically tolerant of a small amount of packet loss, and will generally discard delayed packets rather than pause the audio (packets received outside the 'jitter buffer' window). If your application cannot tolerate any packet loss, then you will need to look carefully at the underlying solution and protocol to make sure it really meets your needs over all network conditions.

Application Level pcap dump

Is it possible to dump contents of tcp recv/send to a pcap file and then have wireshark analyse this. I want to do this because I want to log an applications conversations in order to debug issues but I don't want to have to write my own parsers (wireshark already supports it) and don't necessarily trust the application to parse frames correctly.

What follows is based on work that I've done exclusively on Windows.
I can't speak to how well any of this might apply to other systems.
FWIW, I would suggest editing your question (or at least tagging it)
to be more clear about what OS you're targeting.
If you are talking about writing what's received directly from a TCP client to a pcap file, the answer is no. The pcap file format requires at least some networking headers, e.g., the IP and TCP headers, in order for the file to be valid.
There are four approaches I see.
You could fake these headers for each read from the socket. This wouldn't necessarily require a lot of work, but it would require some knowledge of the relationships between the network header, the transport header, and the payload (i.e., your data) in order for everything to work right. If you're only interested in analysis of the payload in Wireshark, this might be a valid approach. However, if the analysis you're wanting to do in Wireshark is network analysis (i.e., analysis that includes the network headers), then you'll want to avoid this option and figure out how to capture the network headers directly.
You could use a raw socket to capture network traffic at the network layer. This will give you the IP packet - IP header, TCP header, and payload. You could dump this to a pcap file using LINKTYPE_RAW as the link-layer header type, and Wireshark will read the file fine. I've used this approach before.
The third approach would be to use WinPcap to collect your data. This is the same software that Wireshark uses behind the scenes to collect the data, and it's the approach I'm using now. If you have Wireshark installed, the WinPcap is already installed, too. WinPcap has a simple API with numerous examples for opening an interface, collecting the data, and dumping it to a pcap or pcap-ng file.
The fourth option is the simplest...just use Wireshark to collect the data simultaneously with your application.
HTH.

How to push data to variety of different client types in near real time?

We need is to push sports data to a number of different client types such as ajax/javascript, flash, .NET and Mac/iPhone. Data updates need to only be near-real time with delays of several seconds being acceptable.
How to best accomplish this?

The best solution (if we're talking .NET) seem to be to use WCF and streaming http. The client makes the first http connection to the server at port 80, the connection is then kept open with a streaming response that never ends. (And if it does it reconnects).
Here's a sample that demonstrates this: Streaming XML.
The solution to pushing through firewalls: Keeping connections open in IIS

I would go with XML. XML is widely supported on all platforms and has lots of libraries and tools available for it. And since it's text, there are no issues when you pass it between platforms.
I know JSON is another alternative, but I'm not familiar enough with it to know whether or not to recommend it in this case.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse