What is the actual difference between Socket and RPC (Remote Procedure Call)?
As per my understanding both's working is based on Client–server model. Also which one should be used in which conditions?
PS: Confusion arise while reading Operating System Concepts by Galvin
Short answer:
RPC is the protocol. The socket provides access to the transport to implement that protocol.
RPC is the service and protocol offered by the operating system to allow code to be triggered for running by a remote application. It has a defined protocol by which procedures or objects can be accessed by another device over a network. An implementation of RPC can be done over basically any network transport (e.g. TCP, UDP, cups with strings).
The socket is just a programming abstraction such that the application can send and receive data with another device through a particular network transport. You implement protocols (such as RPC) on top of a transport (such as TCP) with a socket.
It is operating system specific. So read first a good OS book like Operating Systems: Three Easy Pieces (freely downloadable).
Network sockets are a way to do some inter-process communication (notably between different machines). Read also about Berkeley sockets API, e.g. socket(7) on Linux.
Remote procedure calls are a programming technique (often using socket(2) system call on Linux). Every RPC request expects exactly one reply and is software initiated.
Sockets are often also used for asynchronous messages (for example, the X11 protocols stack, WebSockets, SMTP). Message passing is a programming paradigm (more general than RPC), they are sent often without expecting any reply. For example, the X11 server would send a keyboard event message for every key press, etc.
(so in some ways, you are comparing apples and oranges)
If on Linux, I recommend reading Advanced Linux Programming (freely downloadable), and reading more about syscalls(2) (notably poll(2) for multiplexing)
PS: Confusion arise while reading Operating System Concepts by Galvin
That's your problem right there.
A remote procedure call (RPC) is high level model for network communication. There are numerous RPC protocols in existence. In the RPC model, your underlying implementation creates a stub for each remote procedure. When your application calls the "remote procedure" the stub packs up the parameters, sends them over the network, invokes, the remote version of the procedure, takes the return values and send them back over the network to the caller, the stub unpacks the return values and your application then receives them.
The RPC model became hip in the late 1980's. The idea was that it would be transparent where your functions actually executed (in your process, in another process, on another computer). This concept expanded into distributed objects around the early 1990's (e.g., DCOM, CORBA).
Unfortunately, in the real world applications really needed to know if a procedure was executing remotely because of delay and error handling.
Somewhere in the the RPC implementation a network interface gets called.
Sockets are such a network interface. They are not the only programming interface but they are the most common on Unix systems.
Thus, an RPC MIGHT be implemented using a socket.
Related
Given the server-client model, would the OS initiate messages to applications, or is message passing always initiated by programs that want to use resources and thus must communicate with the OS?
OS is an overloaded term, and application is a vague term.
A pure message passing OS might implement traditional (unix) system calls in applications. For example, you might have an application called FileSystem, which accepts messages like Read,Write,Open,Close.... In these, such an application would be considered a server, and the client would be an application which wanted to use the File Services.
Pure message passing systems typically have difficulty with asynchronous events. When you look at implementing a normal read system call in a message passing system, it is natural that it will be an RPC: the client sends a read request, then suspends until the server has satisfied the read and sent a reply.
When the client wants asynchronous notification, such as send me a message when there is new mouse events available; the RPC somewhat falls down. While purely asynchronous systems exist, they are cumbersome to use with plain old programming languages like C, C++, ... There is hope that message based languages like Golang can break the impass, but that is yet to be seen.
Higher level OS-like services may deploy a number of interaction methods, quite distinct from client serve. Publish-Subscribe, a more recent reimplmentation of the 1980s multi-catch, has been popular in the last decade. Clients subscribe to a set of channels that they are interested in, and every event delivered to that channel is copied to every client subscribed to the channel before it is retired. Normal clients can generate events as well, so the mechanism serves as a dynamic interconnect between modules.
Dbus + zeromq are P-S systems of differing scales. Note that both can be implemented outside of a message passing OS.
I'm playing around Language Server Protocol. After playing around for sometime I can see two way to communicate with the Language server, which is blocking sockets and non-blocking sockets.
By blocking socket I mean sending request and block until response. This is easy but It will block the UI once I use it in GUI application. Another one is using async/non-blocking sockets. This is a bit complex and might require some callback/event mechanism.
Now my question is which way does VSCode use to communicate with LSP?
The node language server implementation used by many extensions uses non-blocking communications. You can find the implementation here. It uses nodejs streams and the net module
Is there a conventional way to write a program such that commands can be issued to the program from the command line without a repl? For example, how you can send commands to a running nginx server using sudo /etc/init.d/nginx restart (or any other valid command besides restart)
One idea I had was having the long-running program create and monitor a unix socket that other programs can write to to send it commands. Another was to create a local server with a REST interface that can be sent commands that way, though that seems a bit gross.
What's the right way to do this?
Both ways are ok, and you could even consider using some RPC machinery, such as making your application serve JSONRPC on some unix(7) socket. Or use a fifo(7). Or use D-Bus.
A common habit on Unix is to have applications reload their configuration files on e.g. SIGHUP signal, and save some persistent state (before terminating) on SIGTERM. Read signal(7) (notice that only async-signal-safe routines can be called fro signal handlers; a good way is to only set some volatile sig_atomic_t variable inside the handler and test it outside). See also POSIX signal.h documentation.
You might make your application become a specialized HTTP server (e.g. using some HTTP server library like libonion) and give it some Web interface (or REST, or SOAP ...); the user (or sysadmin) will then use his browser to interact with your application.
You could make your server systemd compatible. (I don't know exactly what that requires, it is perhaps D-bus related).
You could embed some command interpreter (like Guile and Lua) in your app and have some limited kind of REPL loop running on some IPC like a socket or a fifo. Beware of nasty code injection.
I had a similar issue where I have a plethora of services running on any number of machines and each is in need of communicating with several others.
My main problem was not so much the communication between the services. That can be done with a simple message sent over a connection (as Basile mentioned, it can be TCP, UDP, Unix sockets, FIFOs...). However, when you have over 20 services, many of which need to communicate with several other services, you start having a headache on how to get all the connections right (I have such a system, although it has a relatively limited number of services, like just 10 and that's already very complicated).
So I created a process (yet another service) called Communicator. All services connect to the Communicator service and when they need to send a message, they include the name of the service they want to reach. The Communicator service is in charge of sending the message to the right place—i.e. it could be to another Communicator service running on a different computer. Communicator has a graph of all the services available on your network and knows how to send messages to them without your service having to know anything about all of that. Computing a graph can be really complex.
For the purpose, I created the eventdispatcher project. It is in C++, which may not be what you're interested in, although you could use it in other languages that interface with C/C++. The structure of the messages are "proprietary" (specific to the Communicator), but you can create any message you want. A message includes a name and parameters (param-name=value). The first version has a simple one line text communication system. The newer version accepts JSON as well (still must be one line of text per message).
The system supports TCP, UDP, Unix sockets, FIFO, and between threads, you can have thread safe fifos. It also understand signals (like SIGHUP, SIGTERM, etc.) It has a specific connection to listen for the death of a thread. It supports encryption over TCP via OpenSSL. The messages can automatically be dispatched (hence the current name of the library). Connections are assigned a timer. And there are CUI and GUI (Qt) extensions as well.
The one main point here is that all your connections can be polled (see poll()) and thus you can implement a system that reacts to events instead of a system which sleeps and checks for events, sleeps and check, etc. or worth, you have a single blocking connection and everything has to happen on that one connection or your service gets stuck. This is one reason Unix has been using signals since early version of Unix did not have select() nor poll().
What are the advantages and disadvantages of using only socket based communication vs a hybrid of REST and socket (using socket only when bidirectional communication is necessary, like receiving messages in a chat).
When I say only socket, I mean that instead of sending a GET request asking for /entities, I'd send update_needed and the server would send a push via socket.
My question is not really about performance, it's more about the concept, like delegate vs block/lambda (using socket would be like the delegate concept and REST is more like block).
It all boils down to what type of application and level of scalability you have in mind.
WebSocket/REST: Client connections?
How to handle CQRS from a client-side perspective
Hard downsides of long polling?
The main reason why I wouldn't use WebSockets in any major project is simply that still many users don't use a modern browser that support them. Namely IE 8 and 9 don't support them and both together still have a market share of over 20 % (Oct 15).
I am trying to understand implementations/options for server-side Websocket endpoints - particularly in Perl using PSGI/Plack and I have a question: Why are all server-side websocket implementations based around event-driven PSGI servers (Twiggy, Tatsumaki, etc.)?
I get that websocket communication is asynchronous, but a non-event driven PSGI server (say Starman) could spawn an asynchronous listener to handle the websocket side of things. I have seen (but not understood) PHP implementations of Websocket servers, so why cant the same be done with PSGI without having to change the server to an event driven one?
Underlying network logic to deal with sockets depends on platform, OS and particular software implementations.
Most common three methods are:
pulling - there is blocking constant "asking" if socket has some data. This method is well bad, as it will block execution of main thread for as long as it waits for some data.
thread per socket - each new connection involves creating new thread and asking each socket in blocking manner happens within that thread. So it wont block main thread with logic. This method is bad as creating thread for each connection is too expensive for memory, and can be around 1Mb or RAM based on OS and other criteria.
async - uses system features to "notify" your process when there is something. So you can react once your app is ready (in case of single threaded app) or even react in separate thread straight away. This method is well efficient as it saves RAM, and allows your app to work without need of waiting or asking for data. It utilises existing functionalities that most OS and platforms provide.
Taking this in account, you indeed can create single process functional way to deal with sockets traffic. But that is not efficient at all as been proven previously. That is why fully async models are major today, as most languages and platforms do support such paradigm.