Sending file on separate connection - unix-socket

I have a server program that spawns a thread for every incoming connection. This thread then handles the request by receiving it and sending a response. For some kinds of connections I have to respond first with a file and then with a text response.
The problem is that, if I send the textual response after sending the file, the response gets written inside the file, because the client has no way of knowing where the file ends and where the response beings. So I need to close the connection after sending the file and then send a response on other connection or, alternatively, send the file on a separate connection and then send the response on the current connection. How can I accomplish this?

Use the technique that FTP uses to keep the data connection separate from the control connection. The server starts listening on an ephemeral port -- the OS will assign an unused port to it. It sends this port number to the client on the main connection. The client then connects to the ephemeral port, and the server sends the file on this new connection.
If you need to deal with multiple sockets concurrently, you can use select() or epoll() to wait for data on either of them.

Related

client/server socket reconnection

I developed a client/server application based on sockets.
The client side is in Delphi. The server side is on an IBM I (as400)
Sometimes, the client and the server get disconnected. I'm not really sure why, but I think it's because of a machine between them (a proxy, a router, a firewall) sending a RST packet.
Anyway, I'm trying to reconnect the client with the same process on the server. (not another one, the same, that's important).
To do that, I create a new connection from the client. So, I have two processes on the server. I'll call them the "LostProcess" and the "HelperProcess".
The LostProcess is waiting for data in a data queue.
The client tells the HelperProcess that it was connected to the LostProcess.
The HelperProcess sends data to the LostProcess (via the data queue).
The HelperProcess makes a giveDescriptor, and the LostProcess makes a takeDescriptor.
Then the HelperProcess stops and the LostProcess sends data to the client (to say “I'm back”).
So far, it works, but when the client sends data , the LostProcess (we can call it the RebornProcess now) never receives them (I tried not to stop the HelperProcess, and that he is who receives the data).
With Wireshark, I could see that the client sends data with a different local port, so I guess that's why the RebornProcess does not receive them.
I tried to force the local port of the new client socket to be the same as the first one, but then the new client socket cannot connect for a while, and if I wait long enough, I have the same problem as before.
Does somebody have an idea how to make the reconnection work?
What you are doing is generally not possible. Once a TCP connection has been lost, it is gone forever. Both apps must close their respective sockets for the lost connection, and the client app must create a new socket connection to continue exchanging data with the server.
If the client app wants to reuse the same local port via bind() (which is generally not advisable in most cases), but does not want to wait for the OS to release the port first, then the client can enable the SO_REUSEADDR option via setsockopt() on the new socket before calling bind() and connect().
Pretty sure the answer is you can't.
There'd be all kinds of security issues if TCP/IP allowed a new connection to reconnect to an existing processes connection.
You should have the lost process terminate and just use the new process instead.

is it possible for http server to respond http client on same connection fd each time?

I want to write iterative HTTP server code that accepts one HTTP Client on the same conn_fd (file descriptor) every time, but for different clients it should create new_fd, based on checking the client address. Is the possible?
I'm not sure I understand your question, but this is basically how sockets works: you create a master socket and set it to a listening state. Then, everytime you accept a new client, a new socket is created for that client, while the master socket remains the same.
For a nice intro about Unix sockets, see http://beej.us/guide/bgnet/
Each new connection will result in a new socket. So if the same client connects multiple times it will be a new socket (and file descriptor), but if it connects one time and sends multiple requests over the same connection (HTTP keep alive) it will be the same fd.

Explain http keep-alive mechanism

Keep-alives were added to HTTP to basically reduce the significant
overhead of rapidly creating and closing socket connections for each
new request. The following is a summary of how it works within HTTP
1.0 and 1.1:
HTTP 1.0 The HTTP 1.0 specification does not really delve into how
Keep-Alive should work. Basically, browsers that support Keep-Alive
appended an additional header to the request as [edited for clarity] explained below:
When the server processes the request and
generates a response, it also adds a header to the response:
Connection: Keep-Alive
When this is done, the socket connection is
not closed as before, but kept open after sending the response. When
the client sends another request, it reuses the same connection. The
connection will continue to be reused until either the client or
the server decides that the conversation is over, and one of them drops the connection.
The above explanation comes from here. But I don't understand one thing
When this is done, the socket connection is not closed as before, but
kept open after sending the response.
As I understand we just send tcp packets to make requests and responses, how this socket connection helps and how does it work? We still have to send packets, but how can it somehow establish the persistent connection? It seems so unreal.
There is overhead in establishing a new TCP connection (DNS lookups, TCP handshake, SSL/TLS handshake, etc). Without a keep-alive, every HTTP request has to establish a new TCP connection, and then close the connection once the response has been sent/received. A keep-alive allows an existing TCP connection to be re-used for multiple requests/responses, thus avoiding all of that overhead. That is what makes the connection "persistent".
In HTTP 0.9 and 1.0, by default the server closes its end of a TCP connection after sending a response to a client. The client must close its end of the TCP connection after receiving the response. In HTTP 1.0 (but not in 0.9), a client can explicitly ask the server not to close its end of the connection by including a Connection: keep-alive header in the request. If the server agrees, it includes a Connection: keep-alive header in the response, and does not close its end of the connection. The client may then re-use the same TCP connection to send its next request.
In HTTP 1.1, keep-alive is the default behavior, unless the client explicitly asks the server to close the connection by including a Connection: close header in its request, or the server decides to includes a Connection: close header in its response.
Let's make an analogy. HTTP consists in sending a request and getting the response. This is similar to asking someone a question, and receiving a response.
The problem is that the question and the answer need to go through the network. To communicate through the network, TCP (sockets) is used. That's similar to using the phone to ask a question to someone and having this person answer.
HTTP 1.0 consists, when you load a page containing 2 images for example, in
make a phone call
ask for the page
get the page
end the phone call
make a phone call
ask for the first image
get the first image
end the phone call
make a phone call
ask for the second image
get the second image
end the phone call
Making a phone call and ending it takes time and resources. Control data (like the phone number) must transit over the network. It would be more efficient to make a single phone call to get the page and the two images. That's what keep-alive allows doing. With keep-alive, the above becomes
make a phone call
ask for the page
get the page
ask for the first image
get the first image
ask for the second image
get the second image
end the phone call
This is is indeed networking question, but it may be appropriate here after all.
The confusion arises from distinction between packet-oriented and stream-oriented connections.
Internet is often called "TCP/IP" network. At the low level (IP, Internet Protocol) the Internet is packet-oriented. Hosts send packets to other hosts.
However, on top of IP we have TCP (Transmission Control Protocol). The entire purpose of this layer of the internet is to hide the packet-oriented nature of the underlying medium and to present the connection between two hosts (hosts and ports, to be more correct) as a stream of data, similar to a file or a pipe. We can then open a socket in the OS API to represent that connection, and we can treat that socket as a file descriptor (literally an FD in Unix, very similar to file HANDLE in Windows).
Most of the rest of Internet client-server protocols (HTTP, Telnet, SSH, SMTP) are layered on top of TCP. Thus a client opens a connection (a socket), writes its request (which is transmitted as one or more pockets in the underlying IP) to the socket, reads the response from a socket (and the response can contain data from multiple IP packets as well) and then... Then the choice is to keep the connection open for the next request or to close it. Pre-KeepAlive HTTP always closed the connection. New clients and servers can keep it open.
The advantage of KeepAlive is that establishing a connection is expensive. For short requests and responses it may take more packets than the actual data exchange.
The slight disadvantage may be that the server now has to tell the client where the response ends. The server cannot simply send the response and close the connection. It has to tell the client: "read 20KB and that will be the end of my response". Thus the size of the response has to be known in advance by the server and communicated to the client as part of higher-level protocol (e.g. Content-Length: in HTTP). Alternatively, the server may send a delimiter to specify the end of the response - it all depends on the protocol above TCP.
You can understand it this way:
HTTP uses TCP as transport. Before sending and receiving packets via TCP,
Client need to send the connect request
The server responds
Data transfer transfer is done
Connection is closed.
However if we are using keep-alive feature, the connection is not closed after receiving the data. The connection stays active.
This helps improving performance as for the next calls, the Connect establishment will not take place as the connection to the server is already there. This means less time taken. Although time takes in connecting is small but it do make a lot of difference in systems where every ms counts.

Does winsock api multithread automatically?

I am wring a small http server which is using the Microsoft Windows WinSock API.
Do I need to apply multithreaded logic when handling multiple users?
Currently Windows sends a message when there is a network event and each message
carried (in wParam) the socket to be used in either send() or recv().
When client A connects and requests a couple of files usually a number of socket
are created by Winsock. My server then get a message that "send this file to
socket 123" and later "send that file to socket 456"
When another client connect it too gets a few sockets, say 789 and 654.
My server then respond to requests to send data using supplied socket number. It
does not have to know who wants the file since the correct file has to be sent to
the right socket.
I do not know whether Windows itself uses multiple threads when handling
accepting connection and sending the message down to my program.
So my question is:
Do I need to apply multithreaded logic when handling multiple users? And if so at
what point should I create a thread?
You typically use a thread per socket. And if you are accepting connections, a thread in a loop to block, waiting for an incoming connection socket. You then create a new thread and pass this socket handle to the new thread to handle. When that connection is closed and done with, simply let that thread terminate (or join). This is the basis of a threaded server.
in psudo code...
loop {
socket = accept();
new ThreadHandler( socket )
}
Using a single thread to handle multiple sockets is tricky, mainly because the thread can block (stop, waiting) while its writing, or more often, reading from a socket. It's not for the faint hearted.
For most applications, there is no point in using multiple threads to handle network connections. I've made a small writeup in an answer to this question.
Multiple threads become useful when handling the received data requires an unpredictable amount of CPU time, for example in database servers, or when the program structure does not allow for requests to be handled asynchronously.
There is also a third option, the "worker pool". A single thread handles all incoming connections and deserializes incoming requests, and then passes off work items to a pool of threads that handle one item at a time.
This way, simply opening a connection does not yet consume the resources needed for an entire thread, and system load is implicitly limited by the number of threads in the pool.

client server sockets and file transfer

I have a client serever application,
My server accepts connections from more than one clients.
After a client is connected to server it sends command to the server and the sever sends replies
the replies are either strings or files.
On the server side after accepting connection,
there is a socket (seperate from listening socket) which is responsible for communication with client.
On the client side after a client sends a command to server, I start reading for the response on the same socket.
Now my problem is with files,
client sends a command to server asking for file, the server starts responding by sending binary data of file, if file is all good it transfers fine,
But if on the server side in the middle of file transfer the server gets a read problem, it has no way to infrom that problem to client, because this is a one to one socket communication... the client will treat any incoming data as if it is a file data untill the file size sent in the start is not complete.,
I am sure this could be a recurring pattern how to can I resolve this ?
FTP does this by having two connections: a command connection and a data connection.
As long as these are TCP/IP sockets, all you need is an agreement between the server and the client that the first eight bytes(for example) sent() and recv()ed, respectively, represents the size of the binary data to follow. TCP/IP will make sure that all the pieces arrive and are in order for you. If you have a variety of files that could be transferred, then you agree that the next four bytes after that represent characters for the file type. So you basically keep recv()ing until you have 12 bytes, which will probably take only one recv(). Then keep using recv() until you have all the bytes you expected to receive.