I am new to Async Socket Connection. Can you please explain. How does this technology work.
There's an existing application (server) which requires socket connections to transmit data back and forward. I already create my application (.NET) but the Server application doesn't seem to understand the XML data that I am sending. My documentation is giving me two ports one to Send and another one to Receive.
I need to be sure that I understand how this works.
I got the IP addresses and also the two Ports to be used.
A socket is the most "raw" way you can use to send byte-level TCP and UDP packets across a network.
For example, your browser uses a socket TCP connection to connect to the StackOverflow web server on port 80. Your browser and the server exchange commands and data according to an agreed-on structure/protocol (in this case, HTTP). An asynchronous socket is no different than a synchronous socket except that is does not block the thread that's using it.
This is really not the most ideal way to work (check and see if your server/vendor application supports SOAP/Web Services, etc), but if this is really the only way, there could be a number of reasons why it's failing. To name a few...
Not actually getting connected or sending data. Run a test using WinsockTool (http://www.isatools.org/tools/winsocktool.msi) and simulate your client first to make sure the server is working as expected.
Encoding incorrect - You're sending raw bytes across the network... Make sure you're using the correct encoding to convert your XML into bytes (ASCII, UTF8, etc).
Buffer Length - Your sending buffer (the amount of data you can transmit in one shot) may be too small or the server may expect a content of a certain length, and your XML could be getting truncated.
let's break a misconception... sockets are FULL-DUPLEX: you connect to a server using one port, then you can send AND receive data through the same socket, no need for 2 port numbers. (actually, there is a port assigned for receiving data, but it is: 1. assigned automatically when creating the socket (unless told so) and 2. of no use in the function calls to receive data)
so you tell us that your documentation give you 2 port numbers... i assume that the "server" is an already existing in-house application, and you are trying to talk to it. if the doc lists 2 ports, then you will need 2 sockets: one for sending and another one for receiving. now i would suggest you first use a synchronous socket before trying the async way: a synchronous socket is less error-prone for a first test.
(by the way, let's break another misconception: if well coded, once a server listen on a port, it can receive any number of connection through the same port number, no need to open 2 listening ports to accept 2 connections... sorry for the re-alignment, but i've seen those 2 errors committed enough time, it gives me a urge to kill)
Related
I have two apps sending tcp packages, both written in python 2. When client sends tcp packets to server too fast, the packets get concatenated. Is there a way to make python recover only last sent package from socket? I will be sending files with it, so I cannot just use some character as packet terminator, because I don't know the content of the file.
TCP uses packets for transmission, but it is not exposed to the application. Instead, the TCP layer may decide how to break the data into packets, even fragments, and how to deliver them. Often, this happens because of the unterlying network topology.
From an application point of view, you should consider a TCP connection as a stream of octets, i.e. your data unit is the byte, not a packet.
If you want to transmit "packets", use a datagram-oriented protocol such as UDP (but beware, there are size limits for such packets, and with UDP you need to take care of retransmissions yourself), or wrap them manually. For example, you could always send the packet length first, then the payload, over TCP. On the other side, read the size first, then you know how many bytes need to follow (beware, you may need to read more than once to get everything, because of fragmentation). Here, TCP will take care of in-order delivery and retransmission, so this is easier.
TCP is a streaming protocol, which doesn't expose individual packets. While reading from stream and getting packets might work in some configurations, it will break with even minor changes to operating system or networking hardware involved.
To resolve the issue, use a higher-level protocol to mark file boundaries. For example, you can prefix the file with its length in octets (bytes). Or, you can switch to a protocol that already handles this kind of stuff, like http.
First you need to know if the packet is combined before it is sent or after. Use wireshark to check it the sender is sending one packet or two. If it is sending one, then your fix is to call flush() after each write. I do not know the answer if the receiver is combining packets after receiving them.
You could change what you are sending. You could send bytes sent, followed by the bytes. Then the other side would know how many bytes to read.
Normally, TCP_NODELAY prevents that. But there are very few situations where you need to switch that on. One of the few valid ones are telnet style applications.
What you need is a protocol on top of the tcp connection. Think of the TCP connection as a pipe. You put things in one end of the pipe and get them out of the other. You cannot just send a file through this without both ends being coordinated. You have recognised you don't know how big it is and where it ends. This is your problem. Protocols take care of this. You don't have a protocol and so what you're writing is never going to be robust.
You say you don't know the length. Get the length of the file and transmit that in a header, followed by the number of bytes.
For example, if the header is a 64bits which is the length, then when you receive your header at the server end, you read the 64bit number as the length and then keep reading until the end of the file which should be the length.
Of course, this is extremely simplistic but that's the basics of it.
In fact, you don't have to design your own protocol. You could go to the internet and use an existing protocol. Such as HTTP.
I have a client application that uses IOCP for socket communication. I'm using ConnectEx to make the TCP connection to the remote endpoing (binding the socket to ADDR_ANY and port 0 before calling ConnectEx).
It will be valid to have two connections to the same remote endpoint (same IP address and port number). When I test that condition with my current code, I have two overlapped IO read operations outstanding (one on each connected socket) from calls to WSARecv(). Each WSARecv() is called with the correct socket and overlapped structure. For example: WSARecv(socket1, ... overlapped1) and WSARecv(socket2, ... overlapped2). The problem I've run into is that when I get a response back from either remote, it triggers the completion event for both of the outstanding overlapped operations. My code only produces this result when two remotes have the same ip address and port number, not when there are two unique remote addresses. Is this the expected behavior (hopefully not)? If so, is there another way to accomplish this?
I'm posting an answer, even though it is really just an explanation of why the problem happened.
My test involved connecting to and communicating with a remote device that provides data. It turns out that it is on the other side of a digi terminal server. So the connection path was:
my test computer (via TCP) -> Digi terminal server (via Serial) -> remote device.
The digi terminal server basically converts TCP/IP to serial communications, and back. Since the serial side doesn't have a concept of 'connectedness' the digi doesn't know which TCP/IP connection should receive the serial data in response to a TCP/IP request, so it forwards the serial data to all active connections on the TCP/IP side. That's what was producing the IOCP trigger on both of my pending overlapped operations. Every time a request was sent to the digi, it sent the request out of its serial port. When the end device responded, the digit forwarded the response data to each of my TCP/IP connections.
Thanks to everyone who commented on my question, but sorry for taking up your time.
I started reading UNIX network programming by W. Richard Stevens and I am very confused between a port and a socket . when I read on internet it said that socket is an endpoint for a connection and for port number it was written that , IP address and port no form a unique pair .
So now my question is that :
(1) What is the difference between these two ?
(2)How are sockets and ports internally manipulated. Are sockets a file ?
(3) How is data sent when we send it using an application ?
(4) If sockets are there then why do we use port numbers ?
Sorry for my English.. Thanks in advance for the reply.
(1) What is the difference between these two ?
A computer running IP networking always has a fixed number of ports -- 65535 TCP ports and 65535 UDP ports. A network packet's header contains a 16-bit unsigned-short field in it specifying which of those ports the packet should be delivered to.
Sockets, on the other hand, are demand-allocated by each program. A socket serves as a handle/interface between the program and the OS's networking stack, and is used to build and specify a context for a particular networking task. A socket may or may not be bound to a port, and it's also possible (and common) to have more than one socket bound to a particular port at the same time.
(2)How are sockets and ports internally manipulated. Are sockets a
file ?
That's totally up to the OS; and different OS's do it different ways. It's unclear what you mean by "a file" in this question, but in general sockets do not have anything to do with the filesystem. On the other hand, one feature of Unix-style OS's is that socket descriptors are also usable in the much same way that filesystem file descriptors are -- i.e. you can pass them to read()/write()/select(), etc and get useful results. Other OS's, such as Windows, do not support that feature and for them you must use a completely separate set of function calls for sockets vs files.
(3) How is data sent when we send it using an application ?
The application calls the send() function (or a similar function such as sendto()), passes in the relevant socket descriptor along with a pointer to the data it wants to send, and then it is up to the network stack to copy that data into a packet and deliver it to the appropriate networking device for transmission.
(4) If sockets are there then why do we use port numbers ?
Because you need a way to communicate with particular programs on other computers, and computer A has no way of knowing what sockets are present (if any) on computer B. But port numbers are fixed, so it is possible for programmers to use them as a rendezvous point for communication -- for example, your web browser knows that a web server is almost certain to be listening for incoming HTTP requests on port 80 whenever the server is running, so it can send its requests to port 80 with a reasonable expectation of getting a useful response back. If it had to specify a socket as a target instead, what would it specify? The server's socket numbers are arbitrary and likely to be different every time the server runs.
1) What is the difference between these two ?
(2)How are sockets and ports internally manipulated. Are sockets a file ?
A socket is (IP+Port):
A socket is like a telephone (i.e. end to end device for communication)
IP is like your telephone number (i.e. address of your socket)
Port is like the person you want to talk to (i.e. the service you want to order from that address)
A socket is part of a process. A process in linux is a file.
(3) How is data sent when we send it using an application ?
Data is sent by converting it to bytes. There is little/big endian problem regarding the ordering in bytes so you have to take this into consideration when coding.
(4) If sockets are there then why do we use port numbers ?
A socket is (address + port) that means the person you want to talk to (port) can be reachable from many telephone numbers (IPs) and thus from many sockets (that does not mean that the person on one telephone number will reply to you the same as the one in the other telephone number because his job here/there may be different).
If I want to use (UDP) sockets as an inter-process communication mechanism on a single PC, are there restrictions on what I can set up due to the two endpoints having the same IP address?
I imagine that in order to have two processes A and B both listening on the same IP/port address, SO_REUSADDR would be necessary - correct? And even though that might conceptually allow for full duplex comms over a single socket, there are other questions I have if I try to go full duplex:
would I end up receiving my own transmissions, and have to filter them out?
would I be exposing myself to other processes injecting spurious or malicious data into my sockets due to the use of SO_REUSEADDR... or do I face this possibility simply by using (connectionless) UDP?
how would things be different (in an addressing/security/restrictions sense) if I chose to use TCP instead?
I'm confident that there is a viable solution using two sockets at each end (one for A -> B data, one for B ->A data)... but is there a viable solution using a single socket at each end? Would there be any clear advantages to using one full-duplex socket per process if it is possible?
The question arises from a misunderstanding. The misunderstanding arises from reading variable names like receivePort and sendPort with different values, and reading them as if they have an implicit link to the socket at the local end. This might make one (mistakenly) believe that two sockets are being used, or must be used - one for send, one for receive. This is wrong - a single socket (at each end) is all that is required.
If using variables to refer to ports on a single host, it is preferable to name them such that it is clear that one is local or pertaining to "this" process, and the other is remote or peer and pertains to the address of a different process, despite being on the same local host. Then it should be clearer that, like any socket, it is entirely possibly to support both send and receive from the single socket with its single port number.
In this scenario (inter-process communication on the same host necessarily using different port numbers for the single socket at each end) all the other questions (SO_REUSEADDR, TCP vs UDP and receiving one's own transmissions) are distractions arising from the misunderstanding.
As far as i know only one process can be bound to a port of the same protocol, and in order to read incoming information to a port a socket must be bound to a that relevant port.
is there a way of sharing a socket with another process or something like that?
is there a way of sharing a socket with another process or something like that?
Sharing a socket and thus the port between two processes is possible (like after fork) but this is probably not what you want for data analysis, since if one process reads the data the other does not get them anymore.
how can firewall/iptables check incoming tcp traffic of already bound ports?
Packet filter like iptables work inside the kernel and get the data before they gets send to the socket. It does not even matter if there is socket bound to this specific port at all. Unless the packet filter denies the data they get forwarded unchanged to the socket (if there is any).
Passive IDS like snort or tools like tcpdump get the raw packets and here it also does not matter if there is a socket at all. They can only read the packets, i.e. not modify or block.
Application level firewalls or (reverse) proxies have their own socket and receive the data there (directly or redirected by the packet filter). They can then analyse the data and will explicitly forward the data (maybe after modification) to the original application.