UDP packets are dropped in OS (asio is used) - sockets

I am developing client-server application which transfers data via UDP.
I am facing the problem of dropped packets. I added socket buffer checking to detect potential overflow. Also my app checks sequence of received numbers in packets. Packets have fixed size. If free space of socket buffer is less than threshold (size of 3 packets for example) then "Critical level of buffer" message is logged. If number of packet is skipped in sequence then corresponding message is logged. There is code:
UdpServer::UdpServer(asio::io_service& io, uint16_t port, uint32_t packetSize) : CommunicationBase(io, port),
m_socket(io, asio::ip::udp::endpoint(asio::ip::address_v6::any(), m_port))
{
m_buffer = new uint8_t[packetSize];
m_packetSize = packetSize;
m_socketBufferSize = m_packetSize * 32;
m_criticalLevel = 5 * m_packetSize;
asio::ip::udp::socket::receive_buffer_size recieveBuffSize(m_socketBufferSize);
m_socket.set_option(recieveBuffSize);
}
UdpServer::~UdpServer()
{
std::free(m_buffer);
}
void UdpServer::StartReceive(std::function<void(uint8_t* buffer, uint32_t bytesCount)> receiveHandler)
{
m_onReceive = receiveHandler;
Receive();
}
inline void UdpServer::Receive()
{
m_socket.async_receive(asio::null_buffers(), [=](const boost::system::error_code& error, size_t bytesCount)
{
OnReceive(bytesCount, error);
});
}
void UdpServer::OnReceive(size_t bytesCount, const boost::system::error_code& error)
{
static uint16_t lastSendNum = 65535;
uint16_t currentNum = 0;
uint16_t diff = 0;
if (error)
{
if (error == asio::error::operation_aborted)
{
logtrace << "UDP socket reports operation aborted, terminating";
return;
}
logerror << "UDP socket error (ignoring): " << error.message();
}
else
{
asio::ip::udp::endpoint from;
boost::system::error_code receiveError;
size_t bytesRead = 0;
size_t bytesAvailable = m_socket.available();
while (bytesAvailable > 0)
{
if (m_socketBufferSize - bytesAvailable < m_criticalLevel)
{
logwarning << "Critical buffer level!";
}
bytesRead = m_socket.receive(asio::buffer(m_buffer, m_packetSize), 0, receiveError);
if (receiveError)
{
logerror << "UDP socket error: " << receiveError.message();
break;
}
currentNum = *reinterpret_cast<uint16_t*>(m_buffer);
diff = currentNum - lastSendNum;
if (diff != 1)
{
logdebug << "Chunk skipped: " << diff << ". Last " << lastSendNum << " next " << currentNum;
}
lastSendNum = currentNum;
if (m_onReceive)
{
m_onReceive(m_buffer, bytesRead);
}
bytesAvailable = m_socket.available();
}
}
Receive();
}
Even if checking of buffer status and packet processing m_onReceive are disabled and bytesAvailable > 0 replaced with true, udp packets are dropped. Speed rate is ~71 Mb/s via 1Gb Ethernet.
Windows 10 is used. Also I checked netstat -s result: no reassembly failures. Socket buffer is never being overflowed.

Related

socket connection lost using select function

I'm newbee in socket program.
I made my server program with good sample program using select function.
It works well about 20,000 connections over.
But, in some case, connection accept twice consequence without
receive data from first socket.
Only data received from second socket connection.
After that, first socket resource cannot release.
FD_SET and FD_ISSET are not working with first socket in case of consequence accept I think.
Working clients are 6.
Before this situation,
accept, receive data, and close socket, accept, rcv
data, close, ...
In case, accept,
accept, receive data from second socket, and close second socket.
Lost first socket connection.
After that, accept function assign second socket descriptor.
What is problem?
How can release fisrt socket?
BR
Paul
My code is as follow:
while(1)
{
//clear the socket set
FD_ZERO (&readfds);
//add master socket to set
FD_SET (sever_socket, &readfds);
max_sd = sever_socket;
//add child sockets to set
for ( i = 0 ; i < MAX_CLIENT ; i ++)
{
//socket descriptor
sd = client_socket [i];
//if valid socket descriptor then add to read list
if (sd > 0)
{
FD_SET( sd , &readfds);
}
//highest file descriptor number, need it for the select function
if(sd > max_sd)
{
max_sd = sd;
}
}
//wait for an activity on one of the sockets , timeout is NULL , so wait indefinitely
activity = select ( max_sd + 1 , &readfds , NULL , NULL , NULL);
if ((activity < 0) && (errno!=EINTR))
{
LOG_F (WARNING, "select error");
}
//If something happened on the master socket, then its an incoming connection
if (FD_ISSET(sever_socket, &readfds))
{
if ((new_socket = accept (sever_socket, (struct sockaddr *) &address, (socklen_t*) &addrlen)) < 0)
{
perror("accept");
exit(EXIT_FAILURE);
}
//inform user of socket number - used in send and receive commands
LOG_F (INFO, "New connection, socket fd is %d, ip is : %s, port : %d",
new_socket, inet_ntoa(address.sin_addr), ntohs(address.sin_port));
//add new socket to array of sockets
for (i = 0; i < MAX_CLIENT; i++)
{
//if position is empty
if( client_socket[i] == 0 )
{
client_socket[i] = new_socket;
LOG_F (INFO, "Adding to list of sockets as %d" , i);
break;
}
}
}
for (i = 0; i < MAX_CLIENT; i++)
{
sd = client_socket[i];
if (FD_ISSET (sd , &readfds))
{
memset (&rcvBuf, 0x00, sizeof(rcvBuf));
if ((fp = fdopen (sd, "r")) == NULL)
{
LOG_F (WARNING, "TCP_SOCKET FD_OPEN Error");
close (sd);
client_socket[i] = 0;
}
else
{
ret = ioctl (sd, FIONREAD, &nread);
if (nread == 0)
{
fclose (fp);
close (sd);
client_socket[i] = 0;
LOG_F (WARNING, "Client disconnected(as %d, fd %d)", i, sd);
}
else
{
len = recv (sd, rcvBuf, nread, 0);
if (len > 0)
{
LOG_F (INFO, "RECV size %d" , len);
...
do_msg_handler ()
}
}
...

Linux TCP socket timestamping option

Quoting form this online kernel doc
SO_TIMESTAMPING
Generates timestamps on reception, transmission or both. Supports
multiple timestamp sources, including hardware. Supports generating
timestamps for stream sockets.
Linux supports TCP timestamping, and I tried to write some demo code to get any timestamp for TCP packet.
The server code as below:
//Bind
if( bind(socket_desc,(struct sockaddr *)&server , sizeof(server)) < 0)
{
perror("bind failed. Error");
return 1;
}
puts("bind done");
//Listen
listen(socket_desc , 3);
//Accept and incoming connection
puts("Waiting for incoming connections...");
int c = sizeof(struct sockaddr_in);
client_sock = accept(socket_desc, (struct sockaddr *)&client, (socklen_t*)&c);
if (client_sock < 0)
{
perror("accept failed");
return 1;
}
// Note: I am trying to get software timestamp only here..
int oval = SOF_TIMESTAMPING_RX_SOFTWARE | SOF_TIMESTAMPING_SOFTWARE;
int olen = sizeof( oval );
if ( setsockopt( client_sock, SOL_SOCKET, SO_TIMESTAMPING, &oval, olen ) < 0 )
{ perror( "setsockopt TIMESTAMP"); exit(1); }
puts("Connection accepted");
char buf[] = "----------------------------------------";
int len = strlen( buf );
struct iovec myiov[1] = { {buf, len } };
unsigned char cbuf[ 40 ] = { 0 };
int clen = sizeof( cbuf );
struct msghdr mymsghdr = { 0 };
mymsghdr.msg_name = NULL;
mymsghdr.msg_namelen = 0;
mymsghdr.msg_iov = myiov;
mymsghdr.msg_iovlen = 1;
mymsghdr.msg_control = cbuf;
mymsghdr.msg_controllen = clen;
mymsghdr.msg_flags = 0;
int read_size = recvmsg( client_sock, &mymsghdr, 0);
if(read_size == 0)
{
puts("Client disconnected");
fflush(stdout);
}
else if(read_size == -1)
{
perror("recv failed");
}
else
{
struct msghdr *msgp = &mymsghdr;
printf("msg received: %s \n",(char*)msgp->msg_iov[0].iov_base);// This line is successfully hit.
// Additional info: print msgp->msg_controllen inside gdb is 0.
struct cmsghdr *cmsg;
for ( cmsg = CMSG_FIRSTHDR( msgp );
cmsg != NULL;
cmsg = CMSG_NXTHDR( msgp, cmsg ) )
{
printf("Time GOT!\n"); // <-- This line is not hit.
if (( cmsg->cmsg_level == SOL_SOCKET )
&&( cmsg->cmsg_type == SO_TIMESTAMPING ))
printf("TIME GOT2\n");// <-- of course , this line is not hit
}
}
Any ideas why no timestamping is available here ? Thanks
Solution
I am able to get the software timestamp along with hardware timestamp using onload with solarflare NIC.
Still no idea how to get software timestamp alone.
The link you gave, in the comments at the end, says:
I've discovered why it doesn't work. SIOCGSTAMP only works for UDP
packets or RAW sockets, but does not work for TCP. – Gio Mar 17 '16 at 9:331
it doesn't make sense to ask for timestamps for TCP, because there's
no direct correlation between arriving packets and data becoming
available. If you really want timestamps for TCP you'll have to use
RAW sockets and implement your own TCP stack (or use a userspace TCP
library). – ecatmur Jul 4 '16 at 10:39

modbus_read_register - Error connection timed out

We are using libmodbus library to read register values from energy meter EM6400 which supports Modbus over RTU. We are facing the following two issues.
1) We are facing an issue with modbus_read_registers API, this API returns -1 and the error message is:
ERROR Connection timed out: select.
After debugging the library, we found this issue is due to the echo of request bytes in the response message.
read() API call in _modbus_rtu_recv returns request bytes first followed by response bytes. As a result, length_to_read is calculated in compute_data_length_after_meta() based on the request bytes instead of response bytes (which contains the number of bytes read) and connection timed out issue occurs.
We tried to use both 3.0.6 and 3.1.2 libmodbus versions but same issue occurs in both the versions.
2) modbus_rtu_set_serial_mode (ctx, MODBUS_RTU_RS485) returns "BAD file descriptor".
Please confirm if there is any API call missing or any parameter is not set correctly.
Our sample code to read register value is as follows.
int main()
{
modbus_t *ctx;
uint16_t tab_reg[2] = {0,0};
float avgVLL = -1;;
int res = 0;
int rc;
int i;
struct timeval response_timeout;
uint32_t tv_sec = 0;
uint32_t tv_usec = 0;
response_timeout.tv_sec = 5;
response_timeout.tv_usec = 0;
ctx = modbus_new_rtu("/dev/ttyUSB0", 19200, 'E', 8, 1);
if (NULL == ctx)
{
printf("Unable to create libmodbus context\n");
res = 1;
}
else
{
printf("created libmodbus context\n");
modbus_set_debug(ctx, TRUE);
//modbus_set_error_recovery(ctx, MODBUS_ERROR_RECOVERY_LINK |MODBUS_ERROR_RECOVERY_PROTOCOL);
rc = modbus_set_slave(ctx, 1);
printf("modbus_set_slave return: %d\n",rc);
if (rc != 0)
{
printf("modbus_set_slave: %s \n",modbus_strerror(errno));
}
/* Commented - Giving 'Bad File Descriptor' issue
rc = modbus_rtu_set_serial_mode(ctx, MODBUS_RTU_RS485);
printf("modbus_rtu_set_serial_mode: %d \n",rc);
if (rc != 0)
{
printf("modbus_rtu_set_serial_mode: %s \n",modbus_strerror(errno));
}
*/
// This code is for version 3.0.6
modbus_get_response_timeout(ctx, &response_timeout);
printf("Default response timeout:%ld sec %ld usec \n", response_timeout.tv_sec, response_timeout.tv_usec );
response_timeout.tv_sec = 60;
response_timeout.tv_usec = 0;
modbus_set_response_timeout(ctx, &response_timeout);
modbus_get_response_timeout(ctx, &response_timeout);
printf("Set response timeout:%ld sec %ld usec \n", response_timeout.tv_sec, response_timeout.tv_usec );
/* This code is for version 3.1.2
modbus_get_response_timeout(ctx, &tv_sec, &tv_usec);
printf("Default response timeout:%d sec %d usec \n",tv_sec,tv_usec );
tv_sec = 60;
tv_usec = 0;
modbus_set_response_timeout(ctx, tv_sec,tv_usec);
modbus_get_response_timeout(ctx, &tv_sec, &tv_usec);
printf("Set response timeout:%d sec %d usec \n",tv_sec,tv_usec );
*/
rc = modbus_connect(ctx);
printf("modbus_connect: %d \n",rc);
if (rc == -1) {
printf("Connection failed: %s\n", modbus_strerror(errno));
res = 1;
}
rc = modbus_read_registers(ctx, 3908, 2, tab_reg);
printf("modbus_read_registers: %d \n",rc);
if (rc == -1) {
printf("Read registers failed: %s\n", modbus_strerror(errno));
res = 1;
}
for (i=0; i < 2; i++) {
printf("reg[%d]=%d (0x%X)\n", i, tab_reg[i], tab_reg[i]);
}
avgVLL = modbus_get_float(tab_reg);
printf("Average Line to Line Voltage = %f\n", avgVLL);
modbus_close(ctx);
modbus_free(ctx);
}
}
Output of this sample is as follows:
created libmodbus context
modbus_set_slave return: 0
modbus_rtu_set_serial_mode: -1
modbus_rtu_set_serial_mode: Bad file descriptor
Default response timeout:0 sec 500000 usec
Set response timeout:60 sec 0 usec
Opening /dev/ttyUSB0 at 19200 bauds (E, 8, 1)
modbus_connect: 0
[01][03][0F][44][00][02][87][0A]
Waiting for a confirmation...
ERROR Connection timed out: select
<01><03><0F><44><00><02><87><0A><01><03><04><C4><5F><43><D4><C6><7E>modbus_read_registers: -1
Read registers failed: Connection timed out
reg[0]=0 (0x0)
reg[1]=0 (0x0)
Average Line to Line Voltage = 0.000000
Issue 1) is probably a hardware issue, with "local echo" enabled in your RS-485 adapter. Local echo is sometimes used to confirm sending of data bytes on the bus. You need to disable it, or find another RS-485 adapter.
I have written about this in the documentation of my MinimalModbus Python library: Local Echo
It lists a few common ways to disable local echo in RS-485 adapters.

socket bad file descriptor

I want to write a multi clients socket program,
but I get Bad file descriptor when the stage of accept.
How can I correct my code? Thanks!
Here is my code
http://codepad.org/q0N1jTgz
Thanks!
Here is my part of code!
while(1)
{
struct sockaddr_in client_addr;
int addrlen = sizeof(client_addr);
/*Accept*/
if(clientfd = accept(sockfd, (struct sockaddr *)&client_addr, (socklen_t*)&addrlen) < 0)
{
perror("Accpet Error");
close(sockfd);
exit(-1);
}
/*Fork process*/
if(child = fork() < 0)
{
perror("Fork Error");
close(sockfd);
exit(-1);
}
else if(child == 0)
{
int my_client = clientfd;
close(sockfd);
send(my_client, welcome, sizeof(welcome), 0);
while ((res = recv(my_client, buffer1, sizeof(buffer1), 0)) > 0)
{
string command(buffer1);
cout << "receive from client:" << command << ", " << res << " bytes\n";
memset(buffer1, '\0', sizeof(buffer1));
}
}
close(clientfd);
}
there are a few bugs in your code
first you need to use parentheses around the assignments for child and clientfd.
line 68 should be changed to
if((clientfd = accept(sockfd, (struct sockaddr *)&client_addr, (socklen_t*)&addrlen)) < 0)
and line 76 should be
if((child = fork()) < 0)
additionally you must return or exit() from the forked process since you have already closed the listening socket.
so add exit(0); or return 0; after line 94
I highly recommend you compile your code with warnings enabled, to catch the assignment problems early. e.g use the -Wall and -Wextra flags if you are using gcc or g++

Why would connect() give intermittent EINVAL on port to FreeBSD?

I have in my C++ application a failure that arose upon porting to 32 bit FreeBSD 8.1 from 32 bit Linux. I have a TCP socket connection which fails to connect. In the call to connect(), I got an error result with errno == EINVAL which the man page for connect() does not cover.
What does this error mean, which argument is invalid? The message just says: "Invalid argument".
Here are some details of the connection:
family: AF_INET
len: 16
port: 2357
addr: 10.34.49.13
It doesn't always fail though. The FreeBSD version only fails after letting the machine sit idle for several hours. But after failing once, it works reliably until you let it sit idle again for a prolonged period.
Here is some of the code:
void setSocketOptions(const int skt);
void buildAddr(sockaddr_in &addr, const std::string &ip,
const ushort port);
void deepBind(const int skt, const sockaddr_in &addr);
void
test(const std::string &localHost, const std::string &remoteHost,
const ushort localPort, const ushort remotePort,
sockaddr_in &localTCPAddr, sockaddr_in &remoteTCPAddr)
{
const int skt = socket(AF_INET, SOCK_STREAM, 0);
if (0 > skt) {
clog << "Failed to create socket: (errno " << errno
<< ") " << strerror(errno) << endl;
throw;
}
setSocketOptions(skt);
// Build the localIp address and bind it to the feedback socket. Although
// it's not traditional for a client to bind the sending socket to a the
// local address, we do it to prevent connect() from using an ephemeral port
// which (our site's firewall may block). Also build the remoteIp address.
buildAddr(localTCPAddr, localHost, localPort);
deepBind(skt, localTCPAddr);
buildAddr(remoteTCPAddr, remoteHost, remotePort);
clog << "Info: Command connect family: "
<< (remoteTCPAddr.sin_family == AF_INET ? "AF_INET" : "<unknown>")
<< " len: " << int(remoteTCPAddr.sin_len)
<< " port: " << ntohs(remoteTCPAddr.sin_port)
<< " addr: " << inet_ntoa(remoteTCPAddr.sin_addr) << endl;
if (0 > ::connect(skt, (sockaddr*)& remoteTCPAddr, sizeof(sockaddr_in)))) {
switch (errno) {
case EINVAL: {
int value = -1;
socklen_t len = sizeof(value);
getsockopt(skt, SOL_SOCKET, SO_ERROR, &value, &len);
cerr << "Error: Command connect failed on local port "
<< getLocFbPort()
<< " and remote port " << remotePort
<< " to remote host '" << remoteHost
<< "' family: "
<< (remoteTCPAddr.sin_family == AF_INET ? "AF_INET" : "<unknown>")
<< " len: " << int(remoteTCPAddr.sin_len)
<< " port: " << ntohs(remoteTCPAddr.sin_port)
<< " addr: " << inet_ntoa(remoteTCPAddr.sin_addr)
<< ": Invalid argument." << endl;
cerr << "\tgetsockopt => "
<< ((value != 0) ? strerror(value): "success") << endl;
throw;
}
default: {
cerr << "Error: Command connect failed on local port "
<< localPort << " and remote port " << remotePort
<< ": (errno " << errno << ") " << strerror(errno) << endl;
throw;
}
}
}
}
void
setSocketOptions(int skt)
{
// See page 192 of UNIX Network Programming: The Sockets Networking API
// Volume 1, Third Edition by W. Richard Stevens et. al. for info on using
// ::setsockopt().
// According to "Linux Socket Programming by Example" p. 319, we must call
// setsockopt w/ SO_REUSEADDR option BEFORE calling bind.
int so_reuseaddr = 1; // Enabled.
int reuseAddrResult
= ::setsockopt(skt, SOL_SOCKET, SO_REUSEADDR, &so_reuseaddr,
sizeof(so_reuseaddr));
if (reuseAddrResult != 0) {
cerr << "Failed to set reuse addr on socket.";
throw;
}
// For every two hours of inactivity, a keepalive occurs.
int so_keepalive = 1; // Enabled. See page 200 for info on SO_KEEPALIVE.
int keepAliveResult =
::setsockopt(skt, SOL_SOCKET, SO_KEEPALIVE, &so_keepalive,
sizeof(so_keepalive));
if (keepAliveResult != 0) {
cerr << "Failed to set keep alive on socket.";
throw;
}
struct linger so_linger;
so_linger.l_onoff = 1; // Turn linger option on.
so_linger.l_linger = 5; // Linger time in seconds. (See page 202)
int lingerResult
= ::setsockopt(skt, SOL_SOCKET, SO_LINGER, &so_linger,
sizeof(so_linger));
if (lingerResult != 0) {
cerr << "Failed to set linger on socket.";
throw;
}
// Disable the Nagel algorithm on the command channel. SOL_TCP is not
// defined on FreeBSD
#ifndef SOL_TCP
#define SOL_TCP (::getprotobyname("TCP")->p_proto)
#endif
unsigned int tcpNoDelay = 1;
int noDelayResult
= ::setsockopt(skt, SOL_TCP, TCP_NODELAY, &tcpNoDelay,
sizeof(tcpNoDelay));
if (noDelayResult != 0) {
cerr << "Failed to set tcp no delay on socket.";
throw;
}
}
void
buildAddr(sockaddr_in &addr, const std::string &ip, const ushort port)
{
memset(&addr, 0, sizeof(sockaddr_in)); // Clear all fields.
addr.sin_len = sizeof(sockaddr_in);
addr.sin_family = AF_INET; // Set the address family
addr.sin_port = htons(port); // Set the port.
if (0 == inet_aton(ip.c_str(), &addr.sin_addr)) {
cerr << "BuildAddr IP.";
throw;
}
};
void
deepBind(const int skt, const sockaddr_in &addr)
{
// Bind the requested port.
if (0 <= ::bind(skt, (sockaddr *)&addr, sizeof(addr))) {
return;
}
// If the port is already in use, wait up to 100 seconds.
int count = 0;
ushort port = ntohs(addr.sin_port);
while ((errno == EADDRINUSE) && (count < 10)) {
clog << "Waiting for port " << port << " to become available..."
<< endl;
::sleep(10);
++count;
if (0 <= ::bind(skt, (sockaddr*)&addr, sizeof(addr))) {
return;
}
}
cerr << "Error: failed to bind port.";
throw;
}
Here is example output when EINVAL (it doesn't always fail here, sometimes it succeeds and fails on the first packet sent over the socket getting scrambled):
Info: Command connect family: AF_INET len: 16 port: 2357 addr: 10.34.49.13
Error: Command connect failed on local port 2355 and remote port 2357 to remote host '10.34.49.13' family: AF_INET len: 16 port: 2357 addr: 10.34.49.13: Invalid argument.
getsockopt => success
I figured out what the issue was, I was first getting a ECONNREFUSED, which on Linux I can just retry the connect() after a short pause and all is well, but on FreeBSD, the following retry of connect() fails with EINVAL.
The solution is when ECONNREFUSED to back up further and instead start retrying back to beginning of test() definition above. With this change, the code now works properly.
It's interesting that the FreeBSD connect() manpage doesn't list EINVAL. A different BSD manpage states:
[EINVAL] An invalid argument was detected (e.g., address_len is
not valid for the address family, the specified
address family is invalid).
Based on the disparate documentation from the different BSD flavours floating around, I would venture that there may be undocumented return code possibilities in FreeBSD, see here for example.
My advice is to print out your address length and the sizeof and contents of your socket address structure before calling connect - this will hopefully assist you to find out what's wrong.
Beyond that, it's probably best if you show us the code you use to set up the connection. This includes the type used for the socket address (struct sockaddr, struct sockaddr_in, etc), the code which initialises it, and the actual call to connect. That'll make it a lot easier to assist.
What’s the local address? You’re silently ignoring errors from bind(2), which seems like not only bad form, but could be causing this issue to begin with!