I have non blocking UDP socket in perl created this way
my $my_sock = IO::Socket::INET->new(LocalPort => $MY_PORT,
Proto => 'udp',
Blocking => '0') or die "socket: $#";
The recv call is
my $retValue = $sock->recv($my_message, 64);
I need to know
a) when there is no data left to read
b) if there is data, how much data was read
c) any error conditions
Surprisingly, I didn't see any return value for recv in perldoc. When I tried it myself, recv returns undef in (a), for b it is an unprintable character
This seems to be an elementary issue. However, I still cannot find the info on googling or on stack overflow.Thanks for any inputs
According to the perldoc, recv "returns the address of the sender if SOCKET's protocol supports this; returns an empty string otherwise. If there's an error, returns the undefined value."
If you are getting an undef, this means recv is encountering an error.
The error in your code is in the following line:
$retValue = $sock->recv($my_message, 64);
The function prototype for recv is:
recv SOCKET,SCALAR,LENGTH,FLAGS
According to perldoc, recv "Attempts to receive LENGTH characters of data into variable SCALAR from the specified SOCKET filehandle. SCALAR will be grown or shrunk to the length actually read."
Try:
$retvalue = recv($sock, $my_message, 64)
This is where I got all the information:
http://perldoc.perl.org/functions/recv.html
The value returned by recv is the address and port that that data was received from
my $hispaddr = $sock->recv($my_message, 64);
if ($retValue) {
my ($port, $iaddr);
($port, $iaddr) = sockaddr_in($hispaddr);
printf("address %s\n", inet_ntoa($iaddr));
printf("port %s\n", $port);
printf("name %s\n", gethostbyaddr($iaddr, AF_INET));
}
The length of the data returned can be determined with
length($my_message);
Related
I need to calculate a CRC16 of N-bytes (5 in the example, for the sake of simplicity) extracted from a binary file of size M (a pair of Kb, not so relevant for my scopes).
printf "offset\tvalue\tcrc16\n";
#Read N bytes from file and copy in the container
for my $counter (0 .. 5- 1)
{
my $oneByte;
read(FH, $oneByte, 1) or die "Error reading $inFile!";
my $ctx2 = Digest::CRC->new( type => 'crc16' );
my $digest2 = ($ctx2->add($oneByte))->hexdigest;
# PRINT for debugging
printf "0x%04X\t0x%02X\t0x", $counter, ord $oneByte;
print $digest2, "\n";
}
Considering this binary input
I obtain the result:
The script is performing byte-by-byte CRC16 (correct by the way), but I need the CRC16 of the full binary stream of 5 bytes (the expected value should be 0x6CD6).
Where am I wrong in the script?
Calling hexdigest or digest or b64digest clears the buffer and begins the next digest from scratch. (If you were computing digests of several files/streams, you wouldn't want the data from one stream to affect the digest of a separate stream).
So wait until the stream is completely read to call digest
... {
...
$ctx2->add($oneByte);
}
print "digest = ", $ctx2->hexdigest, "\n";
Or to help in debugging, save the stream and redigest the stream after each new byte
my $manyBytes = "";
... {
...
$manyBytes .= $oneByte;
$digest2 = $ctx2->add($manyBytes)->hexdigest;
...
}
You can use ->add. You can either pass the whole string at once, chunk by chunk, or character by character.
$ perl -M5.010 -MDigest::CRC -e'
my $d = Digest::CRC->new( type => "crc16" );
$d->add("\x49\x34\x49\x31\x31");
say $d->hexdigest;
'
6cd6
$ perl -M5.010 -MDigest::CRC -e'
my $d = Digest::CRC->new( type => "crc16" );
$d->add($_) for "\x49", "\x34", "\x49", "\x31", "\x31";
say $d->hexdigest;
'
6cd6
As shown, use a single object, and add every byte before calling ->digest (etc) as this resets the process.
I am running a simple perl socket client that queries a server, and then tries to close the socket once a keyword or phrase is encountered.
...
local $\ = "\x{0d}";
while($line = <$sock>){
print $line."\n";
last if ($line =~ /C6/);
}
...
close($sock);
I'd be happy to terminate on either the (0x0d) or the "C6" string - they both terminate the message. I'm monitoring it with Wireshark, and both triggers occur at the end of the message, yet I can't break out of the while loop, either with a break or last, nor does $line ever print.
Ideas? TIA
You don't exit when you receive C6 without receiving a Carriage Return (or EOF) because your code always waits for a Carriage Return (or EOF). Fix:
# sysread returns as soon as data is available, so this is a just a maximum.
use constant BLOCK_SIZE => 4*1024*1024;
my $buf = '';
while (1) {
my $rv = sysread($sock, $buf, length($buf), 4*1024*1024);
die($!) if !defined($rv);
last if !$rv;
process_message($1)
while $buf =~ s/^( (?: [^C\x0D] | C (?=[^6]) )*+ (?: C6 | \x0D ) )//xs;
}
die("Premature EOF") if length($buf);
I think the root of your problem here is that you've got $\ set, which is the output record separator, rather than $/ which is the input record separator.
So your while is waiting for a \n to occur before handing $line onto the rest of the loop.
But failing that, there's also a question of buffering and autoflushing on your socket.
And ... when you say you're monitoring with wireshark, how sure are you that those values are part of the packet content payload rather than part of the packet? Do you actually get \n sent from the server as part of the packet at any point?
I have a small pcap listener i made in perl. And whenever i get anything over 1500 Bytes, it just prints as 1500.
66.0.X.X 1500
Now i get that reply when i sent a 2000 byte packet, i also tried sending 1600, and 10k.
No matter what i get 1500 for pcap.
How can i fix this i looked at SNAPLEN but when i set it under or over, i get the same results.
I have also tried tcpdump and i get weird results, as you can see the first "length" is 1500, but the second a line under is 5000 which is what i sent.
tcpdump: listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes
21:59:06.142530 IP (tos 0x0, ttl 58, id 45206, offset 0, flags [+], proto UDP (17), length 1500)
37.X.X.X.48254 > XXX.54: UDP, length 5000
my code:
use Net::Pcap;
use NetPacket::Ethernet;
use NetPacket::Ethernet qw(:strip);
use NetPacket::IP;
use NetPacket::IP;
use NetPacket::TCP;
use strict;
my $err;
# Use network device passed in program arguments or if no
# argument is passed, determine an appropriate network
# device for packet sniffing using the
# Net::Pcap::lookupdev method
my $dev = $ARGV[0];
unless (defined $dev) {
$dev = Net::Pcap::lookupdev(\$err);
if (defined $err) {
die 'Unable to determine network device for monitoring - ', $err;
}
}
# Look up network address information about network
# device using Net::Pcap::lookupnet - This also acts as a
# check on bogus network device arguments that may be
# passed to the program as an argument
my ($address, $netmask);
if (Net::Pcap::lookupnet($dev, \$address, \$netmask, \$err)) {
die 'Unable to look up device information for ', $dev, ' - ', $err;
}
# Create packet capture object on device
my $object;
$object = Net::Pcap::open_live($dev, 65535, 1, 0, \$err);
unless (defined $object) {
die 'Unable to create packet capture on device ', $dev, ' - ', $err;
}
# Compile and set packet filter for packet capture
# object - For the capture of TCP packets with the SYN
# header flag set directed at the external interface of
# the local host, the packet filter of '(dst IP) && (tcp
# [13] & 2 != 0)' is used where IP is the IP address of
# the external interface of the machine. For
# illustrative purposes, the IP address of 127.0.0.1 is
# used in this example.
my $filter;
Net::Pcap::compile(
$object,
\$filter,
'(port 111)',
0,
$netmask
) && die 'Unable to compile packet capture filter';
Net::Pcap::setfilter($object, $filter) &&
die 'Unable to set packet capture filter';
# Set callback function and initiate packet capture loop
Net::Pcap::loop($object, -1, \&process_packet, '') ||
die 'Unable to perform packet capture';
Net::Pcap::close($object);
sub process_packet {
my ($user_data, $hdr, $pkt) = #_;
my $ip_obj = NetPacket::IP->decode(eth_strip($pkt));
#print("$ip_obj->{src_ip} -> $ip_obj->{dest_ip} $ip_obj->{caplen}\n");
warn "packet!\n";
my %header = %$hdr;
#process_packet(\%header, $pkt);
my $len = length $pkt;
my $fag = length $user_data;
my $fag2 = length $hdr;
warn "$header{len} $header{caplen} $len $fag $fag2\n";
}
From "listening on eth1" I infer you're capturing on an Ethernet. The largest packet size on Ethernet is 1518 bytes (except for non-standard "jumbo frames"), which is:
14 bytes of Ethernet header;
1500 bytes of Ethernet payload;
4 bytes of frame check sequence at the end.
This means that (unless the network is using jumbo frames) the largest IP packet you can send on an Ethernet is 1500 bytes.
So, if you try to send 5000 bytes of data over UDP-over-IPv4 on an Ethernet, that will become a 5008-byte UDP packet when the 8-byte UDP header is added, and that will become a 5028-byte or larger IPv4 packet when the IPv4 header is added (the minimum size of an IPv4 header is 20 bytes, and it can be bigger if there are options in the packet). That's too big for Ethernet, so the IP layer in the protocol stack on your machine will "fragment" that packet into multiple smaller IP packets, and, if all of those fragments arrive at the destination machine, its IP layer will reassemble them into a larger IP packet and then hand the IP payload of the reassembled packet to the UDP layer, so the program receiving the UDP packet will see all 5000 bytes.
Your tcpdump output is the output for the first of the fragments; the IP-layer total length of the packet, extracted from the IP header, is 1500 (IP's length field includes the length of the IP headers and the payload), but the UDP-layer length, extracted from the UDP header (which will normally fit in the first fragment), is 5000.
This has nothing to do with the snapshot length; it has to do with the way IP works, and the maximum packet size on Ethernet.
Upon advice from the comments, I decided to try Net::Pcap::Easy instead of Net::Pcap. As stated in the pod for the former module: "Net::Pcap is awesome, but it's difficult to bootstrap".
The following code solves my problem:
use strict;
use warnings;
use Net::Pcap::Easy;
# all arguments to new are optoinal
my $npe = Net::Pcap::Easy->new(
dev => "eth1",
filter => "port 111",
packets_per_loop => 10,
bytes_to_capture => 1024,
timeout_in_ms => 0, # 0ms means forever
promiscuous => 0, # true or false
udp_callback => sub {
my ($npe, $ether, $ip, $udp, $header ) = #_;
print "UDP: $ip->{src_ip}:$udp->{src_port}"
. " -> $ip->{dest_ip}:$udp->{dest_port} $udp->{len}\n";
},
# tcp_callback => sub {
# my ($npe, $ether, $ip, $tcp, $header ) = #_;
# my $xmit = localtime( $header->{tv_sec} );
#
# print "$xmit TCP: $ip->{src_ip}:$tcp->{src_port}"
# . " -> $ip->{dest_ip}:$tcp->{dest_port}\n";
#
# },
#
# icmp_callback => sub {
# my ($npe, $ether, $ip, $icmp, $header ) = #_;
# my $xmit = localtime( $header->{tv_sec} );
#
# print "$xmit ICMP: $ether->{src_mac}:$ip->{src_ip}"
# . " -> $ether->{dest_mac}:$ip->{dest_ip}\n";
# },
);
1 while $npe->loop;
I am using IO::Select to keep track of a variable number of file handles for reading. Documentation I've come across strongly suggests not to combine the select statement with <> (readline) for reading from the file handles.
My situation:
I will only ever use each file handle once, i.e. when the select offers me the file handle, it will be completely used and then removed from the select. I will be receiving a hash and a variable number of files. I do not mind if this blocks for a time.
For more context, I am a client sending information to be processed by my servers. Each file handle is a different server I'm talking to. Once the server is finished, a hash result will be sent back to me from each one. Inside that hash is a number indicating the number of files to follow.
I wish to use readline in order to integrate with existing project code for transferring Perl objects and files.
Sample code:
my $read_set = IO::Select()->new;
my $count = #agents_to_run; #array comes as an argument
for $agent ( #agents_to_run ) {
( $sock, my $peerhost, my $peerport )
= server($config_settings{ $agent }->
{ 'Host' },$config_settings{ $agent }->{ 'Port' };
$read_set->add( $sock );
}
while ( $count > 0) {
my #rh_set = IO::Select->can_read();
for my $rh ( #{ $rh_set } ) {
my %results = <$rh>;
my $num_files = $results{'numFiles'};
my #files = ();
for (my i; i < $num_files; i++) {
$files[i]=<$rh>;
}
#process results, close fh, decrement count, etc
}
}
Using readline (aka <>) is quite wrong for two reasons: It's buffered, and it's blocking.
Buffering is bad
More precisely, buffering using buffers that cannot be inspected is bad.
The system can do all the buffering it wants, since you can peek into its buffers using select.
Perl's IO system cannot be allowed to do any buffering because you cannot peek into its buffers.
Let's look at an example of what can happen using readline in a select loop.
"abc\ndef\n" arrives on the handle.
select notifies you that there is data to read.
readline will try to read a chunk from the handle.
"abc\ndef\n" will be placed in Perl's buffer for the handle.
readline will return "abc\n".
At this point, you call select again, and you want it to let you know that there is more to read ("def\n"). However, select will report there is nothing to read since select is a system call, and the data has already been read from the system. That means you will have to wait for more to come in before being able to read "def\n".
The following program illustrates this:
use IO::Select qw( );
use IO::Handle qw( );
sub producer {
my ($fh) = #_;
for (;;) {
print($fh time(), "\n") or die;
print($fh time(), "\n") or die;
sleep(3);
}
}
sub consumer {
my ($fh) = #_;
my $sel = IO::Select->new($fh);
while ($sel->can_read()) {
my $got = <$fh>;
last if !defined($got);
chomp $got;
print("It took ", (time()-$got), " seconds to get the msg\n");
}
}
pipe(my $rfh, my $wfh) or die;
$wfh->autoflush(1);
fork() ? producer($wfh) : consumer($rfh);
Output:
It took 0 seconds to get the msg
It took 3 seconds to get the msg
It took 0 seconds to get the msg
It took 3 seconds to get the msg
It took 0 seconds to get the msg
...
This can be fixed using non-buffered IO:
sub consumer {
my ($fh) = #_;
my $sel = IO::Select->new($fh);
my $buf = '';
while ($sel->can_read()) {
sysread($fh, $buf, 64*1024, length($buf)) or last;
while ( my ($got) = $buf =~ s/^(.*)\n// ) {
print("It took ", (time()-$got), " seconds to get the msg\n");
}
}
}
Output:
It took 0 seconds to get the msg
It took 0 seconds to get the msg
It took 0 seconds to get the msg
It took 0 seconds to get the msg
It took 0 seconds to get the msg
It took 0 seconds to get the msg
...
Blocking is bad
Let's look at an example of what can happen using readline in a select loop.
"abcdef" arrives on the handle.
select notifies you that there is data to read.
readline will try to read a chunk from the socket.
"abcdef" will be placed in Perl's buffer for the handle.
readline hasn't received a newline, so it tries to read another chunk from the socket.
There is no more data currently available, so it blocks.
This defies the purpose of using select.
[ Demo code forthcoming ]
Solution
You have to implement a version of readline that doesn't block, and only uses buffers you can inspect. The second part is easy because you can inspect the buffers you create.
Create a buffer for each handle.
When data arrives from a handle, read it but no more. When data is waiting (as we know from select), sysread will return what's available without waiting for more to arrive. That makes sysread perfect for this task.
Append the data read to the appropriate buffer.
For each complete message in the buffer, extract it and process it.
Adding a handle:
$select->add($fh);
$clients{fileno($fh)} = {
buf => '',
...
};
select loop:
use experimental qw( refaliasing declared_refs );
while (my #ready = $select->can_read) {
for my $fh (#ready) {
my $client = $clients{fileno($fh)};
my \$buf = \($client->{buf}); # Make $buf an alias for $client->{buf}
my $rv = sysread($fh, $buf, 64*1024, length($buf));
if (!$rv) {
delete $clients{fileno($fh)};
$sel->remove($fh);
if (!defined($rv)) {
... # Handle error
}
elsif (length($buf)) {
... # Handle eof with partial message
}
else {
... # Handle eof
}
next;
}
while ( my ($msg) = $buf =~ s/^(.*)\n// )
... # Process message.
}
}
}
By the way, this is much easier to do using threads, and this doesn't even handle writers!
Note that IPC::Run can do all the hard work for you if you're communicating with a child process, and that asynchronous IO can be used as an alternative to select.
After much discussion with #ikegami, we determined that in my extremely specific case the readline is actually not an issue. I'm still leaving ikegami's as the accepted right answer because it is far and away the best way to handle the general situation, and a wonderful writeup.
Readline (aka <>) is acceptable in my situation due to the following facts:
The handle is only returned once from the select statement, and then it is closed/removed
I only send one message through the file handle
I do not care if read handles block
I am accounting for timeouts and closed handle returns from select (error checking not included in the sample code above)
So I have a bit of problem figuring what Perl does in the following case:
while(1){
$inputLine=<STDIN>
#parse $inputLine below
#BUT FIRST, I need to check if $inputLine = EOF
}
before I get the obvious answer of using while(<>){}, let me say that there is a very strong reason that I have to do the above (basically setting up an alarm to interrupt blocking and I didnt want that code to clutter the example).
Is there someway to compare $inputLine == undef (as I think that is what STDIN returns at the end).
Thanks.
Inside your loop, use
last unless defined $inputLine;
From the perlfunc documentation on defined:
defined EXPR
defined
Returns a Boolean value telling whether EXPR has a value other than the undefined value undef. If EXPR is not present, $_ will be checked.
Many operations return undef to indicate failure, end of file, system error, uninitialized variable, and other exceptional conditions. This function allows you to distinguish undef from other values. (A simple Boolean test will not distinguish among undef, zero, the empty string, and "0", which are all equally false.) Note that since undef is a valid scalar, its presence doesn't necessarily indicate an exceptional condition: pop returns undef when its argument is an empty array, or when the element to return happens to be undef.
defined($inputLine)
Also, see the 4 argument version of the select function for an alternative way to read from a filehandle without blocking.
You can use eof on the filehandle. eof will return 1 if the next read on FILEHANDLE is an EOF.
The following will have problems with input files that have lines which only have a line feed or as in the case that was giving me problems a FF at the beginning of some lines (Form Feed - the file was the output from a program developed at the end of the 70s and still has formatting for a line printer and is still in FORTRAN - I do miss the wide continous paper for drawing flow diagrams on the back).
open (SIMFIL, "<", 'InputFileName') or die "Can´t open InputFileName\n" ;
open (EXTRDATS, ">>", 'OutputFileName' ) or die "Can´t open OutputFileName\n";
$Simfilline = "";
while (<SIMFIL>) {
$Simfilline = <SIMFIL>;
print EXTRDATS $Simfilline;
$Simfilline = <SIMFIL>;
print EXTRDATS $Simfilline;
}
close SIMFIL;
close EXTRDATS;
`
The following is when eof comes in handy - the expression: "while ()" can return false under conditions other than the end of the file.
open (SIMFIL, "<", 'InputFileName') or die "Can´t open InputFileName\n" ;
open (EXTRDATS, ">>", 'OutputFileName' ) or die "Can´t open OutputFileName\n";
$Simfilline = "";
while (!eof SIMFIL) {
$Simfilline = <SIMFIL>;
print EXTRDATS $Simfilline;
$Simfilline = <SIMFIL>;
print EXTRDATS $Simfilline;
}
close SIMFIL;
close EXTRDATS;
This last code fragment appears to duplicate the input file exactly.