Perl TCP socket communication - perl

I am trying to send a TCP packet to a host and get a response and obviously not doing something right.
I've compared to many posts and even my Perl Cookbook and can't figure out where I'm going wrong. Very simply trying the following:
use strict;
use warnings;
use IO::Socket::INET;
$| = 1;
my $socket = new IO::Socket::INET (
PeerHost => '<the_host_name>',
PeerPort => '33792',
Proto => 'tcp'
);
unless($socket) {
print "couldn't connect to the server\n";
}
# data to send to a server
my $req = <<SBCEND;
START
AUTOLOGUE CENTRAL
TYPE|QUOTE|
H|1|1.0|Regular|CANCEL|6072|TESTDT|JOHNAIX| | |26000|DEMO| | | | | | | | |0
P|1|HAS|LF115|HAS|LF115|10| | | | | | | | | | | | | | |
P|2|WIX|51515|WIX|51515|24| | | | | | | | | | | | | | |
P|3|FRA|PH8A|FRA|PH8A|2| | | | | | | | | | | | | | |
END
SBCEND
my $message = "\002".$req."\003";
my $size = $socket->send($message);
shutdown($socket, 1);
my $response = "";
$socket->recv($response, 1024);
$socket->close();
print "sent: $size\n$req\n";
print "received response: $response\n";
I get no response at all, not sure if I'm supposed to or If something is wrong with my request.
When I ask their support if I'm supposed to get any response regardless, they send me the response I should receive from the docs if the request is valid.
The above request data is an example they sent to me. They did tell me I needed binary 2 at the beginning of my request and binary 3 after, hence my message variable above.
Can someone tell me if I'm doing something wrong here?

Turns out I needed a new line after the binary 2, a small change to this line:
my $message = "\002\r\n".$req."\003";

Related

Fill empty rows of set with non empty value

Note: I've already gone over related questions like following that don't address my query
SQL: how to pick one row for each set of rows with duplicate value in one column?
Fill missing values with first non-null following value in Redshift
I have a sparse, unclean dataset like this
| id | operation | title | channel_type | mode |
|-----|-----------|----------|--------------|------|
| abc | Start | | | |
| abc | Start | recovery | | Link |
| abc | Start | recovery | SMS | |
| abc | Set | | Email | |
| abc | Verify | | Email | |
| pqr | Start | | | OTP |
| pqr | Verfiy | sign_in | Push | |
| pqr | Verify | | | |
| xyz | Start | sign_up | | Link |
and I need to fill up empty rows of each id with non-empty data available from other rows
| id | operation | title | channel_type | mode |
|-----|-----------|----------|--------------|------|
| abc | Start | recovery | SMS | Link |
| abc | Start | recovery | SMS | Link |
| abc | Start | recovery | SMS | Link |
| abc | Set | recovery | Email | Link |
| abc | Verify | recovery | Email | Link |
| pqr | Start | sign_in | Push | OTP |
| pqr | Verfiy | sign_in | Push | OTP |
| pqr | Verify | sign_in | Push | OTP |
| xyz | Start | sign_up | | Link |
notes
some ids can have a certain field as empty in all rows
and while most ids will have same non-empty values for each field, edge cases could have different values. For such groups, filling up any non-empty value in all rows is acceptable. [this is too rare in my dataset and can be ignored]
another extra bit of pattern is that certain fields are mostly only present only against rows of certain operations, for e.g. mode is only present against operation='Start' rows
I've tried grouping rows by id while performing listagg over title, channel_type and mode columns, followed by coalesce, something along the lines of this:
WITH my_data AS (
SELECT
id,
operation,
title,
channel_type,
mode
FROM
my_db.my_table
),
list_aggregated_data AS (
SELECT
id,
listagg(title) AS titles,
listagg(channel_type) AS channel_types,
listagg(mode) AS modes
FROM
my_data
GROUP BY
id
),
coalesced_data AS (
SELECT DISTINCT
id,
coalesce(titles) AS title,
coalesce(channel_types) AS channel_type,
coalesce(modes) AS mode
FROM
list_aggregated_data
),
joined_data AS (
SELECT
md.id,
md.operation,
cd.title,
cd.channel_type,
cd.mode
FROM
my_data AS md
LEFT JOIN
coalesced_data AS cd ON cd.id = md.id
)
SELECT
*
FROM
joined_data
ORDER BY
id,
operation
But for some reason this is resulting in concatenation of values (presumably from coalesce operation), where I get
| id | operation | title | channel_type | mode |
|-----|-----------|------------------|--------------|------|
| abc | Start | recoveryrecovery | SMS | Link |
| abc | Start | recoveryrecovery | SMS | Link |
| abc | Start | recoveryrecovery | SMS | Link |
| abc | Set | recoveryrecovery | Email | Link |
| abc | Verify | recoveryrecovery | Email | Link |
| pqr | Start | sign_in | Push | OTP |
| pqr | Verfiy | sign_in | Push | OTP |
| pqr | Verify | sign_in | Push | OTP |
| xyz | Start | sign_up | | Link |
What's the correct way to approach this problem?
I'd start with the first_value() window function with the ignore nulls option. You will partition by the first 2 columns and will need to work out the edge cases with some data massaging, likely in the order by clause of the window function.

Why can memory leaking by cross-reference be solved by explicit reassignment in Perl?

Cross-reference causes memory leaking in Perl like this.
{
my #a = qw(a b c);
my #b = qw(a b c);
# both reference count are 1
push #a, \#b;
# #b reference count is 2(from #b and via #a)
push #b, \#a;
}
# #b reference count is 2(from via #a)
I understand memory leaking by cross-reference in this situation.
But the memory leaking can be resolve by explicit reassignment like this.
{
my #a = qw(a b c);
my #b = qw(a b c);
# both reference count are 1
push #a, \#b;
# #b reference count is 2(from #b and via #a)
push #b, \#a;
#a = ();
}
# why is #b reference count 0?
#a is lexical scope so I think even if there is no reassignment, #a's reference will be invalid but former cause memory leaking and later is not, why?
You start with
#a #b
| ARRAY | ARRAY
| REFCNT=2 | REFCNT=2
+-->+-----------+ +-->+-----------+
| | +-------+ | | | +-------+ |
| | | a | | | | | a | |
| | +-------+ | | | +-------+ |
| | | b | | | | | b | |
| | +-------+ | | | +-------+ |
| | | c | | | | | c | |
| | +-------+ | | | +-------+ |
| | | --------+ | | --------+
| | +-------+ | | +-------+ | |
| +-----------+ +-----------+ |
| |
+---------------------------------------+
If you were to exit the scope here, the reference counts would drop to one, and they would leak.
After #a = ();:
#a #b
| ARRAY | ARRAY
| REFCNT=2 | REFCNT=1
+-->+-----------+ +-->+-----------+
| | | | +-------+ |
| | | | | a | |
| | | | +-------+ |
| | | | | b | |
| | | | +-------+ |
| | | | | c | |
| | | | +-------+ |
| | | | | --------+
| | | | +-------+ | |
| +-----------+ +-----------+ |
| |
+---------------------------------------+
Note that #b's reference count went from two to one.
On scope exit, #a's reference count will drop to one, and #b's reference count will drop to zero.[1] This will free #b, which will cause #a's reference count to drop to zero. And that will free #a.
No cycle, so no memory leak.
At least in theory. In practice, what actually happens is a bit different as an optimization. But those are internal details that aren't relevant here.

Cygnus doesn't write on CartoDB

I'm trying to integrate Cygnus to CartoDB but when the Cygnus receives an Orion notify it doesn't store the information on CartoDB.
Following the log trace
time=2016-12-19T14:37:13.657Z | lvl=DEBUG | corr=68c76dfc-c5f8-11e6-9346-fa163e00324f | trans=e2827b21-972d-4692-b25a-e1b252961491 | srv=default | subsrv=/ | comp=cygnus-ngsi | op=intercept | msg=com.telefonica.iot.cygnus.interceptors.NGSIGroupingInterceptor[127] : [gi] Event put in the channel, id=1724500127
time=2016-12-19T14:37:13.658Z | lvl=DEBUG | corr=68c76dfc-c5f8-11e6-9346-fa163e00324f | trans=e2827b21-972d-4692-b25a-e1b252961491 | srv=default | subsrv=/ | comp=cygnus-ngsi | op=debug | msg=org.mortbay.log.Slf4jLog[40] : RESPONSE /notify 200
time=2016-12-19T14:37:13.659Z | lvl=DEBUG | corr=68c76dfc-c5f8-11e6-9346-fa163e00324f | trans=e2827b21-972d-4692-b25a-e1b252961491 | srv=default | subsrv=/ | comp=cygnus-ngsi | op=debug | msg=org.mortbay.log.Slf4jLog[40] : EOF
time=2016-12-19T14:37:15.165Z | lvl=DEBUG | corr=68c76dfc-c5f8-11e6-9346-fa163e00324f | trans=e2827b21-972d-4692-b25a-e1b252961491 | srv=default | subsrv=/ | comp=cygnus-ngsi | op=processNewBatches | msg=com.telefonica.iot.cygnus.sinks.NGSISink[509] : Batch completed, persisting it
time=2016-12-19T14:37:15.166Z | lvl=DEBUG | corr=68c76dfc-c5f8-11e6-9346-fa163e00324f | trans=e2827b21-972d-4692-b25a-e1b252961491 | srv=default | subsrv=/ | comp=cygnus-ngsi | op=persistBatch | msg=com.telefonica.iot.cygnus.sinks.NGSICartoDBSink[333] : [cartodb-sink] Processing sub-batch regarding the default_/_waste1_wastectr destination
time=2016-12-19T14:37:15.166Z | lvl=DEBUG | corr=68c76dfc-c5f8-11e6-9346-fa163e00324f | trans=e2827b21-972d-4692-b25a-e1b252961491 | srv=default | subsrv=/ | comp=cygnus-ngsi | op=aggregate | msg=com.telefonica.iot.cygnus.sinks.NGSICartoDBSink$CartoDBAggregator[508] : [cartodb-sink] Processing context element (id=waste1, type=wastectr)
time=2016-12-19T14:37:15.166Z | lvl=DEBUG | corr=68c76dfc-c5f8-11e6-9346-fa163e00324f | trans=e2827b21-972d-4692-b25a-e1b252961491 | srv=default | subsrv=/ | comp=cygnus-ngsi | op=aggregate | msg=com.telefonica.iot.cygnus.sinks.NGSICartoDBSink$CartoDBAggregator[530] : [cartodb-sink] Processing context attribute (name=category, type=StructuredValue)
time=2016-12-19T14:37:15.167Z | lvl=DEBUG | corr=68c76dfc-c5f8-11e6-9346-fa163e00324f | trans=e2827b21-972d-4692-b25a-e1b252961491 | srv=default | subsrv=/ | comp=cygnus-ngsi | op=aggregate | msg=com.telefonica.iot.cygnus.sinks.NGSICartoDBSink$CartoDBAggregator[530] : [cartodb-sink] Processing context attribute (name=status, type=Text)
time=2016-12-19T14:37:15.171Z | lvl=DEBUG | corr=68c76dfc-c5f8-11e6-9346-fa163e00324f | trans=e2827b21-972d-4692-b25a-e1b252961491 | srv=default | subsrv=/ | comp=cygnus-ngsi | op=processNewBatches | msg=com.telefonica.iot.cygnus.sinks.NGSISink[523] : [java.util.ArrayList.rangeCheck(Unknown Source), java.util.ArrayList.get(Unknown Source), com.telefonica.iot.cygnus.sinks.NGSICartoDBSink$CartoDBAggregator.getRows(NGSICartoDBSink.java:410), com.telefonica.iot.cygnus.sinks.NGSICartoDBSink.persistRawAggregation(NGSICartoDBSink.java:552), com.telefonica.iot.cygnus.sinks.NGSICartoDBSink.persistBatch(NGSICartoDBSink.java:362), com.telefonica.iot.cygnus.sinks.NGSISink.processNewBatches(NGSISink.java:510), com.telefonica.iot.cygnus.sinks.NGSISink.process(NGSISink.java:327), org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68), org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147), java.lang.Thread.run(Unknown Source)]
time=2016-12-19T14:37:15.171Z | lvl=WARN | corr=68c76dfc-c5f8-11e6-9346-fa163e00324f | trans=e2827b21-972d-4692-b25a-e1b252961491 | srv=default | subsrv=/ | comp=cygnus-ngsi | op=processNewBatches | msg=com.telefonica.iot.cygnus.sinks.NGSISink[541] : Index: 0, Size: 0
time=2016-12-19T14:37:16.090Z | lvl=DEBUG | corr= | trans= | srv= | subsrv= | comp=cygnus-ngsi | op=run | msg=org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable[126] : Checking file:/usr/cygnus/conf/agent_ngsi_1.conf for changes
The configuration of agent_ngsi_1.conf is
The next tree fields set the sources, sinks and channels used by Cygnus
cygnus-ngsi.sinks = cartodb-sink
cygnus-ngsi.channels = cartodb-channel
Source configuration
# channel name where to write the notification events
#cygnus-ngsi.sources.http-source.channels = hdfs-channel mysql-channel ckan- channel mongo-channel sth-channel kafka-channel dynamodb-channel postgresql- channel
cygnus-ngsi.sources.http-source.channels = cartodb-channel
# source class, must not be changed
cygnus-ngsi.sources.http-source.type = org.apache.flume.source.http.HTTPSource
# listening port the Flume source will use for receiving incoming notifications
cygnus-ngsi.sources.http-source.port = 5050
# Flume handler that will parse the notifications, must not be changed
cygnus-ngsi.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.NGSIRestHandler
# URL target
cygnus-ngsi.sources.http-source.handler.notification_target = /notify
# default service (service semantic depends on the persistence sink)
cygnus-ngsi.sources.http-source.handler.default_service = default
# default service path (service path semantic depends on the persistence sink)
cygnus-ngsi.sources.http-source.handler.default_service_path = /
# source interceptors, do not change
cygnus-ngsi.sources.http-source.interceptors = ts gi
# TimestampInterceptor, do not change
cygnus-ngsi.sources.http-source.interceptors.ts.type = timestamp
# GroupingInterceptor, do not change
cygnus-ngsi.sources.http-source.interceptors.gi.type = com.telefonica.iot.cygnus.interceptors.NGSIGroupingInterceptor$Builder
# Grouping rules for the GroupingInterceptor, put the right absolute path to the file if necessary
# see the doc/design/interceptors document for more details
cygnus-ngsi.sources.http-source.interceptors.gi.grouping_rules_conf_file = /usr/cygnus/conf/grouping_rules.conf
NGSICartoDBSink configuration
# sink class, must not be changed
cygnus-ngsi.sinks.cartodb-sink.type = com.telefonica.iot.cygnus.sinks.NGSICartoDBSink
# channel name from where to read notification events
cygnus-ngsi.sinks.cartodb-sink.channel = cartodb-channel
# true if the grouping feature is enabled for this sink, false otherwise
cygnus-ngsi.sinks.cartodb-sink.enable_grouping = false
# true if name mappings are enabled for this sink, false otherwise
cygnus-ngsi.sinks.cartodb-sink.enable_name_mappings = false
# true if lower case is wanted to forced in all the element names, false otherwise
cygnus-ngsi.sinks.cartodb-sink.enable_lowercase = false
# select the data_model: dm-by-service-path or dm-by-entity
cygnus-ngsi.sinks.cartodb-sink.data_model = dm-by-entity
# absolute path to the CartoDB file containing the mapping between FIWARE service/CartoDB usernames and CartoDB API Keys
cygnus-ngsi.sinks.cartodb-sink.keys_conf_file = /usr/cygnus/conf/cartodb_keys.conf
# if true the latitude and longitude values are exchanged, false otherwise
#cygnus-ngsi.sinks.cartodb-sink.swap_coordinates = true
# if true, a raw based storage is done, false otherwise
cygnus-ngsi.sinks.cartodb-sink.enable_raw = true
# if true, a distance based storage is done, false otherwise
cygnus-ngsi.sinks.cartodb-sink.enable_distance = false
# number of notifications to be included within a processing batch
#cygnus-ngsi.sinks.cartodb-sink.batch_size = 100
# timeout for batch accumulation
#cygnus-ngsi.sinks.cartodb-sink.batch_timeout = 30
# number of retries upon persistence error
#cygnus-ngsi.sinks.cartodb-sink.batch_ttl = 10
# maximum number of connections allowed for a Http-based HDFS backend
#cygnus-ngsi.sinks.cartodb-sink.backend.max_conns = 500
# maximum number of connections per route allowed for a Http-based HDFS backend
#cygnus-ngsi.sinks.cartodb-sink.backend.max_conns_per_route = 100

Should Exception::Class objects evaluate to false in boolean context

I'm trying Exception::Class for the first time and something that surprised me is that Exception::Class objects evaluate to true when returned from a function. Shouldn't the default be the opposite.
I know I can change this with overload but I am wondering if it's a good idea
sub gethtml{
return MyException->new( error => 'some error' );
}
my $response = &gethtml
if($response){
#do something with the html
}
else{
#something went wrong check if it's an exception object
}
You're confusing exceptions with returning a false value to indicate an error.
Part of the point of exceptions is that they provide their own channel to indicate error. This leaves return free to only return valid values. There's no need to check for false vs defined, or special objects, or do any per-function call error checking at all. It's all caught and dealt with at the end of the block.
If you return an exception object it defeats the point; they're not exceptions, they're just error codes.
To take advantage of exceptions, the code in your example should be written like this:
sub get_html {
...try to get the html...
return $html if defined $html;
MyException->throw( error => 'some error' );
}
eval {
my $html = get_html;
# do something with $html;
}
if ( my $e = Exception::Class->caught() ) {
...deal with the error...
}
This can be made a bit prettier with Try::Tiny.
This makes more sense when you have to do a lot of things which might error, such as a bunch of file, network or database operations. Look into modules such as autodie and Path::Tiny for how that works.
You should not create one with new and return it. They have a throw method that acts as constructor and die automatically.
use strict;
use warnings;
use Exception::Class qw( InputException HTTPException );
use Try::Tiny;
sub get_html {
my ($url) = #_;
# input validation
InputException->throw(error => 'no URL') unless $url;
my $res = $ua->get($url);
if ($res->is_success) {
# do more stuff with $res;
} else {
HTTPException->throw( error => 'request failed' );
}
}
# ... later
my $url;
try {
get_html($url);
} catch {
# handle the error which is in $_
if ( $_->isa('InputException') ) {
print "You need to supply a URL";
} elsif ( $_->isa('HTTPException') ) {
print "Could not fetch the HTML because the HTTP request failed.\n";
print "But I am not telling you why.";
}
}
You can then go and catch them (use Try::Tiny for that) or simply wrap it in an eval. But basically those exceptions are simple objects. They are intended as the return value of die and get thrown around, so there is no need to return them anywhere.
Once the program dies, all the scopes on the call stack are exited forcefully until you end up in an eval block (which is what catch does). There, you can handle the error. And since that error is an object, you can do fancy stuff with it.
+--------------------------------------------------------------------+
| sub { |
| +----------------------------------------------------------------+ |
| | if () { | |
| | +------------------------------------------------------------+ | |
| | | foo:: sub { | | |
| | | +--------------------------------------------------------+ | | |
| | | | catch { | | | |
| | | | +----------------------------------------------------+ | | | |
| | | | | doo_stuff:: sub { | | | | |
| | | | | +------------------------------------------------+ | | | | |
| | | | | | | | | | | |
| | | | | | MyException->throw ==> die $obj +---------------------------------+
| | | | | | do_more_stuff(); # never executed | | | | | | |
| | | | | | | | | | | | |
| | | | | +------------------------------------------------+ | | | | | |
| | | | +----------------------------------------------------+ | | | | |
| | | | | | | | |
| | | | handle_exception_in_catch($_) <---------------------------------+
| | | | # ( in Try::Tiny the exception ends up in $_ ) | | | |
| | | | | | | |
| | | +--------------------------------------------------------+ | | |
| | +------------------------------------------------------------+ | |
| +----------------------------------------------------------------+ |
+--------------------------------------------------------------------+
Also see the Exception::Class docs.
If you mix exceptions and regular die or Carp croak calls, you will have to do a lot of checking if stuff is blessed before using ->isa. Safe::Isa comes in handy here.
use strict;
use warnings;
use Exception::Class qw( InputException HTTPException );
use Try::Tiny;
use Safe::Isa;
sub get_html {
my ($url) = #_;
# input validation
InputException->throw(error => 'no URL') unless $url;
my $res = $ua->get($url);
if ($res->is_success) {
# do more stuff with $res;
die "There is no answer in this HTML" if $res->decoded_content !~ m/42/;
} else {
HTTPException->throw( error => 'request failed' );
}
}
With this code, the $_->isa('...') would blow up, because in case of the die call, $_ is not an object and you cannot call the method isa on an unblessed reference (or non-reference). Safe::Isa provides a $_isa, which checks for that first and otherwise just returns false.
my $url;
try {
get_html($url);
} catch {
# handle the error which is in $_
if ( $_->$_isa('InputException') ) {
print "You need to supply a URL";
} elsif ( $_->$_isa('HTTPException') ) {
print "Could not fetch the HTML because the HTTP request failed.\n";
print "But I am not telling you why.";
}
}
For details on how that works, see mst's talk You did what?

file output when using perl MIME::Parser

How do I get the path for the below option?
basiclly messages will be parsed to following dir "/tmp/msg-1370789006-11903-0" which is made up of
time and process ID , how do I get that into my varible for later use?
### Tell it where to put things:
$parser->output_under("/tmp");
The distro's overall documentation (also found in the distro's README file) contains the following useful information:
Overview of the classes
Here are the classes you'll generally be dealing with directly:
(START HERE) results() .-----------------.
\ .-------->| MIME:: |
.-----------. / | Parser::Results |
| MIME:: |--' `-----------------'
| Parser |--. .-----------------.
`-----------' \ filer() | MIME:: |
| parse() `-------->| Parser::Filer |
| gives you `-----------------'
| a... | output_path()
| | determines
| | path() of...
| head() .--------. |
| returns... | MIME:: | get() |
V .-------->| Head | etc... |
.--------./ `--------' |
.---> | MIME:: | |
`-----| Entity | .--------. |
parts() `--------'\ | MIME:: | /
returns `-------->| Body |<---------'
sub-entities bodyhandle() `--------'
(if any) returns... | open()
| returns...
|
V
.--------. read()
| IO:: | getline()
| Handle | print()
`--------' etc...
Which leads us to looking at MIME::Body's documentation, which include the following:
### Where's the data?
if (defined($body->path)) { ### data is on disk:
print "data is stored externally, in ", $body->path;
}
else { ### data is in core:
print "data is already in core, and is...\n", $body->as_string;
}
### Get rid of anything on disk:
$body->purge;