I need to check a list of domain names and get the list of domains with no NS (probably unregistered). I've already found a nice solution with ADNS and adnshost. The command "adnshost -a -tns domain.com" does what I need (The 4th column in the output will contain the result) but I want to achieve the same with Perl and Net::DNS::Async.
My code:
#!/usr/bin/perl -w
use strict;
use utf8;
use Net::DNS::Async;
my $c = new Net::DNS::Async(QueueSize => 1000, Retries => 3);
my $filename = 'domain_list.txt';
open(FH, '<', $filename);
while(<FH>){
chomp($url);
$c->add(\&callback, "$url");
}
$c->await();
sub callback {
my $response = shift;
print $response->string;
}
So, how do I get the needed info with Perl and Net::DNS::Async ?
You can add NS to the add arguments.
while (<FH>) {
chomp;
$c->add(\&callback, $_, 'NS');
}
$c->await();
sub callback {
my $response = shift;
unless ($response->answer) {
my $host = join '.', #{$response->{question}[0]{qname}{label}};
print "$host\n";
}
}
$response->answer will be "empty" if there are no replies.
im using this perl code to transform JSON into other form with some regular expressions:
use strict;
use warnings;
use feature 'say';
use JSON;
use utf8;
my %IDs = ( 'User awx01 logged in.' => 1001 );
my %levels = ( INFO => 4 );
my $data = <DATA>;
my $json = data2json($data);
my $record = decode_json($json);
say rec2msg($record);
sub data2json {
my $json = shift;
$json =~ s/[""]/"/g;
$json =~ s/\\//g;
$json =~ s/"(\{.*?\})"/$1/;
return $json;
}
sub rec2msg {
my $r = shift;
$r->{Message}{message} =~ /(\w+) (\w+) (.+)/;
my($user,$msg) = ($2,"$1 $3");
my $ID = $IDs{$r->{Message}{message}};
my $level = $levels{$r->{Message}{level}};
my $out = "$r->{Message}{'#timestamp'} host CEF:0|OpenSource|AWX|7.0.0|$ID|$msg|$level|src=127.0.0.1 dst=$r->{MessageSourceAddress} duser=$user";
return $out;
}
__DATA__
{"MessageSourceAddress":"192.168.81.20","EventReceivedTime":"2020-02-06 11:55:14","SourceModuleName":"udp","SourceModuleType":"im_udp","SyslogFacilityValue":1,"SyslogFacility":"USER","SyslogSeverityValue":5,"SyslogSeverity":"NOTICE","SeverityValue":2,"Severity":"INFO","EventTime":"2020-02-06 11:55:14","Hostname":"192.168.81.20","Message":"{\"#timestamp\": \"2020-02-06T08:55:52.907Z\", \"message\": \"User awx01 logged in.\", \"host\": \"awxweb\", \"level\": \"INFO\", \"logger_name\": \"awx.api.generics\", \"stack_info\": null, \"type\": \"other\", \"cluster_host_id\": \"awx-contr-01\", \"tower_uuid\": \"333b4131-495f-4460-8e4b-890241a9d73d\"}"}
But im getting this error:
2020-03-31 20:48:50 ERROR perl subroutine rec2msg failed with an error: 'Can't use string ("140511667030448") as a HASH ref while "strict refs" in use at /usr/libexec/nxlog/modules/extension/perl/event1.pl line 21.;'
What im doing wrong? How could i solve it?
You have JSON embedded in JSON, so you need to decode it twice. This often happens when you have one service passing through the response for another service.
Your data2json wasn't decoding that second level, so the value for the Message name was still a string. Since that value wasn't a hash reference, you get the error you reported.
You don't want to use a bunch of substitutions on the entire thing because you can inadvertently change things you shouldn't be messing with. Decode the top level just as you did, but then do the same thing for the Message value:
# read in all the data, even though it looks like a single line. Maybe it won't be later.
my $data = do { local $/; <DATA> };
# decode the first layer
my $decoded = decode_json( $data );
# decode the Message value:
$decoded->{Message} = decode_json( $decoded->{Message} );
Now, when you call rec2msg it should work out.
Note that this has the opposite problem to reverse it. You can't merely encode the entire thing to JSON again. The value for Message still needs to be a string, so you have to encode that first if you want to send it somewhere else. If you are doing that, you probably want to work on a copy. I use dclone to make a deep copy so whatever I do to $encoded does not show up in $decoded:
# make a deep copy so nested references aren't shared
use Storable qw(dclone);
my $encoded = dclone( $decoded );
$encoded->{Message} = encode_json( $encoded->{Message} );
my $new_data = encode_json( $encoded );
Then $new_data will have the same escaping as the original input.
Here it is altogether:
use strict;
use warnings;
use feature 'say';
use JSON;
use utf8;
my %IDs = ( 'User awx01 logged in.' => 1001 );
my %levels = ( INFO => 4 );
# read in all the data, even though it looks
my $data = do { local $/; <DATA> };
my $decoded = decode_json( $data );
$decoded->{Message} = decode_json( $decoded->{Message} );
say rec2msg($decoded);
sub rec2msg {
my $r = shift;
$r->{Message}{message} =~ /(\w+) (\w+) (.+)/;
my($user,$msg) = ($2,"$1 $3");
my $ID = $IDs{$r->{Message}{message}};
my $level = $levels{$r->{Message}{level}};
my $out = "$r->{Message}{'#timestamp'} host CEF:0|OpenSource|AWX|7.0.0|$ID|$msg|$level|src=127.0.0.1 dst=$r->{MessageSourceAddress} duser=$user";
return $out;
}
I'm trying to mirror the website which having the files and folder to hash.
This one having example So I tried, the following
my $url = "http://localhost/mainfolder/";
my ($parent) = $url=~m/\/(\w+)\/?$/;
my %tree=(mainfolder=>[]);
folder_create($url);
sub folder_create
{
my $url = shift;
my $cont = get($url);
my ($child) = $url=~m/($parent.*)/;
$child=~s/\/?(\w+)\/?/{$1}/g;
while($cont=~m/(<tr.+?<\/tr>)/g)
{
my $line = $1;
if($line=~m/\[DIR\].*?href="([^"]*)"[^>]*>(.+?)<\/a>/)
{
my $sub =$1;
$sub=~s/\///;
print "$child\n\n";
push ( eval'#{$tree $child}',$sub);
}
}
}
use Data::Dumper;
print Dumper \%tree,"\n\n\n";
Update
Instead of messing with eval you should use the Data::Diver module
Because of the single quotes, you're trying to execute #{$hash$var} which isn't valid Perl.
If you wrote it as
push eval "\#{\$hash$var}", "somedata"
Then the eval would work, but it would evaluate to the contents of the array in hash element main, which is an empty list of values. That means your call would become
push( ( ), "somedata")
or just
push "somedata"
which is meaningless
This is a particularly unpleasant thing to want to do. Why do you think you need it?
I have the following perl code:
#!/usr/bin/perl
use strict;
use warnings;
use Net::SSH::Perl;
sub SSH($$$$){
my ($host, $user, $pass, $cmd) = #_;
my $ssh = Net::SSH::Perl->new($host);
$ssh->login("$user", "$pass");
return $ssh->cmd($cmd);
};
my %h = (
"ILSW01" => {
"ip" => "1.1.1.",
"user" => "bla",
"pass" => "xxx"
}
);
SSH($h{host}{ip}, $h{host}{user}, $h{host}{pass}, "ls -ltr")
Works fine.
My question is, what if i have more than one host, how can i use the keys of the "host's hash" as parameters in the SSH function to avoid duplicate code writing ?
To to be clear: I want avoid using the SSH function per host
I want avoid using the SSH function per host
That sounds like a very poor design as you'd be combining unrelated two tasks in one function. I'd personally go with
my %hosts = (
"ILSW01" => {
host => "1.1.1.",
user => "bla",
pass => "xxx",
},
...
);
sub SSH {
my ($connect_info, $cmd) = #_;
my $ssh = Net::SSH::Perl->new($connect_info->{host});
$ssh->login($connect_info->{user}, $connect_info->{pass});
return $ssh->cmd($cmd);
}
my #results = map { SSH($_, $cmd) } values(%hosts);
You can make a sub that consists of that last line if you really want a single sub, but I'm not sure there's much benefit to that here.
sub SSH_many {
my ($hosts) = #_;
return map { SSH($_, $cmd) } values(%$hosts);
}
my #results = SSH_many(\%hosts);
First, don't use prototyping in Perl functions. It is broken. Well... not broken, it doesn't work the way you think it should work.
You want to pass multiple hashes into your subroutine. Each containing three values:
The IP address
The User Name
The Password
In Perl, the basic data structures all pass scalar data. Variables like $foo contain a single piece of scalar data. Variables like #foo contain a list of scalar data, and variables like %foo contain a key/value set of scalar data. However, not all data is scalar. Some of it has a complex structure. The question is how to emulate this structure in Perl.
The way we do this in Perl is through references. A reference is a location in memory where my data is stored.
For example, my FTP location is a hash with three pieces of data:
my %ftp_user;
$ftp_user{IP} = 1.1.1.1;
$ftp_user{USRER} = "foo";
$ftp_user{PASSWORD} = "swordfish";
I can get a reference to this hash by putting a backslash in front of it:
my $ftp_user_ref = \%ftp_user;
Note the $ in front of the $ftp_user_ref variable. This means that the value in $ftp_user_ref is a scalar bit of data. I can put that into an array:
push #ftp_users_list, $ftp_user_ref;
Now my #ftp_users_list contains an array of entire hashes, and not an array of individual scalar pieces of data.
I can even take a reference of my #ftp_users_list:
my $ftp_user_list_ref = \#ftp_users_list;
And, I can pass that into my subroutine as a single piece of data.
Take a look at the Perl Reference Tutorial that comes with Perl and learn how references work and how you can use them.
Meanwhile, here's an example that uses references to call each host and run a series of commands. Note that my ssh subroutine takes just two parameters: An array of hashes that are my login credentials for each system, and a list of my commands I want to run on each system:
#! /usr/bin/env perl
#
use strict;
use warnings;
use feature qw(say);
my #ftp_list;
while ( my $line = <DATA> ) {
chomp $line;
my ($ip, $user, $password) = split /:/, $line;
# Push an anonymous hash into my #ftp_list
push #ftp_list, { IP => $ip, USER => $user, PASSWORD => $password };
}
my #commands;
push #commands, "cd data";
push #commands, "get foo.txt";
push #commands, "del foo.txt";
ssh( \#ftp_list, \#commands );
sub ssh {
my $ftp_list_ref = shift;
my $commands_ref = shift;
my #ftp_list = #{ $ftp_list_ref };
my #commmands = #{ $commands_ref };
for my $ip_ref ( #ftp_list ) {
my $ip = $ip_ref->{IP};
my $user = $ip_ref->{USER};
my $pass = $ip_ref->{PASSWORD};
say "my \$ssh = Net::SSH::Perl->new($ip);";
say "\$ssh->login($user, $pass);";
for my $command ( #commands ) {
my $results = say "\$ssh->cmd($command);";
return if not defined $results;
}
}
return 1;
}
__DATA__
1.1.1.1:bob:swordfish
1.2.3.2:carol:secret
1.3.4.5:lewis:ca$h
You could just pass your Hash of hosts and metadata to your SSH function:
sub SSH {
my (%h, $cmd) = #_;
my #commands;
for my $host ( keys %h ) {
my $ssh = New::SSH::Perl->new($host);
$ssh->login($h{$host}->{user}, $h{$host}->{pass});
push #commands,
$ssh->cmd($cmd);
}
return #commands;
}
Now you can call this function with as many hosts as you want, execute the remote commands and return a listing of the results;
I am interested in writing a perl script that goes to the following link and extracts the number 1975: https://familysearch.org/search/collection/results#count=20&query=%2Bevent_place_level_1%3ACalifornia%20%2Bevent_place_level_2%3A%22San%20Diego%22%20%2Bbirth_year%3A1923-1923~%20%2Bgender%3AM%20%2Brace%3AWhite&collection_id=2000219
That website is the amount of white men born in the year 1923 who live in San Diego County, California in 1940. I am trying to do this in a loop structure to generalize over multiple counties and birth years.
In the file, locations.txt, I put the list of counties, such as San Diego County.
The current code runs, but instead of the # 1975, it displays unknown. The number 1975 should be in $val\n.
I would very much appreciate any help!
#!/usr/bin/perl
use strict;
use LWP::Simple;
open(L, "locations26.txt");
my $url = 'https://familysearch.org/search/collection/results#count=20&query=%2Bevent_place_level_1%3A%22California%22%20%2Bevent_place_level_2%3A%22%LOCATION%%22%20%2Bbirth_year%3A%YEAR%-%YEAR%~%20%2Bgender%3AM%20%2Brace%3AWhite&collection_id=2000219';
open(O, ">out26.txt");
my $oldh = select(O);
$| = 1;
select($oldh);
while (my $location = <L>) {
chomp($location);
$location =~ s/ /+/g;
foreach my $year (1923..1923) {
my $u = $url;
$u =~ s/%LOCATION%/$location/;
$u =~ s/%YEAR%/$year/;
#print "$u\n";
my $content = get($u);
my $val = 'unknown';
if ($content =~ / of .strong.([0-9,]+)..strong. /) {
$val = $1;
}
$val =~ s/,//g;
$location =~ s/\+/ /g;
print "'$location',$year,$val\n";
print O "'$location',$year,$val\n";
}
}
Update: API is not a viable solution. I have been in contact with the site developer. The API does not apply to that part of the webpage. Hence, any solution pertaining to JSON will not be applicbale.
It would appear that your data is generated by Javascript and thus LWP cannot help you. That said, it seems that the site you are interested in has a developer API: https://familysearch.org/developers/
I recommend using Mojo::URL to construct your query and either Mojo::DOM or Mojo::JSON to parse XML or JSON results respectively. Of course other modules will work too, but these tools are very nicely integrated and let you get started quickly.
You could use WWW::Mechanize::Firefox to process any site that could be loaded by Firefox.
http://metacpan.org/pod/WWW::Mechanize::Firefox::Examples
You have to install the Mozrepl plugin and you will be able to process the web page contant via this module. Basically you will "remotly control" the browser.
Here is an example (maybe working)
use strict;
use warnings;
use WWW::Mechanize::Firefox;
my $mech = WWW::Mechanize::Firefox->new(
activate => 1, # bring the tab to the foreground
);
$mech->get('https://familysearch.org/search/collection/results#count=20&query=%2Bevent_place_level_1%3ACalifornia%20%2Bevent_place_level_2%3A%22San%20Diego%22%20%2Bbirth_year%3A1923-1923~%20%2Bgender%3AM%20%2Brace%3AWhite&collection_id=2000219',':content_file' => 'main.html');
my $retries = 10;
while ($retries-- and ! $mech->is_visible( xpath => '//*[#class="form-submit"]' )) {
print "Sleep until we find the thing\n";
sleep 2;
};
die "Timeout" if 0 > $retries;
#fill out the search form
my #forms = $mech->forms();
#<input id="census_bp" name="birth_place" type="text" tabindex="0"/>
#A selector prefixed with '#' must match the id attribute of the input. A selector prefixed with '.' matches the class attribute. A selector prefixed with '^' or with no prefix matches the name attribute.
$mech->field( birth_place => 'value_for_birth_place' );
# Click on the submit
$mech->click({xpath => '//*[#class="form-submit"]'});
If you use your browser's development tools, you can clearly see the JSON request that the page you link to uses to get the data you're looking for.
This program should do what you want. I've added a bunch of comments for readability and explanation, as well as made a few other changes.
use warnings;
use strict;
use LWP::UserAgent;
use JSON;
use CGI qw/escape/;
# Create an LWP User-Agent object for sending HTTP requests.
my $ua = LWP::UserAgent->new;
# Open data files
open(L, 'locations26.txt') or die "Can't open locations: $!";
open(O, '>', 'out26.txt') or die "Can't open output file: $!";
# Enable autoflush on the output file handle
my $oldh = select(O);
$| = 1;
select($oldh);
while (my $location = <L>) {
# This regular expression is like chomp, but removes both Windows and
# *nix line-endings, regardless of the system the script is running on.
$location =~ s/[\r\n]//g;
foreach my $year (1923..1923) {
# If you need to add quotes around the location, use "\"$location\"".
my %args = (LOCATION => $location, YEAR => $year);
my $url = 'https://familysearch.org/proxy?uri=https%3A%2F%2Ffamilysearch.org%2Fsearch%2Frecords%3Fcount%3D20%26query%3D%252Bevent_place_level_1%253ACalifornia%2520%252Bevent_place_level_2%253A^LOCATION^%2520%252Bbirth_year%253A^YEAR^-^YEAR^~%2520%252Bgender%253AM%2520%252Brace%253AWhite%26collection_id%3D2000219';
# Note that values need to be doubly-escaped because of the
# weird way their website is set up (the "/proxy" URL we're
# requesting is subsequently loading some *other* URL which
# is provided to "/proxy" as a URL-encoded URL).
#
# This regular expression replaces any ^WHATEVER^ in the URL
# with the double-URL-encoded value of WHATEVER in %args.
# The /e flag causes the replacement to be evaluated as Perl
# code. This way I can look data up in a hash and do URL-encoding
# as part of the regular expression without an extra step.
$url =~ s/\^([A-Z]+)\^/escape(escape($args{$1}))/ge;
#print "$url\n";
# Create an HTTP request object for this URL.
my $request = HTTP::Request->new(GET => $url);
# This HTTP header is required. The server outputs garbage if
# it's not present.
$request->push_header('Content-Type' => 'application/json');
# Send the request and check for an error from the server.
my $response = $ua->request($request);
die "Error ".$response->code if !$response->is_success;
# The response should be JSON.
my $obj = from_json($response->content);
my $str = "$args{LOCATION},$args{YEAR},$obj->{totalHits}\n";
print O $str;
print $str;
}
}
What about this simple script without firefox ? I had investigated the site a bit to understand how it works, and I saw some JSON requests with firebug firefox addon, so I know which URL to query to get the relevant stuff. Here is the code :
use strict; use warnings;
use JSON::XS;
use LWP::UserAgent;
use HTTP::Request;
my $ua = LWP::UserAgent->new();
open my $fh, '<', 'locations2.txt' or die $!;
open my $fh2, '>>', 'out2.txt' or die $!;
# iterate over locations from locations2.txt file
while (my $place = <$fh>) {
# remove line ending
chomp $place;
# iterate over years
foreach my $year (1923..1925) {
# building URL with the variables
my $url = "https://familysearch.org/proxy?uri=https%3A%2F%2Ffamilysearch.org%2Fsearch%2Frecords%3Fcount%3D20%26query%3D%252Bevent_place_level_1%253ACalifornia%2520%252Bevent_place_level_2%253A%2522$place%2522%2520%252Bbirth_year%253A$year-$year~%2520%252Bgender%253AM%2520%252Brace%253AWhite%26collection_id%3D2000219";
my $request = HTTP::Request->new(GET => $url);
# faking referer (where we comes from)
$request->header('Referer', 'https://familysearch.org/search/collection/results');
# setting expected format header for response as JSON
$request->header('content_type', 'application/json');
my $response = $ua->request($request);
if ($response->code == 200) {
# this line convert a JSON to Perl HASH
my $hash = decode_json $response->content;
my $val = $hash->{totalHits};
print $fh2 "year $year, place $place : $val\n";
}
else {
die $response->status_line;
}
}
}
END{ close $fh; close $fh2; }
This seems to do what you need. Instead of waiting for the disappearance of the hourglass it waits - more obviously I think - for the appearance of the text node you're interested in.
use 5.010;
use warnings;
use WWW::Mechanize::Firefox;
STDOUT->autoflush;
my $url = 'https://familysearch.org/search/collection/results#count=20&query=%2Bevent_place_level_1%3ACalifornia%20%2Bevent_place_level_2%3A%22San%20Diego%22%20%2Bbirth_year%3A1923-1923~%20%2Bgender%3AM%20%2Brace%3AWhite&collection_id=2000219';
my $mech = WWW::Mechanize::Firefox->new(tab => qr/FamilySearch\.org/, create => 1, activate => 1);
$mech->autoclose_tab(0);
$mech->get('about:blank');
$mech->get($url);
my $text;
while () {
sleep 1;
$text = $mech->xpath('//p[#class="num-search-results"]/text()', maybe => 1);
last if defined $text;
}
my $results = $text->{nodeValue};
say $results;
if ($results =~ /([\d,]+)\s+results/) {
(my $n = $1) =~ tr/,//d;
say $n;
}
output
1-20 of 1,975 results
1975
Update
This update is with special thanks to #nandhp, who inspired me to look at the underlying data server that produces the data in JSON format.
Rather than making a request via the superfluous https://familysearch.org/proxy this code accesses the server directly at https://familysearch.org/search/records, reencodes the JSON and dumps the required data out of the resulting structure. This has the advantage of both speed (the requests are served about once a second - more than ten times faster than with the equivalent request from the basic web site) and stability (as you note, the site is very flaky - in contrast I have never seen an error using this method).
use strict;
use warnings;
use LWP::UserAgent;
use URI;
use JSON;
use autodie;
STDOUT->autoflush;
open my $fh, '<', 'locations26.txt';
my #locations = <$fh>;
chomp #locations;
open my $outfh, '>', 'out26.txt';
my $ua = LWP::UserAgent->new;
for my $county (#locations[36, 0..2]) {
for my $year (1923 .. 1926) {
my $total = familysearch_info($county, $year);
print STDOUT "$county,$year,$total\n";
print $outfh "$county,$year,$total\n";
}
print "\n";
}
sub familysearch_info {
my ($county, $year) = #_;
my $query = join ' ', (
'+event_place_level_1:California',
sprintf('+event_place_level_2:"%s"', $county),
sprintf('+birth_year:%1$d-%1$d~', $year),
'+gender:M',
'+race:White',
);
my $url = URI->new('https://familysearch.org/search/records');
$url->query_form(
collection_id => 2000219,
count => 20,
query => $query);
my $resp = $ua->get($url, 'Content-Type'=> 'application/json');
my $data = decode_json($resp->decoded_content);
return $data->{totalHits};
}
output
San Diego,1923,1975
San Diego,1924,2004
San Diego,1925,1871
San Diego,1926,1908
Alameda,1923,3577
Alameda,1924,3617
Alameda,1925,3567
Alameda,1926,3464
Alpine,1923,1
Alpine,1924,2
Alpine,1925,0
Alpine,1926,1
Amador,1923,222
Amador,1924,248
Amador,1925,134
Amador,1926,67
I do not know how to post revised code from the solution above.
This code does not (yet) compile correctly. However, I have made some essential update to definitely head in that direction.
I would very much appreciate help on this updated code. I do not know how to post this code and this follow up such that it appease the lords who run this sight.
It get stuck at the sleep line. Any advice on how to proceed past it would be much appreciated!
use strict;
use warnings;
use WWW::Mechanize::Firefox;
my $mech = WWW::Mechanize::Firefox->new(
activate => 1, # bring the tab to the foreground
);
$mech->get('https://familysearch.org/search/collection/results#count=20&query=%2Bevent_place_level_1%3ACalifornia%20%2Bevent_place_level_2%3A%22San%20Diego%22%20%2Bbirth_year%3A1923-1923~%20%2Bgender%3AM%20%2Brace%3AWhite&collection_id=2000219',':content_file' => 'main.html', synchronize => 0);
my $retries = 10;
while ($retries-- and $mech->is_visible( xpath => '//*[#id="hourglass"]' )) {
print "Sleep until we find the thing\n";
sleep 2;
};
die "Timeout while waiting for application" if 0 > $retries;
# Now the hourglass is not visible anymore
#fill out the search form
my #forms = $mech->forms();
#<input id="census_bp" name="birth_place" type="text" tabindex="0"/>
#A selector prefixed with '#' must match the id attribute of the input. A selector prefixed with '.' matches the class attribute. A selector prefixed with '^' or with no prefix matches the name attribute.
$mech->field( birth_place => 'value_for_birth_place' );
# Click on the submit
$mech->click({xpath => '//*[#class="form-submit"]'});
You should set the current form before accessing a field:
"Given the name of a field, set its value to the value specified. This applies to the current form (as set by the "form_name()" or "form_number()" method or defaulting to the first form on the page)."
$mech->form_name( 'census-search' );
$mech->field( birth_place => 'value_for_birth_place' );
Sorry, I am not able too try this code out and thanks for open a question for a new question.