I have some data in input file
user date="" name="" id="small"
user date="" name="" id="sample test"
user date="" name="" id="big city"
I want to get only id's from above file
code::-
use strict;
use warnings;
my $input = "location\input.txt";
open("FH","<$input") or die;
while(my $str = <FH>)
{
my #arr = split(/ /,$str);
$arr[2] =~ s/id=//g;
$arr[2] =~ s/"//g;
print "$arr[2]\n";
}
close("FH");
Output :
small
sample
big
Note :: Here I'm not able to print complete word like "small test", "big city"
Expectation : I need to get complete word "sample test" and "big city" anyone please help me on this
If you know the format will always have quotes after id, you can do:
use feature qw(say);
use strict;
use warnings;
open my $fh, "<", "location/input.txt" or die $!;
while (my $line = <$fh>) {
my ($id) = $line =~ /id="(.*?)"/;
say $id;
}
Breaking down that complicated line we have:
$line =~ /id="(.*?)"/: match id="..." and grab the smallest possible
.... If you use .* instead, you will grab up until the last " of the line, which might belong to another field. This is not the case for id, but try it with date and you'll see.
my ($id) = ...: process the regex match in list context, which returns the capture groups, and assign it pairwise to the list ($id). Concretely, this stuffs the matched value in $id
say $id: prints $id with an automatic newline after it.
A nice module for handling quoted strings is Text::ParseWords. It is a core module too, making it even handier. You can use it here to easily split the string on whitespace, then parse the result into hash keys.
use strict;
use warnings;
use Data::Dumper;
use Text::ParseWords;
while (<DATA>) {
chomp;
my %data = map { my ($key, $data) = split /=/, $_, 2; ($key => $data); } quotewords('\s+', 0, $_);
print Dumper \%data;
}
__DATA__
user date="" name="" id="small"
user date="" name="" id="sample test"
user date="" name="" id="big city"
Output:
$VAR1 = {
'user' => undef,
'name' => '',
'date' => '',
'id' => 'small'
};
$VAR1 = {
'name' => '',
'date' => '',
'id' => 'sample test',
'user' => undef
};
$VAR1 = {
'id' => 'big city',
'date' => '',
'name' => '',
'user' => undef
};
A simplified version to extract data of interest
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
while(<DATA>) {
my %d = /(\w+)="(.*?)"/g;
say 'id: ' . $d{id};
say Dumper(\%d);
}
__DATA__
user date="" name="" id="small"
user date="" name="" id="sample test"
user date="" name="" id="big city"
Output
id: small
$VAR1 = {
'date' => '',
'id' => 'small',
'name' => ''
};
id: sample test
$VAR1 = {
'id' => 'sample test',
'date' => '',
'name' => ''
};
id: big city
$VAR1 = {
'name' => '',
'id' => 'big city',
'date' => ''
};
Related
I am trying to process the data in a single file . i have to read the file and create a hash structure,get the value of fruitname append it to fruitCount and fruitValue and delete the line fruitName and write the entire output after the change is done.Given below is the content of file.
# this is a new file
{
date 14/07/2016
time 11:15
end 11:20
total 30
No "FRUITS"
Fruit_class
{
Name "fruit 1"
fruitName "apple.fru"
fruitId "0"
fruitCount 5
fruitValue 6
}
{
Name "fruit 2"
fruitName "orange.fru"
fruitId "1"
fruitCount 10
fruitValue 20
}
}
I tried with following code :
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %hash_table;
my $name;
my $file = '/tmp/fruitdir/fruit1.txt';
open my $fh, "<", $file or die "Can't open $file: $!";
while (<$fh>) {
chomp;
if (/^\s*fruitName/) {
($name) = /(\".+\")/;
next;
}
s/(fruitCount|fruitValue)/$name\.$1/;
my ($key, $value) = split /\s+/, $_, 2;
$hash_table{$key} = $value;
}
print Dumper(\%hash_table);
This is not working . I need to append the value of fruitname and print the the entire file content as output. Any help will be appreciated.Given below is the output that i got.
$VAR1 = {
'' => undef,
'time' => '11:15 ',
'date' => '14/07/2016',
'{' => undef,
'#' => 'this is a new file',
'total' => '30 ',
'end' => '11:20 ',
'No' => '"FRUITS"',
'Fruit_class' => undef,
'}' => undef
};
Expected hash as output:
$VAR1 = {
'Name' => '"fruit 1"',
'fruitId' => '"0" ',
'"apple_fru".fruitValue' => '6 ',
'"apple_fru".fruitCount' => '5'
'Name' => '"fruit 2"',
'fruitId' => '"0" ',
'"orange_fru".fruitValue' => '10 ',
'"orange_fru".fruitCount' => '20'
};
One word of advice before I continue:
Document your code
There are several logic errors in your code which I think you would have recognized if you wrote down what you thought each line was supposed to do. First, write down the algorithm that you would like to implement, then document how each step in the code implements a step in the algorithm. At the end you'll be able to see what you missed, or what part is not working.
Here are the errors that I see
You aren't ignoring lines that you shouldn't be parsing. For example, you're grabbing the '}' and '{' lines.
You aren't actually storing the name of the fruit. You grab it, but immediately start the next loop without storing it.
You're not keeping track of each structure. You need to start a new structure for each fruit.
Do you really want to keep the double quotes in the values?
Other things to worry about:
Are you guaranteed that the list of attributes is in that order? For example, can Name come last?
Here's some code which does what I think you want.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %hash_table;
my $name;
my #fruit;
my $file = '/tmp/fruitdir/fruit1.txt';
open my $fh, "<", $file or die "Can't open $file: $!";
while (<$fh>) {
chomp;
# save hash table if there's a close bracket, but
# only if it has been filled
if ( /^\s*}\s*$/ ) {
next unless keys %hash_table;
# save COPY of hash table
push #fruit, { %hash_table };
# clear it out for the next iteration
%hash_table = ();
}
# only parse lines that start with Name or fruit
next unless
my ( $key, $value ) =
/^
# skip any leading spaces
\s*
# parse a line beginning with Name or fruitXXXXX
(
Name
|
fruit[^\s]+
)
# need space between key and value
\s+
# everything that follows is a value. clean up
# double quotes in post processing
(.*)
/x;
# remove double quotes
$value =~ s/"//g;
if ( $key eq 'Name' ) {
$name = $value;
}
else {
$key = "${name}.${key}";
}
$hash_table{$key} = $value;
}
print Dumper \#fruit;
and here's the output:
$VAR1 = [
{
'fruit 1.fruitValue' => '6',
'fruit 1.fruitName' => 'apple.fru',
'Name' => 'fruit 1',
'fruit 1.fruitCount' => '5',
'fruit 1.fruitId' => '0'
},
{
'fruit 2.fruitName' => 'orange.fru',
'fruit 2.fruitId' => '1',
'fruit 2.fruitCount' => '10',
'fruit 2.fruitValue' => '20',
'Name' => 'fruit 2'
}
];
I have a query string like this:
id=60087888;jid=16471827;from=advance;action=apply
or it can be like this :
id=60087888&jid=16471827&from=advance&action=apply
Now from this i want to create a hash that will have key as id and its value
I have done this
my %in;
$buffer = 'resid=60087888;jobid=16471827;from=advance;action=apply';
#pairs = split(/=/, $buffer);
foreach $pair (#pairs){
($name, $value) = split(/=/, $pair);
$in{$name} = $value;
}
print %in;
But the issue is in the query string it can be semin colon or & so how can we do this please help me
Don't try to solve it with new code; this is what CPAN modules are for. Specifically in this case, URI::Query
use URI::Query;
use Data::Dumper;
my $q = URI::Query->new( "resid=60087888;jobid=16471827;from=advance;action=apply" );
my %hash = $q->hash;
print Dumper( \%hash );
Gives
{ action => 'apply',
from => 'advance',
jobid => '16471827',
resid => '60087888' }
You've already an answer that works - but personally I might tackle it like this:
my %in = $buffer =~ m/(\w+)=(\w+)/g;
What this does is use regular expressions to pattern match either side of the equals sign.
It does so in pairs - effectively - and as a result is treated by a sequence of key-values in the hash assignment.
Note - it does assume you've not got special characters in your keys/values, and that you have no null values. (Or if you do, they'll be ignored - you can use (\w*) instead if that's the case).
But you get:
$VAR1 = {
'from' => 'advance',
'jid' => '16471827',
'action' => 'apply',
'id' => '60087888'
};
Alternatively:
my %in = map { split /=/ } split ( /[^=\w]/, $buffer );
We split using 'anything that isn't word or equals' to get a sequence, and then split on equals to make the same key-value pairs. Again - certain assumptions are made about valid delimiter/non-delimiter characters.
Check this answer:
my %in;
$buffer = 'resid=60087888;jobid=16471827;from=advance;action=apply';
#pairs = split(/[&,;]/, $buffer);
foreach $pair (#pairs){
($name, $value) = split(/=/, $pair);
$in{$name} = $value;
}
delete $in{resid};
print keys %in;
I know I'm late to the game, but....
#!/usr/bin/perl
use strict;
use CGI;
use Data::Dumper;
my $query = 'id=60087888&jid=16471827&from=advance&action=apply&blank=¬_blank=1';
my $cgi = CGI->new($query);
my %hash = $cgi->Vars();
print Dumper \%hash;
will produce:
$VAR1 = {
'not_blank' => '1',
'jid' => '16471827',
'from' => 'advance',
'blank' => '',
'action' => 'apply',
'id' => '60087888'
};
Which has the added benefit of dealing with keys that might not have values in the source string.
Some of the other examples will produce:
$VAR1 = {
'id' => '60087888',
'1' => undef,
'jid' => '16471827',
'from' => 'advance',
'blank' => 'not_blank',
'action' => 'apply'
};
which may not be desirable.
I would have used URI::Query #LeoNerd 's answer, but I didn't have the ability to install a module in my case and CGI.pm was handy.
also, you could
my $buffer = 'id=60087888&jid=16471827&from=advance&action=apply';
my %hash = split(/&|=/, $buffer);
which gives:
$hash = {
'jid' => '16471827',
'from' => 'advance',
'action' => 'apply',
'id' => '60087888'
};
This is VERY fragile, so I wouldn't advocate using it.
I am trying to gather data from a website. Some anti-patterns make looking finding the right form objects difficult but I have this solved. I am using a post method to get around some javascript acting as a wrapper to submit the form. My problem seems to be in getting the results from the mechanize->post method.
Here's a shortened version of my code.
use strict;
use warnings;
use HTML::Tree;
use LWP::Simple;
use WWW::Mechanize;
use HTTP::Request::Common;
use Data::Dumper;
$| = 1;
my $site_url = "http://someURL";
my $mech = WWW::Mechanize->new( autocheck => 1 );
foreach my $number (#numbers)
{
my $content = get($site_url);
$mech->get ($site_url);
my $tree = HTML::Tree->new();
$tree->parse($content);
my ($title) = $tree->look_down( '_tag' , 'a' );
my $atag = "";
my $atag1 = "";
foreach $atag ( $tree->look_down( _tag => q{a}, 'class' => 'button', 'title' => 'SEARCH' ) )
{
print "Tag is ", $atag->attr('id'), "\n";
$atag1 = Dumper $atag->attr('id');
}
# Enter permit number in "Number" search field
my #forms = $mech->forms;
my #fields = ();
foreach my $form (#forms)
{
#fields = $form->param;
}
my ($name, $fnumber) = $fields[2];
print "field name and number is $name\n";
$mech->field( $name, $number, $fnumber );
print "field $name populated with search data $number\n" if $mech->success();
$mech->post($site_url ,
[
'$atag1' => $number,
'internal.wdk.wdkCommand' => $atag1,
]) ;
print $mech->content; # I think this is where the problem is.
}
The data I get from my final print statement is the data from teh original URL not the page the POST command should take me to. What have I done wrong?
Many Thanks
Update
I don't have Firefox installed so I'm avoiding WWW::Mechanize::Firefox intentionally.
Turns out I was excluding some required hidden fields from my POST command.
I'm trying to parse some XML into Perl, but testing isn't yielding what I'd expect.
$buffer = qq[<DeliveryReport><message id="msgID" sentdate="xxxxx" donedate="xxxxx" status="xxxxxx" gsmerror="0" /></DeliveryReport>];
$xml = XML::Simple->new( ForceArray => 1 );
$file = $xml->XMLin($buffer) or die "Failed for $reply: $!\n";
use Data::Dumper;
print Dumper($file);
$msgid = $file->{message}->{id};
$message_status = $file->{message}->{status};
print "OUTPUT: $msgid $message_status";
but the output is blank and the print Dumper looks wrong regards id attribute but I'm not sure why.
$VAR1 = {
'message' => {
'msgID' => {
'status' => 'xxxxxx',
'gsmerror' => '0',
'sentdate' => 'xxxxx',
'donedate' => 'xxxxx'
}
}
};
OUTPUT:
Here is the final code working correctly.
use XML::Simple;
use Data::Dumper;
$xml = XML::Simple->new (KeyAttr=>'',ForceArray => 1);
$file = $xml->XMLin('
<DeliveryReport>
<message id="msgID1" sentdate="xxxxx" donedate="xxxxx" status="xxxxxx" gsmerror="0" />
<message id="msgID2" sentdate="yyy" donedate="yyy" status="yyy" gsmerror="0" />
</DeliveryReport>
') or die "Failed for $reply: $!\n";
print Dumper($file);
$numOfMsgs = #{$file->{message}};
print "<br /><br />I've received $numOfMsgs records<br />";
for($i = 0; $i < $numOfMsgs; $i++) {
$msgid = $file->{message}->[$i]->{id};
$message_status = $file->{message}->[$i]->{status};
print "message id: [$msgid]<br />";
print "status id: [$message_status]<br />";
print "<br />";
}
By default, XML::Simple chooses to fold around the following keys by default: name, key, id (see note 1).
Your XML schema contains the id key, which is why the hash is being split there. You can clear the KeyAttr value when you create your object (e.g. $xml = XML::Simple( KeyAttr=>"" );) to override the default behavior.
Your output, with multiple message entries, would look like:
$VAR1 = {
'message' => [
{
'gsmerror' => '0',
'status' => 'xxxxxx',
'id' => 'msgID',
'donedate' => 'xxxxx',
'sentdate' => 'xxxxx'
},
{
'gsmerror' => '1',
'status' => 'yyyyyy',
'id' => 'msgID2',
'donedate' => 'yyyyy',
'sentdate' => 'yyyyy'
}
]
};
So you need to adjust your code slightly to account for %message containing an array of message hashes. The format would be the same for a single message if you keep the ForceArray option, so your code change would work for both cases.
my %book = (
'name' => 'abc',
'author' => 'monk',
'isbn' => '123-890',
'issn' => '#issn',
);
my %chapter = (
'title' => 'xyz',
'page' => '90',
);
How do I incorporate %book inside %chapter through reference so that when I write "$chapter{name}", it should print 'abc'?
You can copy the keys/values of the %book into the %chapter:
#chapter{keys %book} = values %book;
Or something like
%chapter = (%chapter, %book);
Now you can say $chapter{name}, but changes in %book are not reflected in %chapter.
You can include the %book via reference:
$chapter{book} = \%book;
Now you could say $chapter{book}{name}, and changes do get reflected.
To have an interface that allows you to say $chapter{name} and that does reflect changes, some advanced techniques would have to be used (this is fairly trivial with tie magic), but don't go there unless you really have to.
You could write a subroutine to check a list of hashes for a key. This program demonstrates:
use strict;
use warnings;
my %book = (
name => 'abc',
author => 'monk',
isbn => '123-890',
issn => '#issn',
);
my %chapter = (
title => 'xyz',
page => '90',
);
for my $key (qw/ name title bogus / ) {
print '>> ', access_hash($key, \%book, \%chapter), "\n";
}
sub access_hash {
my $key = shift;
for my $hash (#_) {
return $hash->{$key} if exists $hash->{$key};
}
undef;
}
output
Use of uninitialized value in print at E:\Perl\source\ht.pl line 17.
>> abc
>> xyz
>>