Perl parsing file content - perl

I am trying to parse text file content consists of 3 categories and access it in the main code. I got to know that hash maybe a good way but since no columns in the input file is unique (Name could be repeatedly or different), I doubt is there other way to do it. Appreciate any reply.
#!/usr/bin/perl
use strict;
use warnings;
my $file = "/path/to/text.txt";
my %info = parseCfg($file);
#get first line data in text file (Eric cat xxx)
#get second line data in text file (Michelle dog yyy)
#so on
}
sub parseCfg {
my $file = shift;
my %data;
return if !(-e $file);
open(my $fh, "<", $file) or die "Can't open < $file: $!";
my $msg = "-I-: Reading from config file: $file\n";
while (<$fh>) {
if (($_=~/^#/)||($_=~/^\s+$/)) {next;}
my #fields = split(" ", $_);
my ($name, $son, $address) = #fields;
#return something
}
close $fh;
}
Input file format:(basically 3 columns)
#Name pet address
Eric cat xxx
Michelle dog yyy
Ben horse zzz
Eric cat aaa

The question isn't clear how the data will be used in the code.
Following code sample demonstrates how the data can be read and stored in anonymous hash referenced by $href. Then $href stored in anonymous array referenced in $aref which returned by parse_cnf() subroutine.
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my $fname = 'pet_data.txt';
my $data = parse_cnf($fname);
say Dumper($data);
printf "Name: %-12s Pet: %-10s Address: %s\n", $_->#{qw/name pet address/} for $data->#*;
exit 0;
sub parse_cnf {
my $fname = shift;
my $aref;
open my $fh, '<', $fname
or die "Couldn't open $fname";
while( <$fh> ) {
next if /(^\s*$|^#)/;
my $href->#{qw/name pet address/} = split;
push $aref->#*, $href;
}
close $fh;
return $aref;
}
Output
$VAR1 = [
{
'address' => 'xxx',
'pet' => 'cat',
'name' => 'Eric'
},
{
'pet' => 'dog',
'name' => 'Michelle',
'address' => 'yyy'
},
{
'name' => 'Ben',
'pet' => 'horse',
'address' => 'zzz'
},
{
'address' => 'aaa',
'pet' => 'cat',
'name' => 'Eric'
}
];
Name: Eric Pet: cat Address: xxx
Name: Michelle Pet: dog Address: yyy
Name: Ben Pet: horse Address: zzz
Name: Eric Pet: cat Address: aaa

Related

how to print space contain attribute in Perl Script

I have some data in input file
user date="" name="" id="small"
user date="" name="" id="sample test"
user date="" name="" id="big city"
I want to get only id's from above file
code::-
use strict;
use warnings;
my $input = "location\input.txt";
open("FH","<$input") or die;
while(my $str = <FH>)
{
my #arr = split(/ /,$str);
$arr[2] =~ s/id=//g;
$arr[2] =~ s/"//g;
print "$arr[2]\n";
}
close("FH");
Output :
small
sample
big
Note :: Here I'm not able to print complete word like "small test", "big city"
Expectation : I need to get complete word "sample test" and "big city" anyone please help me on this
If you know the format will always have quotes after id, you can do:
use feature qw(say);
use strict;
use warnings;
open my $fh, "<", "location/input.txt" or die $!;
while (my $line = <$fh>) {
my ($id) = $line =~ /id="(.*?)"/;
say $id;
}
Breaking down that complicated line we have:
$line =~ /id="(.*?)"/: match id="..." and grab the smallest possible
.... If you use .* instead, you will grab up until the last " of the line, which might belong to another field. This is not the case for id, but try it with date and you'll see.
my ($id) = ...: process the regex match in list context, which returns the capture groups, and assign it pairwise to the list ($id). Concretely, this stuffs the matched value in $id
say $id: prints $id with an automatic newline after it.
A nice module for handling quoted strings is Text::ParseWords. It is a core module too, making it even handier. You can use it here to easily split the string on whitespace, then parse the result into hash keys.
use strict;
use warnings;
use Data::Dumper;
use Text::ParseWords;
while (<DATA>) {
chomp;
my %data = map { my ($key, $data) = split /=/, $_, 2; ($key => $data); } quotewords('\s+', 0, $_);
print Dumper \%data;
}
__DATA__
user date="" name="" id="small"
user date="" name="" id="sample test"
user date="" name="" id="big city"
Output:
$VAR1 = {
'user' => undef,
'name' => '',
'date' => '',
'id' => 'small'
};
$VAR1 = {
'name' => '',
'date' => '',
'id' => 'sample test',
'user' => undef
};
$VAR1 = {
'id' => 'big city',
'date' => '',
'name' => '',
'user' => undef
};
A simplified version to extract data of interest
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
while(<DATA>) {
my %d = /(\w+)="(.*?)"/g;
say 'id: ' . $d{id};
say Dumper(\%d);
}
__DATA__
user date="" name="" id="small"
user date="" name="" id="sample test"
user date="" name="" id="big city"
Output
id: small
$VAR1 = {
'date' => '',
'id' => 'small',
'name' => ''
};
id: sample test
$VAR1 = {
'id' => 'sample test',
'date' => '',
'name' => ''
};
id: big city
$VAR1 = {
'name' => '',
'id' => 'big city',
'date' => ''
};

Issue in Printing data from a hash table in perl

I am trying to process the data in a single file . i have to read the file and create a hash structure,get the value of fruitname append it to fruitCount and fruitValue and delete the line fruitName and write the entire output after the change is done.Given below is the content of file.
# this is a new file
{
date 14/07/2016
time 11:15
end 11:20
total 30
No "FRUITS"
Fruit_class
{
Name "fruit 1"
fruitName "apple.fru"
fruitId "0"
fruitCount 5
fruitValue 6
}
{
Name "fruit 2"
fruitName "orange.fru"
fruitId "1"
fruitCount 10
fruitValue 20
}
}
I tried with following code :
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %hash_table;
my $name;
my $file = '/tmp/fruitdir/fruit1.txt';
open my $fh, "<", $file or die "Can't open $file: $!";
while (<$fh>) {
chomp;
if (/^\s*fruitName/) {
($name) = /(\".+\")/;
next;
}
s/(fruitCount|fruitValue)/$name\.$1/;
my ($key, $value) = split /\s+/, $_, 2;
$hash_table{$key} = $value;
}
print Dumper(\%hash_table);
This is not working . I need to append the value of fruitname and print the the entire file content as output. Any help will be appreciated.Given below is the output that i got.
$VAR1 = {
'' => undef,
'time' => '11:15 ',
'date' => '14/07/2016',
'{' => undef,
'#' => 'this is a new file',
'total' => '30 ',
'end' => '11:20 ',
'No' => '"FRUITS"',
'Fruit_class' => undef,
'}' => undef
};
Expected hash as output:
$VAR1 = {
'Name' => '"fruit 1"',
'fruitId' => '"0" ',
'"apple_fru".fruitValue' => '6 ',
'"apple_fru".fruitCount' => '5'
'Name' => '"fruit 2"',
'fruitId' => '"0" ',
'"orange_fru".fruitValue' => '10 ',
'"orange_fru".fruitCount' => '20'
};
One word of advice before I continue:
Document your code
There are several logic errors in your code which I think you would have recognized if you wrote down what you thought each line was supposed to do. First, write down the algorithm that you would like to implement, then document how each step in the code implements a step in the algorithm. At the end you'll be able to see what you missed, or what part is not working.
Here are the errors that I see
You aren't ignoring lines that you shouldn't be parsing. For example, you're grabbing the '}' and '{' lines.
You aren't actually storing the name of the fruit. You grab it, but immediately start the next loop without storing it.
You're not keeping track of each structure. You need to start a new structure for each fruit.
Do you really want to keep the double quotes in the values?
Other things to worry about:
Are you guaranteed that the list of attributes is in that order? For example, can Name come last?
Here's some code which does what I think you want.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %hash_table;
my $name;
my #fruit;
my $file = '/tmp/fruitdir/fruit1.txt';
open my $fh, "<", $file or die "Can't open $file: $!";
while (<$fh>) {
chomp;
# save hash table if there's a close bracket, but
# only if it has been filled
if ( /^\s*}\s*$/ ) {
next unless keys %hash_table;
# save COPY of hash table
push #fruit, { %hash_table };
# clear it out for the next iteration
%hash_table = ();
}
# only parse lines that start with Name or fruit
next unless
my ( $key, $value ) =
/^
# skip any leading spaces
\s*
# parse a line beginning with Name or fruitXXXXX
(
Name
|
fruit[^\s]+
)
# need space between key and value
\s+
# everything that follows is a value. clean up
# double quotes in post processing
(.*)
/x;
# remove double quotes
$value =~ s/"//g;
if ( $key eq 'Name' ) {
$name = $value;
}
else {
$key = "${name}.${key}";
}
$hash_table{$key} = $value;
}
print Dumper \#fruit;
and here's the output:
$VAR1 = [
{
'fruit 1.fruitValue' => '6',
'fruit 1.fruitName' => 'apple.fru',
'Name' => 'fruit 1',
'fruit 1.fruitCount' => '5',
'fruit 1.fruitId' => '0'
},
{
'fruit 2.fruitName' => 'orange.fru',
'fruit 2.fruitId' => '1',
'fruit 2.fruitCount' => '10',
'fruit 2.fruitValue' => '20',
'Name' => 'fruit 2'
}
];

Hash Key value print Issue

I am not getting count of $mapA{$Brand_Name}->{Success} $mapA{$Brand_Name}->{Failure} please help on this actually i raised this problem in another question i.e closed so i am raising in this.
or any other way to increase the count for particular key.
#!/usr/bin/perl
use Text::CSV;
use POSIX qw(strftime);
use Data::Dumper;
use LWP::Simple;
my $APK_GCM="/root/Basavaraj/GCM/cdr_02-01-2018_StreamzGcm.csv";
my $WEB_GCM="/root/Basavaraj/GCM/cdr_02-01-2018_StreamzWebPushNotification.csv";
my $Yesterday= strftime ("%d-%m-%Y", localtime(time-86400));
my $Current_Date= strftime ("%d-%m-%Y",localtime(time));
print "$Yesterday \n";
print "$Current_Date \n";
open(STDOUT, '>', "/root/Basavaraj/STREAMZ_GCM_APK.txt");
#Creating Class to split the document line by line by comma ,
my $csv = Text::CSV->new({ sep_char => ',' });
open (my $data, '<:encoding(utf8)', $APK_GCM) or die "Could Not open File '$APK_GCM' $!\n";
open (my $data1,'<:encoding(utf8)', $WEB_GCM) or die "Could Not open File '$WEB_GCM' $!\n";
my %mapA;
my $dummyA =<$data>;
while (my $line = <$data>) {
if ($csv->parse($line)) {
my #fields = $csv->fields();
my $Brand_Name=$fields[2];
my $Streamz_Sent=$fields[5];
my $GoogleResA=$fields[5];
$mapA{$Brand_Name} = {Success =>0,Failure=> 0} unless exists ($mapA{$Brand_Name});
my $failureA='{error:MismatchSenderId}';
if ($GoogleResA eq $failureA){
$mapA{$Brand_Name}->{Failure}++;
print "$Brand_Name:$mapA{$Brand_Name}->{Failure} \n";
}else{
$mapA{$Brand_Name}->{Success}++;
print "$Brand_Name:$mapA{$Brand_Name}->{Success} \n";
}
} else {
warn "Line could not be parsed: $line\n";
}
}
#$, = ",";
print " $mapA{$Brand_Name}->{Failure} \n";
my $KeyA;
while (($keyA)=each (%mapA)){
my $success= $mapA{$Brand_Name}->{Success};
my $failure= $mapA{$Brand_Name}->{Failure};
print "$keyA $mapA{$Brand_Name}->{Failure}++ $mapA{$Brand_Name}->{Success}++ \n";
}
foreach my $name ( keys %mapA) {
print " $mapA{$Brand_Name}->{Failure} \n";
print " $mapA{$Brand_Name}->{Success} \n";
print "$name $mapA{$Brand_Name}->{Success} $mapA{$Brand_Name}->{Failure} \n";
}
Code looks messy and not clear, but i'm posting here the way to increment the count for this case, hope it will help someway, copy and paste the below program in your machine and try it out(demonstrates how to increment count, very similar to scenario mentioned above)
#!/usr/bin/perl
use strict;
use warnings;
my %results;
%results = ('Brand A' => {'Success' => 0, 'Failure' => 0},
'Brand B' => {'Success' => 1, 'Failure' => 1});
# add new entry into existing hash
#
$results{'Brand C'}{'Success'} = 5;
$results{'Brand C'}{'Failure'} = 6;
# increment Success count for specific brand straightaway
$results{'Brand B'}{'Success'}++;
print "Success count for brand B = $results{'Brand B'}{'Success'}\n";
#print out hash
#
# first key for example Brand A
for my $brand (keys %results)
{
print "Printing brand here: $brand-->";
# next key for example 'Success' or 'Failure'
#
for my $result (keys %{$results{$brand}})
{
#increment success or failure count
$results{$brand}{$result}++;
print "$result-->$results{$brand}{$result},";
}
print "\n";
}

Read ini files without section names

I want to make a configuration file which hold some objects, like this (where of course none of the paramaters can be considered as a primary key)
param1=abc
param2=ghj
param1=bcd
param2=hjk
; always the sames parameters
This file could be read, lets say with Config::IniFiles, because it has a direct transcription into ini file, like this
[0]
param1=abc
param2=ghj
[1]
param1=bcd
param2=hjk
with, for example, something like
perl -pe 'if (m/^\s*$/ || !$section ) print "[", ($section++ || 0) , "]"'
And finish with
open my $fh, '<', "/path/to/config_file.ini" or die $!;
$cfg = Config::IniFiles->new( -file => $fh );
(...parse here the sections starting with 0.)
But, I here ask me some question about the thing becoming quite complex....
(A) Is There a way to transform the $fh, so that it is not required to execute the perl one-liner BEFORE reading the file sequentially? So, to transform the file during perl is actually reading it.
or
(B) Is there a module to read my wonderfull flat database? Or something approching? I let myslef said, that Gnu coreutils does this kind of flat file reading, but I cannot remember how.
You can create a simple subclass of Config::INI::Reader:
package MyReader;
use strict;
use warnings;
use base 'Config::INI::Reader';
sub new {
my $class = shift;
my $self = $class->SUPER::new( #_ );
$self->{section} = 0;
return $self;
}
sub starting_section { 0 };
sub can_ignore { 0 };
sub parse_section_header {
my ( $self, $line ) = #_;
return $line =~ /^\s*$/ ? ++$self->{section} : undef ;
}
1;
With your input this gives:
% perl -MMyReader -MData::Dumper -e 'print Dumper( MyReader->read_file("cfg") )'
$VAR1 = {
'1' => {
'param2' => 'hjk',
'param1' => 'bcd'
},
'0' => {
'param2' => 'ghj',
'param1' => 'abc'
}
};
You can use a variable reference instead of a file name to create a filehandle that reads from it:
use strict;
use warnings;
use autodie;
my $config = "/path/to/config_file.ini";
my $content = do {
local $/;
open my $fh, "<", $config;
"\n". <$fh>;
};
# one liner replacement
my $section = 0;
$content =~ s/^\s*$/ "\n[". $section++ ."]" /mge;
open my $fh, '<', \$content;
my $cfg = Config::IniFiles->new( -file => $fh );
# ...
You can store the modified data in a real file or a string variable, but I suggest that you use paragraph mode by setting the input record separator $/ to the empty string. Like this
use strict;
use warnings;
{
local $/ = ''; # Read file in "paragraphs"
my $section = 0;
while (<DATA>) {
printf "[%d]\n", $section++;
print;
}
}
__DATA__
param1=abc
param2=ghj
param1=bcd
param2=hjk
output
[0]
param1=abc
param2=ghj
[1]
param1=bcd
param2=hjk
Update
If you read the file into a string, adding section identifiers as above, then you can read the result directly into a Config::IniFiles object using a string reference, for instance
my $config = Config::IniFiles->new(-file => \$modified_contents)
This example shows the tie interface, which results in a Perl hash that contains the configuration information. I have used Data::Dump only to show the structure of the resultant hash.
use strict;
use warnings;
use Config::IniFiles;
my $config;
{
open my $fh, '<', 'config_file.ini' or die "Couldn't open config file: $!";
my $section = 0;
local $/ = '';
while (<$fh>) {
$config .= sprintf "[%d]\n", $section++;
$config .= $_;
}
};
tie my %config, 'Config::IniFiles', -file => \$config;
use Data::Dump;
dd \%config;
output
{
# tied Config::IniFiles
"0" => {
# tied Config::IniFiles::_section
param1 => "abc",
param2 => "ghj",
},
"1" => {
# tied Config::IniFiles::_section
param1 => "bcd",
param2 => "hjk",
},
}
You may want to perform operations on a flux of objects (as Powershell) instead of a flux of text, so
use strict;
use warnings;
use English;
sub operation {
# do something with objects
...
}
{
local $INPUT_RECORD_SEPARATOR = '';
# object are separated with empty lines
while (<STDIN>) {
# key value
my %object = ( m/^ ([^=]+) = ([[:print:]]*) $ /xmsg );
# key cannot have = included, which is the delimiter
# value are printable characters (one line only)
operation ( \%object )
}
A like also other answers.

Export ElasticSearch objects to CSV with Perl

I need to export some objects from an ElasticSearch db in the form of CSV "tables".
I just need to retrieve all records from a specified index.
I've tried this found from clintongormley, but I'm facing inssues. The perl code is:
#!/usr/bin/perl
use ElasticSearch;
use Text::CSV_XS;
my $csv_file = 'output.csv';
open my $fh, '>:encoding(utf8)', $csv_file or die $!;
my $csv = Text::CSV_XS->new;
my $e = ElasticSearch->new(servers => '127.0.0.1:9200');
my $s = $e->scrolled_search(
index => 'myindex',
type => 'mytype',
query => { match_all => '' }
);
my #field_names = qw(title name foo bar);
while (my $doc = $s->next) {
my #cols = map {$doc->$_} #field_names;
$csv->print($fh, \#cols);
}
close $fh or die $!;
I get the following:
[_na] query malformed, no field after start_object];
I think the problem is in the es query.
Any suggestions?
Mea culpa. I wrote that code very quickly without testing it :)
Also, the code is very old and refers to the now deprecated ElasticSearch.pm modulde. The new module is Elasticsearch.pm (note the small s).
Here is the code rewritten to use the new module, and so it actually works:
#!/usr/bin/perl
use Elasticsearch;
use Elasticsearch::Scroll;
use Text::CSV_XS;
my $csv_file = 'output.csv';
open my $fh, '>:encoding(utf8)', $csv_file or die $!;
my $csv = Text::CSV_XS->new;
$csv->eol("\r\n");
my $es = Elasticsearch->new( nodes => '127.0.0.1:9200' );
my $s = Elasticsearch::Scroll->new(
es => $es,
index => 'myindex',
type => 'mytype',
body => { query => { match_all => {} } }
);
my #field_names = qw(title count);
while ( my $doc = $s->next ) {
my #cols = map { $doc->{_source}{$_} } #field_names;
$csv->print( $fh, \#cols );
}
close $fh or die $!;
To test it, you can run these curl commands to setup an index with some data:
# delete the test index in case it already exists
curl -XDELETE localhost:9200/myindex
# create some sample docs
curl -XPOST localhost:9200/myindex/mytype/_bulk -d'
{"index": {}}
{"title": "Doc one", "count": 1}
{"index": {}}
{"title": "Doc two", "count": 2}
{"index": {}}
{"title": "Doc three", "count": 3}
'
If you then run the Perl code, the file output.csv will look like this:
"Doc two",2
"Doc three",3
"Doc one",1
Apologies for the bad original example code