How to find strings that contain a substring from a given list

How to find strings that contain a substring from a given list - perl

I have a string that may or may not contain some specific words.
IF it contain the one of the works I want to print the string in a different color (depending on the word)
So I was thinking to have an array containing the list of the words (e.g. one for red word one for yellow and one for green as the example below:
push(#red_word, [ "error","ERROR","Assertion","assertion","Error","ASSERTION","Errors" ]);
push(#yellow_word, [ "WARNING","Warning","warning","PAUSED","Paused","paused","Warnings" ]);
push(#green_word, [ "ACTIVE","Active","active" ]);
$l is the string i want to check, I tried something like this
foreach my $l (#$lines) {
if ($l =~ #red_word) {
print '<FONT COLOR="FF0000">'.$l.'</FONT><br>';
}
else {
if ($l =~ #yellow_word) {
print '<FONT COLOR="FFFF00">'.$l.'</FONT><br>';
}
else {
if ($l =~ #green_word) {
print '<FONT COLOR="008000">'.$l.'</FONT><br>';
}
else {
print '<FONT COLOR="000000">'.$l.'</FONT><br>';
}
}
}
}
but the result is erratic, some lines are printed in red without any relation to the list red_word.
what am I doing wrong?

This isn't doing what you think it's doing:
push(#red_word, [ "error","ERROR","Assertion","assertion","Error","ASSERTION","Errors" ]);
push(#yellow_word, [ "WARNING","Warning","warning","PAUSED","Paused","paused","Warnings" ]);
push(#green_word, [ "ACTIVE","Active","active" ]);
You're creating a two dimensional data structure a single element array, containing a nested array.
$VAR1 = [
[
'error',
'ERROR',
'Assertion',
'assertion',
'Error',
'ASSERTION',
'Errors'
]
];
That match isn't going to work very well as a result. I'm not actually sure what it'll be doing, but it won't be testing 'if the word is in the list'.
Try instead building a regular expression from your array:
my #red_words = (
"error", "ERROR", "Assertion", "assertion",
"Error", "ASSERTION", "Errors"
);
my $is_red = join( "|", map {quotemeta} #red_words );
$is_red = qr/($is_red)/;
print "Red" if $line =~ m/$is_red/;
Perhaps something like this:
#!/usr/bin/env perl
use strict;
use warnings;
my %colour_map = (
'error' => 'FF0000',
'errors' => 'FF0000',
'assertion' => 'FF0000',
'warning' => 'FFFF00',
'warnings' => 'FFFF00',
'paused' => 'FFFF00',
'active' => '008000',
);
my $search = join( "|", map {quotemeta} keys %colour_map );
$search = qr/\b($search)\b/;
my #lines = (
"line containing assertion",
"a warning",
"green for active",
"A line containing ACTIVE"
);
foreach my $line (#lines) {
if ( my ($word) = $line =~ m/$search/ ) {
print "<FONT COLOR=\"$colour_map{lc($word)}\">$line</FONT><BR/>\n";
}
else {
print "<FONT COLOUR=\"000000\">$line</FONT><BR/>\n";
}
}
(Not entirely sure if there's a way to tranpose a list of matches. I'll have another think).

Related

Search whether AoH value exists in same Hash

I have hash which contains some data.
I want my final %hash to be printed like this:
'UGroup=1' => [ 'C72', 'C73', 'C71' ]
Here is my script:
use Data::Dumper;
my %h = (
'C72' => [ 'S=2-1' ],
'C73' => [ 'S=3-1' ],
'C71' => [ 'S=91-1'],
'UGroup=1' => [ 'S=1-1',
'S=2-1',
'S=3-1',
'S=91-1'],
);
print Dumper(\%h);
foreach my $C (sort keys %h) {
next unless $C =~ /UGroup/;
for my $f (#{$h{$C}}){
print "\tf:$f\n";
#This is not correct, but wanted to do something like this.
push #{$hash{$C}}, $f if(exists $h{$f});
}
}
print Dumper(\%hash);
Here in example input hash I need to check if S=91-1 has any key? If yes then associate that key to value for %hash with its original key.
How can I do that?

You didn't name the things, so
S=91-1 shall be a snake,
C71 shall be a cow, and
UGroup=1 shall be a group.
Start by building this hash:
my %cows_by_snake = (
'S=91-1' => [ 'C71' ],
'S=2-1' => [ 'C72' ],
'S=3-1' => [ 'C73' ],
);
Just ignore the keys that of %h that are groups when you do so.
Once you built a hash, it's simply a question of doing the following:
Create an empty result hash.
For each group,
Create an empty collection of cows.
For each snake associated the the group,
Add the cows associated with the snake to the collection.
Eliminate the duplicates in the collection of cows.
Add the group and the associated cows to the result hash.
my #groups;
my #cows;
for my $cow_or_group (keys(%h)) {
if ($cow_or_group =~ /^UGroup=/) {
push #groups, $cow_or_group;
} else {
push #cows, $cow_or_group;
}
}
my %cows_by_snake;
for my $cow (#cows) {
for my $snake (#{ $h{$cow} }) {
push #{ $cows_by_snake{$snake} }, $cow;
}
}
my %results;
for my $group (#groups) {
my %group_cows;
for my $snake (#{ $h{$group} }) {
for my $cow (#{ $cows_by_snake{$snake} }) {
++$group_cows{$cow};
}
}
$results{$group} = [ sort keys %group_cows ];
}

In Perl, how can I skip an empty key when traversing a hash?

This is my problem, I'm not very knowledgeable in Perl, and I have this function that needs to be fixed.
When this function deviceModelMenu() is called, the CLI displays the following text:
The following models are available
==================================================
1.
2. Cisco1240
3. Catalyst3750
4. Catalyst3650
5. HP2524
The first item is empty, which is wrong, and I need to fix that, the piece of code that displays this menu is:
my $features = shift;
print "=" x 50, "\n";
print "The following models are available\n";
print "=" x 50, "\n";
my $i=1;
foreach (keys %{$features->{features}[0]->{deviceModel}})
{
print "$i. $_ \n";
$i++;
}
If I add the following line:
warn Dumper($features->{features}[0]->{deviceModel});
It dumps this:
$VAR1 = {
'deviceModel' => {
'' => {
'cfg' => []
},
'Cisco1240' => {
'cfg' => [
'cisco1240feature.cfg'
]
},
'Catalyst3750' => {
'cfg' => [
'catalyst3750feature.cfg'
]
},
'Catalyst3650' => {
'cfg' => [
'catalyst3650feature.cfg'
]
},
'HP2524' => {
'cfg' => [
'hp2524feature.cfg'
]
}
}
};
As you may notice, the first item is indeed empty. I added the following line to skip it, and just print the rest of the info:
if ($_ eq '') {
shift;
}
But it doesn't seem to work do what I want. I want to skip the item if it's empty.

Well, shifting #ARGV (implicit argument to shift in main program) nor shifting #_ (implicit argument of shift in a function) are not going to help you, because you are not printing either of them.
You can either:
Not add the '' entry in the first place (depends on how it's generated)
Remove the '' entry before printing:
delete $features->{features}[0]->{deviceModel}->{''};
Don't print the entry:
if($_ eq '') {
next;
}
or
if($_ ne '') {
print "$i. $_ \n";
$i++;
}

foreach (keys %{$features->{features}[0]->{deviceModel}})
{
next unless length($_);
print "$i. $_ \n";
$i++;
}

#!/usr/bin/env perl
use strict; use warnings;
my $devices = {
'deviceModel' => {
'' => { 'cfg' => [] },
'Cisco1240' => { 'cfg' => ['cisco1240feature.cfg' ] },
'Catalyst3750' => { 'cfg' => [ 'catalyst3750feature.cfg' ]},
'Catalyst3650' => { 'cfg' => [ 'catalyst3650feature.cfg' ]},
'HP2524' => { 'cfg' => [ 'hp2524feature.cfg' ]},
}
};
{
my $item = 1;
for my $d (grep length, keys %{ $devices->{deviceModel} }) {
printf "%2d. %s\n", $item++, $d;
}
}
Output:
1. Catalyst3750
2. Cisco1240
3. Catalyst3650
4. HP2524

How to access hash of hash within conditional in Perl

I have a code like this:
use Data::Dumper;
my %hash = (
'chrX' => {
'b' => [
'-51811268 210',
'-51810794 350',
'-51809935 298'
],
'f' => [
'51929018 210',
'51929492 350',
'51930351 298'
]
}
);
foreach my $cnam ( keys %hash ) {
my #lpos_f = ();
my #lpos_b = ();
if ( $hash{$cnam}{"f"} ) {
#lpos_f = #{ $hash{$cnam}{"f"} };
print "+\n";
print Dumper \#lpos_f;
}
elsif ( $hash{$cnam}{"b"} ) {
#lpos_b = #{ $hash{$cnam}{"b"} };
print "-\n";
print Dumper \#lpos_b;
}
}
Why it didn't give print output in each ELSIF condition such
that it gives both these.
+
[
'51929018 210',
'51929492 350',
'51930351 298'
];
-
['-51811268 210',
'-51810794 350',
'-51809935 298'
];
Currently It only gives "+" output

Because %temp is not %hash. use strict would have told you.
Moreover, you cannot get both of if / else. Either the condition is true and you get the first part, or it is not and you get the else part. (With elsif, the second condition might be not true as well and you get nothing).

Simple hash search by value

I have a simple hash, and would like to return the $key based on $value criteria. That is, for line 14, what code would I need to return the $key where the $value is "yellow"?
1 #!/usr/bin/perl
2
3 # This program creates a hash then
4 # prints out what is in the hash
5
6 %fruit = (
7 'apple' => ['red','green'],
8 'kiwi' => 'green',
9 'banana' => 'yellow',
10 );
11
12 print "The apple is #{$fruit{apple}}.\n";
13 print "The kiwi is $fruit{kiwi}.\n";
14 print "What is yellow? ";

grep is the right tool for this job:
my #all_matches = grep { $fruit{$_} eq 'yellow' } keys %fruit;
print("$_ ") foreach #matching_keys;
my ($any_match) = grep { $fruit{$_} eq 'yellow' } keys %fruit;

I'm not so sure that's easy to do efficiently with a one-way hash. The whole point of a hash is to convert the key into a value (or position of the value if you're looking under the covers). You can do an exhaustive search over all the values, collecting the keys as you go but that's not as efficient as a hash lookup.
In order to go the other way efficiently, you might want to consider a two-way hash, something like:
%fruit = (
'apple' => ['red','green'],
'kiwi' => 'green',
'banana' => 'yellow',
);
%antifruit = (
'red' => 'apple',
'green' => ['apple','kiwi'],
'yellow' => 'banana',
);
print "The apple is #{$fruit{'apple'}}.\n";
print "The kiwi is $fruit{'kiwi'}.\n";
print "A yellow thing is $antifruit{'yellow'}.\n";

sub find_key {
my ( $h, $value ) = #_;
while ( my ( $k, $v ) = each %$h ) {
return $k if $v eq $value;
}
return;
}
So you could call it like so:
find_key( \%fruit, 'yellow' );

Since some of your values are arrays, you need to check for that.
Calling:
my #fruit = getfruit(\%fruit, $colour);
The subroutine:
sub getfruit {
my ($fruit, $col) = #_;
my #result;
for my $key (keys %$fruit) {
if (ref $fruit->{$key} eq 'ARRAY') {
for (#{$fruit->{$key}}) {
push #result, $key if /^$col$/i;
}
} else {
push #result, $key if $fruit->{$key} =~ /^$col$/i;
}
}
return #result;
}
Using a regex instead of eq is optional, just be mindful of keeping the same case, since Yellow and yellow are considered different keys.

I note your example has references to anonymous arrays, so I would just do a long winded foreach/if loop:
my %fruit = (
'apple' => ['red','green'],
'kiwi' => 'green',
'banana' => 'yellow',
);
print "The apple is #{$fruit{apple}}.\n";
print "The kiwi is $fruit{kiwi}.\n";
print "What is yellow? ";
my $ele;
my $search = 'yellow';
my #match = ();
foreach $ele (keys(%fruit)) {
if(ref($fruit{$ele}) eq 'ARRAY' and
grep { $_ eq $search } #{ $fruit{$ele} }) {
push(#match, $ele);
} elsif(!ref($fruit{$ele}) and $fruit{$ele} eq $search) {
push(#match, $ele);
}
}
print join(", ", #match) . "\n";

Perl Working On Two Hash References

I would like to compare the values of two hash references.
The data dumper of my first hash is this:
$VAR1 = {
'42-MG-BA' => [
{
'chromosome' => '19',
'position' => '35770059',
'genotype' => 'TC'
},
{
'chromosome' => '2',
'position' => '68019584',
'genotype' => 'G'
},
{
'chromosome' => '16',
'position' => '9561557',
'genotype' => 'G'
},
And the second hash is similar to this but with more hashes in the array. I would like to compare the genotype of my first and second hash if the position and the choromosome matches.
map {print "$_= $cave_snp_list->{$_}->[0]->{chromosome}\n"}sort keys %$cave_snp_list;
map {print "$_= $geno_seq_list->{$_}->[0]->{chromosome}\n"}sort keys %$geno_seq_list;
I could do that for the first array of the hashes.
Could you help me in how to work for all the arrays?
This is my actual code in full
#!/software/bin/perl
use strict;
use warnings;
use Getopt::Long;
use Benchmark;
use Config::Config qw(Sequenom.ini);
useDatabase::Conn;
use Data::Dumper;
GetOptions("sam=s" => \my $sample);
my $geno_seq_list = getseqgenotypes($sample);
my $cave_snp_list = getcavemansnpfile($sample);
#print Dumper($geno_seq_list);
print scalar %$geno_seq_list, "\n";
foreach my $sam (keys %{$geno_seq_list}) {
my $seq_used = $geno_seq_list->{$sam};
my $cave_used = $cave_snp_list->{$sam};
print scalar(#$geno_seq_list->{$_}) if sort keys %$geno_seq_list, "\n";
print scalar(#$cave_used), "\n";
#foreach my $seq2com (# {$seq_used } ){
# foreach my $cave2com( # {$cave_used} ){
# print $seq2com->{chromosome},":" ,$cave2com->{chromosome},"\n";
# }
#}
map { print "$_= $cave_snp_list->{$_}->[0]->{chromosome}\n" } sort keys %$cave_snp_list;
map { print "$_= $geno_seq_list->{$_}->[0]->{chromosome}\n" } sort keys %$geno_seq_list;
}
sub getseqgenotypes {
my $snpconn;
my $gen_list = {};
$snpconn = Database::Conn->new('live');
$snpconn->addConnection(DBI->connect('dbi:Oracle:pssd.world', 'sn', 'ss', { RaiseError => 1, AutoCommit => 0 }),
'pssd');
#my $conn2 =Database::Conn->new('live');
#$conn2->addConnection(DBI->connect('dbi:Oracle:COSI.world','nst_owner','nst_owner', {RaiseError =>1 , AutoCommit=>0}),'nst');
my $id_ind = $snpconn->execute('snp::Sequenom::getIdIndforExomeSample', $sample);
my $genotype = $snpconn->executeArrRef('snp::Sequenom::getGenotypeCallsPosition', $id_ind);
foreach my $geno (#{$genotype}) {
push #{ $gen_list->{ $geno->[1] } }, {
chromosome => $geno->[2],
position => $geno->[3],
genotype => $geno->[4],
};
}
return ($gen_list);
} #end of sub getseqgenotypes
sub getcavemansnpfile {
my $nstconn;
my $caveman_list = {};
$nstconn = Database::Conn->new('live');
$nstconn->addConnection(
DBI->connect('dbi:Oracle:CANP.world', 'nst_owner', 'NST_OWNER', { RaiseError => 1, AutoCommit => 0 }), 'nst');
my $id_sample = $nstconn->execute('nst::Caveman::getSampleid', $sample);
#print "IDSample: $id_sample\n";
my $file_location = $nstconn->execute('nst::Caveman::getCaveManSNPSFile', $id_sample);
open(SNPFILE, "<$file_location") || die "Error: Cannot open the file $file_location:$!\n";
while (<SNPFILE>) {
chomp;
next if /^>/;
my #data = split;
my ($nor_geno, $tumor_geno) = split /\//, $data[5];
# array of hash
push #{ $caveman_list->{$sample} }, {
chromosome => $data[0],
position => $data[1],
genotype => $nor_geno,
};
} #end of while loop
close(SNPFILE);
return ($caveman_list);
}

The problem that I see is that you're constructing a tree for generic storage of data, when what you want is a graph, specific to the task. While you are constructing the record, you could also be constructing the part that groups data together. Below is just one example.
my %genotype_for;
my $record
= { chromosome => $data[0]
, position => $data[1]
, genotype => $nor_geno
};
push #{ $gen_list->{ $geno->[1] } }, $record;
# $genotype_for{ position }{ chromosome }{ name of array } = genotype code
$genotype_for{ $data[1] }{ $data[0] }{ $sample } = $nor_geno;
...
return ( $caveman_list, \%genotype_for );
In the main line, you receive them like so:
my ( $cave_snp_list, $geno_lookup ) = getcavemansnpfile( $sample );
This approach at least allows you to locate similar position and chromosome values. If you're going to do much with this, I might suggest an OO approach.
Update
Assuming that you wouldn't have to store the label, we could change the lookup to
$genotype_for{ $data[1] }{ $data[0] } = $nor_geno;
And then the comparison could be written:
foreach my $pos ( keys %$small_lookup ) {
next unless _HASH( my $sh = $small_lookup->{ $pos } )
and _HASH( my $lh = $large_lookup->{ $pos } )
;
foreach my $chrom ( keys %$sh ) {
next unless my $sc = $sh->{ $chrom }
and my $lc = $lh->{ $chrom }
;
print "$sc:$sc";
}
}
However, if you had limited use for the larger list, you could construct the specific case
and pass that in as a filter when creating the longer list.
Thus, in whichever loop creates the longer list, you could just go
...
next unless $sample{ $position }{ $chromosome };
my $record
= { chromosome => $chromosome
, position => $position
, genotype => $genotype
};
...

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to find strings that contain a substring from a given list - perl

Related

Search whether AoH value exists in same Hash

In Perl, how can I skip an empty key when traversing a hash?

How to access hash of hash within conditional in Perl

Simple hash search by value

Perl Working On Two Hash References

Categories

Resources