Why my standard input not consider the flower brackets using perl? - perl

Here I had tried to read the file using standard input. In my case standard input fails to print the flower bracket contents.
My code:
#!/usr/local/bin/perl
use strict;
use warnings;
my $file = 'file.txt';
open my $fh, "<", $file
or die "Could not open '$file': $!";
chomp(my #files = <$fh>);
close $fh
or die "Coould not close '$file' $!";
while (my $stdin = <>) {
chomp $stdin;
if ( grep { $stdin eq $_ } #files ) {
print "#files\n";
last
}
else {
print "There is no word in the $file\n";
last;
}
}
File.txt:
{data1}
data2
data3
{data4}
File Execution:
perl t.pl
data1
There is no word in the file.txt

Looks like this question has been edited since I first looked at it a few hours ago. Originally, the crucial line looked like this:
if ( grep { $stdin eq $_ } #files ) {
That was never going to work because you are giving it "data1" as input and none of the lines matches that string using eq. You have a line that contains "data1", but as it is surrounded by "{" and "}", the strings are different - 'data1 eq {data1} is obviously false.
You have now changed that line to:
if ( grep { $stdin && $_ } #files ) {
And that's very strange. This check asks the question "do both $stdin and $_ contain true values?". And that will almost certainly always be true. I'm really not sure what that change was supposed to achieve.
Your question doesn't actually say what you're trying to do here. But I'm guessing that you want to match if any of the lines contains the string that is entered (but it's ok if it doesn't make up the entire line). In that case, you want a regex check and your line of code should be:
if ( grep { /\Q$stdin/ } #files ) {
Note: I've added the \Q as suggested in the comments. This is a good idea as it prevents the strings in #files being interpreted as containing regex metacharacters.

Related

Perl - Compare two large txt files and return the required lines from the first

So I am quite new to perl programming. I have two txt files, combined_gff.txt and pegs.txt.
I would like to check if each line of pegs.txt is a substring for any of the lines in combined_gff.txt and output only those lines from combined_gff.txt in a separate text file called output.txt
However my code returns empty. Any help please ?
P.S. I should have mentioned this. Both the contents of the combined_gff and pegs.txt are present as rows. One row has a string. second row has another string. I just wish to pickup the rows from combined_gff whose substrings are present in pegs.txt
#!/usr/bin/perl -w
use strict;
open (FILE, "<combined_gff.txt") or die "error";
my #gff = <FILE>;
close FILE;
open (DATA, "<pegs.txt") or die "error";
my #ext = <DATA>;
close DATA;
my $str = ''; #final string
foreach my $gffline (#gff) {
foreach my $extline (#ext) {
if ( index($gffline, $extline) != -1) {
$str=$str.$gffline;
$str=$str."\n";
exit;
}
}
}
open (OUT, ">", "output.txt");
print OUT $str;
close (OUT);
The first problem is exit. The output file is never created if a substring is found.
The second problem is chomp: you don't remove newlines from the lines, so the only way how a substring can be found is when a string from pegs.txt is a suffix of a string from combined_gff.txt.
Even after fixing these two problems, the algorithm will be very slow, as you're comparing each line from one file to each line of the second file. It will also print a line multiple times if it contains several different substrings (not sure if that's what you want).
Here's a different approach: First, read all the lines from pegs.txt and assemble them into a regex (quotemeta is needed so that special characters in substrings are interpreted literally in the regex). Then, read combined_gff.txt line by line, if the regex matches the line, print it.
#!/usr/bin/perl
use warnings;
use strict;
open my $data, '<', 'pegs.txt' or die $!;
chomp( my #ext = <$data> );
my $regex = join '|', map quotemeta, #ext;
open my $file, '<', 'combined_gff.txt' or die $!;
open my $out, '>', 'output.txt' or die $!;
while (<$file>) {
print {$out} $_ if /$regex/;
}
close $out;
I also switched to 3 argument version of open with lexical filehandles as it's the canonical way (3 argument version is safe even for files named >file or rm *| and lexical filehandles aren't global and are easier to pass as arguments to subroutines). Also, showing the actual error is more helpful than just dying with "error".
As choroba says you don't need the "exit" inside the loop since it ends the complete execution of the script and you must remove the line forwards (LF you do it by chomp lines) to find the matches.
Following the logic of your script I made one with the corrections and it worked fine.
#!/usr/bin/perl -w
use strict;
open (FILE, "<combined_gff.txt") or die "error";
my #gff = <FILE>;
close FILE;
open (DATA, "<pegs.txt") or die "error";
my #ext = <DATA>;
close DATA;
my $str = ''; #final string
foreach my $gffline (#gff) {
chomp($gffline);
foreach my $extline (#ext) {
chomp($extline);
print $extline;
if ( index($gffline, $extline) > -1) {
$str .= $gffline ."\n";
}
}
}
open (OUT, ">", "output.txt");
print OUT $str;
close (OUT);
Hope it works for you.
Welcho

How do I convert spaces to underscores in Perl?

I am writing a code to create files named after one column in an array, and then have all the common values of another column within it. I have successfully done this, but now I would like to eliminate the white space in-between the file names and convert it into underscores. How do I do that?
#!/usr/local/bin/perl
use strict;
my #traitarray;
my $traitarray;
my $input ;
my %traithash ;
my $t_out ;
my $TRAIT;
my $SNPS;
open ($input, "gwas_catalog_v1.0-downloaded_2015-07-08") || die () ;
while(<$input>) {
#traitarray = split (/\t/);
$TRAIT = $traitarray[7];
$SNPS = $traitarray[21];
if (!exists $traithash {$TRAIT}) {
open ($t_out, ">outputFiles/".$TRAIT.".txt");
print ($t_out "$SNPS\n");
$traithash {$TRAIT} = 1 ;
push (#traitarray, $TRAIT) ;
}
else {
print $t_out "$SNPS\n";
}
}
foreach ($traitarray) {
close "$TRAIT.txt";
}
I have tried looking for an answer but many of the questions either include something else as well, or how to go about this within the bash terminal, something I am not comfortable with yet, as I am still new to coding.
The file is 10947980 lines, and has 33 columns.
You're presumably talking about the file you open with this
open ($t_out, ">outputFiles/".$TRAIT.".txt")
You can do that using the transliterate operator first tr/ /_/ and your open call would be better written like this
my $outfile = "outputFiles/$TRAIT.txt";
$outfile =~ tr/ /_/;
open my $t_out, '>', $outfile or die qq{Unable to open "$outfile" for output: $!};
I assume you're referring to the value in the $TRAIT var which is then used as part of the new filename.
$TRAIT = $traitarray[7];
$TRAIT =~ s/\s+/_/g;

Find a line by using filereader in perl

I created a text-box .I want to control what the user writes in the text-box by reading a text file and comparing each line with the text using Perl. In my code I filled to param('text')
open(DATA, "<baba.txt") or die "Couldn't open file file.txt, $!";
while(<DATA>)
{
if($_=~param('text'))
{
print $_;
}
}
I have no problem while reading the file but i couldn't handle the matches. It returned nothing.
What is wrong with my code?
The right side of the =~ operator needs to be a regular expression.
See this site for more details.
while (<DATA>)
{
chomp;
if (param('text') =~ /\Q$_/)
{
print $_;
}
}
Perhaps the following will be helpful:
use strict;
use warnings;
my $text = <<END;
This is just a BU.NCH of text
in a here document that will
be used for some matching in
just a little bit.
END
while (<DATA>) {
chomp;
if ( $text =~ /\b\Q$_\E\b/i ) {
print $_, "\n";
}
}
__DATA__
this
some
bu.nch
a
hello
world
Output:
this
some
bu.nch
a
Before attempting to match a word read from a file, you need to chomp it, to remove the record separator (if any), which is usually \n. There are also a few other items for you to consider:
Whether you want a case-insensitive match
Escaping any meta-characters in your words which may be present
Forcing word borders to prevent an in-string match.
Item (1) above is achieved by using the /i modifier. Item (2) is done by enclosing the 'word' in the regex like this: \Q$_\E (\Quote-meta; \End Quote-meta). And the last uses \b: \b\Q$_\E\b.
Hope this helps!
try this
my $text = param('text');
open(DATA, "<baba.txt") or die "Couldn't open file file.txt, $!";
while(<DATA>)
{
if($_=~ /$text/ )
{
print $_;
}
}
because this -> $_=~param('text') not regexp search

Reading file line by line iteration issue

I have the following simple piece of code (identified as the problem piece of code and extracted from a much larger program).
Is it me or can you see an obvious error in this code that it stopping it from matching against $variable and printing $found when it definitely should be doing?
Nothing is printed when I try to print $variable, and there are definitely matching lines in the file I am using.
The code:
if (defined $var) {
open (MESSAGES, "<$messages") or die $!;
my $theText = $mech->content( format => 'text' );
print "$theText\n";
foreach my $variable (<MESSAGES>) {
chomp ($variable);
print "$variable\n";
if ($theText =~ m/$variable/) {
print "FOUND\n";
}
}
}
I have located this as the point at which the error is occurring but cannot understand why?
There may be something I am totally overlooking as its very late?
Update I have since realised that I misread your question and this probably doesn't solve the problem. However the points are valid so I am leaving them here.
You probably have regular expression metacharacters in $variable. The line
if ($theText =~ m/$variable/) { ... }
should be
if ($theText =~ m/\Q$variable/) { ... }
to escape any that there are.
But are you sure you don't just want eq?
In addition, you should read from the file using
while (my $variable = <MESSAGES>) { ... }
as a for loop will unnecessarily read the entire file into memory. And please use a better name than $variable.
This works for me.. Am I missing the question at hand? You're just trying to match "$theText" to anything on each line in the file right?
#!/usr/bin/perl
use warnings;
use strict;
my $fh;
my $filename = $ARGV[0] or die "$0 filename\n";
open $fh, "<", $filename;
my $match_text = "whatever";
my $matched = '';
# I would use a while loop, out of habit here
#while(my $line = <$fh>) {
foreach my $line (<$fh>) {
$matched =
$line =~ m/$match_text/ ? "Matched" : "Not matched";
print $matched . ": " . $line;
}
close $fh
./test.pl testfile
Not matched: this is some textfile
Matched: with a bunch of lines or whatever and
Not matched: whatnot....
Edit: Ah, I see.. Why don't you try printing before and after the "chomp()" and see what you get? That shouldn't be the issue, but it doesn't hurt to test each case..

How can I find the strings from one file in another file in Perl?

The script below takes function names in a text file and scans on a
folder that contains multiple c,h files. It opens those files one-by-one and
reads each line. If the match is found in any part of the files, it prints the
line number and the line that contains the match.
Everything is working fine except that the comparison is not working properly. I would be very grateful to whoever solves my problem.
#program starts:
use FileHandle;
print "ENTER THE PATH OF THE FILE THAT CONTAINS THE FUNCTIONS THAT YOU WANT TO
SEARCH: ";#getting the input file
our $input_path = <STDIN>;
$input_path =~ s/\s+$//;
open(FILE_R1,'<',"$input_path") || die "File open failed!";
print "ENTER THE PATH OF THE FUNCTION MODEL: ";#getting the folder path that
#contains multiple .c,.h files
our $model_path = <STDIN>;
$model_path =~ s/\s+$//;
our $last_dir = uc(substr ( $model_path,rindex( $model_path, "\\" ) +1 ));
our $output = $last_dir."_FUNC_file_names";
while(our $func_name_input = <FILE_R1> )#$func_name_input is the function name
#that is taken as the input
{
$func_name_input=reverse($func_name_input);
$func_name_input=substr($func_name_input,rindex($func_name_input,"\("+1);
$func_name_input=reverse($func_name_input);
$func_name_input=substr($func_name_input,index($func_name_input," ")+1);
#above 4 lines are func_name_input is choped and only part of the function
#name is taken.
opendir FUNC_MODEL,$model_path;
while (our $file = readdir(FUNC_MODEL))
{
next if($file !~ m/\.(c|h)/i);
find_func($file);
}
close(FUNC_MODEL);
}
sub find_func()
{
my $fh1 = FileHandle->new("$model_path//$file") or die "ERROR: $!";
while (!$fh1->eof())
{
my $func_name = $fh1->getline(); #getting the line
**if($func_name =~$func_name_input)**#problem here it does not take the
#match
{
next if($func_name=~m/^\s+/);
print "$.,$func_name\n";
}
}
}
$func_name_input=substr($func_name_input,rindex($func_name_input,"\("+1);
You're missing an ending parenthesis. Should be:
$func_name_input=substr($func_name_input,rindex($func_name_input,"\(")+1);
There's probably an easier way than those four statements, too. But it's a little early to wrap my head around it all. Do you want to match "foo" in "function foo() {"? If so, you could use a regex like /\s+([^) ]+)/.
When you say $func_name =~$func_name_input, you're treating all characters in $func_name_input as special regex characters. If this is not what you mean to do, you can use quotemeta (perldoc -f quotemeta): $func_name =~quotemeta($func_name_input) or $func_name =~ qr/\Q$func_name_input\E/.
Debugging will be easier with strictures (and a syntax-hilighting editor). Also note that, if you're not using those variables in other files, "our" doesn't do anything "my" wouldn't do for file-scoped variables.
find + xargs + grep does 90% of what you want.
find . -name '*.[c|h]' | xargs grep -n your_pattern
ack does it even easier.
ack --type=cc your_pattern
Simply take your list of patterns from your file and "or" them together.
ack --type=cc 'foo|bar|baz'
This has the benefit of only search the files once, and not once for each pattern being searched for as you're doing.
I still think you should just use ack, but your code needed some serious love.
Here is an improved version of your program. It now takes the directory to search and patterns on the command line rather than having to ask for (and the user write) files. It searches all the files under the directory, not just the ones in the directory, using File::Find. It does this in one pass by concatenating all the patterns into regular expressions. It uses regexes instead of index() and substr() and reverse() and oh god. It simply uses built in filehandles rather than the FileHandle module and checking for eof(). Everything is declared lexical (my) instead of global (our). Strict and warnings are on for easier debugging.
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
die "Usage: search_directory function ...\n" unless #ARGV >= 2;
my $Search_Dir = shift;
my $Pattern = build_pattern(#ARGV);
find(
{
wanted => sub {
return unless $File::Find::name =~ m/\.(c|h)$/i;
find_func($File::Find::name, $pattern);
},
no_chdir => 1,
},
$Search_Dir
);
# Join all the function names into one pattern
sub build_pattern {
my #patterns;
for my $name (#_) {
# Turn foo() into foo. This replaces all that reverse() and rindex()
# and substr() stuff.
$name =~ s{\(.*}{};
# Use \Q to protect against regex metacharacters in the input
push #patterns, qr{\Q$name\E};
}
# Join them up into one pattern.
return join "|", #patterns;
}
sub find_func {
my( $file, $pattern ) = #_;
open(my $fh, "<", $file) or die "Can't open $file: $!";
while (my $line = <$fh>) {
# XXX not all functions are unindented, but your choice
next if $line =~ m/^\s+/;
print "$file:$.: $line" if $line =~ $pattern;
}
}