How to make a script for two txt files with different names in perl

How to make a script for two txt files with different names in perl - perl

I want to make the same calculations in two similar files, but I do not want to double the code for each file nor to create two scripts for this.
my $file = "file1.txt";
my $tempfile = "file1_temp.txt";
if (not defined $file) {
die "Input file not found";
}
open(my $inputFileHandler, '<:encoding(UTF-8)', $file)
or die "Could not open file '$file' $!";
open(my $outs, '>', $tempfile) or die $!;
/*Calculations made*/
close($outs);
copy($tempfile,$file) or die "Copy failed: $!";
unlink($tempfile) or die "Could not delete the file!\n";
close($inputFileHandler);
So i want to do the exact calculations for file2.txt_temp and copy it in file2.txt is there a way to do it without writing the code again for file2?
Thank you very much.

Write your code as a Unix filter. Read the data from STDIN and write it to STDOUT. Your code will be simpler and your program will be more flexible.
#!/usr/bin/perl
use strict;
use warnings;
while (<STDIN>) {
# do a calculation using the data that is in $_
print $output_data
}
The cleverness is in how you call the program:
$ ./my_clever_filter < file1.txt > file1_out.txt
$ ./my_clever_filter < file2.txt > file2_out.txt
See The Unix Filter Model: What, Why and How? for far more information.

Assuming your code is well written (not manipulating any globals, ...) you could use a for-loop
foreach my $prefix ('file1', 'file2') {
my $file = $prefix . ".txt";
my $tempfile = $prefix . "_temp.txt";
...
}

There is a certain Perl feature that is designed especially for cases like this, and that is this:
$ perl -pi -e'/*Calculations made*/' file1.txt file2.txt ... fileN.txt
Loosely referred to as "in-place edit", which basically does what your code does: It writes to a temp file and then overwrites the original.
Which will apply your calculations to the files named as arguments. If you have complex calculations you can put them in a file and skip the -e'....' part
$ perl -pi foo.pl file1.txt ...
Say for example that your "calculations" consist of incrementing each pair of numbers by 1:
s/(\d+) (\d+)/($1 + 1) . ($2 + 1)/ge
You would do either
$ perl -pi -e's/(\d+) (\d+)/($1 + 1) . ($2 + 1)/ge' file1.txt file2.txt
$ perl -pi foo.pl file1.txt file2.txt
Where foo.pl contains the code.
Be aware that the -i switch is destructive, so make backups before running the command. You can supply a backup extension to save a backup, but that backup is overwritten if you run the command again. E.g. -i.bak.
-p places a while (<>) loop around your code, followed by a print of each line,
-i.bak does the editing of the original file, and saves a backup with the extension, if it is supplied.

Related

Read same extension multiple files in one directory in Perl

I currently have an issue with reading files in one directory.
I need to take all the fastq files in a file and run the script for each file then put new files in an ‘Edited_sequences’ folder.
The one script I had is
perl -ne '$i++; if($i<80001){print}' BM2003_TCCCAGAACAAC_L001_R1_001.fastq > ./Edited_sequences/BM2003_TCCCAGAACAAC_L001_R1_001.fastq
It takes the first 80000 lines in one fastq file then outputs the result.
Now for example I have 2000 fastq files, then I need to copy and paste for 2000 times.
I know there is a glob command suit for this situation but I just do not know how to deal with that.
Please help me out.

You can use perl to do copy/paste for you, first argument *.fastq are all fastq files, and second ./Edited_sequences is target folder for new files,
perl -e '$d=pop; `head -8000 "$_" > "$d/$_"` for #ARGV' *.fastq ./Edited_sequences

glob gets you an array of filenames matching a particular expression. It's frequently used with <> brackets, a lot like reading input (you can think of it as reading files from a directory).
This is a simple example that will print the names of every ".fastq" file in the current directory:
print "$_\n" for <*.fastq>;
The important part is <*.fastq>, which gives us an array of filenames matching that expression (in this case, a file extension). If you need to change which directory your Perl script is working in, you can use chdir.
From there, we can process your files as needed:
while (my $filename = <*.fastq>) {
open(my $in, '<', $filename) or die $!;
open(my $out, '>', "./Edited_sequences/$filename") or die $!;
for (1..80000) {
my $line = <$in>;
print $out $line;
}
}

You have two choices:
Use Perl to read in the 2000 files and run it as part of your program
Use the Shell to pass each of those 2000 file to your command line
Here's the bash alternative:
for file in *.fastq
do
perl -ne '$i++; if($i<80001){print}' "$file" > "./Edited_sequences/$file"
done
Your same Perl script, but with the shell finding each file. This should work and not overload the command line. The for loop in bash, if handed a glob can expand them correctly.
However, I always recommend that you don't actually execute the command, but echo the resulting commands into a file:
for file in *.fastq
do
echo "perl -ne '\$i++; if(\$i<80001){print}' \
\"$file\" > \"./Edited_sequences/$file\"" >> myoutput.txt
done
Then, you can look at myoutput.txt to make sure it looks good before you actually do any real harm. Once you've determined that myoutput.txt is a good file, you can execute that as a shell script:
$ bash myoutput.txt

Create files from text list using command line?

I know that normally you can just use touch filename to create new files via command line. But, in the text file I have a list of about 500 cities and states, each on a new line. I need to use command line to create a new text file for each of the cities/states. For example, Texas.txt, New York.txt, California.txt
The name of the file that contains the list is newcities.txt - Is this possible to do in command line or through Perl?

You can do this directly in the shell, no need for perl
cat myfile | while read f; do echo "Creating file $f"; touch "$f"; done

perl -lnwe 'open my $fh,">", "$_.txt" or die "$_: $!";' cities.txt
Using the -l option to autochomp the input. The open will create a new empty file, and the file handle will be autoclosed when it goes out of scope.

how about a simple:
cat fileName | xargs touch

Here's a one-liner in perl, assuming each city is on a new line
perl -ne 'chomp; `touch $_`;' newcities.txt
Here's the script version:
#!/usr/bin/perl
use warnings;
use strict;
open my $fh, "<", "./newcities.txt"
or die "Cannot open file: $!";
while( my $line = <$fh> ){
chomp $line;
system("touch $line");
}
close $fh;

Perl script to copy rows to place in new file

I have a lot of text files in a folder. The folder is 'c:\vehicles'. For every text file, I want to copy any row that includes the words: model, make, year. The file I want to write to is 'vehicles.txt' and located in 'c:\'.
I know I've written the code wrong. What should I do to correct it? Thanks for the help.
C:\vehicles $ ls -A | xargs head -qn 30 | perl -Mstrict -wne 'if( $ +_ =~ /(make)|(model)|(year)/ ) { print "$_"; }' > vehicles.txt
grep -rE "(make|model|year)" c:

Perhaps the following will help:
use strict;
use warnings;
for my $file (<*.txt>) {
open my $fh, '<', $file or die $!;
while (<$fh>) {
chomp;
print "$_\n" if /(?:\b(?:model|make|year)\b)/i;
}
close $fh;
}
Assuming the script will be in c:\vehicles, type perl scriptName.pl >vehicles.txt at the command prompt.
The <*.txt> notation returns a list of all text files in the directory. Each of these files are opened and read, line by line. If any of the words your looking for are found on a line, it's printed. The >vehicles.txt notation means to print to the file.

How to read multiple files from a directory, extract specific strings and ouput to an html file?

Greetings,
I have the following code and am stuck on how I would proceed to modify it so it will ask for the directory, read all files in the directory, then extract specific strings and ouput to an html file? Thanks in advance.
#!/usr/local/bin/perl
use warnings;
use strict;
use Cwd;
print "Enter filename: "; # Should be Enter directory
my $perlfile =STDIN;
open INPUT_FILE, $perlfile || die "Could not open file: $!";
open OUTPUT, '>out.html' || die "Could not open file: $!";
# Evaluates the file and imports it into an array.
my #comment_array = ;
close(INPUT_FILE);
chomp #comment_array;
#comment_array = grep /^\s*#/g, #comment_array;
my $comment;
foreach $comment (#comment_array) {
$comment =~ /####/; #Pattern match to grab only #s
# Prints comments to screen
Print results in html format
# Writes comments to output.html
Writes results to html file
}
close (OUTPUT);

Take it one step at a time. You have a lot planned, but so far you haven't even changed your prompt string to ask for a directory.
To read the entered directory name, your:
my $perlfile =STDIN;
gives an error (under use strict;). Start by looking that error up (use diagnostics; automates this) and trying to figure out what you should be doing instead.
Once you can prompt for a directory name and print it out, then add code to open the directory and read the directory. Directories can be opened and read with opendir and readdir. Make sure you can read the directory and print out the filenames before going on to the next step.

a good starting point to learn about specific functions (from the cmd line)
perldoc -f opendir
However, your particular problem is answered as follows, you can also use command line programs and pipe them into a string to simplify file handling ('cat') and pattern matching ('grep').
#!/usr/bin/perl -w
use strict;
my $dir = "/tmp";
my $dh;
my #patterns;
my $file;
opendir($dh,$dir);
while ($file = readdir($dh)){
if (-f "$dir/$file"){
my $string = `cat $dir/$file | grep pattern123`;
push #patterns, $string;
}
}
closedir($dh);
my $html = join("<br>",#patterns);
open F, ">out.html";
print F $html;
close F;

execute command line command from Perl?

HI all,
i need to have this command line command executed from a Perl file:
for file in *.jpg ; do mv $file `echo $file | sed 's/\(.*\.\)jpg/\1jp4/'` ; done
So I added the folders and tried:
system "bash -c 'for file in $mov0to/*.jp4 ; do mv $file `echo $basedir/internal/0/$file | sed 's/\(.*\.\)jp4/\1jpg/'` ; done'";
But all I get is:
sh: Syntax error: "(" unexpected
No file specified
I am on Kubuntu 10.4
Thanks,
Jan

I can think of many better ways of doing this, but ideally you want pure Perl:
use File::Copy qw( move );
opendir DIR, $mov0to or die "Unable to open $mov0to: $!";
foreach my $file ( readdir DIR ) {
my $out = $file;
next unless $out =~ s/\.jp4$/.jpg/;
move "$mov0to/$file", "$basedir/internal/0/$out" or die "Unable to move $mov0to/$file->$basedir/internal/0/$out: $!";
}
closedir DIR;
If you insist on doing the lifting in Bash, you should be doing:
system qq(bash -c 'for file in $mov0to/*.jp4 ; do mv \$file $basedir/internal/0/\${file\/%.jp4\/.jpg} ; done');

All of the variables inside the double quotes will get interpolated, the $file that you're using in the bash for loop in particular. We don't have enough context to know where the other variables ($mov0to and $basedir) come from or what they contain so they might be having the same problems. The double quotes are also eating your backslashes in the sed part.
Andy White is right, you'd have less of a mess if the bash commands were in a separate script.

It's most likely a problem with nested quotes. I would recommend either moving that bash command into its own (bash) script file, or writing the loop/logic directly in Perl code, making simpler system calls if you need to.