How to use Perl to count specific characters in a string [closed] - perl

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have a Perl script that I am using to create a file.
I have a variable that holds a number of file paths separated by commas.
path/to/file1,path/to/file2....path/tofileN
Depending on how many file paths are returned, I need to create another string that creates random strings, to match with each file up to N files
If my first string variable contains 3 file paths, I need to create a string like
RandomName1,RandomName2,RandomName3
and write it out to my output file.
How can I parse the incoming string of file paths to determine how many file paths there are in total?
How can I write a loop to create a file name for each incoming file path, upto N files?

I don't quite see why you would put the list of names into a separate string, but here we go.
use strict;
use warnings;
sub create_random_name {
# return a random filename
}
my $foo = '/home/foo,/root,/dev/null';
my #filenames;
foreach (split ',', $foo) {
push #filenames, create_random_name();
}
print join ',', #filenames;
__END__
efe277fe7aa54f7231dedef7ac8c1e3a,327f56cff4bd21b03ee3ceaa4280014c,7f1ca3feb3b51f7a9ee84f08b1791785
Let's see.
I've created a sub create_random_name that should return some randomness. Without further specification of what you need, I will leave that out of the answer.
We split your string of paths into an array, but since you do not want them, we only loop through the results. There is no my $bar in the foreach for the same reason.
We only want to create_random_name for the same number of files as there are paths. Those are pushed into #filenames,
which we then join on , to make them look the same as our starting point, the list in $foo.

For 1), you probably want split as RC said. For 2), File::Temp should do the trick.
use File::Temp 'tempfile';
my $orig_path_str = 'path/to/file1,path/to/file2....path/tofileN';
#orig_paths = split(/,/, $orig_path_str);
my #random_paths;
# loops number of times equalling your number of paths
foreach (1..scalar(#orig_paths)) {
my ($fh, $filename) = tempfile();
push(#random_paths, $filename);
}
my $random_path_str = join(',', #random_paths);

Related

Need to grep a string from a line in a variable and store it in another variable [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I have a variable $abc that contains the line like below:
abc.txt -> check a/b/c test
I need to get abc.txt in another variable say $bcd and a/b/c in the variable say $xyz. I have the regex to do so but I don't know how I can do it using perl as in my knowledge perl grep and perl map can be used on arrays only not variables.
my ($bcd) = split(/\s*->\s*/, $abc, 2);
my $bcd = $abc =~ s/\s*->.*//sr;
my ($bcd) = $abc =~ /^(?:(?!\s*->).)*)/s;
my ($bcd) = $abc =~ /^(.*?)\s*->/s;
All but the last returns the entire input string if there's no ->.

Find all instances of every file name in a directory tree [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
This Perl script has a search function that searches for files in subdirectories that have the same filename.
The output generated must be something like this:
file_name1: ~/X/file_name1 ~/?/??/file_name1
file_name2: ~/X/XYZ/file_name2 ~/?/?????/ ??/file_name2
I have tried the way using below:
find . -type f | perl -e'
while ( <> ) {
chomp;
push #{ $h{ substr($_, rindex($_, "/") + 1) } }, $_;
}
for $k (keys %h) {
next if #{ $h{$k} } < 2;
print "$k: #{ $h{$k} }\n"
}'
Does anyone have another solution to this?
It seems that you are looking for duplicate files, simply put.
A couple of approaches
There is a module File::Find::Duplicates, what may just solve your problem directly.
Use a module to find all files, say File::Find::Rule. Then use one of many ways to find duplicates in list(s), or use a hash (see this post). Use MD5 to confirm that the files are the same.
There are posts on SO on details mentioned above, and in fact on the whole topic, please search.

Why cannot I read a file line by line in a Perl script? [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 7 years ago.
Improve this question
I have a Perl script, which has to read a file line by line.
Line in the file:
0060|9592014|A001-9592014-0060|82769|NOVARTIS PHARMA SERVICES AG BASEL|51671|NOVARTIS AG|A+|SWITZERLAND|Guarantees Issued|12/31/2016|12/31/2016|0|0|0|0|0|0|0|0|0|29014.0967835279993469339764885601502052|0||||0|1|550.3648|32541||.32|SUIG|OLEG|AAA||||||END|
I need to get only 32 fields, the first 32.
open (PRISM, "$infile") or die "Can't open $infile\n";
while (my $file_line = <PRISM>)
{
last if ($file_line=~/^PRISMEXP/);
next if ($file_line=~/^(\s)*$/); # Skip blank lines
print "LINE: $file_line\n"; # This line doesn't print anything
my #field = (split /\|/, $file_line[0-32]);
print "$field[0]\n"; #This line doesn't print anything
}
And as you can see, this part of code doesn't read the file and doesn't print anything. Why? Where is my mistake?
Where you have WHILE you should have while.
Also, your blank line check should have =~, not =.
Your split uses $file_line[0-32] is the same thing as $file_line[-32], which is the 32nd element from the end of #file_line, but you haven't set that array anywhere; I'm guessing that should be substr($file_line,0,32).
Or, if you only want the first 32 fields, it should be:
my #field;
#field[0..31] = split /\|/, $file_line;
Always use use strict; use warnings;. It would have caught the last error, and likely the second error too.
Here are some notes on your program that should help you improve your success rate
Always use strict and use warnings at the top of every Perl program, if you haven't done that already
Use lexical file handles, like my $prism_fh instead of global bareword file handles like PRISM
Don't put scalar variables inside double quotes. At best it will make no difference, and at worst you will get a completely different string
Always put the $! variable in your die string when checking the status of open calls. It will tell you why the open failed. Also, perl will add the source file name and line number to the output of die unless you put a newline on the end of your string, so don't do that if you want to know where in your code the error occurred
It is often better to use the default variable $_ when reading from a file. Many operators use it as their default parameter, making for more concise and tidy code
Don't forget unless. You can more cleanly check whether a line contains non-blanks by using next unless $file_line =~ /\S/
If you don't chomp the input lines then there is no need to put a newline on the end when you print the output
You need to split lines before you can select fields from the input $file_line[0-32] isn't valid Perl
Here's your Perl code refactored so that it prints the first 32 pipe-separated fields. I hope it is obvious that it needs a preamble that does use strict and use warnings and defines $infile.
open my $prism_fh, '<', $infile or die qq{Can't open "$infile": $!\n};
while (<$prism_fh>) {
next unless /\S/;
last if /^PRISMEXP/;
chomp;
my #fields = (split /\|/);
print join('|', #fields[0 .. 31]), "\n";
}
output
0060|9592014|A001-9592014-0060|82769|NOVARTIS PHARMA SERVICES AG BASEL|51671|NOVARTIS AG|A+|SWITZERLAND|Guarantees Issued|12/31/2016|12/31/2016|0|0|0|0|0|0|0|0|0|29014.0967835279993469339764885601502052|0||||0|1|550.3648|32541||.32
Update
Instead of splitting and recombining, you could use a regular expression to grab the first 32 pipe-separated fields, like this
while (<$prism_fh) {
next unless /\S/;
last if /^PRISMEXP/;
chomp;
print $1, "\n" if /^((?:[^|]*\|){31}[^|]*)/;
}
The output is identical to that of the program above.
Because of the line:
last if ($file_line=~/^PRISMEXP/);
If the first line of $infile begins with PRISMEXP you will never print anything.
You have also to change the line:
my #field = (split /\|/, $file_line[0-32]);
to:
my #field = (split /\|/, $file_line)[0..32];

Randomizing 3 lines to display in CGI with Perl [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I'm trying to write a CGI script that will take three lines of text and randomize them. Each time you view the webpage, the three lines will appear one after the other in a different order each time. How do I do this and what is the code?
perldoc -q "random line"
Found in D:\sb\perl\lib\perlfaq5.pod
How do I select a random line from a file?
Short of loading the file into a database or pre-indexing the lines in
the file, there are a couple of things that you can do.
Here's a reservoir-sampling algorithm from the Camel Book:
srand;
rand($.) < 1 && ($line = $_) while <>;
This has a significant advantage in space over reading the whole file
in. You can find a proof of this method in *The Art of Computer
Programming*, Volume 2, Section 3.4.2, by Donald E. Knuth.
You can use the File::Random module which provides a function for that
algorithm:
use File::Random qw/random_line/;
my $line = random_line($filename);
Another way is to use the Tie::File module, which treats the entire file
as an array. Simply access a random array element.
or
perldoc -q shuffle
Found in D:\sb\perl\lib\perlfaq4.pod
How do I shuffle an array randomly?
If you either have Perl 5.8.0 or later installed, or if you have
Scalar-List-Utils 1.03 or later installed, you can say:
use List::Util 'shuffle';
#shuffled = shuffle(#list);
If not, you can use a Fisher-Yates shuffle.
sub fisher_yates_shuffle {
my $deck = shift; # $deck is a reference to an array
return unless #$deck; # must not be empty!
my $i = #$deck;
while (--$i) {
my $j = int rand ($i+1);
#$deck[$i,$j] = #$deck[$j,$i];
}
}
use List::Util qw( shuffle );
#lines = shuffle(#lines);

How can I combine files into one CSV file?

If I have one file FOO_1.txt that contains:
FOOA
FOOB
FOOC
FOOD
...
and a lots of other files FOO_files.txt. Each of them contains:
1110000000...
one line that contain 0 or 1 as the number of FOO1 values (fooa,foob, ...)
Now I want to combine them to one file FOO_RES.csv that will have the following format:
FOOA,1,0,0,0,0,0,0...
FOOB,1,0,0,0,0,0,0...
FOOC,1,0,0,0,1,0,0...
FOOD,0,0,0,0,0,0,0...
...
What is the simple & elegant way to conduct that
(with hash & arrays -> $hash{$key} = \#data ) ?
Thanks a lot for any help !
Yohad
If you can't describe a your data and your desired result clearly, there is no way that you will be able to code it--taking on a simple project is a good way to get started using a new language.
Allow me to present a simple method you can use to churn out code in any language, whether you know it or not. This method only works for smallish projects. You'll need to actually plan ahead for larger projects.
How to write a program:
Open up your text editor and write down what data you have. Make each line a comment
Describe your desired results.
Start describing the steps needed to change your data into the desired form.
Numbers 1 & 2 completed:
#!/usr/bin perl
use strict;
use warnings;
# Read data from multiple files and combine it into one file.
# Source files:
# Field definitions: has a list of field names, one per line.
# Data files:
# * Each data file has a string of digits.
# * There is a one-to-one relationship between the digits in the data file and the fields in the field defs file.
#
# Results File:
# * The results file is a CSV file.
# * Each field will have one row in the CSV file.
# * The first column will contain the name of the field represented by the row.
# * Subsequent values in the row will be derived from the data files.
# * The order of subsequent fields will be based on the order files are read.
# * However, each column (2-X) must represent the data from one data file.
Now that you know what you have, and where you need to go, you can flesh out what the program needs to do to get you there - this is step 3:
You know you need to have the list of fields, so get that first:
# Get a list of fields.
# Read the field definitions file into an array.
Since it is easiest to write CSV in a row oriented fashion, you will need to process all your files before generating each row. So you'll need someplace to store the data.
# Create a variable to store the data structure.
Now we read the data files:
# Get a list of data files to parse
# Iterate over list
# For each data file:
# Read the string of digits.
# Assign each digit to its field.
# Store data for later use.
We've got all the data in memory, now write the output:
# Write the CSV file.
# Open a file handle.
# Iterate over list of fields
# For each field
# Get field name and list of values.
# Create a string - comma separated string with field name and values
# Write string to file handle
# close file handle.
Now you can start converting comments into code. You could have anywhere from 1 to 100 lines of code for each comment. You may find that something you need to do is very complex and you don't want to take it on at the moment. Make a dummy subroutine to handle the complex task, and ignore it until you have everything else done. Now you can solve that complex, thorny sub-problem on it's own.
Since you are just learning Perl, you'll need to hit the docs to find out how to do each of the subtasks represented by the comments you've written. The best resource for this kind of work is the list of functions by category in perlfunc. The Perl syntax guide will come in handy too. Since you'll need to work with a complex data structure, you'll also want to read from the Data Structures Cookbook.
You may be wondering how the heck you should know which perldoc pages you should be reading for a given problem. An article on Perlmonks titled How to RTFM provides a nice introduction to the documentation and how to use it.
The great thing, is if you get stuck, you have some code to share when you ask for help.
If I understand correctly your first file is your key order file, and the remaining files each contain a byte per key in the same order. You want a composite file of those keys with each of their data bytes listed together.
In this case you should open all the files simultaneously. Read one key from the key order file, read one byte from each of the data files. Output everything as you read it to you final file. Repeat for each key.
It looks like you have many foo_files that have 1 line in them, something like:
1110000000
Which stands for
fooa=1
foob=1
fooc=1
food=0
fooe=0
foof=0
foog=0
fooh=0
fooi=0
fooj=0
And it looks like your foo_res is just a summation of those values? In that case, you don't need a hash of arrays, but just a hash.
my #foo_files = (); #NOT SURE HOW YOU POPULATE THIS ONE
my #foo_keys = qw(a b c d e f g h i j);
my %foo_hash = map{ ( $_, 0 ) } #foo_keys; # initialize hash
foreach my $foo_file ( #foo_files ) {
open( my $FOO, "<", $foo_file) || die "Cannot open $foo_file\n";
my $line = <$FOO>;
close( $FOO );
chomp($line);
my #foo_values = split(//, $line);
foreach my $indx ( 0 .. $#foo_keys ) {
last if ( ! $foo_values[ $indx ] ); # or some kind of error checking if the input file doesn't have all the values
$foo_hash{ $foo_keys[$indx] } += $foo_values[ $indx ];
}
}
It's pretty hard to understand what you are asking for, but maybe this helps?
Your specifications aren't clear. You couldn't have a "lots of other files" named FOO_files.txt, because it's only one name. So I'm going to take this as the files-with-data + filelist pattern. In this case, there are files named FOO*.txt, each containing "[01]+\n".
Thus the idea is to process all the files in the filelist file and to insert them all into a result file FOO_RES.csv, comma-delimited.
use strict;
use warnings;
use English qw<$OS_ERROR>;
use IO::Handle;
open my $foos, '<', 'FOO_1.txt'
or die "I'm dead: $OS_ERROR";
#ARGV = sort map { chomp; "$_.txt" } <$foos>;
$foos->close;
open my $foo_csv, '>', 'FOO_RES.csv'
or die "I'm dead: $OS_ERROR";
while ( my $line = <> ) {
my ( $foo_name ) = ( $ARGV =~ /(.*)\.txt$/ );
$foo_csv->print( join( ',', $foo_name, split //, $line ), "\n" );
}
$foo_csv->close;
You don't really need to use a hash. My Perl is a little rusty, so syntax may be off a bit, but basically do this:
open KEYFILE , "foo_1.txt" or die "cannot open foo_1 for writing";
open VALFILE , "foo_files.txt" or die "cannot open foo_files for writing";
open OUTFILE , ">foo_out.txt"or die "cannot open foo_out for writing";
my %output;
while (<KEYFILE>) {
my $key = $_;
my $val = <VALFILE>;
my $arrVal = split(//,$val);
$output{$key} = $arrVal;
print OUTFILE $key."," . join(",", $arrVal)
}
Edit: Syntax check OK
Comment by Sinan: #Byron, it really bothers me that your first sentence says the OP does not need a hash yet your code has %output which seems to serve no purpose. For reference, the following is a less verbose way of doing the same thing.
#!/usr/bin/perl
use strict;
use warnings;
use autodie qw(:file :io);
open my $KEYFILE, '<', "foo_1.txt";
open my $VALFILE, '<', "foo_files.txt";
open my $OUTFILE, '>', "foo_out.txt";
while (my $key = <$KEYFILE>) {
chomp $key;
print $OUTFILE join(q{,}, $key, split //, <$VALFILE> ), "\n";
}
__END__