Date naipulation in Perl - perl

My script is passed a date parameter in the format YYYYMMDD, like for example 20130227. I need to check whether it's a Monday. If yes then I need to retrieve the previous four days date values otherwise I should retrieve the two previous days' date values and store them in array.
For example if the parameter is 20130227 and it's a Monday, then I need to store ('20130227' '20130226' '20130225' '20130224') in an array. If it's not a Monday then I need to store only ('20130227' '20130226') in an array.
What perl function can I use for doing this? I am using perl on solaris 10.

Not all standard Perl commands are listed in the standard Perl list of commands. This is a big confusion for beginners and the main reason you end up seeing beginners use a lot of system commands to do things that could be done directly in Perl.
Many Perl commands are available if you include the Perl Module for that command. For example, you want to copy a file from one place to another, but there's no Perl copy command listed as a standard function. Many people end up doing this:
system ("cp $my_file $new_location");
However, there is a standard Perl module called File::Copy that includes the missing copy command:
use File::Copy;
copy ($my_file, $new_location);
The File::Copy module is included with Perl, but you have to know about modules and how they're used. Unfortunately, although they're a major part of Perl, they're simply not included in many Perl beginner books.
I am assuming your confusion comes from the fact you're looking for some command in Perl and not finding it. However, the Time::Piece module is a standard Perl module since Perl 5.10 that is used for date manipulation. Unfortunately, it's an object oriented module which can make its syntax a bit strange to users who aren't familiar with Object Oriented Perl. Fortunately, it's really very simple to use.
In object oriented programming, you create an object that contains your data. The object itself cannot easily be displayed, but contains all of the features of your data, and various methods (basically subroutines) can be used to get information on that object.
First thing you need to do is create a time object of your date:
my $time_obj = Time::Piece->strptime("20130227", "%Y%m%d");
Now, $time_obj represents your date. The 20130227 represents your date string. The %Y%m%d represents the format of your string (YYYYMMDD). Unfortunately the Time::Piece documentation doesn't tell you how the format works, but the format characters are documented in the Unix strftime manpage.
Once you have your time object, you can query it with all sorts of method (aka subroutines):
if ( $time_obj->day_of_week == 1 ) {
print "This is a Monday\n";
}
Read the documentation and try it out.

The generic toolkit for handling dates in Perl would be the DateTime module. It comes with a huge range of date parsing choices to get your strings formats in and out, and can easily query e.g. day-of-week.
A more lightweight, fast and recommended alternative might be Date::ISO8601 - your formats are quite close to that ISO format, but you would need to be willing to do a bit of manipulation on the variables e.g. my ($yyyy, $mm, $dd) = ( substr($d, 0,4), substr( $d, 4, 2 ), substr( $d, 6, 2 ) ); will grab the year month and day strings from your examples to feed to the module's constructor.
Please give these at least a try, and if you get stuck post some code on your question. Once you have some attempted code in the question, it is much quicker for someone to answer by filling in just the bits you don't know - you probably know a lot more about the solution you want than you think!

I'm not keen on helping someone who appears to have made no effort to them themselves. So I'm not going to give you an answer, but I'll suggest that you look at the Time::Piece module (a standard part of Perl since version 5.10). And, in particular, its strftime method.
That should be enought to get you started.

Related

How to override a subroutine such as `length` in Perl?

I would like to simply override the length subroutine to take in account ANSI escape sequences so I wrote this:
sub length {
my $str = shift;
if ($cfg{color}) {
return length($str =~ s/\x1B\[\d+[^m]*m//gr);
}
return length($str);
}
Unfortunately Perl detect the ambiguous call that is remplaced with CORE::length.
How can I just tell Perl to use the local declaration instead?
Of course, an alternative solution would be to rename each call to length with ansi_length and rename the custom function accordingly.
To those who want more details:
The context where I would like to override the core module length is a short code that generate ASCII tables (a bit like Text::ASCIITable, but with different features like multicolumns and multirows). I don't want to write a dedicated Perl module because I would like to keep my program as monolithic as possible because the people what will use it are not familiar with CPAN or even modules installation.
In this code, I need to know the width of each columns in each rows in order to align them properly. When a cell contain a colored text with an ANSI sequence like ^[[33mgreen^[[0m, I need to ignore the coloring sequences.
As I already use UTF-8 chars in my Program, I had to add this to my Program:
use utf8;
use open ':std', ':encoding(UTF-8)';
I noticed the utf8 module also overload the core subroutine length. I realized this will also be a good solution in my case.
Eventually I think I added enough details to this question. I would be glad to be notified why I got downvotes on this question. I don't think I can make this more clear. Also I think all these details are not usefull at all to understand the initial question...
Overwriting a core function is not a good idea. If you use a library, that itself uses the core function, the library function would be confronted with the overwritten function and may fail. You could create an own module/namespace ANSI:: or so, then use ANSI::length, but I think it is better to use a name like you proposed: ansi_length.
If you still insist:
You can overwrite the core function with
BEGIN {
*CORE::GLOBAL::length = sub ...
}
Whenever you need access to the origin CORE function, use
CORE::length.
This is valid for all built in functions of Perl.
Here is a reference : http://perldoc.perl.org/CORE.html

Running nested, dependent perl scripts

I have two perl scripts that I need to run together.
The first script defines a number of common functions and a main method.
Script 1(one.pl) Example:
#!/usr/bin/perl
sub getInformation
{
my $serverMode = $_[0];
my $port = $_[1];
return "You\nneed\nto\nparse\nme\nright\nnow\n!\n";
}
#main
&parseInformation(&getInformation($ARGV[0], $ARGV[1]));
The second is a script that calls the second script after defining two functions.
Script 2(two.pl) Example:
#!/usr/bin/perl
sub parseInformation
{
my $parsedInformation = $_[0];
#omitted
print "$parsedInformation";
}
my $parameterOne = $ARGV[0];
my $parameterTwo = $ARGV[1];
do "./one.pl $parameterOne $parameterTwo";
Command line usage:
> ./two.pl bayside 20
I have attempted to do this and the script seems to run however, whenever I run the script in perl -d two.pl mode I get no information from the debugger about the other script.
I have done some research and read about system, capture, require and do. If use the system function to run the script, how will I be able to export the functions defined in script two?
Questions:
1. Is there anyway to do this in perl?
2. If so how exactly do I need to achieve that?
I fully understand that perl is perl. Not another programming language. Unfortunately, when transitioning one tends to bring with what they knew with them. My apologies.
References:
How to run a per script from within a perl script
Perl documentation for require function
Generally speaking, that is not the way you should write reusable common functions in Perl. Instead, you should put the bulk of your code into Perl modules, and write just short scripts that act as wrappers for the modules. These short scripts should basically just grab and validate command-line arguments, pass those arguments to the modules for the real work, then format and output the results.
I really wish I could recommend perldoc perlmod to learn about writing modules, but it seems to mostly concentrate on the minutiae rather than a high-level overview of how to write and use a Perl module. Gabor Szabo's tutorial is perhaps a better place to start.
Here's a simple example, creating a script that outputs the Unix timestamp. This is the module:
# This file is called "lib/MyLib/DateTime.pm"
use strict;
use warnings;
package MyLib::DateTime;
use parent "Exporter";
our #EXPORT_OK = qw( get_timestamp );
sub get_timestamp {
my $ts = time;
return $ts;
}
1;
And this is the script that uses it:
#!/usr/bin/env perl
use strict;
use warnings;
use lib "/path/to/lib"; # path to the module we want, but
# excluding the "MyLib/DateTime.pm" part
use MyLib::DateTime qw( get_timestamp ); # import the function we want
# Here we might deal with input; e.g. #ARGV
# but as get_timestamp doesn't need any input, we don't
# have anything to do.
# Here we'll call the function we defined in the module.
my $result = get_timestamp();
# And here we'll do the output
print $result, "\n";
Now, running the script should output the current Unix timestamp. Another script that was doing something more complex with timestamps could also use MyLib::DateTime.
More importantly, another module which needed to do something with timestamps could use MyLib::DateTime. Putting logic into modules, and having those modules use each other is really the essence of CPAN. I've been demonstrating a really basic date and time library, but the king of datetime manipulation is the DateTime module on CPAN. This in turn uses DateTime::TimeZone.
The ease of re-using code, and the availability of a large repository of free, well-tested, and (mostly) well-documented modules on CPAN, is one of the key selling points of Perl.
Exactly.
Running 2 separate scripts at the same time won't give either script access to the others functions at all. They are 2 completely separate processes. You need to use modules. The point of modules is so that you don't repeat yourself, often called "dry" programming. A simple rule of thumb is:
If you are going to use a block of code more than once put it into a subroutine in the current script.
If you are going to use the same block in several programs put it in a module.
Also remember that common problems usually have a module on CPAN
That should be enough to get you going. Then if you're going to do much Perl Programming you should buy the book "Programming Perl" by Larry Wall, if you've programmed in other languages, or "Learning Perl" by Randal Schwartz if you're new to programming. I'm old fashioned so I have both books in print but you can still get them as ebooks. Also check out Perl.org as you're not alone.

Having a perl script make use of one among several secondary scripts

I have a main program mytool.pl to be run from the command line. There are several auxillary scripts special1.pl, special2.pl, etc. which each contain a couple subroutines and a hash, all identically named across scripts. Let's suppose these are named MySpecialFunction(), AnotherSpecialFunction() and %SpecialData.
I'd like for mytool to include/use/import the contents of one of the special*.pl files, only one, according to a command line option. For example, the user will do:
bash> perl mytool.pl --specialcase=5
and mytools will use MySpecialFunction() from special5.pl, and ignore all other special*.pl files.
Is this possible and how to do it?
It's important to note that the selection of which special file to use is made at runtime, so adding a "use" at the top of mytool.pl probably isn't the right thing to do.
Note I am a long-time C programmer, not a perl expert; I may be asking something obvious.
This is for a one-off project that will turn to dust in only a month. Neither mytool.pl nor special?.pl (nor perl itself) will be of interest beyond the end of this short project. Therefore, we don't care for solutions that are elaborate or require learning some deep magic. Quick and dirty preferred. I'm guessing that Perl's module mechanism is overkill for this, but have no idea what the alternatives are.
You can use a hash or array to map values of specialcase to .pl files and require or do them as needed.
#!/usr/bin/env perl
use strict; use warnings;
my #handlers = qw(one.pl two.pl);
my ($case) = #ARGV;
$case = 0 unless defined $case;
# check that $case is within range
do $handlers[$case];
print special_function(), "\n";
When you use a module, Perl just require's the module in a BEGIN block (and imports the modules exported items). Since you want to change what script you load at runtime, call require yourself.
if ($special_case_1) {
require 'special1.pl';
# and go about your business
}
Here's a good reference on when to use use vs. require.

How can I convert CGI input to UTF-8 without Perl's Encode module?

Through this forum, I have learned that it is not a good idea to use the following for converting CGI input (from either an escape()d Ajax call or a normal HTML form post) to UTF-8:
read (STDIN, $_, $ENV{CONTENT_LENGTH});
s{%([a-fA-F0-9]{2})}{ pack ('C', hex ($1)) }eg;
utf8::decode $_;
A safer way (which for example does not allow bogus characters through) is to do the following:
use Encode qw (decode);
read (STDIN, $_, $ENV{CONTENT_LENGTH});
s{%([a-fA-F0-9]{2})}{ pack ('C', hex ($1)) }eg;
decode ('UTF-8', $_, Encode::FB_CROAK);
I would, however, very much like to avoid using any modules (including XSLoader, Exporter, and whatever else they bring with them). The function is for a high-volume mod_perl driven website and I think both performance and maintainability will be better without modules (especially since the current code does not use any).
I guess one approach would be to examine the Encode module and strip out the functions and constants used for the “decode ('UTF-8', $_, Encode::FB_CROAK)” call. I am not sufficiently familiar with Unicode and Perl modules to do this. Maybe somebody else is capable of doing this or know a similar, safe “native” way of doing the UTF-8 conversion?
UPDATE:
I prefer keeping things non-modular, because then the only black-box is Perl's own compiler (unless of course you dig down into the module libs).
Sometimes you see large modules being replaced with a few specific lines of code. For example, instead of the CGI.pm module (which people are also in love with), one can use the following for parsing AJAX posts:
my %Input;
if ($ENV{CONTENT_LENGTH}) {
read (STDIN, $_, $ENV{CONTENT_LENGTH});
foreach (split (/&/)) {
tr/+/ /; s/%([a-fA-F0-9]{2})/pack("C", hex($1))/eg;
if (m{^(\w+)=\s*(.*?)\s*$}s) { $Input{$1} = $2; }
else { die ("bad input ($_)"); }
}
}
In a similar way, it would be great if one could extract or replicate Encode's UTF-8 decode function.
Don't pre-optimize. Do it the conventional way first then profile and benchmark later to see where you need to optimize. People usually waste all their time somewhere else, so starting off blindfolded and hadcuffed doesn't give you any benefit.
Don't be afraid of modules. The point of mod_perl is to load up everything as few times as possible so the startup time and module loading time are insignificant.
Don't use escape() to create your posted data. This isn't compatible with URL-encoding, it's a mutant JavaScript oddity which should normally never be used. One of the defects is that it will encode non-ASCII characters to non-standard %uNNNN sequences based on UTF-16 code units, instead of standard URL-encoded UTF-8. Your current code won't be able to handle that.
You should typically use encodeURIComponent() instead.
If you must URL-decode posted input yourself rather than using a form library (and this does mean you won't be able to handle multipart/form-data), you will need to convert + symbols to spaces before replacing %-sequences. This replacement is standard in form submissions (though not elsewhere in URL-encoded data).
To ensure input is valid UTF-8 if you really don't want to use a library, try this regex. It also excludes some control characters (you may want to tweak it to exclude more).

How can I extract fields from a CSV file in Perl?

I want to extract a particular fields from a csv file (830k records) and store into hash. Is there any fast and easy way to do in Perl with out using any external methods?
How can I achieve that?
Use Text::CSV_XS. It's fast, moderately flexible, and extremely well-tested. The answer to many of these questions is something on CPAN. Why spend the time to make something not as good as what a lot of people have already perfected and tested?
If you don't want to use external modules, which is a silly objection, look at the code in Text::CSV_XS and do that. I'm constantly surprised that people think that even though they think they can't use a module they won't use a known and tested solution as example code for the same task.
assuming normal csv (ie, no embedded commas), to get 2nd field for example
$ perl -F"," -lane 'print $F[1];' file
See also this code fragment taken from The Perl Cookbook which is a great book in itself for Perl solutions to common problems
using split command would do the job I guess. (guessing columns are separated by commas and no commas present in fields)
while (my $line = <INPUTFILE>){
#columns= split ('<field_separator>',$line); #field separator is ","
}
and then from elements of the "column" array you can construct whatever hash you like.