I am a Perl newbie and despite that I manage to knock some code together, I am generally slowed down when I need to perform array manipulations -- clearly not yet familiar enough. Currently, I am writing a Perl code that queries a mySQL database and subsequently generates a LaTeX table using the package LaTeX::Table. In the process I need to manipulate arrays, and I am seeking advise to improve my understanding of dealing with arrays in Perl.
In the first part of the code I query a mySQL database using a subroutine:
my #stashYear = queryRecent($Ticker,'YEAR');
print "Last Year:\t";
### Dump the stash contents!
foreach my $array_ref ( #stashYear ) {
print "#$array_ref\n";
}
I have analogue code for a monthly and weekly query. The output looks of the above looks as follows:
bash-3.2$ ./test.pl DG.PA
Last Year: 2015 45.935 62.6 43.4 59.14 256
Last Month: 2016-03 63.9 66.69 62.36 65.47 21
Last Week: 2016-15 64.96 66.24 64 65.95 5
bash-3.2$
At this stage I wish to add and element to the arrays to obtain the following:
Year 2015 45.935 62.6 43.4 59.14 256
Month 2016-03 63.9 66.69 62.36 65.47 21
Week 2016-15 64.96 66.24 64 65.95 5
I have tried a variety of methods, but until now I am unsuccessful. I have played around with unshift, but none of my attempts have resulted in a satisfactory outcome.
In the remainder of the code, I combine the arrays:
#Combine Year, Month and Week Array into one
my #stashRecent = #stashYear;
push (#stashRecent, #stashMonth);
push (#stashRecent, #stashWeek);
#Convert Array in Latex::Table compatible format
my $recentData = \#stashRecent;
And subsequently I pass it on to LaTeX::Table to generate the required table:
$table = LaTeX::Table->new(
{
data => $recentData,
}
);
$table->generate();
print $table->generate_string();
It get the expected output:
bash-3.2$ ./test.pl DG.PA
\begin{table}
\centering
\begin{tabular}{lrrrrr}
\toprule
2015 & 45.935 & 62.6 & 43.4 & 59.14 & 256 \\
2016-03 & 63.9 & 66.69 & 62.36 & 65.47 & 21 \\
2016-15 & 64.96 & 66.24 & 64 & 65.95 & 5 \\
\bottomrule
\end{tabular}
\end{table}
bash-3.2$
Despite I manage to combine the arrays and convert them into the format required by LaTeX::Table, I have not succeeded to add the elements (Year, Month and Week), as described above.
Any feedback to get me going and improve my knowledge will be highly appreciated. In the mean time, I thank everybody who to give this some thought, and help a newbie with becoming a little less newbie.
I'm not sure I understand the question fully, but it seems the following should do the work:
unshift #stashYear, 'Year';
unshift #stashMonth, 'Month';
unshift #stashWeek, 'Week';
Related
I need to create a list of days between a date interval.
Say for example from 2001-01-01 to 2009-12-31:
2001-01-01
2001-01-02
2001-01-03
..
2009-12-29
2009-12-30
2009-12-31
I know how to do it but maybe someone has a script already made?
If not, I will make such a script and upload it so others won't waste time on this when they need it.
I do not know awk from GnuWin32, but if the functions "mktime" and "strftime" are available, you can try the following code:
BEGIN {
START_DATE="2001-02-01"
END_DATE="2001-03-05"
S2=START_DATE
gsub("-"," ",S2)
T=mktime(S2 " 01 00 00")
if (T<0)
printf("%s is invalid.\n",START_DATE) >> "/dev/stderr"
else
{
for(S=START_DATE; END_DATE>S ;T+=86440) print S=strftime("%F",T)
}
}
The key is to convert the start date to a number meaning the seconds since the Epoch, add 86400 seconds (one day or 24 x 60 x 60) and convert back to the ISO date format.
After some trials I realized the mktime() function admits wrong dates as good (for instance, 2000-14-03).
Best regards
This question already has answers here:
Pattern matching dates
(4 answers)
Closed 9 years ago.
April 9, 2012 can be written in any of these ways:
4912
4/9/12
4-9-12
4 9 12
04-9-12
04-09-12
4 9 2012
4 09 2012
(I think you get the point)
For those of you that don't understand, the rules are:
1. Dates may or may not have ` `, `-` or `/` between them
2. The year can be written as 2 digits (assumed to be dates in the range of [2000, 2099] inclusive) or 4 digits
3. One digit month/days may or may not have leading zeroes.
How would you go about problem solving this to format the dates into 04/09/12?
I know the dates can be ambiguous, i.e., 12112 can be 12/1/12 or 1/21/12, but assume the smallest month possible.
This actually is something that regexes are good at; making an assumption, moving forward with it, then backtracking if necessary to get a successful match.
s{
\A
( 1[0-2] | 0?[1-9] )
[-/ ]?
( 3[01] | [12][0-9] | 0?[1-9] )
[-/ ]?
( (?: [0-9]{2} ){1,2} )
\z
}
{
sprintf '%02u/%02u/%04u', $1, $2, ( length $3 == 4 ? $3 : 2000+$3 )
}xe;
The range checks present, while not determined by the value of the month, should be sufficient to pick a good date from the ambiguous cases (where there is a good date).
Note that it is important to try two digit month and days first; otherwise 111111 becomes 1-1-1111, not the presumably intended 11-11-11. But this means 11111 will prefer to be 11-1-11, not 1-11-11.
If a valid day of month check is needed, it should be performed after reformatting.
Notes:
s{}{} is a substitution using curly braces instead of / to delimit the parts of the regex to avoid having to escape the /, and also because using paired delimiters allows opening and closing both the pattern and replacement parts, which looks nice to me.
\A matches the start of the string being matched; \z matches the end. ^ and $ are often used for this, but can have slightly different meanings in some cases; I prefer these since they always only mean one thing.
The x flag on the end says this is an extended regex that can have extra whitespace or comments that are ignored, so that it is more readable. (Whitespace inside a character class isn't ignored.) The e flag says the replacement part isn't a string, it is code to execute.
'%02u/%02u/%02u' is a printf format, used for taking values and formatting them in a particular way; see http://perldoc.perl.org/functions/sprintf.html.
Install Date::Calc
On ubuntu libdate-calc-perl
This should be able to read in all those dates ( except 4912, 4 9 2012, 4 09 2012 ) and then output them in a common format
I am currently using Template Toolkit and have never learn or use before TT.
For example, I have 10 files, 5 files dated year dd/mm/2011 and 5 files dated dd/mm/2012. I need to display the year once only. I tried using foreach loop but instead of displaying 2011 5 times and 2012 5 times, I want it to display only 1 time.
What I need to achieve is to get the year and using that to create a link to display those documents on that year.
Hope you guys understand and some kind souls please help me out. =x
You'd use a similar approach in TT that you'd use in any other programming language. Make a note of the last year you saw and only print the current one if it's different.
Here's a simple example that you can run with tpage.
$ cat years.tt
[%- dates = [ '01/11/2012', '01/12/2012', '01/01/2013', '01/02/2013'];
lastyear = '';
FOREACH date IN dates;
bits = date.split('/');
IF bits.2 != lastyear;
bits.2 _ "\n";
END;
bits.0 _ '/' _ bits.1 _ "\n";
lastyear = bits.2;
END -%]
$ tpage years.tt
2012
01/11
01/12
2013
01/01
01/02
But you almost certainly want to think about passing a more sensible data structure into TT.
I have to parse a file and store it in a table. I was asked to use a hash to implement this. Give me simple means to do that, only in Perl.
-----------------------------------------------------------------------
L1234| Archana20 | 2010-02-12 17:41:01 -0700 (Mon, 19 Apr 2010) | 1 line
PD:21534 / lserve<->Progress good
------------------------------------------------------------------------
L1235 | Archana20 | 2010-04-12 12:54:41 -0700 (Fri, 16 Apr 2010) | 1 line
PD:21534 / Module<->Dir,requires completion
------------------------------------------------------------------------
L1236 | Archana20 | 2010-02-12 17:39:43 -0700 (Wed, 14 Apr 2010) | 1 line
PD:21534 / General Page problem fixed
------------------------------------------------------------------------
L1237 | Archana20 | 2010-03-13 07:29:53 -0700 (Tue, 13 Apr 2010) | 1 line
gTr:SLC-163 / immediate fix required
------------------------------------------------------------------------
L1238 | Archana20 | 2010-02-12 13:00:44 -0700 (Mon, 12 Apr 2010) | 1 line
PD:21534 / Loc Information Page
------------------------------------------------------------------------
I want to read this file and I want to perform a split or whatever to extract the following fields in a table:
the id that starts with L should be the first field in a table
Archana20 must be in the second field
timestamp must be in the third field
PD must be in the fourth field
Type (content preceding / must be in the last field)
My questions are:
How to ignore the --------… (separator line) in this file?
How to extract the above?
How to split since the file has two delimiters (|, /)?
How to implement it using a hash and what is the need for this?
Please provide some simple means so that I can understand since I am a beginner to Perl.
My questions are:
How to ignore the --------… (separator line) in this file?
How to extract the above?
How to split since the file has two delimiters (|, /)?
How to implement it using a hash and what is the need for this?
You will probably be working through the file line by line in a loop. Take a look at perldoc -f next. You can use regular expressions or a simpler match in this case, to make sure that you only skip appropriate lines.
You need to split first and then handle each field as needed after, I would guess.
Split on the primary delimiter (which appears to be ' | ' - more on that in a minute), then split the final field on its secondary delimiter afterwards.
I'm not sure if you are asking whether you need a hash or not. If so, you need to pick which item will provide the best set of (unique) keys. We can't do that for you since we don't know your data, but the first field (at a glance) looks about right. As for how to get something like this into a more complex data structure, you will want to look at perldoc perldsc eventually, though it might only confuse you right now.
One other thing, your data above looks like it has a semi-important typo in the first line. In that line only, there is no space between the first field and its delimiter. Everywhere else it's ' | '. I mention this only because it can matter for split. I nearly edited this, but maybe the data itself is irregular, though I doubt it.
I don't know how much of a beginner you are to Perl, but if you are completely new to it, you should think about a book (online tutorials vary widely and many are terribly out of date). A reasonably good introductory book is freely available online: Beginning Perl. Another good option is Learning Perl and Intermediate Perl (they really go together).
When you say This is not a homework...to mean this will be a start to assess me in perl I assume you mean that this is perhaps the first assignment you have at a new job or something, in which case It seems that if we just give you the answer it will actually harm you later since they will assume you know more about Perl than you do.
However, I will point you in the right direction.
A. Don't use split, use regular expressions. You can learn about them by googling "perl regex"
B. Google "perl hash" to learn about perl hashes. The first result is very good.
Now to your questions:
regular expressions will help you ignore lines you don't want
regular expressions with extract items. Look up "capture variables"
Don't split, use regex
See point B above.
If this file is line based then you can do a line by line based read in a while loop. Then skip those lines that aren't formatted how you wish.
After that, you can either use regex as indicated in the other answer. I'd use that to split it up and get an array and build a hash of lists for the record. Either after that (or before) clean up each record by trimming whitespace etc. If you use regex, then use the capture expressions to add to your list in that fashion. Its up to you.
The hash key is the first column, the list contains everything else. If you are just doing a direct insert, you can get away with a list of lists and just put everything in that instead.
The key for the hash would allow you to look at particular records for fast lookup. But if you don't need that, then an array would be fine.
You can try this one,
Points need to know:
read the file line by line
By using regular expression, removing '----' lines.
after that use split function to populate Hashes of array .
#!/usr/bin/perl
use strict;
use warning;
my $test_file = 'test.txt';
open(IN, '<' ,"$test_file") or die $!;
my (%seen, $id, $name, $timestamp, $PD, $type);
while(<IN>){
chomp;
my $line = $_;
if($line =~ m/^-/){ #removing '---' lines
# print "$line:hello\n";
}else{
if ($line =~ /\|/){
($id , $name, $timestamp) = split /\|/, $line, 4;
} else{
($PD, $type) = split /\//, $line , 3;
}
$seen{$id}= [$name, $timestamp, $PD, $type]; //use Hashes of array
}
}
for my $test(sort keys %seen){
my $test1 = $seen{$test};
print "$test:#{$test1}\n";
}
close(IN);
I am getting a date field from the database in one of my variables, at the moment I am using the following code to check if the date is in "yyyy-mm-dd" format
if ( $dat =~ /\d{3,}-\d\d-\d\d/ )
My question, is there a better way to accomplish this.
Many Thanks
The OWASP Validation Regex Repository's version of dates in US format with support for leap years:
^(?:(?:(?:0?[13578]|1[02])(/|-|.)31)\1|(?:(?:0?[1,3-9]|1[0-2])(/|-|.)(?:29|30)\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:0?2(/|-|.)29\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:(?:0?[1-9])|(?:1[0-2]))(/|-|.)(?:0?[1-9]|1\d|2[0-8])\4(?:(?:1[6-9]|[2-9]\d)?\d{2})$
The Regular Expression Library contains a simpler version along the lines of the other suggestions, which is translated to your problem:
^\d{4}-\d{1,2}-\d{1,2}$
As noted by others, if this is a date field from a database, it should be coming in a well-defined format, so you can use a simple regex, such as that given by toolkit.
But that has the disadvantage that it will accept invalid dates, such as 2009-02-30. Again, if you're handling dates that successfully made it into a date-typed field in a DB, you should be safe.
A more robust approach would be to use one of the many Date/Time modules from CPAN. Probably Date::Manip would be a good choice, and in particular check out the ParseDate() function.
http://metacpan.org/pod/Date::Manip
How about
/\d{2}\d{2}?-(0[1-9]|1[0-2])-(0[1-9]|[1-2][0-9]|3[0-1])/
\d could match number characters from other languages. And is YYY really a valid year? If it must be four digits, dash, two digits, dash, two digits, I'd prefer /^[0-9]{4}-[0-9]{2}-[0-9]{2}$/ or /^[12][0-9]{3}-[0-9]{2}-[0-9]{2}$/. Be aware of space characters around the string you're matching.
Of course, this doesn't check the reasonableness of the characters that are there, except for the first character in the second example. If that's required, you'll do well to just pass it to a date parsing module and then check its output for logical results.
The best and lightweight solution is using Date::Calc's check_date sub routine, here's an example:
use strict;
use warnings
use Date::Calc qw[check_date];
## string in YYYY-MM-DD format, you can have any format
## you like, just parse it
my #dt_dob = unpack("A4xA2xA2",$str_dob_date);
unless(check_date(#dt_dob)) {
warn "Oops! invalid date!";
}
I hope that was helpful :-)
Well you can start with:
/\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|30|31)/
I would very strongly recommend AGAINST writing your own regular expression to do this. Date/time parsing is simple, but there are some tricky aspects, and this is a problem that has been solved hundreds of times. No need for you to design, write, and debug yet another solution.
If you want a regular expression, the best solution is probably to use my Regexp::Common::time plugin for the Regexp::Common module. You can specify simple or complex, rigid or fuzzy date/time matching, and it has a very extensive test suite.
If you just want to parse specific date formats, you may be better off using one of the many parsing/formatting plugins for Dave Rolsky's excellent DateTime module.
If you want to validate the date/time values after you have matched them, I would suggest my Time::Normalize module.
Hope this helps.
I think using a regex without outer check is much to complicated! I use a little sub to get it:
sub check_date {
my $date_string = shift;
# Check the string fromat and get year, month and day out of it.
# Best to use a regex.
return 0 unless $date_string =~ m/^(\d{4})-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])$/;
# 31. in a month with 30 days
return 0 if ($3 >= 31 and ($2 == 4 or $2 == 6 or $2 == 9 or $2 == 11));
# February 30. or 31.
return 0 if ($3 >= 30 and $2 == 2);
# February 29. in not a leap year.
return 0 if ($2 == 2 and $3 == 29
and not ($1 % 4 == 0 and ($1 % 100 != 0 or $1 % 400 == 0)));
# Date is valid
return 1;
}
I got the idea (and most of the code) from regular-expressions.info. There are other examples too.