date creation with Image::Exiftools or Date::Handler - date

I'm trying to merge all dirs containing pictures having different timeZone, in a single dir, uniform file's names with a same format (Hungarian style yyyymmdd_hhmmss) eliminating huge of duplicates, at once.
I was looking for modules (over the stat statement which return only the epoch (stat(file))[9]) that can read the file's creation date, not the last modification or access date.
I fount Immage::ExifTools seem to be the effective and easiest to list.
I noted that after several cycling into the dir the module stop to extract the correct date/time output stuck in a monotone identical wrong date/time value for the next over files. Here is the list and below one of pictures that get me into this trouble: more precisely file's property on win10 give a creation date of 20140626 12:16 pm; running instead the routine I obtain 20021208_120000.
foreach $img(#img){
next if -l $img;
$img =~ /.+(\..+$)/;
$ext = $1;
# %ENV;
$exif=new Image::ExifTool;
$exif->ExtractInfo($dir.$img);
$for = $exif->GetValue('CreateDate');
$for =~ s/$space/\_/g;
$for =~ s/\://g;
$for = '_'.$for;
$size = (stat($dir.$img))[7];
# $date = Date->new($date[9]);
# #data = $date->array;
#tie my %date, 'Date::Tie', utc_epoch => $date{$date[9]}; #tz => $date{tz};
#my $date = Date::Tie->new( epoch => $date[9] );
%date;
# $for = IMG.$for.$ext;
if (!$all{'IMG'.$for.$ext}){
$all{'IMG'.$for.$ext}= $size ;
rename $dir.$img, $dir.'IMG'.$for.$ext;
print "rename $dir.$img, $dir.'IMG'.$for.$ext\n";
}elsif($all{'IMG'.$for.$ext} == $size){
unlink $dir.$img;
print "Deleting $dir.$img\n";
}
Checking the files properties involved, the "wrong" ones seems to have the same properties "working" ones: both working and wrong state the properties of : creation/acquisition date, modification date and last access date..
I can't understand where the module fault.
Have you any recommendation ? Any different module to use ?
Thanks
Simon

On Windows you can use Win32API::File::Time to read and modify the file creation time:
use feature qw(say);
use strict;
use warnings;
use Win32API::File::Time qw(GetFileTime SetFileTime);
use File::Temp qw(tempdir);
my $dir = tempdir( CLEANUP => 1 );
my $fn = 'test.txt';
open (my $fh, '>', $fn) or die "Could not open file '$fn': $!";
say $fh "Foobar";
close $fh;
print_file_times($fn);
my ($atime, $mtime, $ctime) = GetFileTime ($fn);
SetFileTime ($fn, undef, undef, $ctime-180);
print_file_times($fn);
sub print_file_times {
my ($fn) = #_;
my ($atime, $mtime, $ctime) = GetFileTime ($fn);
say "File: $fn:";
say " File access time: $atime";
say " File modification time: $mtime";
say " File creation time: $ctime";
return $ctime;
}
Output:
File: test.txt:
File access time: 1614640101
File modification time: 1614640101
File creation time: 1614639958
File: test.txt:
File access time: 1614640101
File modification time: 1614640101
File creation time: 1614639778

Related

How to set file last modified file attribute in Perl

Here I use last_run file stored in the local directory and update it's last modified date if it's not equal to the current datetime. Is it possible not to write a symbol into the file, but to update mtime in some other way?
I saw that there is a module File::Touch, but I don't want to use it as I already have an opened file, and it will be faster to use the file descriptor as is.
#!/usr/bin/perl
use utf8;
use strict;
use warnings;
use File::stat;
use DateTime;
my $file_name = "last_run";
if (-e $file_name)
{
my $fh;
if (open($fh, "+<", $file_name))
{
my $timestamp = stat($fh)->mtime;
my $now = DateTime->now(time_zone => 'local');
my ($sec, $min, $hour, $day, $month, $year) = localtime($timestamp);
$month += 1;
$year += 1900;
if ($now->day != $day or $now->month != $month or $now->year != $year)
{
print $fh ' '; # <- I have to write something into the file
}
print "$day $month $year\n";
print $now->day.' '.$now->month.' '.$now->year."\n";
}
else
{
print "cannot open +< $file_name: $!";
}
close($fh);
}
else
{
open(my $fh, ">", $file_name)
or print "cannot open > file name: $!";
}
You're looking for the utime function.
Some quotes from the documentation:
Changes the access and modification times on each file of a list of files. The first two elements of the list must be the NUMERIC access and modification times, in that order. Returns the number of files successfully changed. The inode change time of each file is set to the current time.
Since Perl 5.8.0, if the first two elements of the list are undef, the utime(2) syscall from your C library is called with a null second argument. On most systems, this will set the file's access and modification times to the current time.
On systems that support futimes(2), you may pass filehandles among the files. On systems that don't support futimes(2), passing filehandles raises an exception. Filehandles must be passed as globs or glob references to be recognized; barewords are considered filenames.
So, something like:
utime undef, undef, $fh;
will update the file's modification (And access) time to the current one. If you're not using most systems, there's another example in the documentation that explicitly uses the current time.
futimes(2) is present on Linux and BSDs. If you're using a different OS, you'll have to use the filename instead of handle.

Split my output into multiple files

I have the following list in a CSV file, and my goal is to split this list into directories named YYYY-Month based on the date in each row.
NAME99;2018/06/13;12:27:30
NAME01;2018/06/13;13:03:59
NAME00;2018/06/15;11:33:01
NAME98;2018/06/15;12:22:00
NAME34;2018/06/15;16:58:45
NAME17;2018/06/18;15:51:10
NAME72;2018/06/19;10:06:37
NAME70;2018/06/19;12:44:03
NAME77;2018/06/19;16:36:55
NAME25;2018/06/11;16:32:57
NAME24;2018/06/11;16:32:57
NAME23;2018/06/11;16:37:15
NAME01;2018/06/11;16:37:15
NAME02;2018/06/11;16:37:15
NAME01;2018/06/11;16:37:18
NAME02;2018/06/05;09:51:17
NAME00;2018/06/13;15:04:29
NAME07;2018/06/19;10:02:26
NAME08;2018/06/26;16:03:57
NAME09;2018/06/26;16:03:57
NAME02;2018/06/27;16:58:12
NAME03;2018/07/03;07:47:21
NAME21;2018/07/03;10:53:00
NAMEXX;2018/07/05;03:13:01
NAME21;2018/07/05;15:39:00
NAME01;2018/07/05;16:00:14
NAME00;2018/07/08;11:50:10
NAME07;2018/07/09;14:46:00
What is the smartest method to achieve this result without having to create a list of static routes, in which to carry out the append?
Currently my program writes this list to a directory called YYYY-Month only on the basis of localtime but does not do anything on each line.
Perl
#!/usr/bin/perl
use strict;
use warnings 'all';
use feature qw(say);
use File::Path qw<mkpath>;
use File::Spec;
use File::Copy;
use POSIX qw<strftime>;
my $OUTPUT_FILE = 'output.csv';
my $OUTFILE = 'splitted_output.csv';
# Output to file
open( GL_INPUT, $OUTPUT_FILE ) or die $!;
$/ = "\n\n"; # input record separator
while ( <GL_INPUT> ) {
chomp;
my #lines = split /\n/;
my $i = 0;
foreach my $lines ( #lines ) {
# Encapsulate Date/Time
my ( $name, $y, $m, $d, $time ) =
$lines[$i] =~ /\A(\w+);(\d+)\/(\d+)\/(\d+);(\d+:\d+:\d+)/;
# Generate Directory YYYY-Month - #2009-January
my $dir = File::Spec->catfile( $BASE_LOG_DIRECTORY, "$y-$m" ) ;
unless ( -e $dir ) {
mkpath $dir;
}
my $log_file_path = File::Spec->catfile( $dir, $OUTFILE );
open( OUTPUT, '>>', $log_file_path ) or die $!;
# Here I append value into files
print OUTPUT join ';', "$y/$m/$d", $time, "$name\n";
$i++;
}
}
close( GL_INPUT );
close( OUTPUT );
There is no reason to care about the actual date, or to use date functions at all here. You want to split up your data based on a partial value of one of the columns in the data. That just happens to be the date.
NAME08;2018/06/26;16:03:57 # This goes to 2018-06/
NAME09;2018/06/26;16:03:57 #
NAME02;2018/06/27;16:58:12 #
NAME03;2018/07/03;07:47:21 # This goes to 2018-07/
NAME21;2018/07/03;10:53:00 #
NAMEXX;2018/07/05;03:13:01 #
NAME21;2018/07/05;15:39:00 #
The easiest way to do this is to iterate your input data, then stick it into a hash with keys for each year-month combination. But you're talking about log files, and they might be large, so that's inefficient.
We should work with different file handles instead.
use strict;
use warnings;
my %months = ( 6 => 'June', 7 => 'July' );
my %handles;
while (my $row = <DATA>) {
# no chomp, we don't actually care about reading the whole row
my (undef, $dir) = split /;/, $row; # discard name and everything after date
# create the YYYY-MM key
$dir =~ s[^(....)/(..)][$1-$months{$2}];
# open a new handle for this year/month if we don't have it yet
unless (exists $handles{$dir}) {
# create the directory (skipped here) ...
open my $fh, '>', "$dir/filename.csv" or die $!;
$handles{$dir} = $fh;
}
# write out the line to the correct directory
print { $handles{$dir} } $row;
}
__DATA__
NAME08;2018/06/26;16:03:57
NAME09;2018/06/26;16:03:57
NAME02;2018/06/27;16:58:12
NAME03;2018/07/03;07:47:21
NAME21;2018/07/03;10:53:00
NAMEXX;2018/07/05;03:13:01
NAME21;2018/07/05;15:39:00
I've skipped the part about creating the directory as you already know how to do this.
This code will also work if your rows of data are not sequential. It's not the most efficient as the number of handles will grow the more data you have, but as long you don't have 100s of them at the same time that does not really matter.
Things of note:
You don't need chomp because you don't care about working with the last field.
You don't need to assign all of the values after split because you don't care about them.
You can discard values by assigning them to undef.
Always use three-argument open and lexical file handles.
the {} in print { ... } $roware needed to tell Perl that this is the handle we are printing too. See http://perldoc.perl.org/functions/print.html.

Rolling archival file containing last 1 year's data

A script needs to be developed to create a archive file containing only last 1 year's data. Script will copy the content of main data file (CEMI.log) and append the same to archived log file (CEMI.Archive.log). This will happen once in a week (every monday at 3 am). After one year, the script should be able to remove the data from the archive file which is older than a year.
I am stuck at a point where I need to remove last one year's data. How can this be done using shell or perl script?
sample file:
-bash-3.2# more test.txt
2015-01-15,09:17:10,101,20a6475d-4d0c-4fe4-8765-35065ddfe887,_1.1,L,
2015-01-15,09:18:57,70,al Test,20a6475d-4d0c-4fe4-8765-35065ddfe887,1,L,
2015-01-15,10:59:28,1,,,,Best Practice,9f02745244d6440584b24012d882f935,,L,
2015-01-15,11:00:52,2,,,,Best Practice,9f02745244d6440584b24012d882f935,,L,
2015-01-15,11:00:56,1,,,,Best Practice,9f02745244d6440584b24012d882f935,,L,
2015-01-15,11:03:14,1,,,,Best Practice,9f02745244d6440584b24012d882f935,,L,
2015-01-15,11:03:38,1,,,,Best Practice,9f02745244d6440584b24012d882f935,,L,
2015-01-15,11:04:01,1,,,,Best Practice,9f02745244d6440584b24012d882f935,,L,
2015-01-15,11:05:07,2,,,,Best Practice,9f02745244d6440584b24012d882f935,,L,
2015-01-15,11:06:45,32,,,,Best Practice,9f02745244d6440584b24012d882f935,,L,
2015-01-15,12:57:13,36,,,560909,Best Practice,e8418950-6561-4465-b16b-30e118e826b7,,L,
2015-01-15,13:37:56,1032,,xml-data1,Test,20a6475d-4d0c-4fe4-8765-35065ddfe887,Request_1.1,L,Success
2015-01-15,13:39:01,38,,,Internal Test,20a6475d-4d0c-4fe4-8765-35065ddfe887,Request_1.1,L,
2015-01-15,13:39:50,113,,xml-dat1,al Test,20a6475d-4d0c-4fe4-8765-35065ddfe887,Request_1.1,L,
2015-01-15,13:40:17,74,,, Test,20a6475d-4d0c-4fe4-8765-35065ddfe887,Request_1.1,L,
This will stop at the first line not matching your date format.
#!/usr/bin/env perl
use strict;
use warnings;
use DateTime;
use File::Copy qw/move/;
use File::Temp qw/tempfile/;
# filenames
my $log = "test.txt";
my ( undef, $tmp ) = tempfile( undef, OPEN => 0 ); # get safe temporary filename, but do not open file
# move old log file to temporary location
move $log => $tmp
or die "Cannot rename '$log' to '$tmp': $!";
# open temporary file (contains old log) and new log
open( my $in, "<", $tmp )
or die "Cannot open '$tmp': $!";
open( my $out, ">", $log )
or die "Cannot open '$log': $!";
# calculate a DateTime value one year in the past
my $limit = DateTime->today->subtract( years => 1 );
# skip lines with date that is too old
while (<$in>) {
if (m/^(\d\d\d\d)-(\d\d)-(\d\d)/) {
# get DateTime object from matched date parts
my $dt = DateTime->new(
year => $1,
month => $2,
day => $3,
);
# keep on skipping lines while dates are too old
if ( $dt < $limit ) {
next;
}
}
# if no date was found or date too young, end skipping
last;
}
# copy all remaining lines from temporary file to new log
while (<$in>) {
print $out $_;
}
close $in or warn "Error closing '$in': $!";
close $out or warn "Error closing '$out': $!";
unlink $tmp or die "Cannot delete '$tmp': $!";
How to execute this:
$ cat > test.txt # press Ctrl+D to end text input
2014-01-01,will be removed
stops here
2014-01-01,will be kept because stopped
2015-01-01,would be kept anyway
$ perl rotate_log.pl
$ cat test.txt
2014-01-01,will be kept because stopped
2015-01-01,would be kept anyway

Load multiple csv file in oracle table from perl

after some research decided to put question here for more expert answers.Couldn't find exact scenario as my problem so here it goes...
I think it will take few days for me to get something working, can't even think about how to move forward now.
DB: 11gR2
OS: Unix
I'm trying to load multiple csv file into Oracle table using perl script.
List what all csv I need to work on, since directory where csv file exist contains many other files.
Open csv file and insert into table
If there are any error then rollback all inserts of that file and move into next file
Record how many inserts done by each file
#!/usr/bin/perl
use warnings;
use strict;
use Text::CSV;
use DBD::Oracle;
my $exitStatus = 0;
my $dow = `date +%a`; chomp $dow;
my $csvDow = `date -dd +%a`; chomp $csvDow;
# define logfile
my logFile;
$logFile = "log.dbinserts"
# define csv file directory
my $csvLogDir = "Home/log/$csvDow";
# csv Files in array to list all possible match of file
opendir(my $dh, $csvLogDir ) || die "can't opendir $csvLogDir : $!";
my #csvFile = grep { /csv.*host1/ && -f "$csvLogDir/$_" } readdir($dh); chomp #csvFile;
closedir $dh;
foreach my $i (#csvFile)
{
$logFile (CSV File: $i);
}
foreach my $file (#csvFile)
{
chomp ($item);
$logFile-> ("Working under: $file");
&insertRecords($csvLogDir."/".$file);
}
$logFile-> ("Exit status")
#----------------
sub insertRecords
{
my $filetoInsert=shift;
my $row;
open my $fh, "<" or die "$fileToInsert: $!";
my $csv = Text::CSV->new ({
binary =>1,
auto_diag =>1,
});
while ($row = $csv->getline ($fh))
{
print "first column : $row->[0]\n,";
}
close $fh;
}
========
CSV File
=========
date, host, first, number1, number2
20141215 13:05:08, S1, John, 100, 100.20
20141215 13:06:08, S2, Ray, 200, 200.50
...
...
...
=========
Table - tab1
=========
Sample_Date
Server
First
N1
N2
For the first step it depends one which criteria you'll need to select your CSV files
if it's on the name of those CSV you could simply use opendir and get the list of files with readd :
my $dirToScan = '/var/data/csv';
opendir(my $dh, $dirToScan ) || die "can't opendir $dirToScan : $!";
my #csvFiles = grep { /.csv$/ && -f "$some_dir/$_" } readdir($dh);
closedir $dh;
In this example you'll retrieve a array with all the files that end whith .csv (whithin the design dir)
After that you'll need to use your foreach on your array.
You can find more example and explanation here
I don't know the structure of your CSV but I would advise to use a module like Text::CSV, it's a simple CSV parser that will wrap Text::CSV_PP or Text::CSV_XS, if it's installed on your system ( it's faster than the PP version (because written in perl/XS)
this module allows you to transform a CSV row in a array like this :
use Text::CSV;
my $file = "listed.csv";
open my $fh, "<", $file or die "$file: $!";
my $csv = Text::CSV->new ({
binary => 1, # Allow special character. Always set this
auto_diag => 1, # Report irregularities immediately
});
while (my $row = $csv->getline ($fh)) {
print "first colum : $row->[0]\n";
}
close $fh;
from : perlmeme.org
You'll need to open() your file (within the foreach loop), pass it to the Text::CSV element (you can declare your parser outside of the loop)
That's the easiest case where you know the column number of you CSV, if you need to use the column name you'll need to user the getline_hr() function (see the CPAN doc of Text::CSV)
And once you have your values (you should be whithin the foreach loop of you file list and in the while, that list the rows of your CSV, you will need to insert this data in your database.
For this you'll need the DBD::Oracle module that will allow you to connect to the database.
Like every DBI connector you'll need to instanciate a connection, using this syntax :
use DBI;
$dbh = DBI->connect("dbi:Oracle:$dbname", $user, $passwd);
And then in your loop (while your reading you CSV rows) you should be able to do something like this :
$SQL = "INSERT INTO yourTable (foobar,baz) VALUES (?,?)";
$sth = $dbh->prepare($SQL);
$sth->execute($row->[0],$row->[1]);
here you have tree step where you prepare the request with the value replaced by '?' (you can also use declared variable instead, if you have a lot of columns)
after the preparation you execute the request with the desired value (once again you don't have to use anonymous vars)
To catch if the request failed you only have to set RaiseError to when the connection is declared, that would look like something like this :
$dbh = DBI->connect("dbi:Oracle:$dbname", $user, $passwd,
{
PrintError => 1,
PrintWarn => 1,
RaiseError => 1
});
And then when playing the request :
try
{
$sth->execute($row->[0],$row->[1]);
}
catch
{
warn "INSERT error : $_";
$CSVhasFailures = 1;
};
You'll need to set the value of $CSVhasFailures to 0 before each CSV
After that, by testing the value of the CSVhasFailures at the end of the while loop you could decide to execute a commit or a rollback using the integrated function commit and rollback whithin the DBD::Oracle module
if you wan't to count the number of insert you'll just have to put a $counter++ after the $sth->execute statement
for more info on the DBD::Oracle I would suggest you to read the CPAN documentation page.
Last suggestion, begin step by step : Lists your CSV files, read the rows of each CSV, read a column, print a set of column and then insert you data in a temporary table.

Generate dynamic file name list

I started programming in perl few months back and this is my first question on stackoverflow.com. I hope I can get a solution.
So I want to copy some files from an ftp server. The file names are in this format:
abc_201149_cde_07_fgh_include_internal
In this example the numeric part gets changed on weekly basis, e.g. 201149 says year = 2011 and week = 49. Similarly, 07 says which version it is.
I have copied all the file names into one file called "sdk_link.txt" and I am reading each file name from this and then copying to my local PC:
use Net::FTP;
use File::Copy;
$Login = "<redacted>";
$Pwd = "<redacted>";
$ftpHost = "<redacted>";
$ftpFolder = "/daj/dhakj/ahdakl/abc_201206_def_05";
$ftp=Net::FTP->new($ftpHost,Timeout=>100);
if ($ftp)
{
print $ftp->message;
}
$ftp->login($Login,$Pwd);
print $ftp->message;
$ftp->cwd($ftpFolder);
print $ftp->message;
open FILE,"sdk_link.txt" or die $!;
while($test=<FILE>)
{
chomp($test);
#Copy the file
copy("$test","/testing");
}
$ftp->quit;
I want to run this script every week on Windows. How can I make the numeric part change so that the correct files get downloaded?
Well, the obvious answer is to keep a template on file, and insert the correct numbers. For example:
echo abc_%s_cde_%s_fgh_include_internal |
perl -MPOSIX -nE 'say sprintf $_, strftime("%Y%U", localtime()), "07";'
Output:
abc_201207_cde_07_fgh_include_internal
So that if you'd have a file with templates, you can use %s to insert strings, and provide arguments either from your own list of arguments, or dynamically, as you prefer. E.g.:
my $year = "2011";
my $week = "49";
my $ver = "07"; # Strings or numbers does not really matter
open my $fh, '<', "sdk_link.txt" or die $!;
while (<$fh>) {
my $file = sprintf $_, $year, $week, $ver;
copy($file, "/testing") or die "Copy failed for file $file: $!";
}
I am not so sure File::Copy::copy works as intended for remote files, but that's another question. I believe Net::FTP::get() might be what you want.