Convert date from file in Perl - perl

On my program, I need to open my text file. Next I need to find date and convert. I don't know how to convert date with my code. Now in result I have mm / dd / yyyy. I'd like to change it to dd-mm-yyyy. Is it possible to do it with my code. How should I fix it?
#my program
use strict;
use warnings;
use feature 'say';
open my $file, 'file.txt' or die "Error\n";
my $re = qr/\d{1,2}\/\d{1,2}\/\d{1,4}/i; #mm/dd/yyyy
#my $re = qr/\d{1,2}-\d{1,3}/i; #postcode
while(my $fh = <$file>) {
if (my #match = $fh =~ /$re/g){;
say for #match;
}
}
#my file.txt
Today is 03.02.2020. Tommorow will be 03.03.2020.

To get the results of a match captured, you need to write parentheses around the parts you want to keep:
my $re = qr/(\d{1,2})\/(\d{1,2})\/(\d{1,4})/i; #mm/dd/yy
Also, your sample file.txt separates the components with dots and not slashes, so you need either change the file or the regular expression.

If the goal to change date format in file then this can be achieved with following one liner
perl -0777 -pe "s/\b(\d{2})\.(\d{2})\.(\d{2,4})\b/$1-$2-$3/g" -i file.txt
Options:
-0777 read file at once
-pe execute script in "..." and print out result
-i do replacement in place (original file modified to desired result)
Note:
MS Windows assume the script to be wrapped into "perl script", UNIX systems assume the script to be wrapped into 'perl script'.
\b can be omitted if there is not strict requirements for date be separated by punctuation.
Attention:
Please make a copy of original file just in case you make a mistake.

Goal:
substitute in file all dates represented in form 'dd.mm.yyyy' to 'dd-mm-yyyy'
Errors:
In OP's code regex is tuned to date format 'dd/mm/yyyy' what does not match data in input file 'dd.mm.yyyy'
Procedure:
open file, read file line by line, substitute date in 'dd.mm.yyyy' format to 'dd-mm-yyyy' format in each line all occurrances, output result to console, close the file, done
Note:
OP outputs date by one digit on each line to console -- definitely not what he intended [corrected], to make in place substitute in input file the code should be modified or see one liner provided.
use strict;
use warnings;
use feature 'say';
# Uncomment bellow to read from file
# my $filename = 'file.txt';
# open my $file, '<', $filename
# or die "Error\n";
# NOTE:
# In provided file you have dates in following format
# 03.02.2020
# but in question you refer to
# 03/02/2020
# The code adjusted accordingly, substitution applies
# only to dates matching re format in line (no need for 'if')
while(<DATA>) { # replace DATA with $file to read from a file
chomp; # snip eol
s/(\d{1,2})\.(\d{1,2})\.(\d{1,4})/$1-$2-$3/g; # substitute all date 0ccurrences in line
say;
}
__DATA__
#my file.txt
Today is 03.02.2020. Tommorow will be 03.03.2020.
Output
#my file.txt
Today is 03-02-2020. Tommorow will be 03-03-2020.

Related

Remove mysterious line breaks in CSV file using Perl

I have a CSV file that I'm parsing using Perl. The file is a BOM produced by Solidworks 2015 that was saved as an XLS file, then opened in Excel and saved as a CSV file.
There are cells that have line breaks. When I read a line with such a cell from the file, the line comes in with the line breaks. For example, here is one of the lines read looks like this:
74,,74,1,1,"SJ-TL303202-DET-074-
001",PDSI,"2.25"" DIA. X 8.00""",A2,513,1,
It reads in as a single line in Perl.
When I turn the Show All Characters in Notepad++, I can see the line breaks are cause by [CR][LF].
So I thought this would work to remove the line feeds:
$line =~ s/[\r\n]+//g;
but it does not.
You don't give much of a sample of your CSV data, but what you show is perfectly valid. A text field may contain newlines if you wish, as long as it is enclosed in double-quotes
The Text::CSV module will process it quite happily as long as you enable the binary option in the constructor call, and you may reformat the data as you wish before you write it back out again
This program expects the path to the input file as a parameter on the command line, and it will write the modified data to STDOUT, which you can redirect on the command line, like this
$ perl fix_csv.pl input.csv > output.csv
I've assumed that your data contains only 7-bit ASCII data, and it should work whether you're running it on a Windows system or on Linux
use strict;
use warnings 'all';
my ($csv_file) = #ARGV;
use Text::CSV;
open my $fh, '<', $csv_file or die qq{Unable to open "$csv_file" for input: $!};
my $csv = Text::CSV->new( { binary => 1 } );
while ( my $row = $csv->getline( $fh ) ) {
tr/\r\n//d for #$row;
$csv->combine(#$row);
print $csv->string, "\n";
}
output
74,,74,1,1,SJ-TL303202-DET-074-001,PDSI,"2.25"" DIA. X 8.00""",A2,513,1,

How to read the contents of a file

I am confused on how to read the contents of a text file. I'm able to read the files name but can't figure out how to get the contents. By the way the file is encrypted that's why I'm trying to decrypt.
#!/Strawberry/perl/bin/perl
use v5.14;
sub encode_decode {
shift =~ tr/A-Za-z/Z-ZA-Yz-za-y/r;
}
my ($file1) = #ARGV;
open my $fh1, '<', $file1;
while (<$fh1>) {
my $enc = encode_decode($file1);
print my $dec = encode_decode($enc);
# ... do something with $_ from $file1 ...
}
close $fh1;
This line
my $enc = encode_decode($file1)
passes the name of the file to encode_decode
A loop like while ( <$fh1> ) { ... } puts each line from the file into the default variable $_. You've written so yourself in your comment “do something with $_ from $file1 ...”. You probably want
my $enc = encode_decode($_)
And, by the way, your encode_decode subroutine won't reverse its own encoding. You've written what is effectively a ROT25 encoding, so you would have to apply encode_decode 26 times to get back to the original string
It's also worth noting that your shebang line
#!/Strawberry/perl/bin/perl
is pointless on Windows because the command shell doesn't process shebang lines. Perl itself will check the line for options like -w or -i, but you shouldn't be using those anyway. Just omit the line, or if you want to be able to run your program on Linux as well as Windows then use
#!/bin/env perl
which will cause a Linux shell to search the PATH variable for the first perl executable

Why doesn't chomp() work in this case?

I'm trying to use chomp() to remove all the newline character from a file. Here's the code:
use strict;
use warnings;
open (INPUT, 'input.txt') or die "Couldn't open file, $!";
my #emails = <INPUT>;
close INPUT;
chomp(#emails);
my $test;
foreach(#emails)
{
$test = $test.$_;
}
print $test;
and the test conent for the input.txt file is simple:
hello.com
hello2.com
hello3.com
hello4.com
my expected output is something like this: hello.comhello2.comhello3.comhello4.com
however, I'm still getting the same content as the input file, any help please?
Thank you
If the input file was generated on a different platform (one that uses a different EOL sequence), chomp might not strip off all the newline characters. For example, if you created the text file in Windows (which uses \r\n) and ran the script on Mac or Linux, only the \n would get chomp()ed and the output would still "look" like it had newlines.
If you know what the EOL sequence of the input is, you can set $/ before chomp(). Otherwise, you may need to do something like
my #emails = map { s/[\n\r]+$//g; $_ } <INPUT>;

Saving Data that's Been Run Through ActivePerl

This must be a basic question, but I can't find a satisfactory answer to it. I have a script here that is meant to convert CSV formatted data to TSV. I've never used Perl before now and I need to know how to save the data that is printed after the Perl script runs it though.
Script below:
#!/usr/bin/perl
use warnings;
use strict;
my $filename = data.csv;
open FILE, $filename or die "can't open $filename: $!";
while (<FILE>) {
s/"//g;
s/,/\t/g;
s/Begin\.Time\.\.s\./Begin Time (s)/;
s/End\.Time\.\.s\./End Time (s)/;
s/Low\.Freq\.\.Hz\./Low Freq (Hz)/;
s/High\.Freq\.\.Hz\./High Freq (Hz)/;
s/Begin\.File/Begin File/;
s/File\.Offset\.\.s\./File Offset (s)/;
s/Random.Number/Random Number/;
s/Random.Percent/Random Percent/;
print;
}
All the data that's been analyzed is in the cmd prompt. How do I save this data?
edit:
thank you everyone! It worked perfectly!
From your cmd prompt:
perl yourscript.pl > C:\result.txt
Here you run the perl script and redirect the output to a file called result.txt
It's always potentially dangerous to treat all commas in a CSV file as field separators. CSV files can also include commas embedded within the data. Here's an example.
1,"Some data","Some more data"
2,"Another record","A field with, an embedded comma"
In your code, the line s/,/\t/g treats all tabs the same and the embedded comma in the final field will also be expanded to a tab. That's probably not what you want.
Here's some code that uses Text::ParseWords to do this correctly.
#!/usr/bin/perl
use strict;
use warnings;
use Text::ParseWords;
while (<>) {
my #line = parse_line(',', 0, $_);
$_ = join "\t", #line;
# All your s/.../.../ lines here
print;
}
If you run this, you'll see that the comma in the final field doesn't get updated.

Perl file processing on SHIFT_JIS encoded Japanese files

I have a set of SHIFT_JIS (Japanese) encoded csv file from Windows, which I am trying to process on a Linux server running Perl v5.10.1 using regular expressions to make string replacements.
Here is my requirement:
I want the Perl script’s regular expressions being human readable (at least to a Japanese person)
Ie. like this:
s/北/0/g;
Instead of it littered with some hex codes
s/\x{4eba}/0/g;
Right now, I am editing the Perl script in Notepad++ on Windows, and pasting in the string I need to search for from the csv data file onto the Perl script.
I have the following working test script below:
use strict;
use warnings;
use utf8;
open (IN1, "<:encoding(shift_jis)", "${work_dir}/tmp00.csv") or die "Error: tmp00.csv\n";
open (OUT1, "+>:encoding(shift_jis)" , "${work_dir}/tmp01.csv") or die "Error: tmp01.csv\n";
while (<IN1>)
{
print $_ . "\n";
chomp;
s/北/0/g;
s/10:00/9:00/g;
print OUT1 "$_\n";
}
close IN1;
close OUT1;
This would successfully replace the 10:00 with 9:00 in the csv file, but the issue is I was unable to replace北 (ie. North) with 0 unless use utf8 is also included at the top.
Questions:
1) In the open documentation, http://perldoc.perl.org/functions/open.html, I didn’t see use utf8 as a requirement, unless it is implicit?
a) If I had use utf8 only, then the first print statement in the loop would print garbage character to my xterm screen.
b) If I had called open with :encoding(shift_jis) only, then the first print statement in the loop would print Japanese character to my xterm screen, but the replacement would not happen. There is no warning that use utf8 was not specified.
c) If I used both a) and b), then this example works.
How does “use utf8” modify the behavior of calling open with :enoding(shift_jis) in this Perl script?
2) I also tried to open the file without any encoding specified, wouldn’t Perl treat the file strings as raw bytes, and be able to perform regular expression match that way if the strings I pasted in the script, is in the same encoding as the text in the original data file? I was able to do file name replacement earlier this way without specifying any encoding whatsoever (please refer to my related post here: Perl Japanese to English filename replacement).
Thanks.
UPDATES 1
Testing a simple localization sample in Perl for filename and file text replacement in Japanese
In Windows XP, copy the 南 character from within a .csv data file and copy to the clipboard, then use it as both the file name (ie. 南.txt) and file content (南). In Notepad++ , reading the file under encoding UTF-8 shows x93xEC, reading it under SHIFT_JIS displays南.
Script:
Use the following Perl script south.pl, which will be run on a Linux server with Perl 5.10
#!/usr/bin/perl
use feature qw(say);
use strict;
use warnings;
use utf8;
use Encode qw(decode encode);
my $user_dir="/usr/frank";
my $work_dir = "${user_dir}/test_south";
# forward declare the function prototypes
sub fileProcess;
opendir(DIR, ${work_dir}) or die "Cannot open directory " . ${work_dir};
# readdir OPTION 1 - shift_jis
#my #files = map { Encode::decode("shift_jis", $_); } readdir DIR; # Note filename could not be decoded as shift_jis
#binmode(STDOUT,":encoding(shift_jis)");
# readdir OPTION 2 - utf8
my #files = map { Encode::decode("utf8", $_); } readdir DIR; # Note filename could be decoded as utf8
binmode(STDOUT,":encoding(utf8)"); # setting display to output utf8
say #files;
# pass an array reference of files that will be modified
fileNameTranslate();
fileProcess();
closedir(DIR);
exit;
sub fileNameTranslate
{
foreach (#files)
{
my $original_file = $_;
#print "original_file: " . "$original_file" . "\n";
s/南/south/;
my $new_file = $_;
# print "new_file: " . "$_" . "\n";
if ($new_file ne $original_file)
{
print "Rename " . $original_file . " to \n\t" . $new_file . "\n";
rename("${work_dir}/${original_file}", "${work_dir}/${new_file}") or print "Warning: rename failed because: $!\n";
}
}
}
sub fileProcess
{
# file process OPTION 3, open file as shift_jis, the search and replace would work
# open (IN1, "<:encoding(shift_jis)", "${work_dir}/south.txt") or die "Error: south.txt\n";
# open (OUT1, "+>:encoding(shift_jis)" , "${work_dir}/south1.txt") or die "Error: south1.txt\n";
# file process OPTION 4, open file as utf8, the search and replace would not work
open (IN1, "<:encoding(utf8)", "${work_dir}/south.txt") or die "Error: south.txt\n";
open (OUT1, "+>:encoding(utf8)" , "${work_dir}/south1.txt") or die "Error: south1.txt\n";
while (<IN1>)
{
print $_ . "\n";
chomp;
s/南/south/g;
print OUT1 "$_\n";
}
close IN1;
close OUT1;
}
Result:
(BAD) Uncomment Option 1 and 3, (Comment Option 2 and 4)
Setup: Readdir encoding, SHIFT_JIS; file open encoding SHIFT_JIS
Result: file name replacement failed..
Error: utf8 "\x93" does not map to Unicode at .//south.pl line 68.
\x93
(BAD) Uncomment Option 2 and 4 (Comment Option 1 and 3)
Setup: Readdir encoding, utf8; file open encoding utf8
Result: file name replacement worked, south.txt generated
But south1.txt file content replacement failed , it has the content \x93 ().
Error: "\x{fffd}" does not map to shiftjis at .//south.pl line 25.
... -Ao?= (Bx{fffd}.txt
(GOOD) Uncomment Option 2 and 3, (Comment Option 1 and 4)
Setup: Readdir encoding, utf8; file open encoding SHIFT_JIS
Result: file name replacement worked, south.txt generated
South1.txt file content replacement worked, it has the content south.
Conclusion:
I had to use different encoding scheme for this example to work properly. Readdir utf8, and file processing SHIFT_JIS, as the content of the csv file was SHIFT_JIS encoded.
A good place to start would be to read the documentation for the utf8 module. Which says:
The use utf8 pragma tells the Perl parser to allow UTF-8 in the
program text in the current lexical scope (allow UTF-EBCDIC on EBCDIC
based platforms). The no utf8 pragma tells Perl to switch back to
treating the source text as literal bytes in the current lexical
scope.
If you don't have use utf8 in your code, then the Perl compiler assumes that your source code is in your system's native single-byte encoding. And the character '北' will make little sense. Adding the pragma tells Perl that your code includes Unicode characters and everything starts to work.