A perl Script to Delete trailing spaces in a csv file - perl

I am new to Perl Programming.
I have a CSV File with N fields in which Nth field is having Trailing Spaces for all the records. I want to remove all this Trailing Spaces.Please help me in this.
I have used this substitution in a loop. But it has given me empty file
s/\s+$//
Example File
123,ABCD,"AC,BD",21/12/2013
134,CDEF,"CD,BD,ED",23/11/2013
987,TGYH,"HY,-.FDDS",20/11/2013
Output
123,ABCD,"AC,BD",21/12/2013
134,CDEF,"CD,BD,ED",23/11/2013
987,TGYH,"HY,-.FDDS",20/11/2013
Please let me know if you need more details.
Thanks In Advance.

Your regex seems good. You could say:
perl -ple 's/\s+$//' filename
To save the changes in-place to the file, say:
perl -i -ple 's/\s+$//' filename

Steps to remove the leading and trailing spaces in the csv file:
Open the input file
OPEN FILE, "<$input_file";
Looping each and every row in the file to do trim of each row in the csv file.In the while loop use regex to trim the leading and trailing of the spaces for each field
while(my $row = <FILE>)
{
$row =~ 's/\s//g';
}
This will give you the results you are expecting

Related

Concatenate of string in perl

i am trying to concatenate string in perl.
eg:
my $file = $table_name.".sql";
print $file;
I get output like:
Employee
.sql
(considering $table_name =Employee )
please suggest how to make sure the output comes in single lie without blanks.
Your $table_name variable contains a word 'Employee' as well as a new line character.
You can use chomp to remove the newline.
Add this before you concatenate your variables:
chomp $table_name;
The chomp() function will remove newline character from the end of a string.
check http://perlmeme.org/howtos/perlfunc/chomp_function.html

Deleting empty fields in pipe delimited printout in perl?

I'm going through each line of a file, looking for a few specific things in each line with regex, and I want to print it so that each row of the output .csv file just contains those things (thing1|thing2|thing3|thing4|) but because it's going through line by line I get things like
||||
then
|||thing4|
then
|thing1||thing3||
and I don't know how to delete the empty pipe delimited areas to shove everything together. Help?
You could filter it after with a regex
$out =~ s/\|{2}/|/sg;

perl multiline find and replace, can't get newline working

I have a directory on a Linux host with several property files which I want to edit by replacing hardcoded values with placeholder tags. My goal was to have a perl script which reads a delimited file that contains entries for each of the property files listing the hardcoded value, the placeholder value and the name of the file to edit.
For example, in file.prop I have these values set
<connection targetHostUrl="99.99.99.99"
targetHostPort="9999"
And I want to replace the values with tags as shown below
<connection targetHostUrl="TARGETHOST"
targetHostPort="PORT"
There will be several entries similar to this so I have to match on the unique combination of IP and PORT so I need a multiline match.
To do this I wrote the following script to take the input of the delimited filename, which is delimited with ||. I go get that file from the config directory and read in the values to get the hardcoded value, tag, and filename to edit. Then I read in that property file, do the substitution and then write it out again.
#!/usr/bin/perl
my $config = $ARGV[0];
chomp $config;
my $filename = '/config/' . $config;
my ($hard,$tagg,$prop);
open(DATAFILE, $filename) or die "Could not open DATAFILE $filename.";
while(<DATAFILE>)
{
chomp $_;
($hard,$tagg,$prop) = split('\|\|', $_);
$*=1;
open(INPUT,"</properties/$prop") or die "Could not open INPUT $prop.";
#input_array=<INPUT>;
close(INPUT);
$input_scalar=join("",#input_array);
$input_scalar =~ s/$hard/$tagg/;
open(OUTPUT,">/properties/$prop") or die "Could not open OUTPUT $prop.";
print(OUTPUT $input_scalar);
close(OUTPUT);
}
close DATAFILE;
Inside the config file I have the following entry
<connection targetHostUrl="99.99.999.99"(.|\n)*?targetHostPort="9999"||<connection targetHostUrl="TARGETHOST1"\n targetHostPort="PORT"||file.prop
My output is as shown below. It puts what I hoped to be a newline as a literal \n
<connection targetHostUrl="TARGETHOST"\n targetHostPort="PORT"
I can't find a way to get the \n taken as a newline. At first I thought, no problem, I'll just do a 2nd substitution like
perl -i -pe 's/\\n/\n/o' $prop
and although this works, for some reason it puts ^M characters at the end of every line except the one I did the replacement on. I don't want to do a 3rd replace to strip them out.
I've searched and found other ways of doing the multiline search/replace but they all interpret the \n literally.
Any suggestions?
My output is as shown below. It puts what I hoped to be a newline as a literal \n
Why would it insert a newline when the string doesn't contain one?
I can't find a way to get the \n taken as a newline.
There isn't any. If you want to substitute a newline, you need to provide a newline.
If you used a proper CSV parser like Text::CSV_XS, you could put a newline in your data file.
Otherwise, you'll have to write some code to handle the escape sequences you want your code to handle.
for some reason it puts ^M characters at the end of every line except the one I did the replacement on.
Quite the opposite. It removes it from the one line you did the replacement on.
That's home some programs represent a Carriage Return. You have a file with CR LF line ends. You could use dos2unix to convert it, or you could leave it as is because XML doesn't care.

how to shift based on a regular expression using perl?

how to shift the top element from array based on a regular expression using perl? Also these are datarecords, meaning I have the input record separator ($/) set to
$/='#';
for example, the following text file contains this data record.
#dddddddddd
ccccccccccc
eeeeeeeeeee
fffffffffff
I would like to remove the # sign and keep the text, for example:
dddddddddd
ccccccccccc
eeeeeeeeeee
fffffffffff
If you just want to manipulate a text file, a one-liner seems like the best solution. This will edit the file and keep a backup in "inputfile.txt.bak".
perl -pi.bak -we 's/^#//' inputfile.txt
Or you can do a shell redirection:
perl -wpe 's/^#//' inputfile.txt > outputfile.txt
These will try to alter all the lines in the file. If you just want the first line altered you need something different:
perl -wpe 's/^#// if ($. == 0);' inputfile.txt > outputfile.txt
Don't confuse shift with regex substitution.
shift will remove the first element from the array, not string.
A regex substitution can deal with removal of the leading '#' sigil.
The first element of the array would be $array[0].
If a regex substitution is applied to this first element, the '#' is removed:
my #array = ( '#dddddddddd', 'ccccccccccc', 'eeeeeeeeeee', 'fffffffffff' );
$array[0] =~ s/^#//;
print $array[0]; # 'dddddddddd'
This does not seem to be related to arrays. It appears you are just dealing with strings.
This removes a leading hash mark for the string $line:
$line =~ s/^\#//;

Reading a large file in perl, record by record, with a dynamic record separator

I have a script that reads a large file line by line. The record separator ($/) that I would like to use is (\n). The only problem is that the data on each line contains CRLF characters (\r\n), which the program should not be considered the end of a line.
For example, here is a sample data file (with the newlines and CRLFs written out):
line1contents\n
line2contents\n
line3\r\ncontents\n
line4contents\n
If I set $/ = "\n", then it splits the third line into two lines. Ideally, I could just set $/ to a regex that matches \n and not \r\n, but I don't think that's possible. Another possibility is to read in the whole file, then use the split function to split on said regex. The only problem is that the file is too large to load into memory.
Any suggestions?
For this particular task, it sounds pretty straightforward to check your line ending and append the next line as necessary:
$/ = "\n";
...
while(<$input>) {
while( substr($_,-2) eq "\r\n" ) {
$_ .= <$input>;
}
...
}
This is the same logic used to support line continuation in a number of different programming contexts.
You are right that you can't set $/ to a regular expression.
dos2unix would put a UNIX newline character in for the "\r\n" and so wouldn't really solve the problem. I would use a regex that replaces all instances of "\r\n" with a space or tab character and save the results to a different file (since you don't want to split the line at those points). Then I would run your script on the new file.
Try using dos2unix on the file first, and then read in as normal.