Read string within double quotes in perl - perl

I have a file which contains lines like this:
"pin1" Inpin; "pin2" outpin; "pin3" inoutPin;
some other string "pin4" inpin
I want to store just pin1, pin2, pin3, pin4 (basically words within double quotes). Can someone please help..? Basically read one line at a time and grab the word within double quotes only. I tried to split a line by ";" but it doesn't work since ";" may not be present in all lines.
thanks!

join ', ' $str =~ /"([^"]*)"/g

try this code:
my $string = '"pin1" Inpin; "pin2" outpin; "pin3" inoutPin;`';
my #array;
while ( $string =~ /\"(.+?)\"/ )
{
put (array, $1.),
$string =~ s/\"$1\"//;
}

Related

How do you match \'

I need a regex to match \' <---- literally backslash apostrophe.
my $line = '\'this';
$line =~ s/(\o{134})(\o{047})/\\\\'/g;
$line =~ s/\\'/\\\\'/g;
$line =~ s/[\\][']/\\\\'/g;
printf('%s',$line);
print "\n";
All I get out of this is
'this
When what I want is
\\'this
This occurs whether the string is declared using ' or ". This was a test script for tracking down a file parsing bug. I wanted to confirm that the regex was working as expected.
I don't know if when the backslash apostrophe is parsed by the regex it is not treated as 2 characters, but is instead treated as an escaped apostrophe.
Either way. what is the best way to match \' and print out \\'? I don't want to escape any other back slashes or apostrophes and I can't change the text I am parsing, just the way it is handled and outputted.
s/\\'/\\\\'/g
All three of your patterns match a backslash followed by a quote, the above being the simplest.
Your testing was in vain because your string doesn't contain any backslashes. Both string literals "\'this" (from earlier edit) and '\'this' (from later edit) produce the string 'this.
say "\'this"; # 'this
say '\'this'; # 'this
To produce the string \'this, you could use either of the following string literals (among others):
"\\'this"
'\\\'this'
say "\\'this"; # \'this
say '\\\'this'; # \'this
The answer is, of course
s/[\\][']/\\\\'/g
This will match
\'this
And substitute with this
\\'this
This was the only way I could get it to work.
Perl
Too much "regexing" in your snippet. Try:
my $line = '\'this';
$line =~ s/'/\\\\\'/g;
printf('%s',$line);
print "\n";
# \\'this
or... if you want another mode:
my $line = '\'this';
$line =~ s/'/\\'/g;
printf('%s',$line);
print "\n";
# \'this

Perl Single Quote replacement

I've been struggling for the last days in regards to a character replacement in Perl:
I have a String which is surrounded by single quotes, yet, inside that String, I have a name which contains a single quote, let's say O'Neil. Now, given the fact that my String is surrounded by single quotes, Perl recognizes the single quote in the Name, as being the end of the String.
Surrounding the entire string in double quotes is not an option, since it's build from an URL.
Now, I did some research and didn't find anything, now I'm asking y'all:
I've tried to play around with the following syntax:
$Daten =~ s/\'/\\'/g; which of course doesn't work...
$Daten is the entire string which contains the Name O'Neil*
Now, I want to replace the single quote, with a backslash quote: ' -> \'
Anyone has any ideas?
Best regards,
Ionut Sanda
Perhaps something like following code should comply with your requirements
use strict;
use warnings;
my $debug = 1;
while( my $line = <DATA> ) {
$line =~ s/(.*)'(.+)'(.+)'(.*)/$1'$2\\'$3'$4/g;
print $line if $debug;
}
__DATA__
'USER1:O'NEILL:PATRICK:M:lastname_firstname#company.com'
datax 'USER1:O'NEILL:PATRICK:M:lastname_firstname#company.com' datay
output
'USER1:O\'NEILL:PATRICK:M:lastname_firstname#company.com'
datax 'USER1:O\'NEILL:PATRICK:M:lastname_firstname#company.com' datay
Well, as you do not provide a sample or your code I have to improvise
use strict;
use warnings;
my $debug = 1;
while( my $Daten = <DATA> ) {
$Daten =~ s/(.*)'(.+)'(.+)'(.*)/$1'$2\\'$3'$4/g; # Magic happens here
print $Daten if $debug;
}
__DATA__
'USER1:O'NEILL:PATRICK:M:lastname_firstname#company.com'
datax 'USER1:O'NEILL:PATRICK:M:lastname_firstname#company.com' datay
output
'USER1:O\'NEILL:PATRICK:M:lastname_firstname#company.com'
datax 'USER1:O\'NEILL:PATRICK:M:lastname_firstname#company.com' datay
Otherwise I do not have enough information to understand your problem (no sample of data, no snippet of the code).

how to remove date identifier and special charactar from string

I want to remove date identifier and * from string .
$string = "*102015 Supplied air hood";
$output = "Supplied air hood";
i have used
$string =~ s/[#\%&\""*+]//g;
$string =~ s/^\s+//;
what should i used to get string value = "Supplied air hood";
Thanks in advance
To remove everything from the string up to the first space, you can write
$str =~ s/^\S*\s+//;
Your pattern doesn't contain numbers. It would remove the *, but nothing else. If you want to remove a * followed by six digits and a blank at the beginning of the string, do it like this:
$string =~ s/^\*\d{6} //;
However, if that string always contains a pattern like this, you don't need a regular expression substitution. You can simply take a substring.
my $output = substr $string, 8;
That will assign the content of $string starting from the 9th character
The script below does what you want, assuming that the date always appears at the beginning the line, and that it is follow by exactly one space.
use strict;
use warnings;
while (<DATA>)
{
# skip one or more characters not a space
# then skip exactly one space
# then capture all remaining characters
# and assign them to $s
my ($s) = $_ =~ /[^ ]+ (.*)/;
print $s, "\n";
}
__DATA__
*110115 first date
*110115 second date
*110315 third date
Output is:
first date
second date
third date

removing trailing words end with : from a string using perl

i have question on how to remove specific set of words that end with : in a string using perl.
For instance,
lunch_at_home: start at 1pm.
I want to get only "start at 1 pm"after discarding "lunch_at_home:"
note that lunch_at_home is just an example. It can be any string with any length but it should end with ":"
This should do the job.
my $string = "lunch_at_home: start at 1pm."
$string =~ s/^.*:\s*//;
It will remove all char before : including the :
If you want to remove a specific set of words that are set apart from the data you want:
my $string = 'lunch_at_home: start at 1pm.';
$string =~ s/\b(lunch_at_home|breakfast_at_work):\s*//;
That would leave you with start at 1pm. and you can expand the list as needed.
If you just want to remove any "words" (we'll use the term loosely) that end with a colon:
my $string = 'lunch_at_home: start at 1pm.';
$string =~ s/\b\S+:\s*//;
You'd end up with the same thing in this case.
take
my $string = "lunch_at_home: start at 1pm.";
to remove everything up to the last ":" and the period at the end of the entry as in your question:
$string =~ s/.*: (.*)\./$1/;
to remove everything up to the first ":"
$string =~ s/.*?: (.*)\./$1/;
split on : and discard the first part:
my (undef, $value) = split /:\s*/, $string, 2;
The final argument (2), ensures this works correctly if the trailing string contains a :.
You can use split function to achieve this:
my $string = "lunch_at_home: start at 1pm.";
$string = (split /:\s*/, $string)[1];
print "$string\n";

Perl split() Function Not Handling Pipe Character Saved As A Variable

I'm running into a little trouble with Perl's built-in split function. I'm creating a script that edits the first line of a CSV file which uses a pipe for column delimitation. Below is the first line:
KEY|H1|H2|H3
However, when I run the script, here is the output I receive:
Col1|Col2|Col3|Col4|Col5|Col6|Col7|Col8|Col9|Col10|Col11|Col12|Col13|
I have a feeling that Perl doesn't like the fact that I use a variable to actually do the split, and in this case, the variable is a pipe. When I replace the variable with an actual pipe, it works perfectly as intended. How could I go about splitting the line properly when using pipe delimitation, even when passing in a variable? Also, as a silly caveat, I don't have permissions to install an external module from CPAN, so I have to stick with built-in functions and modules.
For context, here is the necessary part of my script:
our $opt_h;
our $opt_f;
our $opt_d;
# Get user input - filename and delimiter
getopts("f:d:h");
if (defined($opt_h)) {
&print_help;
exit 0;
}
if (!defined($opt_f)) {
$opt_f = &promptUser("Enter the Source file, for example /qa/data/testdata/prod.csv");
}
if (!defined($opt_d)) {
$opt_d = "\|";
}
my $delimiter = "\|";
my $temp_file = $opt_f;
my #temp_file = split(/\./, $temp_file);
$temp_file = $temp_file[0]."_add-headers.".$temp_file[1];
open(source_file, "<", $opt_f) or die "Err opening $opt_f: $!";
open(temp_file, ">", $temp_file) or die "Error opening $temp_file: $!";
my $source_header = <source_file>;
my #source_header_columns = split(/${delimiter}/, $source_header);
chomp(#source_header_columns);
for (my $i=1; $i<=scalar(#source_header_columns); $i++) {
print temp_file "Col$i";
print temp_file "$delimiter";
}
print temp_file "\n";
while (my $line = <source_file>) {
print temp_file "$line";
}
close(source_file);
close(temp_file);
The first argument to split is a compiled regular expression or a regular expression pattern. If you want to split on text |. You'll need to pass a pattern that matches |.
quotemeta creates a pattern from a string that matches that string.
my $delimiter = '|';
my $delimiter_pat = quotemeta($delimiter);
split $delimiter_pat
Alternatively, quotemeta can be accessed as \Q..\E inside double-quoted strings and the like.
my $delimiter = '|';
split /\Q$delimiter\E/
The \E can even be omitted if it's at the end.
my $delimiter = '|';
split /\Q$delimiter/
I mentioned that split also accepts a compiled regular expression.
my $delimiter = '|';
my $delimiter_re = qr/\Q$delimiter/;
split $delimiter_re
If you don't mind hardcoding the regular expression, that's the same as
my $delimiter_re = qr/\|/;
split $delimiter_re
First, the | isn't special inside doublequotes. Setting $delimiter to just "|" and then making sure it is quoted later would work or possibly setting $delimiter to "\\|" would be ok by itself.
Second, the | is special inside regex so you want to quote it there. The safest way to do that is ask perl to quote your code for you. Use the \Q...\E construct within the regex to mark out data you want quoted.
my #source_header_columns = split(/\Q${delimiter}\E/, $source_header);
see: http://perldoc.perl.org/perlre.html
It seems as all you want to do is count the fields in the header, and print the header. Might I suggest something a bit simpler than using split?
my $str="KEY|H1|H2|H3";
my $count=0;
$str =~ s/\w+/"Col" . ++$count/eg;
print "$str\n";
Works with most any delimeter (except alphanumeric and underscore), it also saves the number of fields in $count, in case you need it later.
Here's another version. This one uses the character class brackets instead, to specify "any character but this", which is just another way of defining a delimeter. You can specify delimeter from the command-line. You can use your getopts as well, but I just used a simple shift.
my $d = shift || '[^|]';
if ( $d !~ /^\[/ ) {
$d = '[^' . $d . ']';
}
my $str="KEY|H1|H2|H3";
my $count=0;
$str =~ s/$d+/"Col" . ++$count/eg;
print "$str\n";
By using the brackets, you do not need to worry about escaping metacharacters.
#!/usr/bin/perl
use Data::Dumper;
use strict;
my $delimeter="\\|";
my $string="A|B|C|DD|E";
my #arr=split(/$delimeter/,$string);
print Dumper(#arr)."\n";
output:
$VAR1 = 'A';
$VAR2 = 'B';
$VAR3 = 'C';
$VAR4 = 'DD';
$VAR5 = 'E';
seems you need define delimeter as \\|