In Perl (simple program for newbie needed) , after . or ? or ! the next character should be in upper case [duplicate] - perl

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How can I replace a particular character with its upper-case counterpart?
Consider following string :
String = "this is for test. i'm new to perl! Please help. can u help? i hope so."
In the above string, after . or ? or ! the next character should be in upper case. Also the first letter of the sentence (string) is to be in uppercase. How can I do that?
I'm reading string character by character.
your help will be greatly appreciated.
regards,
Amit

In a basic way, you can try this code
#!/usr/bin/perl
$String = "this is for test. i'm new to perl! Please help. can u help? i hope so.";
$String =~ s/ ((^\w)|(\.\s\w)|(\?\s\w)|(\!\s\w))/\U$1/xg;
print "$String\n";
(^\w) : beginning of the line
(\.\s\w) : After a '.' followed by a space
(\?\s\w) : After a '?' followed by a space
(\!\s\w) : After a '!' followed by a space

Perl has a ucfirst function that does exactly what you want. So all you need to do it to split your string into sections, use ucfirst on each section and then join them together again.
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
my $string = q[this is for test. i'm new to perl! Please help. can u help? i hope so.];
my #bits = split /([.?!]\s+)/, $string;
$string = join '', map { ucfirst } #bits;
say $string;

Related

Perl - Convert integer to text Char(1,2,3,4,5,6)

I am after some help trying to convert the following log I have to plain text.
This is a URL so there maybe %20 = 'space' and other but the main bit I am trying convert is the char(1,2,3,4,5,6) to text.
Below is an example of what I am trying to convert.
select%20char(45,120,49,45,81,45),char(45,120,50,45,81,45),char(45,120,51,45,81,45)
What I have tried so far is the following while trying to added into the char(in here) to convert with the chr($2)
perl -pe "s/(char())/chr($2)/ge"
All this has manage to do is remove the char but now I am trying to convert the number to text and remove the commas and brackets.
I maybe way off with how I am doing as I am fairly new to to perl.
perl -pe "s/word to remove/word to change it to/ge"
"s/(char(what goes in here))/chr($2)/ge"
Output try to achieve is
select -x1-Q-,-x2-Q-,-x3-Q-
Or
select%20-x1-Q-,-x2-Q-,-x3-Q-
Thanks for any help
There's too much to do here for a reasonable one-liner. Also, a script is easier to adjust later
use warnings;
use strict;
use feature 'say';
use URI::Escape 'uri_unescape';
my $string = q{select%20}
. q{char(45,120,49,45,81,45),char(45,120,50,45,81,45),}
. q{char(45,120,51,45,81,45)};
my $new_string = uri_unescape($string); # convert %20 and such
my #parts = $new_string =~ /(.*?)(char.*)/;
$parts[1] = join ',', map { chr( (/([0-9]+)/)[0] ) } split /,/, $parts[1];
$new_string = join '', #parts;
say $new_string;
this prints
select -x1-Q-,-x2-Q-,-x3-Q-
Comments
Module URI::Escape is used to convert percent-encoded characters, per RFC 3986
It is unspecified whether anything can follow the part with char(...)s, and what that might be. If there can be more after last char(...) adjust the splitting into #parts, or clarify
In the part with char(...)s only the numbers are needed, what regex in map uses
If you are going to use regex you should read up on it. See
perlretut, a tutorial
perlrequick, a quick-start introduction
perlre, the full account of syntax
perlreref, a quick reference (its See Also section is useful on its own)
Alright, this is going to be a messy "one-liner". Assuming your text is in a variable called $text.
$text =~ s{char\( ( (?: (?:\d+,)* \d+ )? ) \)}{
my #arr = split /,/, $1;
my $temp = join('', map { chr($_) } #arr);
$temp =~ s/^|$/"/g;
$temp
}xeg;
The regular expression matches char(, followed by a comma-separated list of sequences of digits, followed by ). We capture the digits in capture group $1. In the substitution, we split $1 on the comma (since chr only works on one character, not a whole list of them). Then we map chr over each number and concatenate the result into a string. The next line simply puts quotation marks at the start and end of the string (presumably you want the output quoted) and then returns the new string.
Input:
select%20char(45,120,49,45,81,45),char(45,120,50,45,81,45),char(45,120,51,45,81,45)
Output:
select%20"-x1-Q-","-x2-Q-","-x3-Q-"
If you want to replace the % escape sequences as well, I suggest doing that in a separate line. Trying to integrate both substitutions into one statement is going to get very hairy.
This will do as you ask. It performs the decoding in two stages: first the URI-encoding is decoded using chr hex $1, and then each char() function is translated to the string corresponding to the character equivalents of its decimal parameters
use strict;
use warnings 'all';
use feature 'say';
my $s = 'select%20char(45,120,49,45,81,45),char(45,120,50,45,81,45),char(45,120,51,45,81,45)';
$s =~ s/%(\d+)/ chr hex $1 /eg;
$s =~ s{ char \s* \( ( [^()]+ ) \) }{ join '', map chr, $1 =~ /\d+/g }xge;
say $s;
output
select -x1-Q-,-x2-Q-,-x3-Q-

Perl: Replace consecutive spaces in this given scenario?

an excerpt of a big binary file ($data) looks like this:
\n1ax943021C xxx\t2447\t5
\n1ax951605B yyy\t10400\t6
\n1ax919275 G2L zzz\t6845\t6
The first 25 characters contain an article number, filled with spaces. How can I convert all spaces between the article numbers and the next column into a \x09 ? Note the one or more spaces between different parts of the article number.
I tried a workaround, but that overwrites the article number with ".{25}xxx»"
$data =~ s/\n.{25}/\n.{25}xxx/g
Anyone able to help?
Thanks so much!
Gary
You can use unpack for fixed width data:
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Useqq=1;
print Dumper $_ for map join("\t", unpack("A25A*")), <DATA>;
__DATA__
1ax943021C xxx 2447 5
1ax951605B yyy 10400 6
1ax919275 G2L zzz 6845 6
Output:
$VAR1 = "1ax943021C\txxx\t2447\t5";
$VAR1 = "1ax951605B\tyyy\t10400\t6";
$VAR1 = "1ax919275 G2L\tzzz\t6845\t6";
Note that Data::Dumper's Useqq option prints whitecharacters in their escaped form.
Basically what I do here is take each line, unpack it, using 2 strings of space padded text (which removes all excess space), join those strings back together with tab and print them. Note also that this preserves the space inside the last string.
I interpret the question as there being a 25 character wide field that should have its trailing spaces stripped and then delimited by a tab character before the next field. Spaces within the article number should otherwise be preserved (like "1ax919275 G2L").
The following construct should do the trick:
$data =~ s/^(.{25})/{$t=$1;$t=~s! *$!\t!;$t}/emg;
That matches 25 characters from the beginning of each line in the data, then evaluates an expression for each article number by stripping its trailing spaces and appending a tab character.
Have a try with:
$data =~ s/ +/\t/g;
Not sure exactly what you what - this will match the two columns and print them out - with all the original spaces. Let me know the desired output and I will fix it for you...
#!/usr/bin/perl -w
use strict;
my #file = ('\n1ax943021C xxx\t2447\t5', '\n1ax951605B yyy\t10400\t6',
'\n1ax919275 G2L zzz\t6845\t6');
foreach (#file) {
my ($match1, $match2) = ($_ =~ /(\\n.{25})(.*)/);
print "$match1'[insertsomethinghere]'$match2\n";
}
Output:
\n1ax943021C '[insertsomethinghere]'xxx\t2447\t5
\n1ax951605B '[insertsomethinghere]'yyy\t10400\t6
\n1ax919275 G2L '[insertsomethinghere]'zzz\t6845\t6

How can I replace a particular character with its upper-case counterpart?

Consider the following string
String = "this is for test. i'm new to perl! Please help. can u help? i hope so."
In the above string after . or ? or ! the next character should be in upper case. how can I do that?
I'm reading from text file line by line and I need to write modified data to another file.
your help will be greatly appreciated.
you could use a regular expression
try this:
my $s = "...";
$s =~ s/([\.\?!]\s*[a-z])/uc($1)/ge; # of course $1 , thanks to plusplus
the g-flag searches for all matches and the e-flag executes uc to convert the letter to uppercase
Explanation:
with [.\?!] you search for your punctuation marks
\s* is for whitespaces between the marks and the first letter of your next word and
[a-z] matches on a single letter (in this case the first one of the next word
the regular expression mentioned above searches with these patterns for every appearance of a punctuation mark followed by (optional) whitespaces and a letter and replaces it with the result of uc (which converts the match to uppercase).
For example:
my $s = "this is for test. i'm new to perl! Please help. can u help? i hope so.";
$s =~ s/([\.\?!]\s*[a-z])/uc(&1)/ge;
print $s;
will find ". i", "! P", ". c" and "? i" and replaces then, so the printed result is:
this is for test. I'm new to perl! Please help. Can u help? I hope so.
You can use the substitution operator s///:
$string =~ s/([.?!]\s*\S)/ uc($1) /ge;
Here's a split solution:
$str = "this is for test. im new to perl! Please help. can u help? i hope so.";
say join "", map ucfirst, split /([?!.]\s*)/, $str;
If all you are doing is printing to a new file, you don't need to join the string back up. E.g.
while ($line = <$input>) {
print $output map ucfirst, split /([?!.]\s*)/, $line;
}
edit - completely misread the question, thought you were just asking to uppercase the is for some reason, apologies for any confusion!
as the answers so far state, you could look at regular expressions, and the substitution operator (s///). No-one has mentioned the \b (word boundary) character though, which may be useful to find the single is - otherwise you are going to have to keep adding punctuation characters that you find to the character class match (the [ ... ]).
e.g.
my $x = "this is for test. i'm new to perl! Please help. can u help? i hope so. ".
\"i want it to work!\". Dave, Bob, Henry viii and i are friends. foo i bar.";
$x =~ s/\bi\b/I/g; # or could use the capture () and uc($1) in eugene's answer
gives:
# this is for test. I'm new to perl! Please help. can u help? I hope so.
# "I want it to work!". Dave, Bob, Henry viii and I are friends. foo I bar.

How to get rid of control characters in perl.. specifically [gs]?

my code is as follows
my $string = $cells[71];
print $string;
this prints the string but where spaces should be there is a box with 01 10 in it. I opened it in Notepad++ and the box turned into a black GS (which i am assuming is group separator).
I looked online and it said to use:
s/[^[:print:]]+//g
but when i set the string to:
my $string =~s/[^[:print:]]+//g
and I run the program i get:
4294967295
How do i resolve this?
I did what HOBBS said and it worked... thanks :)
Is there anyway I could print an enter where each of these characters are ( the box with 1001)?
When doing a regex match, you need to be careful to write $var =~ /pattern/, not $var = ~ /pattern/. When you use the second one, you're doing /pattern/, which is a regex match against $_, returning a number in scalar context. Then you do ~, which takes the bitwise inverse of that number, then ($var =) you assign that result to $var. Not what you wanted at all.
You have to assign the variable first, then do the substitution:
my $string = $cells[71];
$string =~ s/[^[:print:]]+//g;

Can I use the y operator to do a non-one-to-one transliteration in Perl?

The y operator in Perl does character-by-character transliteration. For example, if we do y/abc/dfg to the string "foobar", we get "foofdr". But what if I want to transliterate "ā" to "ei" and "ä" to "a:" and "ō" to "әu" and "o" to "ɒ".
I tried the following line of code but no luck:(
y/āäōo/(ei)(a:)(әu)ɒ/
Do we hopefully have a workaround for this problem? Or do I have to repeatedly use the s operator and do a lot of cumbersome substitutions?
Thanks in advance for any guidance :)
In this case, create a hash and go from the keys to the strings easily.
use warnings;
use strict;
use utf8;
binmode STDOUT, ":utf8";
my $string = "āäōo";
my %trans = qw/ā ei ä a: ō u o ɒ/;
my $keys = join '', keys %trans;
$string =~ s/([$keys])/$trans{$1}/g;
print "$string\n";
You need to alter this if your keys are more than one character long by sorting the keys in order of decreasing length and joining them using ( | | ) instead of [ ].
It sounds like you're trying to do something similar to Text::Unaccent::PurePerl.