Modify the last two characters of a string in Perl - perl

I am looking for a solution to a problem:
I have the NSAP address which is 20 characters long:
39250F800000000000000100011921680030081D
I now have to replace the last two characters of this string with F0 and the final string should look like:
39250F80000000000000010001192168003008F0
My current implementation chops the last two characters and appends F0 to it:
my $nsap = "39250F800000000000000100011921680030081D";
chop($nsap);
chop($nsap);
$nsap = $nsap."F0";
Is there a better way to accomplish this?

You can use substr:
substr ($nsap, -2) = "F0";
or
substr ($nsap, -2, 2, "F0");
Or you can use a simple regex:
$nsap =~ s/..$/F0/;
This is from substr's manpage:
substr EXPR,OFFSET,LENGTH,REPLACEMENT
substr EXPR,OFFSET,LENGTH
substr EXPR,OFFSET
Extracts a substring out of EXPR and returns it.
First character is at offset 0, or whatever you've
set $[ to (but don't do that). If OFFSET is nega-
tive (or more precisely, less than $[), starts
that far from the end of the string. If LENGTH is
omitted, returns everything to the end of the
string. If LENGTH is negative, leaves that many
characters off the end of the string.
Now, the interesting part is that the result of substr can be used as an lvalue, and be assigned:
You can use the substr() function as an lvalue, in
which case EXPR must itself be an lvalue. If you
assign something shorter than LENGTH, the string
will shrink, and if you assign something longer
than LENGTH, the string will grow to accommodate
it. To keep the string the same length you may
need to pad or chop your value using "sprintf".
or you can use the replacement field:
An alternative to using substr() as an lvalue is
to specify the replacement string as the 4th argu-
ment. This allows you to replace parts of the
EXPR and return what was there before in one oper-
ation, just as you can with splice().

$nsap =~ s/..$/F0/;
replaces the last two characters of a string with F0.

Use the substr( ) function:
substr( $nsap, -2, 2, "F0" );
chop( ) and the related chomp( ) are really intended for removing line ending characters - newlines and so on.
I believe that substr( ) will be quicker than using a regular expression.

Related

Substitution on string in perl changes string to an integer value

I am trying to do delete some characters matching a regex in perl and when I do that it returns an integer value.
I have tried substituting multiple spaces in a string with empty string or basically deleting the space.
#! /usr/intel/bin/perl
my $line = "foo/\\bar car";
print "$line\n";
#$line = ~s/(\\|(\s)+)+//; <--Ultimately need this, where backslash and space needs to be deleted. Tried this, returns integer value
$line = ~s/\s+//; <-- tried this, returns integer value
print "$line\n";
Expected results:
First print: foo/\bar car
Second print: foo/barcar
Actual result:
First print: foo/\\bar car
Second print: 18913234908
The proper solution is
$line =~ s/[\s\\]+//g;
Note:
g flag to substitute all occurrences
no space between = and ~
=~ is a single operator, binding the substitution operator s to the target variable $line.
Inserting a space (as in your code) means s binds to the default target, $_, because there is no explicit target, and then the return value (which is the number of substitutions made) has all its bits inverted (unary ~ is bitwise complement) and is assigned to $line.
In other words,
$line = ~ s/...//
parses as
$line = ~(s/...//)
which is equivalent to
$line = ~($_ =~ s/...//)
If you had enabled use warnings, you would've gotten the following message:
Use of uninitialized value $_ in substitution (s///) at prog.pl line 6.
You've already accepted an answer, but I thought it would be useful to give you a few more details.
As you now know,
$line = ~s/\s+//;
is completely different to:
$line =~ s/\s+//;
You wanted the second, but you typed the first. So what did you end up with?
~ is "bitwise negation operator". That is, it converts its argument to a binary number and then bit-flips that number - all the zeroes become ones and all the ones become zeros.
So you're asking for the bitwise negation of s/\s+//. Which means the bitwise negation works on the value returned by s/\s+//. And the value returned by a substitution is the number of substitutions made.
We can now work out all of the details.
s/\s+// carries out your substitution and returns the number of substitutions made (an integer).
~s/\s+// returns the bitwise negation of the integer returned by the substitution (which is also an integer).
$line = ~s/\s+// takes that second integer and assigns it to the variable $line.
Probably, the first step returns 1 (you don't use /g on your s/.../.../, so only one substitution will be made). It's easy enough to get the bitwise negation of 1.
$ perl -E'say ~1'
18446744073709551614
So that might well be the integer that you're seeing (although it might be different on a 32-bit system).

index argument contains . perl

If a string contains . representing any character, index doesn't match on it. What to do so that it takes . as any character?
For ex,
index($str, $substr)
if $substr contains . anywhere, index will always return -1
thanks
carol
That is not possible. The documentation says:
The index function searches for one string within another, but without
the wildcard-like behavior of a full regular-expression pattern match.
...
The keywords, you can use for further googlings are:
perl regular expression wildcard
Update:
If you just want to know, if your string matches, using a regular expression could look like that:
my $string = "Hello World!";
if( $string =~ /ll. Worl/ )
{
print "Ahoi! Position: ".($-[0])."\n";
}
This is matching a single character.
$-[0] is the offset into the string of the beginning of the entire
match.
-- http://perldoc.perl.org/perlvar.html
If you want to have a pattern, that is matching an arbitary amount of arbitary characters, you could choose a pattern like...
...
if( $string =~ /ll.*orl/ )
{
...
See perlvar for further information about special perl variables. You will find the variable #LAST_MATCH_START and some explanation about $-[0] over there. There are several more variables, that can help you to find sub matches and to gather other interessting information about your matches...
From perldoc -f index, you can see index() doesn't have any regex syntax:
index STR,SUBSTR
The index function searches for one string within another, but without the wildcard-like behavior of a full regular-
expression pattern match. It returns the position of the first occurrence of SUBSTR in STR at or after POSITION. If
POSITION is omitted, starts searching from the beginning of the string. POSITION before the beginning of the string or after
its end is treated as if it were the beginning or the end, respectively. POSITION and the return value are based at 0 (or
whatever you've set the $[ variable to--but don't do that). If the substring is not found, "index" returns one less than the
base, ordinarily "-1"
A simple test:
$ perl -e 'print index("1234567asdfghj.","j.")'
13
Use regex:
$str =~ /$substr/g;
$index = pos();

Read chunks of data in Perl

What is a good way in Perl to split a line into pieces of varying length, when there is no delimiter I can use. My data is organized by column length, so the first variable is in positions 1-4, the second variable is positions 5-15, etc. There are many variables each with different lengths.
Put another way, is there some way to use the split function based on the position in the string, not a matched expression?
Thanks.
Yes there is. The unpack function is well-suited to dealing with fixed-width records.
Example
my $record = "1234ABCDEFGHIJK";
my #fields = unpack 'A4A11', $record; # 1st field is 4 chars long, 2nd is 11
print "#fields"; # Prints '1234 ABCDEFGHIJK'
The first argument is the template, which tells unpack where the fields begin and end. The second argument tells it which string to unpack.
unpack can also be told to ignore character positions in a string by specifying null bytes, x. The template 'A4x2A9' could be used to ignore the "AB" in the example above.
See perldoc -f pack and perldoc perlpacktut for in-depth details and examples.
Instead of using split, try the old-school substr method:
my $first = substr($input, 0, 4);
my $second = substr($input, 5, 10);
# etc...
(I like the unpack method too, but substr is easier to write without consulting the documentation, if you're only parsing out a few fields.)
You could use the substr() function to extract data by offset:
$first = substr($line, 0, 4);
$second = substr($line, 4, 11);
Another option is to use a regular expression:
($first, $second) = ($line =~ /(.{4})(.{11})/);

What does Perl's substr do?

My variable $var has the form 'abc.de'. What does this substr exactly do in this statement:
$convar = substr($var,0,index(".",$var));
index() finds one string within another and returns the index or position of that string.
substr() will return the substring of a string between 2 positions (starting at 0).
Looking at the above, I suspect the index method is being used incorrectly (since its definition is index STR, SUBSTR), and it should be
index($var, ".")
to find the '.' within 'abc.de' and determine a substring of "abc.de"
The substr usage implied here is -
substr EXPR,OFFSET,LENGTH
Since the offset is 0, the operation returns the string upto but not including the first '.' position (as returned by index(".", $var)) into $convar.
Have a look at the substr and index functions in perldoc to clarify matters further.
The Perl substr function has format:
substr [string], [offset], [length]
which returns the string from the index offset to the index offset+length
index has format:
index [str], [substr]
which returns the index of the first occurrence of substr in str.
so substr('abc.de', 0, index(".", $var));
would return the substring starting at index 0 (i.e. 'a') up to the number of characters to the first occurrence of the string "."
So $convar will have "abc" in the example you have
edit: damn, people are too fast :P
edit2: and Brian is right about index being used incorrectly
Why not run it and find out?
#!/usr/bin/perl
my $var = $ARGV[0];
my $index = index(".",$var);
print "index is $index.\n";
my $convar = substr($var, 0, $index);
print "convar is $convar.\n";
Run that on a bunch of words and see what happens.
Also, you may want to type:
perldoc -f index
perldoc -f substr
Fabulously, you can write data into a substring using substr as the left hand side of an assignment:
$ perl -e '$a="perl sucks!", substr($a,5,5)="kicks ass"; print $a'
perl kicks ass!
You don't even need to stick to the same length - the string will expand to fit.
Technically, this is known as using substr as an lvalue.

How do I get the length of a string in Perl?

What is the Perl equivalent of strlen()?
length($string)
perldoc -f length
length EXPR
length Returns the length in characters of the value of EXPR. If EXPR is
omitted, returns length of $_. Note that this cannot be used on an
entire array or hash to find out how many elements these have. For
that, use "scalar #array" and "scalar keys %hash" respectively.
Note the characters: if the EXPR is in Unicode, you will get the num-
ber of characters, not the number of bytes. To get the length in
bytes, use "do { use bytes; length(EXPR) }", see bytes.
Although 'length()' is the correct answer that should be used in any sane code, Abigail's length horror should be mentioned, if only for the sake of Perl lore.
Basically, the trick consists of using the return value of the catch-all transliteration operator:
print "foo" =~ y===c; # prints 3
y///c replaces all characters with themselves (thanks to the complement option 'c'), and returns the number of character replaced (so, effectively, the length of the string).
length($string)
The length() function:
$string ='String Name';
$size=length($string);
You shouldn't use this, since length($string) is simpler and more readable, but I came across some of these while looking through code and was confused, so in case anyone else does, these also get the length of a string:
my $length = map $_, $str =~ /(.)/gs;
my $length = () = $str =~ /(.)/gs;
my $length = split '', $str;
The first two work by using the global flag to match each character in the string, then using the returned list of matches in a scalar context to get the number of characters. The third works similarly by splitting on each character instead of regex-matching and using the resulting list in scalar context