Adding special character at several places in a string in perl script

Adding special character at several places in a string in perl script - perl

I read some raw data from my device. This data contains the IP address as well but in a different format. As you know the IP address is generally written in the format a.b.c.d. However I have data of the format abcd given from the device. I need to get this in the format a.b.c.d How do I do this in a perl script?
Regards

First, let us split the hex string into substrings of two characters:
... split /..\K/, "c0a80001";
We treat each fragment as a hex string, and get the numeric value with the hex builtin:
... map hex, ...
Then, we join all numbers with a period:
join '.', ...
Combined:
my $ip = join '.', map hex, split /..\K/, "c0a80001";
print "$ip\n";
Output: 192.168.0.1. This is the usual text representation for an IPv4 address.

There are many ways. This inserts dots with substring.
map { substr($string,$_,0)='.' } (6,4,2);
Maybe you prefer regexes.
$string =~ s/[0-9a-f]{2}\K(?!\Z)/./g;

It really depends on the approach taken, but mostly you would need to escape with a backslash the dot from the IP address
a\.b\.c\.d
Some source code with be nice btw ...

Related

How to identify a character in a string?

I am trying to write a Powershell code to identify a string with a specific character from a filename from multiple files.
An example of a filename
20190902091031_202401192_50760_54206_6401.pdf
$Variable = $Filename.Substring(15,9)
Results:
202401192 (this is what I am after)
However in some instances the filename will be like below
20190902091031_20240119_50760_54206_6401.pdf
$Variable = $Filename.Substring(15,9)
Results:
20240119_ (this is NOT what I am after)
I am trying to find a code to identify the 9th character,
IF the 9th character = "_"
THEN Set
$Variable = $Filename.Substring(15,8)
Results:
20240119

All credit to TheMadTechnician who beat me to the punch with this answer.
To expand on the technique a bit, use the split method or operator to split a string every time a certain character shows up. Your data is separated by the underscore character, so is a perfect example of using this technique. By using either of the following:
$FileName.Split('_')
$FileName -split '_'
You can turn your long string into an array of shorter strings, each containing one of the parts of your original string. Since you want the 2nd one, you use the array descriptor [1] (0 is 1st) and you're done.
Good luck

Transform data to array with Perl

How do I transform my data to an array with Perl?
Here is my data:
my $data =
"203.174.38.128203.174.38.129203.174.38.1" .
"30203.174.38.131203.174.38.132203.174.38" .
".133203.174.38.134173.174.38.135203.174." .
"38.136203.174.38.137203.174.38.142";
And I want to transform it to be array like this
my #array= (
"203.174.38.128",
"203.174.38.129",
"203.174.38.130",
"203.174.38.131",
"203.174.38.132",
"203.174.38.133",
"203.174.38.134",
"173.174.38.135",
"203.174.38.136",
"203.174.38.137",
"203.174.38.142"
);
Anyone know how to do that with Perl?

If the first part of IP logged is always 203, it's kinda easy:
my #arr = split /(?<=\d)(?=203\.)/, $data;
In the example given it's not, but the first part is always 3-digit, and the second part is always 174, so it's enough to do...
my #arr = split /(?<=\d)(?=\d{3}\.174\.)/, $data;
... to get the correct result.
But please understand that it's close to impossible to give a more generic (and bulletproof) solution here - when these 'marker' parts are... too dynamic. For example, take this string...
11.11.11.22222.11.11.11
The question is, where to split it? Should it be 11.11.11.22; 222.11.11.11? Or 11.11.11.222; 22.11.11.11? Both are quite valid IPs, if you ask me. And it could get even worse, with trying to split '2222' part (can be '2; 222', '22; 22' and even '222; 2').
You can, for example, make a rule: "split each sequence of > 3 digits followed by a dot sign so that the second part of this split would always start from 3 digits":
my #arr = split /(?<=\d)(?=\d{3}\.)/, $data;
... but this will obviously fail to work properly in the ambiguous cases mentioned earlier IF there are IPs with two- or even one-digit first octet in your datastring.

If you write a regex that will match any valid value for one of the numbers in the quartet then you can just search for them all and recombine them in sets of four. This
/2[0-5][0-5]|1\d\d|[1-9]\d|\d/
matches 200-255 or 100-199 or 10-99 or 0-9, and a program to use it is shown below.
There is no way to know which option to take if there is more than one way to split the string, and this solution assigns the longest value to the first of the two ip addresses. For instance, 1.1.1.1234.1.1.1 will split as 1.1.1.123 and 4.1.1.1
use strict;
use warnings;
use feature 'say';
my $data =
"203.174.38.128203.174.38.129203.174.38.1" .
"30203.174.38.131203.174.38.132203.174.38" .
".133203.174.38.134173.174.38.135203.174." .
"38.136203.174.38.137203.174.38.142";
my $byte = qr/2[0-5][0-5]|1\d\d|\d\d|\d/;
my #bytes = $data =~ /($byte)/g;
my #addresses;
push #addresses, join('.', splice(#bytes, 0, 4)) while #bytes;
say for #addresses;
output
203.174.38.128
203.174.38.129
203.174.38.130
203.174.38.131
203.174.38.132
203.174.38.133
203.174.38.134
173.174.38.135
203.174.38.136
203.174.38.137
203.174.38.142

Using your sample, it looks like you have 3 digits for the first and last node. That would prompt using this pattern:
/(\d{3}\.\d{1,3}\.\d{1,3}\.\d{3})/
Add that with a /g switch and it will pull every one.
However, if you have a larger and divergent set of data than what you show for your sample, somebody should have separated the ips before dumping them into this string. If they are separate data points, they should have some separation.

Perl CSV without exponential

I am generating CSV, and I want to store numbers without exponential format.
Please give me some suggestion.
I tried:
I used , perfectly,
I tried single quote before the large number, so I got as expected out in CSV, but number fore it showed single quote, if I click that number then number displaying perfectly.
I tried with delimeter, that is ' quote before one trailing slash.
So far no luck.

You might try using something like the Math::BigInt module:
use Math::BigInt;
my $num = new Math::BigInt(2);
$num=$num**128;
print "$num\n";
which will output:
340282366920938463463374607431768211456

How can I create a Unicode character from its bytes when they are stored in different variables in Perl?

I am trying to Convert hex representations of Unicode characters to the characters they represent. The following example works fine:
#!/usr/bin/perl
use Encode qw( encode decode );
binmode(STDOUT, ':encoding(utf-8)');
my $encoded = encode('utf8', "\x{e382}\x{af}");
eval { $encoded = decode('utf8', $encoded, Encode::FB_CROAK); 1 }
or print("coaked\n");
print "$encoded\n";
However the hex digits are stored in 3 variables.
So if i replace the encode line with this:
my $encoded = encode('utf8', "\x{${byte1}${byte2}}\x{${byte3}}");
where
my $byte1 = "e3"; my $byte2 = "82"; my $byte3 = "af";
It fails as it tries to evaluate the \x immediately and sees the $ sign and { as characters.
Does anyone know how to get around this.

Instead of
my $encoded = encode('utf8', "\x{${byte1}${byte2}}\x{${byte3}}");
You can use
my $encoded = encode('utf8', chr(hex($byte1 . $byte2)) . chr(hex($byte3)));
hex() converts from hexadecimal, and chr() returns the unicode character for a given code point.
[Edit:]
Not related to your question, but I noticed you mix utf-8 and utf8 in your program. I don't know if this is a typo, but you should be a ware that these are not the same things in Perl:
utf-8 (with hyphen, case insensitive) is what the UTF-8 standard says, whereas utf8 (no hyphen, also case insensitive) is Perls internal encoding, which is more loosely defined (it allows codepoints that are not valid unicode codepoints). In general, you should stick to utf-8 (perlunifaq has the details).

trendel's answer seems pretty good, but Encode::Escape offers an alternative solution:
use Encode::Escape::Unicode;
my $hex = '263a';
my $escaped = "\\x{" . $hex . "}\n";
print encode 'utf8', decode 'unicode-escape', $escaped;

First off, think hard about why you ended up with three variables, $byte1, $byte2, $byte3, each holding one byte's worth of data, as a two-character string, in hex. This part of your program seems hard because of a poor design decision further up. Fix that bad decision, and this part of the code will fall out naturally.
That being said, what you want to do, I think, is this:
my $byte1 = "e3"; my $byte2 = "82"; my $byte3 = "af";
my $str = chr(hex($byte1 . $byte2)) . chr(hex($byte3))
The encoding stuff is a red herring; you shouldn't be worrying about encodings in the middle of your program, only when you do IO.
I'm assuming in the above that you want to get out a two character string, U+E382 followed by U+AF. That's what you actually asked for. However, since there is no U+E382, since it's in the middle of the private use area, that's probably not what you actually wanted. Please try to reword the question? Perhaps ask a more basic question, and describe what you are trying to achieve, rather then how you are going about trying to do it?

What's the simplest way of adding one to a binary string in Perl?

I have a variable that contains a 4 byte, network-order IPv4 address (this was created using pack and the integer representation). I have another variable, also a 4 byte network-order, subnet. I'm trying to add them together and add one to get the first IP in the subnet.
To get the ASCII representation, I can do inet_ntoa($ip&$netmask) to get the base address, but it's an error to do inet_ntoa((($ip&$netmask)+1); I get a message like:
Argument "\n\r&\0" isn't numeric in addition (+) at test.pm line 95.
So what's happening, the best as I can tell, is it's looking at the 4 bytes, and seeing that the 4 bytes don't represent a numeric string, and then refusing to add 1.
Another way of putting it: What I want it to do is add 1 to the least significant byte, which I know is the 4th byte? That is, I want to take the string \n\r&\0 and end up with the string \n\r&\1. What's the simplest way of doing that?
Is there a way to do this without having to unpack and re-pack the variable?

What's happening is that you make a byte string with $ip&$netmask, and then try to treat it as a number. This is not going to work, as such. What you have to feed to inet_ntoa is.
pack("N", unpack("N", $ip&$netmask) + 1)
I don't think there is a simpler way to do it.

Confusing integers and strings. Perhaps the following code will help:
use Socket;
$ip = pack("C4", 192,168,250,66); # why not inet_aton("192.168.250.66")
$netmask = pack("C4", 255,255,255,0);
$ipi = unpack("N", $ip);
$netmaski = unpack("N", $netmask);
$ip1 = pack("N", ($ipi&$netmaski)+1);
print inet_ntoa($ip1), "\n";
Which outputs:
192.168.250.1