Writing a custom base64 encoding function in perl - perl

I'm trying to learn perl by writing a custom base64 encoding function, unfortunately I've had no success by now. What I've come to is the following, which doesn't work and I unfortunately don't have any clue about how to proceed.
sub base64($) {
# Split string into single bits
my $bitstring = unpack("B*", $_[0]);
# Pack bits in pieces of six bits at a time
my #splitsixs = unpack("(A6)*", $bitstring);
my #enc = ("A".."Z", "a".."z", "0".."9", "+", "/");
# For each piece of six bits, convert them to integer, and take the corresponding place in #enc.
my #s = map { $enc[pack("B6", $_)] } #splitsixs;
join "", #s;
}
Can someone explain to me what am I doing wrong in this conversion? (Please leave aside for now the fact that I'm not considering padding)

I finally made it! I was erroneously trying to indexing elements in $enc directly via packed bytes, while I should convert them first into integers.
You can see this in the lines below.
I copy the entire function, padding included, in the hope that it might be useful to others.
sub base64($) {
# Split string into single bits
my $bitstring = unpack("B*", $_[0]);
# Pack bits in pieces of six bits at a time
my #sixs = unpack("(A6)*", $bitstring);
# Compute the amount of zero padding necessary to obtain a 6-aligned bitstring
my $padding = ((6 - (length $sixs[-1]) % 6) % 6);
$sixs[-1] = join "", ($sixs[-1], "0" x $padding);
# Array of mapping from pieces to encodings
my #enc = ("A".."Z", "a".."z", "0".."9", "+", "/");
# Unpack bit strings into integers
#sixs = map { unpack("c", pack("b6", join "", reverse(split "", $_))) } #sixs;
# For each integer take the corresponding place in #enc.
my #s = map { $enc[$_] } #sixs;
# Concatenate string adding necessary padding
join "", (#s, "=" x ($padding / 2));
}

Related

Convert the position of a character in a string to account for "gaps" (i.e., non alphanumeric characters in the string)

In a nutshell
I have a string that looks something like this ...
---MNTSDSEEDACNERTALVQSESPSLPSYTRQTDPQHGTTEPKRAGHT--------LARGGVAAPRERD
And I have a list of positions and corresponding characters that looks something like this...
position character
10 A
12 N
53 V
54 A
This position/character key doesn't account for hyphen (-) characters in the string. So for example, in the given string the first letter M is in position 1, the N in position 2, the T in position 3, etc. The T preceding the second chunk of hyphens is position 47, and the L after that hyphen chunk is position 48.
I need to convert the list of positions and corresponding characters so that the position accounts for hyphen characters. Something like this...
position character
13 A
15 N
64 V
65 A
I think there should be a simple enough way to do this, but I am fairly new so I am probably missing something obvious, sorry about that! I am doing this as part of bigger script, so if anyone had a way to accomplish this using perl that would be amazing. Thank you so much in advance and please let me know if I can clarify anything or provide more information!
What I tried
At first, I took a substring of characters equal to the position value, counted the number of hyphens in that substring, and added the hyphen count onto the original position. So for the first position/character in my list, take the first 10 characters, and then there are 3 hyphens in that substring, so 10+3 = 13 which gives the correct position. This works for most of my positions, but fails when the original position falls within a bunch of hyphens like for positions 53 and 54.
I also tried grabbing the character by taking out the hyphens and then using the original position value like this...
my #array = ($string =~ /\w/g);
my $character = $array[$position];
which worked great, but then I was having a hard time using this to convert the position to include the hyphens because there are too many matching characters to match the character I grabbed here back to the original string with hyphens and find the position in that (this may have been a dumb thing to try from the start).
The actual character seems not to be relevant. It's enough to count the non-hyphens:
use strict;
use warnings;
use Data::Dumper;
my $s = '---MNTSDSEEDACNERTALVQSESPSLPSYTRQTDPQHGTTEPKRAGHT--------LARGGVAAPRERD';
my #positions = (10,12,53,54);
my #transformed = ();
my $start = 0;
for my $loc(#positions){
my $dist = $loc - $start;
while ($dist){
$dist-- if($s =~ m/[^-]/g);
}
my $pos = pos($s);
push #transformed, $pos;
$start = $loc;
}
print Dumper \#transformed;
prints:
$VAR1 = [
13,
15,
64,
65
];

How to format a number into NN.nn style

I am handling a stream of numbers from sensors and want to format them to a 'standard' layout centered on the decimal point, as per the following: 1.00 = 01.00 | 12.9 = 12.90 | 2 = 02.00 | 49.09 = 49.09 etc.
I have tried zfill and round - including combinations but the decimal point moves in everything I have tried so far. The purpose is to fill pre-defined fields for later analysis.
UPDATE
Probably not the most elegant solution but I came up with this, which works a far as I have been able to test so far:
For padding to the left of decimal point:
def zfl(d, chrs, pad):
# Pads the provided string with leading 'pad's to suit the specified
# 'chrs' length.
# When called, parameters are : d = string, chrs = required length of
# string and pad = fill characters
# The formatted string of correct length and added pad characters is
# returned as string
frmtd_str = str(d)
while len(frmtd_str) != chrs:
# less then required characters
frmtd_str = pad + frmtd_str
return(frmtd_str)`
Function for padding to the right of decimal point:
def zfr(d, chrs, pad):
# Pads the provided string with trailing 'pad's to suit the specified
# 'chrs' length
# When called, parameters are : d = string, chrs = required length of
# string and pad = fill characters
# The formatted string of correct length and added pad characters is
# returned as string
frmtd_str = str(d)
while len(frmtd_str) != chrs:
# less then required characters
frmtd_str = frmtd_str + pad
return(frmtd_str)
Example to call the above funtions:
The original data is split into two parts using the decimal as the seperator:
dat_splt = str(Dat[0]).split(".",2)
Then the padding carried out and reconstructed for use:
exampledat = "{}.{}".format(zfl(dat_splt[0],3,'0'), zfr(dat_splt[1],3,'0'))
Notes:
To pad out either side requires the parameters for string, character required and the 'pad' character.
The characters required can be anything (only tested with 1 to 10)
The final returned string can be asymmetrical i.e. nnnnn.nn or n.nnn
The number of characters in each part of the original data is accommodated.
Quite happy with the results from this and it is reusable as common functions. I am sure there are more 'economical/efficient' methods but I haven't found those yet and at least this works, giving nice orderly and stable text string result lists (which is what I was aiming for at this point).
Hope I got the layout correct.. :-)
'{:0>5.2f}'.format(n)
'{:0>5.2f}'.format(1)
'01.00'
'{:0>5.2f}'.format(12.9)
'12.90'
'{:0>5.2f}'.format(49.09)
'49.09'
https://queirozf.com/entries/python-number-formatting-examples#left-padding-with-zeros

Trouble padding a Perl string array without increasing array length

I have an integer value my $reading = 1200;.
I have an array my #DigitField = "000000000";
I want to replace the right-hand 4 elements of the array with $reading's value, and I want to do this programmatically using Perl's length function as shown below.
I've tried.
my #DigitField = "000000000";
my $reading = 1200;
splice #DigitField, length(#DigitField) + 1, length $reading, $reading;
print #DigitField;
but I'm getting
0000000001200
and I want the string to remain nine characters wide.
What are some other ways to replace part of a Perl string array?
I think you are possibly confused - the # sigil indicates #DigitField is an array variable. A string is not an array.
I think you want to format the number:
my $reading = 1200;
my $digitfield = sprintf('%09d', $reading);
print $digitfield, "\n";
I added a \n to the end of the print, this adds a newline. Depending on the context of your program, you may or may not want this in the final.

implementation of sha-256 in perl

i'm trying very hard on implementing the sha-256 algorithm. I have got problems with the padding of the message. for sha-256 you have to append one bit at the end of the message, which I have reached so far with $message .= (chr 0x80);
The next step should be to fill the emtpy space(512bit block) with 0's.
I calculated it with this formula: l+1+k=448-l and append it then to the message.
My problem comes now:Append in the last 64bit block the binary representation of the length of the message and fill the rest with 0's again. Since perl handles their data types by themself, there is no "byte" datatype. How can I figure out which value I should append?
please see also the official specification:
http://csrc.nist.gov/publications/fips/fips180-3/fips180-3_final.pdf
If at all possible, pull something off the shelf. You do not want to roll your own SHA-256 implementation because to get official blessing, you would have to have it certified.
That said, the specification is
5.1.1 SHA-1, SHA-224 and SHA-256
Suppose that the length of the message, M, is l bits. Append the bit 1 to the end of the message, followed by k zero bits, where k is the smallest, non-negative solution to the equation
l + 1 + k ≡ 448 mod 512
Then append the 64-bit block that is equal to the number l expressed using a binary representation. For example, the (8-bit ASCII) message “abc” has length 8 × 3 = 24, so the message is padded with a one bit, then 448 - (24 + 1) = 423 zero bits, and then the message length, to become the 512-bit padded message
423 64
.-^-. .---^---.
01100001 01100010 01100011 1 00…00 00…011000
“a” “b” “c” '-v-'
l=24
Then length of the padded message should now be a multiple of 512 bits.
You might be tempted to use vec because it allows you to address single bits, but you would have to work around funky addressing.
If bits is 4 or less, the string is broken into bytes, then the bits of each byte are broken into 8/BITS groups. Bits of a byte are numbered in a little-endian-ish way, as in 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80. For example, breaking the single input byte chr(0x36) into two groups gives a list (0x6, 0x3); breaking it into 4 groups gives (0x2, 0x1, 0x3, 0x0).
Instead, a pack template of B* specifies
A bit string (descending bit order inside each byte).
and N
An unsigned long (32-bit) in "network" (big-endian) order.
The latter is useful for assembling the message length. Although pack has a Q parameter for quad, the result is in the native order.
Start with a bit of prep work
our($UPPER32BITS,$LOWER32BITS);
BEGIN {
use Config;
die "$0: $^X not configured for 64-bit ints"
unless $Config{use64bitint};
# create non-portable 64-bit masks as constants
no warnings "portable";
*UPPER32BITS = \0xffff_ffff_0000_0000;
*LOWER32BITS = \0x0000_0000_ffff_ffff;
}
Then you can defined pad_message as
sub pad_message {
use bytes;
my($msg) = #_;
my $l = bytes::length($msg) * 8;
my $extra = $l % 512; # pad to 512-bit boundary
my $k = 448 - ($extra + 1);
# append 1 bit followed by $k zero bits
$msg .= pack "B*", 1 . 0 x $k;
# add big-endian length
$msg .= pack "NN", (($l & $UPPER32BITS) >> 32), ($l & $LOWER32BITS);
die "$0: bad length: ", bytes::length $msg
if (bytes::length($msg) * 8) % 512;
$msg;
}
Say the code prints the padded message with
my $padded = pad_message "abc";
# break into multiple lines for readability
for (unpack("H*", $padded) =~ /(.{64})/g) {
print $_, "\n";
}
Then the output is
6162638000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000018
which matches the specification.
First of all I hope you do this just as an exercise -- there is a Digest module in core that already computes SHA-256 just fine.
Note that $message .= (chr 0x80); appends one byte, not one bit. If you really need bitwise manipulation, take a look at the vec function.
To get the binary representation of an intger, you should use pack. To get it to 64 bit, do something like
$message .= pack 'Q', length($message)
Note that the 'Q' format is only available on 64 bit perls; if yours isn't one, simply concatenate four 0-bytes with a 32 bit value (pack format L).

How can I modify specific part of a binary scalar in Perl?

I adopt select(), sysread(), syswrite() mechanism to handle socket messages where messages are sysread() into $buffer (binary) before they are syswritten.
Now I want to change two bytes of the message, which denote the length of the whole message. At first, I use following code:
my $msglen=substr($buffer,0,2); # Get the first two bytes
my $declen=hex($msglen);
$declen += 3;
substr($buffer,0,2,$declen); # change the length
However, it doesn't work in this way. If the final value of $declen is 85, then the modified $buffer will be "0x35 0x35 0x00 0x02...". I insert digital number to $buffer but finally got ASCII!
I also tried this way:
my $msglen=substr($buffer,0,2); # Get the first two bytes,binary
$msglen += 0b11; # Or $msglen += 3;
my $msgbody=substr($buffer,2); # Get the rest part of message, binary
$buffer=join("", $msglen, $msgbody);
Sadly, this method also failed. The result is such as"0x33 0x 0x00 0x02..." I just wonder why two binary scalar can't be joined into a binary scalar?
Can you help me? Thank you!
my $msglen=substr($buffer,0,2); # Get the first two bytes
my $number = unpack("S",$msglen);
$number += 3;
my $number_bin = pack("S",$number);
substr($buffer,0,2,$number_bin); # change the length
Untested, but I think this is what you are trying to do... convert a string with two bytes representing a short int into an actual int object and then back again.
I have found another workable way -- using vec
vec($buffer, 0, 16) += 3;
You cannot join two binary buffers in Perl directly all you have to do is call pack to get an ASCII and then join it and call unpack on it to get back.