I adopt select(), sysread(), syswrite() mechanism to handle socket messages where messages are sysread() into $buffer (binary) before they are syswritten.
Now I want to change two bytes of the message, which denote the length of the whole message. At first, I use following code:
my $msglen=substr($buffer,0,2); # Get the first two bytes
my $declen=hex($msglen);
$declen += 3;
substr($buffer,0,2,$declen); # change the length
However, it doesn't work in this way. If the final value of $declen is 85, then the modified $buffer will be "0x35 0x35 0x00 0x02...". I insert digital number to $buffer but finally got ASCII!
I also tried this way:
my $msglen=substr($buffer,0,2); # Get the first two bytes,binary
$msglen += 0b11; # Or $msglen += 3;
my $msgbody=substr($buffer,2); # Get the rest part of message, binary
$buffer=join("", $msglen, $msgbody);
Sadly, this method also failed. The result is such as"0x33 0x 0x00 0x02..." I just wonder why two binary scalar can't be joined into a binary scalar?
Can you help me? Thank you!
my $msglen=substr($buffer,0,2); # Get the first two bytes
my $number = unpack("S",$msglen);
$number += 3;
my $number_bin = pack("S",$number);
substr($buffer,0,2,$number_bin); # change the length
Untested, but I think this is what you are trying to do... convert a string with two bytes representing a short int into an actual int object and then back again.
I have found another workable way -- using vec
vec($buffer, 0, 16) += 3;
You cannot join two binary buffers in Perl directly all you have to do is call pack to get an ASCII and then join it and call unpack on it to get back.
Related
Is there a method for converting unique strings to unique integers in PowerShell?
I'm using a PowerShell function as a service bus between two API's,
the first API produces unique codes e.g. HG44X10999 (varchars)- but the second API which will consume the first as input, will only accept integers. I only care about keeping them unique.
I have looked at $string.gethashcode() but this produces negative integers and also changes between builds. Get-hash | $string -encoding ASCII obviously outputs varchars too.
Other examples on SO are referring to converting a string of numeric characters to integers i.e. $string = 123 - but I can't find a way of quickly computing an int from a string of alphanumeric
The Fowler-Noll-Vo hash function seems well-suited for your purpose, as it can produce a 32-bit hash output.
Here's a simple implementation in PowerShell (the offset basis and initial prime is taken from the wikipedia reference table for 32-bit outputs):
function Get-FNVHash {
param(
[string]$InputString
)
# Initial prime and offset chosen for 32-bit output
# See https://en.wikipedia.org/wiki/Fowler–Noll–Vo_hash_function
[uint32]$FNVPrime = 16777619
[uint32]$offset = 2166136261
# Convert string to byte array, may want to change based on input collation
$bytes = [System.Text.Encoding]::UTF8.GetBytes($InputString)
# Copy offset as initial hash value
[uint32]$hash = $offset
foreach($octet in $bytes)
{
# Apply XOR, multiply by prime and mod with max output size
$hash = $hash -bxor $octet
$hash = $hash * $FNVPrime % [System.Math]::Pow(2,32)
}
return $hash
}
Now you can repeatably produce distinct integers from the input strings:
PS C:\> Get-FNVHash HG44X10999
1174154724
If the target API only accepts positive signed 32-bit integers you can change the modulus to [System.Math]::Pow(2,31) (doubling the chance of collisions, to
approx. 1 in 4300 for 1000 distinct inputs)
For further insight into this simple approach, see this page on FNV and have a look at this article exploring short string hashing
I am very new to Perl. Recently I wrote a code to calculate the coefficient of correlation between the atoms between two structures. This is a brief summary of my program.
for($i=1;$i<=2500;$i++)
{
for($j=1;$j<=2500;$j++)
{
calculate the correlation (Cij);
print $Cij;
}
}
This program prints all the correlations serially in a single column. But I need to print the correlations in the form of a matrix, something like..
Atom1 Atom2 Atom3 Atom4
Atom1 0.5 -0.1 0.6 0.8
Atom2 0.1 0.2 0.3 -0.5
Atom3 -0.8 0.9 1.0 0.0
Atom4 0.3 1.0 0.8 -0.8
I don't know, how it can be done. Please help me with a solution or suggest me how to do it !
Simple issue you're having. You need to print a NL after you finish printing a row. However, while i have your attention, I'll prattle on.
You should store your data in a matrix using references. This way, the way you store your data matches the concept of your data:
my #atoms; # Storing the data in here
my $i = 300;
my $j = 400;
my $value = ...; # Calculating what the value should be at column 300, row 400.
# Any one of these will work. Pick one:
my $atoms[$i][$j] = $value; # Looks just like a matrix!
my $atoms[$i]->[$j] = $value; # Reminds you this isn't really a matrix.
my ${$atoms[$1]}[$j] = $value; # Now this just looks ridiculous, but is technically correct.
My preference is the second way. It's just a light reminder that this isn't actually a matrix. Instead it's an array of my rows, and each row points to another array that holds the column data for that particular row. The syntax is still pretty clean although not quite as clean as the first way.
Now, let's get back to your problem:
my #atoms; # I'll store the calculated values here
....
my $atoms[$i]->[$j] = ... # calculated value for row $i column $j
....
# And not to print out my matrix
for my $i (0..$#atoms) {
for my $j (0..$#{ $atoms[$i] } ) {
printf "%4.2f ", $atoms[$i]->[$j]; # Notice no "\n".
}
print "\n"; # Print the NL once you finish a row
}
Notice I use for my $i (0..$#atoms). This syntax is cleaner than the C style three part for which is being discouraged. (Python doesn't have it, and I don't know it will be supported in Perl 6). This is very easy to understand: I'm incrementing through my array. I also use $#atom which is the length of my #atoms array -- or the number of rows in my Matrix. This way, as my matrix size changes, I don't have to edit my program.
The columns [$j] is a bit tricker. $atom[$i] is a reference to an array that contains my column data for row $i, and doesn't really represent a row of data directly. (This is why I like $atoms[$i]->[$j] instead of $atoms[$i][$j]. It gives me this subtle reminder.) To get the actual array that contains my column data for row $i, I need to dereference it. Thus, the actual column values are stored in row $i in the array array #{$atoms[$i]}.
To get the last entry in an array, you replace the # sigil with $#, so the last index in my
array is $#{ $atoms[$i] }.
Oh, another thing because this isn't a true matrix: Each row could have a different numbers of entries. You can't have that with a real matrix. This makes using an Array of Arrays in Perl a bit more powerful, and a bit more dangerous. If you need a consistent number of columns, you have to manually check for that. A true matrix would automatically create the required columns based upon the largest $j value.
Disclaimer: Pseudo Code, you might have to take care of special cases and especially the headers yourself.
for($i=1;$i<=2500;$i++)
{
print "\n"; # linebreak here.
for($j=1;$j<=2500;$j++)
{
calculate the correlation (Cij);
printf "\t%4f",$Cij; # print a tab followed by your float giving it 4
# spaces of room. But no linebreak here.
}
}
This is of course a very crude and quick and dirty solution. But if you save the output into a .csv file, most csv-able spreadsheet programs (OpenOfice) should easily be able to read it into a proper table. If the spreadsheet viewer of your choice can not understand tabs as delimeter, you could easily add ; or / or whatever it can use into the printf string.
say pack "A*", "asdf"; # Prints "asdf"
say pack "s", 0x41 * 256 + 0x42; # Prints "BA" (0x41 = 'A', 0x42 = 'B')
The first line makes sense: you're taking an ASCII encoded string, packing it into a string as an ASCII string. In the second line, the packed form is "\x42\x41" because of the little endian-ness of short integers on my machine.
However, I can't shake the feeling that somehow, I should be able to treat the packed string from the second line as a number, since that's how (I assume) Perl stores numbers, as little-endian sequence of bytes. Is there a way to do so without unpacking it? I'm trying to get the correct mental model for the thing that pack() returns.
For instance, in C, I can do this:
#include <stdio.h>
int main(void) {
char c[2];
short * x = c;
c[0] = 0x42;
c[1] = 0x41;
printf("%d\n", *x); // Prints 16706 == 0x41 * 256 + 0x42
return 0;
}
If you're really interested in how Perl stores data internally, I'd recommend PerlGuts Illustrated. But usually, you don't have to care about stuff like that because Perl doesn't give you access to such low-level details. These internals are only important if you're writing XS extensions in C.
If you want to "cast" a two-byte string to a C short, you can use the unpack function like this:
$ perl -le 'print unpack("s", "BA")'
16706
However, I can't shake the feeling that somehow, I should be able to treat the packed string from the second line as a number,
You need to unpack it first.
To be able to use it as a number in C, you need
char* packed = "\x42\x41";
int16_t int16;
memcpy(&int16, packed, sizeof(int16_t));
To be able to use it as a number in Perl, you need
my $packed = "\x42\x41";
my $num = unpack('s', $packed);
which is basically
use Inline C => <<'__EOI__';
SV* unpack_s(SV* sv) {
STRLEN len;
char* buf;
int16_t int16;
SvGETMAGIC(sv);
buf = SvPVbyte(sv, len);
if (len != sizeof(int16_t))
croak("usage");
Copy(buf, &int16, 1, int16_t);
return newSViv(int16);
}
__EOI__
my $packed = "\x42\x41";
my $num = unpack_s($packed);
since that's how (I assume) perl stores numbers, as little-endian sequence of bytes.
Perl stores numbers in one of following three fields of a scalar:
IV, a signed integer of size perl -V:ivsize (in bytes).
UV, an unsigned integer of size perl -V:uvsize (in bytes). (ivsize=uvsize)
NV, a floating point numbers of size perl -V:nvsize (in bytes).
In all case, native endianness is used.
I'm trying to get the correct mental model for the thing that pack() returns.
pack is used to construct "binary data" for interfacing with external APIs.
I see pack as a serialization function. It takes as input Perl values, and outputs a serialized form. The fact the output serialized form happens to be a Perl bytestring is more of an implementation detail than a core functionality.
As such, all you're really expected to do with the resulting string is feed it to unpack, though the serialized form is convenient to have it move around processes, hosts, planets.
If you're interested in serializing it to a number instead, consider using vec:
say vec "BA", 0, 16; # prints 16961
To take a closer look at the string's internal representation, take a look at Devel::Peek, though you're not going to see anything surprising with a pure ASCII string.
use Devel::Peek;
Dump "BA";
SV = PV(0xb42f80) at 0xb56300
REFCNT = 1
FLAGS = (POK,READONLY,pPOK)
PV = 0xb60cc0 "BA"\0
CUR = 2
LEN = 16
I have an integer value my $reading = 1200;.
I have an array my #DigitField = "000000000";
I want to replace the right-hand 4 elements of the array with $reading's value, and I want to do this programmatically using Perl's length function as shown below.
I've tried.
my #DigitField = "000000000";
my $reading = 1200;
splice #DigitField, length(#DigitField) + 1, length $reading, $reading;
print #DigitField;
but I'm getting
0000000001200
and I want the string to remain nine characters wide.
What are some other ways to replace part of a Perl string array?
I think you are possibly confused - the # sigil indicates #DigitField is an array variable. A string is not an array.
I think you want to format the number:
my $reading = 1200;
my $digitfield = sprintf('%09d', $reading);
print $digitfield, "\n";
I added a \n to the end of the print, this adds a newline. Depending on the context of your program, you may or may not want this in the final.
i'm trying very hard on implementing the sha-256 algorithm. I have got problems with the padding of the message. for sha-256 you have to append one bit at the end of the message, which I have reached so far with $message .= (chr 0x80);
The next step should be to fill the emtpy space(512bit block) with 0's.
I calculated it with this formula: l+1+k=448-l and append it then to the message.
My problem comes now:Append in the last 64bit block the binary representation of the length of the message and fill the rest with 0's again. Since perl handles their data types by themself, there is no "byte" datatype. How can I figure out which value I should append?
please see also the official specification:
http://csrc.nist.gov/publications/fips/fips180-3/fips180-3_final.pdf
If at all possible, pull something off the shelf. You do not want to roll your own SHA-256 implementation because to get official blessing, you would have to have it certified.
That said, the specification is
5.1.1 SHA-1, SHA-224 and SHA-256
Suppose that the length of the message, M, is l bits. Append the bit 1 to the end of the message, followed by k zero bits, where k is the smallest, non-negative solution to the equation
l + 1 + k ≡ 448 mod 512
Then append the 64-bit block that is equal to the number l expressed using a binary representation. For example, the (8-bit ASCII) message “abc” has length 8 × 3 = 24, so the message is padded with a one bit, then 448 - (24 + 1) = 423 zero bits, and then the message length, to become the 512-bit padded message
423 64
.-^-. .---^---.
01100001 01100010 01100011 1 00…00 00…011000
“a” “b” “c” '-v-'
l=24
Then length of the padded message should now be a multiple of 512 bits.
You might be tempted to use vec because it allows you to address single bits, but you would have to work around funky addressing.
If bits is 4 or less, the string is broken into bytes, then the bits of each byte are broken into 8/BITS groups. Bits of a byte are numbered in a little-endian-ish way, as in 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80. For example, breaking the single input byte chr(0x36) into two groups gives a list (0x6, 0x3); breaking it into 4 groups gives (0x2, 0x1, 0x3, 0x0).
Instead, a pack template of B* specifies
A bit string (descending bit order inside each byte).
and N
An unsigned long (32-bit) in "network" (big-endian) order.
The latter is useful for assembling the message length. Although pack has a Q parameter for quad, the result is in the native order.
Start with a bit of prep work
our($UPPER32BITS,$LOWER32BITS);
BEGIN {
use Config;
die "$0: $^X not configured for 64-bit ints"
unless $Config{use64bitint};
# create non-portable 64-bit masks as constants
no warnings "portable";
*UPPER32BITS = \0xffff_ffff_0000_0000;
*LOWER32BITS = \0x0000_0000_ffff_ffff;
}
Then you can defined pad_message as
sub pad_message {
use bytes;
my($msg) = #_;
my $l = bytes::length($msg) * 8;
my $extra = $l % 512; # pad to 512-bit boundary
my $k = 448 - ($extra + 1);
# append 1 bit followed by $k zero bits
$msg .= pack "B*", 1 . 0 x $k;
# add big-endian length
$msg .= pack "NN", (($l & $UPPER32BITS) >> 32), ($l & $LOWER32BITS);
die "$0: bad length: ", bytes::length $msg
if (bytes::length($msg) * 8) % 512;
$msg;
}
Say the code prints the padded message with
my $padded = pad_message "abc";
# break into multiple lines for readability
for (unpack("H*", $padded) =~ /(.{64})/g) {
print $_, "\n";
}
Then the output is
6162638000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000018
which matches the specification.
First of all I hope you do this just as an exercise -- there is a Digest module in core that already computes SHA-256 just fine.
Note that $message .= (chr 0x80); appends one byte, not one bit. If you really need bitwise manipulation, take a look at the vec function.
To get the binary representation of an intger, you should use pack. To get it to 64 bit, do something like
$message .= pack 'Q', length($message)
Note that the 'Q' format is only available on 64 bit perls; if yours isn't one, simply concatenate four 0-bytes with a 32 bit value (pack format L).