How can I convert a 48 hex string to bytes using Perl? - perl

I have a hex string (length 48 chars) that I want to convert to raw bytes with the pack function in order to put it in a Win32 vector of bytes.
How I can do this with Perl?

my $bytes = pack "H*", $hex;
See perlpacktut for more information.

The steps are:
Extract pairs of hexadecimal characters from the string.
Convert each pair to a decimal number.
Pack the number as a byte.
For example:
use strict;
use warnings;
my $string = 'AA55FF0102040810204080';
my #hex = ($string =~ /(..)/g);
my #dec = map { hex($_) } #hex;
my #bytes = map { pack('C', $_) } #dec;
Or, expressed more compactly:
use strict;
use warnings;
my $string = 'AA55FF0102040810204080';
my #bytes = map { pack('C', hex($_)) } ($string =~ /(..)/g);

I have the string:
"61 62 63 64 65 67 69 69 6a"
which I want to interpret as hex values, and display those as ASCII chars (those values should reproduce the character string "abcdefghij").
Typically, I try to write something quick like this:
$ echo "61 62 63 64 65 67 69 69 6a" | perl -ne 'print "$_"; print pack("H2 "x10, $_)."\n";'
61 62 63 64 65 67 69 69 6a
a
... and then I wonder, why do I get only one character back :)
First, let me note down that the string I have, can also be represented as the hex values of bytes that it takes up in memory:
$ echo -n "61 62 63 64 65 67 68 69 6a" | hexdump -C
00000000 36 31 20 36 32 20 36 33 20 36 34 20 36 35 20 36 |61 62 63 64 65 6|
00000010 37 20 36 38 20 36 39 20 36 61 |7 68 69 6a|
0000001a
_(NB: Essentially, I want to "convert" the above byte values in memory as input, to these below ones, if viewed by hexdump:
$ echo -n "abcdefghij" | hexdump -C
00000000 61 62 63 64 65 66 67 68 69 6a |abcdefghij|
0000000a
... which is how the original values for the input hex string were obtained.
)_
Well, this Pack/Unpack Tutorial (AKA How the System Stores Data) turns out is the most helpful for me, as it mentions:
The pack function accepts a template string and a list of values [...]
$rec = pack( "l i Z32 s2", time, $emp_id, $item, $quan, $urgent);
It returns a scalar containing the list of values stored according to the formats specified in the template [...]
$rec would contain the following (first line in decimal, second in hex, third as characters where applicable). Pipe characters indicate field boundaries.
Offset Contents (increasing addresses left to right)
0 160 44 19 62| 41 82 3 0| 98 111 120 101 115 32 111 102
A0 2C 13 3E| 29 52 03 00| 62 6f 78 65 73 20 6f 66
| b o x e s o f
That is, in my case, $_ is a single string variable -- whereas pack expects as input a list of several such 'single' variables (in addition to a formatting template string); and outputs again a 'single' variable (which could, however, be a sizeable chunk of memory!). In my case, if the output 'single' variable contains the ASCII code in each byte in memory, then I'm all set (I could simply print the output variable directly, then).
Thus, in order to get a list of variables from the $_ string, I can simply split it at the space sign - however, note:
$ echo "61 62 63 64 65 67 68 69 6a" | perl -ne 'print "$_"; print pack("H2", split(/ /, $_))."\n";'
61 62 63 64 65 67 68 69 6a
a
... that amount of elements to be packed must be specified (otherwise again we get only one character back); then, either of these alternatives work:
$ echo "61 62 63 64 65 67 68 69 6a" | perl -ne 'print "$_"; print pack("H2"x10, split(/ /, $_))."\n";'
61 62 63 64 65 67 68 69 6a
abcdeghij
$ echo "61 62 63 64 65 67 68 69 6a" | perl -ne 'print "$_"; print pack("(H2)*", split(/ /, $_))."\n";'
61 62 63 64 65 67 68 69 6a
abcdeghij

Related

How to read currency symbol in a perl script

I have a Perl script where we are reading data from a .csv file which is having some different currency symbol . When we are reading that file and write the content I can see it is printing
Get <A3>50 or <80>50 daily
Actual value is
Get £50 or €50 daily
With Dollar sign it is working fine if there is any other currency code it is not working
I tried
open my $in, '<:encoding(UTF-8)', 'input-file-name' or die $!;
open my $out, '>:encoding(latin1)', 'output-file-name' or die $!;
while ( <$in> ) {
print $out $_;
}
$ od -t x1 input-file-name
0000000 47 65 74 20 c2 a3 35 30 20 6f 72 20 e2 82 ac 35
0000020 30 20 64 61 69 6c 79 0a
0000030
od -t x1 output-file-name
0000000 47 65 74 20 a3 35 30 20 6f 72 20 5c 78 7b 32 30
0000020 61 63 7d 35 30 20 64 61 69 6c 79 0a
0000034
but that is also not helping .Output I am getting
Get \xA350 or \x8050 daily
od -t x1 output-file-name
0000000 47 65 74 20 a3 35 30 20 6f 72 20 5c 78 7b 32 30
0000020 61 63 7d 35 30 20 64 61 69 6c 79 0a
0000034
Unicode Code Point
Glyph
UTF-8
Input File
ISO-8859-1
Output File
U+00A3 POUND SIGN
£
C2 A3
C2 A3
A3
A3
U+20AC EURO SIGN
€
E2 82 AC
E2 82 AC
N/A
5C 78 7B 32 30 61 63 7D
("LATIN1" is an alias for "ISO-8859-1".)
There are no problems with the input file.
£ is correctly encoded in your input file.
€ is correctly encoded in your input file.
As for the output file,
£ is correctly encoded in your output file.
€ isn't found in the latin1 charset, so \x{20ac} is used instead.
Your program is working as expected.
You say you see <A3> instead of £. That's probably because the program you are using is expecting a file encoded using UTF-8, but you provided a file encoded using ISO-8859-1.
You also say you see <80> instead of €. But there's no way you'd see that for the file you provided.

Perl printf line breaks

Looking to add a line break after every 16 bytes of the 100 bytes printed from j. All 100 bytes are currently printed on one line at the moment.
Any assistance would be great full, thanks.
for ($j = 6; $j < 106; $j++)
{
printf("%02X ", hex(#bytes[$j]));
}
printf("\t");
Thanks all. I approached it another way. In the end the visual aspect was not a concern as I ended up printing to file anyway. This data then gets pasted in a hex editor so formatting was a forethought rather than a need be. The below is now outputting all the data I was looking for in the end rather than a predetermined length.
for ($j = 6; $j < #bytes; $j = $j + 1)
{
printf outfile ("%02X ", hex(#bytes[$j]));
}
printf("\n");
Use splice() to grab 16 elements at a time from your array.
I've also switched to using say(), join(), map() and sprintf().
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my #bytes = map { int(rand(255)) } 0..200;
# As splice is destructive, take a copy of the array.
# Also, let's remove those first six elements and any
# after index 106.
my #data = #bytes[6 .. 105];
while (my #row = splice #data, 0, 16) {
say join ' ', map { sprintf '%02X', $_ } #row;
}
Update: A few more Perl tips for you.
Always add use strict and use warnings to the top of your Perl programs. And fix the problems they will show you.
#bytes[$j] is better written as $bytes[$j] (as it's a single value). That's one of the things use warnings will tell you about.
for ($j = 6; $j < 106; $j++) is probably better written as for my $j (6 .. 105). Less chance of "off-by-one" errors.
Actually, as you're just using $j to get the element from the array, you probably want to just iterate across the elements of the array directly - for my $elem (#bytes[6 .. 105]) (and then using $elem in place of $bytes[$j]).
You can utilize some $counter and print \n once $counter % 16 equal 0.
use strict;
use warnings;
my #bytes = map { int(rand(255)) } 0..200;
my $counter = 1;
for ( #bytes[6..106] ) {
printf "%02X ", $_;
print "\n" unless $counter++ % 16;
}
print "\n";
Output sample
C9 CA E7 66 13 F5 56 BE 08 68 E4 22 93 77 E0 14
08 4F F3 AD CC F4 66 DE 6C BB 1B E6 CE F3 13 DD
AE 6A CD 9B 5E 98 1F D4 2E C5 80 4B 3E 8E BC BF
5B 27 F9 0D 97 AB 26 C0 11 2D 1D 95 CE 26 3C D8
3C D8 A4 06 0A 48 0D 45 53 28 7E 5D D2 AD 90 5C
03 32 95 48 F6 DB 20 90 A7 62 41 3D D7 AB 7C 3B
CF 3D 0D C2 DA
Note: index from 6 to 106 gives 101 element
Short and fast:
sub hex_dump {
my $i = 0;
my $d = uc unpack 'H*', pack 'C*', #_;
$d =~ s{ ..\K(?!\z) }{ ++$i % 16 ? " " : "\n" }xseg;
return "$d\n";
}
print hex_dump(#bytes[6..105]);
There's also the very clean splice loop approach.
sub hex_dump {
my $d = '';
$d .= join(" ", map sprintf("%02X", $_), splice(#_, 0, 16)) . "\n" while #_;
return $d;
}
print hex_dump(#bytes[6..105]);
Because this approach produces a line at a time, it's great for recreating the traditional hex dump format (with offset and printable characters shown).
$ hexdump -C a.pl
00000000 75 73 65 20 73 74 72 69 63 74 3b 0a 75 73 65 20 |use strict;.use |
00000010 77 61 72 6e 69 6e 67 73 3b 0a 0a 73 75 62 20 68 |warnings;..sub h|
00000020 65 78 5f 64 75 6d 70 20 7b 0a 20 20 20 6d 79 20 |ex_dump {. my |
...
000000a0 74 65 73 20 3d 20 30 2e 2e 32 35 35 3b 0a 0a 70 |tes = 0..255;..p|
000000b0 72 69 6e 74 20 68 65 78 5f 64 75 6d 70 28 40 62 |rint hex_dump(#b|
000000c0 79 74 65 73 5b 36 2e 2e 31 30 35 5d 29 3b 0a |ytes[6..105]);.|
000000cf
You could also use a for loop. You would iterate over the indexes, and print either a space or a line feed depending on the index of the byte you're printing.
sub hex_dump {
my $d = '';
for my $i (0..$#_) {
$d .= sprintf("%02X%s", $_[$i], $i == $#_ || $i % 16 == 15 ? "\n" : " ");
}
return $d;
}
print hex_dump(#bytes[6..105]);

Can someone tell me why I am getting this error, is it because of the spacing (I know quotations matter)? [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 2 years ago.
Improve this question
create table product(
productid int,
description varchar(20)
);

insert into product (
productid,
description )
Values ( 42 , ' tv');
ERROR: column "description" of relation "product" does not exist
As several people pointed out in comments, there are invisible characters (sometimes called "gremlins") in your SQL that make it invalid. Here's a hex dump of the contents (after copying the code from the question, using macOS commands):
$ pbpaste | xxd -g1
00000000: 63 72 65 61 74 65 20 74 61 62 6c 65 20 70 72 6f create table pro
00000010: 64 75 63 74 28 0a 70 72 6f 64 75 63 74 69 64 20 duct(.productid
00000020: 69 6e 74 2c e2 80 a8 0a 64 65 73 63 72 69 70 74 int,....descript
^^ ^^ ^^ ^^^
00000030: 69 6f 6e 20 76 61 72 63 68 61 72 28 32 30 29 0a ion varchar(20).
00000040: 29 3b 0a e2 80 a8 69 6e 73 65 72 74 20 69 6e 74 );....insert int
00000050: 6f 20 70 72 6f 64 75 63 74 20 28 e2 80 a8 70 72 o product (...pr
00000060: 6f 64 75 63 74 69 64 2c e2 80 a8 64 65 73 63 72 oductid,...descr
^^ ^^ ^^ ^^^
00000070: 69 70 74 69 6f 6e 20 29 e2 80 a8 56 61 6c 75 65 iption )...Value
^^ ^^ ^^ ^^^
00000080: 73 20 28 20 34 32 20 2c 20 27 20 74 76 27 29 3b s ( 42 , ' tv');
00000090: 0a 45 52 52 4f 52 3a 20 20 63 6f 6c 75 6d 6e 20 .ERROR: column
000000a0: 22 64 65 73 63 72 69 70 74 69 6f 6e 22 20 6f 66 "description" of
000000b0: 20 72 65 6c 61 74 69 6f 6e 20 22 70 72 6f 64 75 relation "produ
000000c0: 63 74 22 20 64 6f 65 73 20 6e 6f 74 20 65 78 69 ct" does not exi
000000d0: 73 74 st
(Note that xxd represents bytes that don't correspond to printable ASCII characters as "." in the text display on the right. The "."s that correspond to 0a in hex are newline characters.)
The hex codes e2 80 a8 correspond to the UTF-8 encoding of the unicode "line separator" character. I don't know how that character got in there; you'd have to trace back the origin of that code snippet to figure out where they were added.
I'd avoid using TextEdit for source code (and config files, etc) . Instead, I'd recommend using BBEdit or some other code-oriented editor. I think even in BBEdit's free-demo mode it can show (and let you remove) normally-invisible characters by choosing View menu -> Text Display -> Show Invisibles.
You can also remove non-plain-ASCII characters from a text file from the macOS Terminal with:
LC_ALL=C tr -d '\n\t -~' <infile.txt >cleanfile.txt
(Replacing infile.txt and cleanfile.txt with the paths/names of the input file and where you want to store the output.) Warning: do not try to write the cleaned contents back to the original file, that won't work. Also, don't use this to clean anything except plain text files (if the file has any sections that aren't supposed to be text sections, this may mangle those sections). Keep the original file as a backup until you've verified that the "clean" version works right.
You can also "clean" the paste buffer with:
pbpaste | LC_ALL=C tr -d '\n\t -~' | pbcopy
...so just copy the relevant code from your text editor, run that in Terminal, then paste the cleaned version back into the editor.

Convert Hex to ASCII in PowerShell [duplicate]

This question already has answers here:
PowerShell hex to string conversion
(7 answers)
Closed 6 years ago.
I have a series of hex values which looks like this:
68 65 6c 6c 6f 57 6f 72 6c 64 7c 31 2f 30 38 31 35 7c 41 42 43 2d 31 35 02 08
I now need to convert this hex value to ASCII so that the result looks like:
helloWorld|1/0815|ABC-15
I tried so many things but I never came to the final code. I tried to use the convert-function in every imaginable way without any success.
At the moment I use this website to convert, but I need to do this in my PowerShell script.
Much like Phil P.'s approach, but using the -split and -join operators instead (also, integers not needed, ASCII chars will fit into a [byte]):
$hexString = "68 65 6c 6c 6f 57 6f 72 6c 64 7c 31 2f 30 38 31 35 7c 41 42 43 2d 31 35 02 08"
$asciiChars = $hexString -split ' ' |ForEach-Object {[char][byte]"0x$_"}
$asciiString = $asciiChars -join ''
$asciiString
Well, we can do something terrible and treat the HEX as a string... And then convert it to int16, which then can be converted to a char.
$hexString = "68 65 6c 6c 6f 57 6f 72 6c 64 7c 31 2f 30 38 31 35 7c 41 42 43 2d 31 35 02 08"
We have the string, now we can use the spaces to split and get each value separately. These values can be converted to an int16, which is a ascii code representation for the character
$hexString.Split(" ") | forEach {[char]([convert]::toint16($_,16))}
The only problem is that it returns an array of single characters. Which we can iterate through and concatenate into a string
$hexString.Split(" ") | forEach {[char]([convert]::toint16($_,16))} | forEach {$result = $result + $_}
$result

Hex dump parsing in perl

I have a hex dump of a message in a file which i want to get it in an array
so i can perform the decoding logic on it.
I was wondering if that was a easier way to parse a message which looks like this.
37 39 30 35 32 34 35 34 3B 32 31 36 39 33 34 35
3B 32 31 36 39 33 34 36 00 00 01 08 40 00 00 15
6C 71 34 34 73 69 6D 31 5F 33 30 33 31 00 00 00
00 00 01 28 40 00 00 15 74 65 6C 63 6F 72 64 69
74 65 6C 63 6F 72 64 69
Note that the data can be max 16 bytes on any row. But any row can contain fewer bytes too (minimum :1 )
Is there a nice and elegant way rather than to read 2 chars at a time in perl ?
Perl has a hex operator that performs the decoding logic for you.
hex EXPR
hex
Interprets EXPR as a hex string and returns the corresponding value. (To convert strings that might start with either 0, 0x, or 0b, see oct.) If EXPR is omitted, uses $_.
print hex '0xAf'; # prints '175'
print hex 'aF'; # same
Remember that the default behavior of split chops up a string at whitespace separators, so for example
$ perl -le '$_ = "a b c"; print for split'
a
b
c
For every line of the input, separate it into hex values, convert the values to numbers, and push them onto an array for later processing.
#! /usr/bin/perl
use warnings;
use strict;
my #values;
while (<>) {
push #values => map hex($_), split;
}
# for example
my $sum = 0;
$sum += $_ for #values;
print $sum, "\n";
Sample run:
$ ./sumhex mtanish-input
4196
I would read a line at a time, strip the whitespace, and use pack 'H*' to convert it. It's hard to be more specific without knowing what kind of "decoding logic" you're trying to apply. For example, here's a version that converts each byte to decimal:
while (<>) {
s/\s+//g;
my #bytes = unpack('C*', pack('H*', $_));
print "#bytes\n";
}
Output from your sample file:
55 57 48 53 50 52 53 52 59 50 49 54 57 51 52 53
59 50 49 54 57 51 52 54 0 0 1 8 64 0 0 21
108 113 52 52 115 105 109 49 95 51 48 51 49 0 0 0
0 0 1 40 64 0 0 21 116 101 108 99 111 114 100 105
116 101 108 99 111 114 100 105
I think reading in two characters at a time is the appropriate way to parse a stream whose logical tokens are two-character units.
Is there some reason you think that's ugly?
If you're trying to extract a particular sequence, you could do that with whitespace-insensitive regular expressions.