Hex dump parsing in perl

Hex dump parsing in perl - perl

I have a hex dump of a message in a file which i want to get it in an array
so i can perform the decoding logic on it.
I was wondering if that was a easier way to parse a message which looks like this.
37 39 30 35 32 34 35 34 3B 32 31 36 39 33 34 35
3B 32 31 36 39 33 34 36 00 00 01 08 40 00 00 15
6C 71 34 34 73 69 6D 31 5F 33 30 33 31 00 00 00
00 00 01 28 40 00 00 15 74 65 6C 63 6F 72 64 69
74 65 6C 63 6F 72 64 69
Note that the data can be max 16 bytes on any row. But any row can contain fewer bytes too (minimum :1 )
Is there a nice and elegant way rather than to read 2 chars at a time in perl ?

Perl has a hex operator that performs the decoding logic for you.
hex EXPR
hex
Interprets EXPR as a hex string and returns the corresponding value. (To convert strings that might start with either 0, 0x, or 0b, see oct.) If EXPR is omitted, uses $_.
print hex '0xAf'; # prints '175'
print hex 'aF'; # same
Remember that the default behavior of split chops up a string at whitespace separators, so for example
$ perl -le '$_ = "a b c"; print for split'
a
b
c
For every line of the input, separate it into hex values, convert the values to numbers, and push them onto an array for later processing.
#! /usr/bin/perl
use warnings;
use strict;
my #values;
while (<>) {
push #values => map hex($_), split;
}
# for example
my $sum = 0;
$sum += $_ for #values;
print $sum, "\n";
Sample run:
$ ./sumhex mtanish-input
4196

I would read a line at a time, strip the whitespace, and use pack 'H*' to convert it. It's hard to be more specific without knowing what kind of "decoding logic" you're trying to apply. For example, here's a version that converts each byte to decimal:
while (<>) {
s/\s+//g;
my #bytes = unpack('C*', pack('H*', $_));
print "#bytes\n";
}
Output from your sample file:
55 57 48 53 50 52 53 52 59 50 49 54 57 51 52 53
59 50 49 54 57 51 52 54 0 0 1 8 64 0 0 21
108 113 52 52 115 105 109 49 95 51 48 51 49 0 0 0
0 0 1 40 64 0 0 21 116 101 108 99 111 114 100 105
116 101 108 99 111 114 100 105

I think reading in two characters at a time is the appropriate way to parse a stream whose logical tokens are two-character units.
Is there some reason you think that's ugly?
If you're trying to extract a particular sequence, you could do that with whitespace-insensitive regular expressions.

Related

How to read currency symbol in a perl script

I have a Perl script where we are reading data from a .csv file which is having some different currency symbol . When we are reading that file and write the content I can see it is printing
Get <A3>50 or <80>50 daily
Actual value is
Get £50 or €50 daily
With Dollar sign it is working fine if there is any other currency code it is not working
I tried
open my $in, '<:encoding(UTF-8)', 'input-file-name' or die $!;
open my $out, '>:encoding(latin1)', 'output-file-name' or die $!;
while ( <$in> ) {
print $out $_;
}
$ od -t x1 input-file-name
0000000 47 65 74 20 c2 a3 35 30 20 6f 72 20 e2 82 ac 35
0000020 30 20 64 61 69 6c 79 0a
0000030
od -t x1 output-file-name
0000000 47 65 74 20 a3 35 30 20 6f 72 20 5c 78 7b 32 30
0000020 61 63 7d 35 30 20 64 61 69 6c 79 0a
0000034
but that is also not helping .Output I am getting
Get \xA350 or \x8050 daily
od -t x1 output-file-name
0000000 47 65 74 20 a3 35 30 20 6f 72 20 5c 78 7b 32 30
0000020 61 63 7d 35 30 20 64 61 69 6c 79 0a
0000034

Unicode Code Point
Glyph
UTF-8
Input File
ISO-8859-1
Output File
U+00A3 POUND SIGN
£
C2 A3
C2 A3
A3
A3
U+20AC EURO SIGN
€
E2 82 AC
E2 82 AC
N/A
5C 78 7B 32 30 61 63 7D
("LATIN1" is an alias for "ISO-8859-1".)
There are no problems with the input file.
£ is correctly encoded in your input file.
€ is correctly encoded in your input file.
As for the output file,
£ is correctly encoded in your output file.
€ isn't found in the latin1 charset, so \x{20ac} is used instead.
Your program is working as expected.
You say you see <A3> instead of £. That's probably because the program you are using is expecting a file encoded using UTF-8, but you provided a file encoded using ISO-8859-1.
You also say you see <80> instead of €. But there's no way you'd see that for the file you provided.

Convert Hex to ASCII in PowerShell [duplicate]

This question already has answers here:
PowerShell hex to string conversion
(7 answers)
Closed 6 years ago.
I have a series of hex values which looks like this:
68 65 6c 6c 6f 57 6f 72 6c 64 7c 31 2f 30 38 31 35 7c 41 42 43 2d 31 35 02 08
I now need to convert this hex value to ASCII so that the result looks like:
helloWorld|1/0815|ABC-15
I tried so many things but I never came to the final code. I tried to use the convert-function in every imaginable way without any success.
At the moment I use this website to convert, but I need to do this in my PowerShell script.

Much like Phil P.'s approach, but using the -split and -join operators instead (also, integers not needed, ASCII chars will fit into a [byte]):
$hexString = "68 65 6c 6c 6f 57 6f 72 6c 64 7c 31 2f 30 38 31 35 7c 41 42 43 2d 31 35 02 08"
$asciiChars = $hexString -split ' ' |ForEach-Object {[char][byte]"0x$_"}
$asciiString = $asciiChars -join ''
$asciiString

Well, we can do something terrible and treat the HEX as a string... And then convert it to int16, which then can be converted to a char.
$hexString = "68 65 6c 6c 6f 57 6f 72 6c 64 7c 31 2f 30 38 31 35 7c 41 42 43 2d 31 35 02 08"
We have the string, now we can use the spaces to split and get each value separately. These values can be converted to an int16, which is a ascii code representation for the character
$hexString.Split(" ") | forEach {[char]([convert]::toint16($_,16))}
The only problem is that it returns an array of single characters. Which we can iterate through and concatenate into a string
$hexString.Split(" ") | forEach {[char]([convert]::toint16($_,16))} | forEach {$result = $result + $_}
$result

Function readmtx on matlab

I want to read a matrix that is on my matlab path. I was using the function readmtx but I don't know what to put on 'precision' (mtx = readmtx(fname,nrows,ncols,precision)).
I was wondering if you could help me with that. Or suggest a better way to read the matrix

You could read a matrix from text file with load command. If the first line include text, that should be started with %.
Note that each row of the text file should be values of a row in matrix, which are separated by a space, for Example:
%C1 C2 C3
1 2 3
4 5 6
7 8 9
Then, if you use load command you can read the text file into a matrix, something like:
myMatrix = load('textFileName.txt')
Now, Let's talk about readmtx ;)
About precision as described here:
Both binary and formatted data files can be read. If the file is binary, the precision argument is a format string recognized by fread. Repetition modifiers such as '40*char' are not supported. If the file is formatted, precision is a fscanf and sscanf-style format string of the form '%nX', where n is the number of characters within which the formatted data is found, and X is the conversion character such as 'g' or 'd'. Fortran-style double-precision output such as '0.0D00' can be read using a precision string such as '%nD', where n is the number of characters per element. This is an extension to the C-style format strings accepted by sscanf. Users unfamiliar with C should note that '%d' is preferred over '%i' for formatted integers. MATLAB syntax follows C in interpreting '%i' integers with leading zeros as octal. Formatted files with line endings need to provide the number of trailing bytes per row, which can be 1 for platforms with carriage returns or linefeed (Macintosh, UNIX®), or 2 for platforms with carriage returns and linefeeds (DOS).
Check this example also:
Write and read a binary matrix file:
fid = fopen('binmat','w');
fwrite(fid,1:100,'int16');
fclose(fid);
mtx = readmtx('binmat',10,10,'int16')
mtx =
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
mtx = readmtx('binmat',10,10,'int16',[2 5],3:2:9)
mtx =
13 15 17 19
23 25 27 29
33 35 37 39
43 45 47 49

Why does s/$/,/g replace first character?

OS Ubuntu 12.04
Shell bash
Why does this sed command
sed -e 's/$/,/g' test.in
replace the 5 in test.in?
Here are the contents of test.in
52147398,9480,12/31/2011 23:22,101049000,LNAM,FNAM,80512725,43,0,75,1/1/2012 6:45,101049000
Here are the results of running sed
,2147398,9480,12/31/2011 23:22,101049000,LNAM,FNAM,80512725,43,0,75,1/1/2012 6:45,101049000
I want to put a comma after 1010489000
After replacing the date-time format so that it's in yyyy-mm-dd hh:mm:ss format, now
s/$/,/g
works as expected. Is this because sed got hung up on the mm/dd/yyyy hh:mm format?

Probably your file is in DOS format. The carriage return just before the newline would cause the comma to be written as the first character on the line overwriting an existing character (in this case the 5).
Convert to Unix format first:
tr -d '\r' < file > newfile

$ cat test.txt | sed 's/$/,/g'
52147398,9480,12/31/2011 23:22,101049000,LNAM,FNAM,80512725,43,0,75,1/1/2012 6:45,101049000,
Maybe the file is corrupt? Try copy/pasting into a new file...
Also try,
$ hexdump test.txt
hexdump test.txt
0000000 35 32 31 34 37 33 39 38 2c 39 34 38 30 2c 31 32
0000010 2f 33 31 2f 32 30 31 31 20 32 33 3a 32 32 2c 31
0000020 30 31 30 34 39 30 30 30 2c 4c 4e 41 4d 2c 46 4e
0000030 41 4d 2c 38 30 35 31 32 37 32 35 2c 34 33 2c 30
0000040 2c 37 35 2c 31 2f 31 2f 32 30 31 32 20 36 3a 34
0000050 35 2c 31 30 31 30 34 39 30 30 30 0a
000005c
The first character should be 0x35.

How can I convert a 48 hex string to bytes using Perl?

I have a hex string (length 48 chars) that I want to convert to raw bytes with the pack function in order to put it in a Win32 vector of bytes.
How I can do this with Perl?

my $bytes = pack "H*", $hex;
See perlpacktut for more information.

The steps are:
Extract pairs of hexadecimal characters from the string.
Convert each pair to a decimal number.
Pack the number as a byte.
For example:
use strict;
use warnings;
my $string = 'AA55FF0102040810204080';
my #hex = ($string =~ /(..)/g);
my #dec = map { hex($_) } #hex;
my #bytes = map { pack('C', $_) } #dec;
Or, expressed more compactly:
use strict;
use warnings;
my $string = 'AA55FF0102040810204080';
my #bytes = map { pack('C', hex($_)) } ($string =~ /(..)/g);

I have the string:
"61 62 63 64 65 67 69 69 6a"
which I want to interpret as hex values, and display those as ASCII chars (those values should reproduce the character string "abcdefghij").
Typically, I try to write something quick like this:
$ echo "61 62 63 64 65 67 69 69 6a" | perl -ne 'print "$_"; print pack("H2 "x10, $_)."\n";'
61 62 63 64 65 67 69 69 6a
a
... and then I wonder, why do I get only one character back :)
First, let me note down that the string I have, can also be represented as the hex values of bytes that it takes up in memory:
$ echo -n "61 62 63 64 65 67 68 69 6a" | hexdump -C
00000000 36 31 20 36 32 20 36 33 20 36 34 20 36 35 20 36 |61 62 63 64 65 6|
00000010 37 20 36 38 20 36 39 20 36 61 |7 68 69 6a|
0000001a
_(NB: Essentially, I want to "convert" the above byte values in memory as input, to these below ones, if viewed by hexdump:
$ echo -n "abcdefghij" | hexdump -C
00000000 61 62 63 64 65 66 67 68 69 6a |abcdefghij|
0000000a
... which is how the original values for the input hex string were obtained.
)_
Well, this Pack/Unpack Tutorial (AKA How the System Stores Data) turns out is the most helpful for me, as it mentions:
The pack function accepts a template string and a list of values [...]
$rec = pack( "l i Z32 s2", time, $emp_id, $item, $quan, $urgent);
It returns a scalar containing the list of values stored according to the formats specified in the template [...]
$rec would contain the following (first line in decimal, second in hex, third as characters where applicable). Pipe characters indicate field boundaries.
Offset Contents (increasing addresses left to right)
0 160 44 19 62| 41 82 3 0| 98 111 120 101 115 32 111 102
A0 2C 13 3E| 29 52 03 00| 62 6f 78 65 73 20 6f 66
| b o x e s o f
That is, in my case, $_ is a single string variable -- whereas pack expects as input a list of several such 'single' variables (in addition to a formatting template string); and outputs again a 'single' variable (which could, however, be a sizeable chunk of memory!). In my case, if the output 'single' variable contains the ASCII code in each byte in memory, then I'm all set (I could simply print the output variable directly, then).
Thus, in order to get a list of variables from the $_ string, I can simply split it at the space sign - however, note:
$ echo "61 62 63 64 65 67 68 69 6a" | perl -ne 'print "$_"; print pack("H2", split(/ /, $_))."\n";'
61 62 63 64 65 67 68 69 6a
a
... that amount of elements to be packed must be specified (otherwise again we get only one character back); then, either of these alternatives work:
$ echo "61 62 63 64 65 67 68 69 6a" | perl -ne 'print "$_"; print pack("H2"x10, split(/ /, $_))."\n";'
61 62 63 64 65 67 68 69 6a
abcdeghij
$ echo "61 62 63 64 65 67 68 69 6a" | perl -ne 'print "$_"; print pack("(H2)*", split(/ /, $_))."\n";'
61 62 63 64 65 67 68 69 6a
abcdeghij

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Hex dump parsing in perl - perl

I think reading in two characters at a time is the appropriate way to parse a stream whose logical tokens are two-character units. Is there some reason you think that's ugly? If you're trying to extract a particular sequence, you could do that with whitespace-insensitive regular expressions.

Related

How to read currency symbol in a perl script

Convert Hex to ASCII in PowerShell [duplicate]

Function readmtx on matlab

Why does s/$/,/g replace first character?

How can I convert a 48 hex string to bytes using Perl?

Categories

Resources