I am using the unpack function to convert the contents of a binary file to hexadecimal.
I am doing it as follows:
#! /usr/bin/perl
use strict;
use warnings;
my $input=$ARGV[0];
open(INPUT,'<',$input) || die("Couldn't open the file, $input with error: $!\n");
my $value=<INPUT>;
$value=unpack("H*",$value);
print $value,"\n";
This prints the contents of the binary input file as a hex string.
However, the issue is that, while parsing the contents of the binary file, if it comes across the byte 0xa (newline character), unpack function stops at that point.
As a result of this, I get the incomplete output in $value variable.
Few examples:
65 2E 0D 0D 0A 24 00 00 00 00 00 00 00 BA DC 95 DC FE BD
FE FF FF FF 07 00 00 00 08 00 00 00 09 00 00 00 0A 00 00 00 0B 00 00 00 0C 00
All the content after the byte, 0xa is not parsed by unpack.
So, is there a way to use unpack for the complete binary file so that it does not stop parsing once it encounters a new line character?
Thanks.
What do you think
my $value = <INPUT>;
does? Read a line, which is to say read until 0A. Fix:
my $value;
{ local $/; $value = <INPUT>; }
Also, you want to add
binmode(INPUT);
after the open.
Related
I'm learning Assembly as part of a malware analysis project and trying to use a few Node.js libraries to scrape executables from GitHub and disassemble them.
Specifically I'm focusing on x86-64 PE.
But a disassembler, such as the one I chose isn't necessarily supposed to find the instructions in a particular executable format such as in a PE.
In addition to first needing to know where my instructions should start, when I started using the disassembler, I realized I also needed to set a particular RIP value for the program to start at. I don't fully understand why some programs start at different memory offsets, but supposedly it's to allow other cooperating processes to put memory in the same block. Or something like that.
So my goal is to know:
the correct starting value for the RIP
the correct byte to look for the first instruction, beyond the header.
So I used a library to find meta data, like so:
let metaData = await executableMetadata.getMetadataObjectFromExecutableFilePath_Async(execPath);
Which when passed an exe with a header like this:
0: 4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00
16: b8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00
32: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
48: 00 00 00 00 00 00 00 00 00 00 00 00 80 00 00 00
64: 0e 1f ba 0e 00 b4 09 cd 21 b8 01 4c cd 21 54 68
80: 69 73 20 70 72 6f 67 72 61 6d 20 63 61 6e 6e 6f
96: 74 20 62 65 20 72 75 6e 20 69 6e 20 44 4f 53 20
112: 6d 6f 64 65 2e 0d 0d 0a 24 00 00 00 00 00 00 00
128: 50 45 00 00 4c 01 03 00 91 3f 9a ef 00 00 00 00
144: 00 00 00 00 e0 00 22 00 0b 01 30 00 00 12 00 00
tells us:
{
format: 'PE',
pe_header_offset_16le: 128,
machine_type: 332,
machine_type_object: {
constant: 'IMAGE_FILE_MACHINE_I386',
description: 'Intel 386 or later processors and compatible processors'
},
number_of_sections: 3,
timestamp: -275103855,
coff_symbol_table_offset: 0,
coff_number_of_symbol_table_entries: 0,
size_of_optional_header: 224,
characteristics_bitflag: 34,
characteristics_bitflags: [
{
constant: 'IMAGE_FILE_EXECUTABLE_IMAGE',
description: 'Image only. This indicates that the image file is valid and can be run. If this flag is not set, it indicates a linker error.',
flag_code: 2
},
{
constant: 'IMAGE_FILE_LARGE_ADDRESS_AWARE',
description: 'Application can handle > 2-GB addresses.',
flag_code: 32
}
],
object_type_code: 267,
object_type: 'PE32',
linker: { major_version: 48, minor_version: 0 },
size_of_code: 4608,
size_of_initialized_data: 2048,
size_of_uninitialized_data: 0,
address_of_entry_point: 12586,
base_of_code: 8192,
windows_specific: {
image_base: 4194304,
section_alignment: 8192,
file_alignment: 512,
major_os_version: 4,
minor_os_version: 0,
major_image_version: 0,
minor_image_version: 0,
major_subsystem_version: 6,
minor_subsystem_version: 0,
win32_version: 0,
size_of_image: 32768,
size_of_headers: 512,
checksum: 0,
subsystem: {
constant: 'IMAGE_SUBSYSTEM_WINDOWS_CUI',
description: 'The Windows character subsystem',
subsystem_code: 3
},
dll_characteristics: 34144,
dll_characteristic_flags: [ [Object], [Object], [Object], [Object], [Object] ]
},
base_of_data: 16384
}
And from this, I think maybe I found the two pieces of info I needed:
First instruction byte: windows_specific.size_of_headers (512)
RIP starting value: address_of_entry_point (12586)
But I'm basically guessing. Could anyone more familiar with this meta data explain the correct properties to look at to get the info I need?
Windows executable file begins with 16bit DOS stub. Double word at the file offset 60 contains offset of DWORD PE signature, in your example it is 60: 80 00 00 00, i.e. 128 in decimal.
PE signature is immediately followed with COFF file header (file offset 132).
You may want to confront your hexadecimal dump with structure of headers in assembly language. COFF_FILE_HEADER.Machine is 132: 4C 01, i.e. 0x14C which signalizes 32bit executable. In 64bit executable it would be 0x8664.
File header is followed by COFF section headers. You are interrested in those sections, which have set bit SCN_MEM_EXECUTE=0x2000_0000 in COFF_SECTION_HEADER.Characteristics.
COFF_SECTION_HEADER.PointerToRawData specifies file offset of the start of code.
Dissect out .SizeOfRawData bytes which start at this file offset and submit that portion of code it to your disassembler.
Beware that on run-time the code will be in fact mapped to .VirtualAddress, different from .PointerToRawData.
I use a CardReader to communicate to a SIM-card.
For example, I need to get an IMSI from the SIM card.
To do this I send some commands (SELECT 3F00/7F20/6F07):
A0 A4 00 00 02 3F 00
A0 A4 00 00 02 7F 20
A0 A4 00 00 02 6F 07
and here I send READ BINARY command
A0 B0 00 00 09
and after that I receive 90 00 --> Ok - normal ending of the command.
Hey! And where is my IMSI stored?? How can I catch data, which were read by "A0 B0 00 00 09" command?
If I try "A0 C0 00 00 00" command (GET RESPONSE) I will get an Error.
You don't need to send Get Response Command "A0 C0 00 00 00" after Read Data.
There are 9 bytes of data in reply to your Read Data Command "A0 B0 00 00 09".
In my attempt to get "Steam for Linux" working on Debian, I've run into an issue. libcef (Chromium Embedded Framework) works fine with GLIBC_2.13 (which eglibc on Debian testing can provide), but requires one pesky little extra function from GLIBC_2.15 (which eglibc can't provide):
$ readelf -s libcef.so | grep -E "#GLIBC_2\.1[4567]"
1037: 00000000 0 FUNC GLOBAL DEFAULT UND __fdelt_chk#GLIBC_2.15 (49)
2733: 00000000 0 FUNC GLOBAL DEFAULT UND __fdelt_chk##GLIBC_2.15
My plan of attack here was to LD_PRELOAD a shim library that provides just these functions. This doesn't seem to work. I really want to avoid installing GLIBC_2.17 (since it is in Debian experimental; even Debian sid still has GLIBC_2.13).
This is what I've tried.
fdelt_chk.c is basically stolen from the GNU C library:
#include <sys/select.h>
# define strong_alias(name, aliasname) _strong_alias(name, aliasname)
# define _strong_alias(name, aliasname) \
extern __typeof (name) aliasname __attribute__ ((alias (#name)));
unsigned long int
__fdelt_chk (unsigned long int d)
{
if (d >= FD_SETSIZE)
__chk_fail ();
return d / __NFDBITS;
}
strong_alias (__fdelt_chk, __fdelt_warn)
My Versions script looks as follows:
GLIBC_2.15 {
__fdelt_chk; __fdelt_warn;
};
I then build the library as follows:
$ gcc -m32 -c -fPIC fdelt_chk.c -o fdelt_chk.o
$ gcc -m32 -shared -nostartfiles -Wl,-s -Wl,--version-script Versions -o fdelt_chk.so fdelt_chk.o
However, if I then run Steam (with a bunch of extra stuff to get it working in the first place), the loader still refuses to find the symbol:
% LD_LIBRARY_PATH="/home/tinctorius/.local/share/Steam/ubuntu12_32" LD_PRELOAD=./fdelt_chk.so:./steamui.so ./steam
./steam: /lib/i386-linux-gnu/i686/cmov/libc.so.6: version `GLIBC_2.15' not found (required by /home/tinctorius/.local/share/Steam/ubuntu12_32/libcef.so)
However, the version symbol is also provided by the .so I just built:
% readelf -s fdelt_chk.so
Symbol table '.dynsym' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FUNC GLOBAL DEFAULT UND __chk_fail#GLIBC_2.3.4 (3)
2: 0000146c 0 NOTYPE GLOBAL DEFAULT ABS _edata
3: 0000146c 0 NOTYPE GLOBAL DEFAULT ABS _end
4: 00000310 44 FUNC GLOBAL DEFAULT 11 __fdelt_warn##GLIBC_2.15
5: 00000310 44 FUNC GLOBAL DEFAULT 11 __fdelt_chk##GLIBC_2.15
6: 00000000 0 OBJECT GLOBAL DEFAULT ABS GLIBC_2.15
7: 0000146c 0 NOTYPE GLOBAL DEFAULT ABS __bss_start
At this point, I don't know what I can do to trick the loader (who?) into choosing my symbols. Am I going in the right direction at all?
I ran into this same problem, though not with Steam. What I was trying to run wanted 2.15 for fdelt_chk while my system had 2.14. I found a solution for simple cases like ours where we can easily provide our own implementation for the missing functionality.
I started out from your attempted solution of implementing the functionality and LD_PRELOADing it. Using LD_DEBUG=all (as suggested by osgx) showed that the linker was still looking for 2.15, so just having the right symbol wasn't enough and there was some other versioning mechanism somewhere. I noticed that objdump -p and readelf -V both showed references to 2.15, so I looked up documentation on ELF and found information on version requirements.
So my new goal was to transform references to 2.15 into references to something else. It seemed reasonable that I could just overwrite structures that referred to 2.15 with the structures that referred to some lower version, like 2.1. In the end, after some trial and error, I found just editing the right Elfxx_Vernaux(es?) in .gnu.version_r was sufficient, but caveat hacker, I guess.
The .gnu.version_r section is a list of 16-byte Elfxx_Verneeds and 16-byte Elfxx_Vernauxes. Each Elfxx_Verneed entry is followed by the associated Elfxx_Vernauxes. As far as I could tell, vn_file is actually how many associated Elfxx_Vernauxes there are, even though the docs say number of associated verneed array entries. It might just be a misunderstanding on my part, though.
So, to start off making the edits, let's look at some of the info from readelf -V. I snipped out parts we don't care about.
$ readelf -V mybinary
<snip stuff before .gnu.version_r>
Version needs section '.gnu.version_r' contains 5 entries:
Addr: 0x00000000000021ac Offset: 0x0021ac Link: 4 (.dynstr)
<snip libraries that don't refer to GLIBC_2.15>
0x00c0: Version: 1 File: libc.so.6 Cnt: 10
0x00d0: Name: GLIBC_2.3 Flags: none Version: 19
0x00e0: Name: GLIBC_2.7 Flags: none Version: 16
0x00f0: Name: GLIBC_2.2 Flags: none Version: 15
0x0100: Name: GLIBC_2.2.4 Flags: none Version: 14
0x0110: Name: GLIBC_2.1.3 Flags: none Version: 13
0x0120: Name: GLIBC_2.15 Flags: none Version: 12
0x0130: Name: GLIBC_2.4 Flags: none Version: 10
0x0140: Name: GLIBC_2.1 Flags: none Version: 9
0x0150: Name: GLIBC_2.3.4 Flags: none Version: 4
0x0160: Name: GLIBC_2.0 Flags: none Version: 2
From this we see that the section starts at 0x21ac. Each file listed will have a Elfxx_Verneed followed by an Elfxx_Vernaux for each of the subentries (like GLIBC_2.3). I assume the order of the info in the output will always match the order in the file since readelf is just dumping the structures. Here's my entire .gnu.version_r section.
000021A0 01 00 02 00
000021B0 A3 0C 00 00 10 00 00 00 30 00 00 00 11 69 69 0D
000021C0 00 00 11 00 32 0D 00 00 10 00 00 00 10 69 69 0D
000021D0 00 00 0B 00 3C 0D 00 00 00 00 00 00 01 00 02 00
000021E0 BE 0C 00 00 10 00 00 00 30 00 00 00 13 69 69 0D
000021F0 00 00 08 00 46 0D 00 00 10 00 00 00 10 69 69 0D
00002200 00 00 07 00 3C 0D 00 00 00 00 00 00 01 00 02 00
00002210 99 0C 00 00 10 00 00 00 30 00 00 00 11 69 69 0D
00002220 00 00 06 00 32 0D 00 00 10 00 00 00 10 69 69 0D
00002230 00 00 05 00 3C 0D 00 00 00 00 00 00 01 00 02 00
00002240 AE 0C 00 00 10 00 00 00 30 00 00 00 11 69 69 0D
00002250 00 00 12 00 32 0D 00 00 10 00 00 00 10 69 69 0D
00002260 00 00 03 00 3C 0D 00 00 00 00 00 00 01 00 0A 00
00002270 FF 0C 00 00 10 00 00 00 00 00 00 00 13 69 69 0D
00002280 00 00 13 00 46 0D 00 00 10 00 00 00 17 69 69 0D
00002290 00 00 10 00 50 0D 00 00 10 00 00 00 12 69 69 0D
000022A0 00 00 0F 00 5A 0D 00 00 10 00 00 00 74 1A 69 09
000022B0 00 00 0E 00 64 0D 00 00 10 00 00 00 73 1F 69 09
000022C0 00 00 0D 00 70 0D 00 00 10 00 00 00 95 91 96 06
000022D0 00 00 0C 00 7C 0D 00 00 10 00 00 00 14 69 69 0D
000022E0 00 00 0A 00 87 0D 00 00 10 00 00 00 11 69 69 0D
000022F0 00 00 09 00 32 0D 00 00 10 00 00 00 74 19 69 09
00002300 00 00 04 00 91 0D 00 00 10 00 00 00 10 69 69 0D
00002310 00 00 02 00 3C 0D 00 00 00 00 00 00
To briefly talk about the structure here, it starts out with an Elfxx_Verneed. As per the docs, we can see there will be 2 Elfxx_Vernauxes, one offset 16 bytes, and the next Elfxx_Verneed is offset 48 bytes. These offsets are from the start of the current structure. It looks like technically the associated Elfxx_Vernauxes might not be adjacent after the current Elfxx_Verneed but it was actually so in all the files I poked around in.
From this we can find the file we want (libc.so.6) in a few different ways. Cross reference the string (which I won't get into), find the Elfxx_Verneed with a count of 0A 00 (10, matching our readelf output above), or find the last Elfxx_Verneed since it's the last one readelf output. In any case, the right one for my file is at 0x226C. Its first Elfxx_Vernaux starts at 0x227C.
We want to find the Elfxx_Vernaux with a version of 0C 00 (12, again matching our readelf output above). We see the Elfxx_Vernaux that matches is at 0x22CC and the entire structure is 95 91 96 06 00 00 0C 00 7C 0D 00 00 10 00 00 00. We'll be overwriting the first 12 bytes so as to leave the offset alone. We're only modifying the data, not moving around the structures, after all.
To pick the data to overwrite with, we just copy it from a different Elfxx_Vernaux for a version of glibc we can satisfy. I picked one for 2.1, which is at 0x22EC in my file, with the data 11 69 69 0D 00 00 09 00 32 0D 00 00 10 00 00 00. So take the first 12 bytes from this and overwrite the first 12 bytes above, and that's it for the hex editing.
Of course, you might have multiple references to deal with. Your program might have multiple binaries to edit.
At this point, our program still won't run. But instead of being told something like GLIBC_2.15 not found it should complain about missing __fdelt_chk. Now we do the shim and LD_PRELOADing described in the question, except instead of versioning our implementation as 2.15, we use the version we picked while hex editing. At this point the program should run.
This method depends on being able to provide an implementation for whatever's missing. Our __fdelt_chk is extremely simple but I don't doubt that in some cases providing an implementation could be more difficult than just upgrading the system's libc instead.
For what it's worth, the __fdelt_chk function is related to the FORTIFY_SOURCE feature which was added in glibc 2.15. It enables compile-time and run-time checking for buffer overflows.
If you were able to recompile with the following CFLAGS added, it would build a backwards compatible binary without the extra checking:
-U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=0
I write the following bytes into a file named disk.img
FA 8D 36 1B 7C E8 01 00 F4 AC 3C 00 74 0C B4 0E
BB 07 00 B9 01 00 CD 10 EB EF C3 4D 61 79 20 74
68 65 20 66 6F 72 63 65 20 62 65 20 77 69 74 68
20 79 6F 75 21 0D 0A 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
..enough zero to make the size of file 512bytes.
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 AA
The above bytes are proper instructions and magic number that should work when loading into the boot sector. But after I executed "qemu-X86_64 disk.img", error happens.
Error -13 while loading disk.img
Does anyone know how to solve the problem or what is the reason that might lead to this error?
Thank you!
I don't know if you can fill an image with just anything and expect it to work just because you have 55 AA in the correct place. Since you seem to be writing a bootloader make sure your code thinks it is executing at the correct place. It should be in offset 0x7C00 (if I remember this correctly, double check that). You set it by writing the line [org 0x7C00] at the top of your assembly file.
Also I'm not sure you can have only a 512 byte file. Try to make the disk image bigger than that using something like dd if=/dev/zero of=disk.img bs=512 count=2000 and then just copy your bootloader to the first part of the disk using dd again.
Also, you should use the -hda or -fda tags, so it would be qemu -hda disk.img. -hda means hard drive image, and -fda means floppy disk image.
How can I unpack a 4byte binary file, store like the following example,
to array or TEXT file ?
input file:
00000000 00 00 00 00 00 00 00 01 00 00 00 01 00 00 00 00 |................|
00000001 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 |................|
desired output file:
0,1,1,0,1,1,1,1
For now I'm using the following unpack code:
open(ERROR_ID_BIN, "<", "/error_id.bin") or die $!;
local $/;
my #err_values = unpack("V*", <ERROR_ID_BIN>);
close(ERROR_ID_BIN);
print "\n\n\n\n\t#err_values\n\n\n";
And my problem is that it flips the values and gives me that:
0,16777216,16777216,0,16777216,16777216,16777216,16777216
What should I do ?
V is little-endian (least significant byte first); try N for big-endian (most significant byte first).
From the pack documentation
N An unsigned long (32-bit) in
"network" (big-endian) order.
V An
unsigned short (32-bit) in "VAX"
(little-endian) order.
Don't you want 'N' to correct your endness ?