My managed process is suspected to have caused a BSOD at a client site. I received a full memory dump (i.e.: including kernel, physical pages only) - but still am not able to inspect my process' stacks.
After switching to my process context -
.process /p /r <MyProcAddress>
I see only -
1: kd> k
# ChildEBP RetAddr
00 b56e3b70 81f2aa5d nt!KeBugCheckEx+0x1e
01 b56e3b94 81e7b68d nt!PspCatchCriticalBreak+0x71
02 b56e3bc4 81e6dfd1 nt!PspTerminateAllThreads+0x2d
03 b56e3bf8 8d48159a nt!NtTerminateProcess+0xcd
WARNING: Stack unwind information not available. Following frames may be wrong.
04 b56e3c24 81c845e4 klif+0x7559a
05 b56e3c24 77da6bb4 nt!KiSystemServicePostCall
06 0262f34c 00220065 ntdll!KiFastSystemCallRet
07 0262f390 003e0022 0x220065
08 0262f394 0073003c 0x3e0022
09 0262f398 00730079 0x73003c
0a 0262f39c 006e003a 0x730079
0b 0262f3a0 006d0061 0x6e003a
0c 0262f3a4 00200065 0x6d0061
0d 0262f3a8 00610076 0x200065
0e 0262f3ac 0075006c 0x610076
0f 0262f3b0 003d0065 0x75006c
10 0262f3b4 00770022 0x3d0065
11 0262f3b8 006e0069 0x770022
12 0262f3bc 006f0077 0x6e0069
13 0262f3c0 00640072 0x6f0077
14 0262f3c4 00200022 0x640072
...
Which is natural for managed process. SOS extension does not work for kernel dumps.
Is there anything I can do to view the throwing managed stack? It was previously said to be 'much more difficult', but hopefully not impossible.
PS.
I'm aware of the presence of Kaspersky driver kilf.sys in the stack, and this is my personal suspect. But the question is more general - hopefully there's a way to understand what my process was doing at the time.
the stack as you posted is not correct
it appears to be overwritten or is a result of some other artefact
with such a stack details you will have a hard time deciphering
anything useful at all
the contents of stack converted to a printable range in english looks like this
Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 00 3E 00 22 00 22 00 65 00 73 00 3C 00 3E 00 22 .>.".".e.s.<.>."
00000010 00 73 00 79 00 73 00 3C 00 6E 00 3A 00 73 00 79 .s.y.s.<.n.:.s.y
00000020 00 6D 00 61 00 6E 00 3A 00 20 00 65 00 6D 00 61 .m.a.n.:. .e.m.a
00000030 00 61 00 76 00 20 00 65 00 75 00 6C 00 61 00 76 .a.v. .e.u.l.a.v
00000040 00 3D 00 65 00 75 00 6C 00 77 00 22 00 3D 00 65 .=.e.u.l.w.".=.e
00000050 00 6E 00 69 00 77 00 22 00 6F 00 77 00 6E 00 69 .n.i.w.".o.w.n.i
00000060 00 64 00 72 00 6F 00 77 00 20 00 22 00 64 00 72 .d.r.o.w. .".d.r
try !analyze -v and see what is the bsod analysis results
Related
When I use powershell tee-object cmdlet to save the output to a file, blank lines are created between each actual line. Output gets doubled and ugly, in both screen output, as well in the redirected file.
regular command, and output:
# db2 connect to sample
Database Connection Information
Database server = DB2/NT64 11.5.0.0
SQL authorization ID = SAMUEL
Local database alias = SAMPLE
but, when you use Tee-Object against it... here is what happens:
# db2 connect to sample | Tee-Object test.out
Database Connection Information
Database server = DB2/NT64 11.5.0.0
SQL authorization ID = SAMUEL
Local database alias = SAMPLE
In both screen output, and also in the generated file a well:
# type test.out
Database Connection Information
Database server = DB2/NT64 11.5.0.0
SQL authorization ID = SAMUEL
Local database alias = SAMPLE
--- edit ---
#js2010, here is the entire hex-format for better reading.. cant paste it properly in the comments.
# format-hex test.out
Path: E:\PowerShell_Tests\db2mon\test.out
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 FF FE 0D 00 0A 00 0D 00 0A 00 20 00 20 00 20 00 .รพ........ . . .
00000010 44 00 61 00 74 00 61 00 62 00 61 00 73 00 65 00 D.a.t.a.b.a.s.e.
00000020 20 00 43 00 6F 00 6E 00 6E 00 65 00 63 00 74 00 .C.o.n.n.e.c.t.
00000030 69 00 6F 00 6E 00 20 00 49 00 6E 00 66 00 6F 00 i.o.n. .I.n.f.o.
00000040 72 00 6D 00 61 00 74 00 69 00 6F 00 6E 00 0D 00 r.m.a.t.i.o.n...
00000050 0A 00 0D 00 0A 00 0D 00 0A 00 0D 00 0A 00 20 00 .............. .
00000060 44 00 61 00 74 00 61 00 62 00 61 00 73 00 65 00 D.a.t.a.b.a.s.e.
00000070 20 00 73 00 65 00 72 00 76 00 65 00 72 00 20 00 .s.e.r.v.e.r. .
00000080 20 00 20 00 20 00 20 00 20 00 20 00 20 00 3D 00 . . . . . . .=.
00000090 20 00 44 00 42 00 32 00 2F 00 4E 00 54 00 36 00 .D.B.2./.N.T.6.
000000A0 34 00 20 00 31 00 31 00 2E 00 35 00 2E 00 30 00 4. .1.1...5...0.
000000B0 2E 00 30 00 0D 00 0A 00 0D 00 0A 00 20 00 53 00 ..0......... .S.
000000C0 51 00 4C 00 20 00 61 00 75 00 74 00 68 00 6F 00 Q.L. .a.u.t.h.o.
000000D0 72 00 69 00 7A 00 61 00 74 00 69 00 6F 00 6E 00 r.i.z.a.t.i.o.n.
000000E0 20 00 49 00 44 00 20 00 20 00 20 00 3D 00 20 00 .I.D. . . .=. .
000000F0 53 00 41 00 4D 00 55 00 45 00 4C 00 0D 00 0A 00 S.A.M.U.E.L.....
00000100 0D 00 0A 00 20 00 4C 00 6F 00 63 00 61 00 6C 00 .... .L.o.c.a.l.
00000110 20 00 64 00 61 00 74 00 61 00 62 00 61 00 73 00 .d.a.t.a.b.a.s.
00000120 65 00 20 00 61 00 6C 00 69 00 61 00 73 00 20 00 e. .a.l.i.a.s. .
00000130 20 00 20 00 3D 00 20 00 53 00 41 00 4D 00 50 00 . .=. .S.A.M.P.
00000140 4C 00 45 00 0D 00 0A 00 0D 00 0A 00 0D 00 0A 00 L.E.............
00000150 0D 00 0A 00 ....
Also, your 2nd test reveals that the problem is not about using tee-object cmdlet, but actually, just piping the output causes it...
Another information, If I perform a redirect to a file, from a regular windows cmd window, the issue does not happens,
from cmd window:
E:\PowerShell_Tests\db2mon>db2 connect to sample > cmd.out
E:\PowerShell_Tests\db2mon>type cmd.out
Database Connection Information
Database server = DB2/NT64 11.5.0.0
SQL authorization ID = SAMUEL
Local database alias = SAMPLE
but, performing the same redirect from a powershell session, created the double lines again:
# db2 connect to sample > pwsh.out
PS [Samuel]E:\PowerShell_Tests\db2mon
# Get-Content pwsh.out
Database Connection Information
Database server = DB2/NT64 11.5.0.0
SQL authorization ID = SAMUEL
Local database alias = SAMPLE
--- end edit ---
--- edit 2 ---
#js2010
# db2 connect to sample | format-hex
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 20 20 20 44 61 74 61 62 61 73 65 20 43 6F 6E 6E Database Conn
00000010 65 63 74 69 6F 6E 20 49 6E 66 6F 72 6D 61 74 69 ection Informati
00000020 6F 6E on
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 20 44 61 74 61 62 61 73 65 20 73 65 72 76 65 72 Database server
00000010 20 20 20 20 20 20 20 20 3D 20 44 42 32 2F 4E 54 = DB2/NT
00000020 36 34 20 31 31 2E 35 2E 30 2E 30 64 11.5.0.0
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 20 53 51 4C 20 61 75 74 68 6F 72 69 7A 61 74 69 SQL authorizati
00000010 6F 6E 20 49 44 20 20 20 3D 20 53 41 4D 55 45 4C on ID = SAMUEL
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 20 4C 6F 63 61 6C 20 64 61 74 61 62 61 73 65 20 Local database
00000010 61 6C 69 61 73 20 20 20 3D 20 53 41 4D 50 4C 45 alias = SAMPLE
--- end edit 2 ---
Does any one has any clue on what is going on, and how can I "fix" it ?
Thanks
As your Format-Hex output implies, db2 - bizarrely - uses CRCRLF ("`r`r`n" in PowerShell terms) rather than the usual CRLF sequences ("`r`n") as newlines (to separate its output lines) - it is a behavior it shares with sfc.exe.
When you print to the display, this anomaly doesn't surface, but it does when you capture or redirect the output, such as via Tee-Object.
The workaround is to eliminate every other line, which discards the extra lines that result from PowerShell interpreting a CR ("`r") by itself as a newline too:
$i = 0
db2 ... | Where-Object { ++$i % 2 } | Tee-Object test.out
Update: You've since provided a convenient wrapper function based on this solution in your own answer.
for other Db2 DBAs out there trying to use powershell as me..
I have created this small hack, to handle this for all my db2 ps sessions.
Edit your powershell user profile, creating an function and alias as above:
$Home[My ]Documents\PowerShell\Microsoft.PowerShell_profile.ps1 :
# db2 settings for powershell
Set-Item -Path env:DB2CLP -value "**$$**"
# Handle db2 output, avoiding doubled lines due 'CRCRLF' pattern
Function Handle-Db2 {
$i = 0
db2 $args | Where-Object { ++$i % 2 }
}
New-Alias -Name "db2ps" Handle-Db2
Now, if you want to use the hacked version, instead of calling db2 .... you can use db2ps ... instead and have a proper output.
# db2ps describe table employee | Tee-Object employee.out
Data type Column
Column name schema Data type name Length Scale Nulls
------------------------------- --------- ------------------- ---------- ----- ------
EMPNO SYSIBM CHARACTER 6 0 No
FIRSTNME SYSIBM VARCHAR 12 0 No
MIDINIT SYSIBM CHARACTER 1 0 Yes
LASTNAME SYSIBM VARCHAR 15 0 No
WORKDEPT SYSIBM CHARACTER 3 0 Yes
PHONENO SYSIBM CHARACTER 4 0 Yes
HIREDATE SYSIBM DATE 4 0 Yes
JOB SYSIBM CHARACTER 8 0 Yes
EDLEVEL SYSIBM SMALLINT 2 0 No
SEX SYSIBM CHARACTER 1 0 Yes
BIRTHDATE SYSIBM DATE 4 0 Yes
SALARY SYSIBM DECIMAL 9 2 Yes
BONUS SYSIBM DECIMAL 9 2 Yes
COMM SYSIBM DECIMAL 9 2 Yes
14 record(s) selected.
# db2ps describe table employee | Select-String "DEC"
SALARY SYSIBM DECIMAL 9 2 Yes
BONUS SYSIBM DECIMAL 9 2 Yes
COMM SYSIBM DECIMAL 9 2 Yes
It would be nice if IBM fix this odd CRCRLF behavior on db2 commands on windows.
Until this not happens, enjoy!
Regards
I'm trying to create a minimal C example on a boot sector for educational purposes.
However, I noticed that my example was not being recognized as a boot sector because he magic 0x55aa bytes were not present as the 511th and 512th bytes.
Then, I investigated further, and it seems that this is because the .eh_frame section was getting included in the image, even though it was never mentioned in the linker script.
Why is that?
The exact setup is present here and reproduced below
build.sh:
as -ggdb3 -o entry.o entry.S
gcc -c -ggdb3 -nostartfiles -nostdlib -o main.o main.c
ld -o main.elf -T linker.ld entry.o main.o
ld --oformat binary -o main.img -T linker.ld entry.o main.o
qemu-system-x86_64 -hda main.img
linker.ld
ENTRY(mystart)
SECTIONS
{
.text : {
entry.o(.text)
*(.text)
*(.rodata)
*(.data)
/**(.eh_frame)*/
. = 0x1FE;
SHORT(0xAA55)
}
/* Reserve 16 MiB of stack. */
__stack_bottom = .;
. = . + 0x1000000;
__stack_top = .;
}
entry.S:
.text
.global mystart
mystart:
mov %rsp, __stack_top
call main
jmp .
main.c:
void main(void) {
while (1);
}
If I uncomment the .eh_frame frame above, then it gets included at the specified location, and things work, although it is not ideal and I would rather ignore that section completely.
But why does it try to include the .eh_frame in the final image if I never mention it?
I found out about .eh_frame by doing:
hd main.img
which gives:
00000000 14 00 00 00 00 00 00 00 01 7a 52 00 01 78 10 01 |.........zR..x..|
00000010 1b 0c 07 08 90 01 00 00 1c 00 00 00 1c 00 00 00 |................|
00000020 27 00 00 00 06 00 00 00 00 41 0e 10 86 02 43 0d |'........A....C.|
00000030 06 00 00 00 00 00 00 00 48 89 24 25 38 02 00 01 |........H.$%8...|
00000040 e8 02 00 00 00 eb fe 55 48 89 e5 eb fe 66 2e 0f |.......UH....f..|
00000050 1f 84 00 00 00 00 00 66 2e 0f 1f 84 00 00 00 00 |.......f........|
00000060 00 66 2e 0f 1f 84 00 00 00 00 00 66 2e 0f 1f 84 |.f.........f....|
00000070 00 00 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 |.....f.........f|
00000080 2e 0f 1f 84 00 00 00 00 00 66 2e 0f 1f 84 00 00 |.........f......|
00000090 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 2e 0f |...f.........f..|
000000a0 1f 84 00 00 00 00 00 66 2e 0f 1f 84 00 00 00 00 |.......f........|
000000b0 00 66 2e 0f 1f 84 00 00 00 00 00 66 2e 0f 1f 84 |.f.........f....|
000000c0 00 00 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 |.....f.........f|
000000d0 2e 0f 1f 84 00 00 00 00 00 66 2e 0f 1f 84 00 00 |.........f......|
000000e0 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 2e 0f |...f.........f..|
000000f0 1f 84 00 00 00 00 00 66 2e 0f 1f 84 00 00 00 00 |.......f........|
00000100 00 66 2e 0f 1f 84 00 00 00 00 00 66 2e 0f 1f 84 |.f.........f....|
00000110 00 00 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 |.....f.........f|
00000120 2e 0f 1f 84 00 00 00 00 00 66 2e 0f 1f 84 00 00 |.........f......|
00000130 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 2e 0f |...f.........f..|
00000140 1f 84 00 00 00 00 00 66 2e 0f 1f 84 00 00 00 00 |.......f........|
00000150 00 66 2e 0f 1f 84 00 00 00 00 00 66 2e 0f 1f 84 |.f.........f....|
00000160 00 00 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 |.....f.........f|
00000170 2e 0f 1f 84 00 00 00 00 00 66 2e 0f 1f 84 00 00 |.........f......|
00000180 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 2e 0f |...f.........f..|
00000190 1f 84 00 00 00 00 00 66 2e 0f 1f 84 00 00 00 00 |.......f........|
000001a0 00 66 2e 0f 1f 84 00 00 00 00 00 66 2e 0f 1f 84 |.f.........f....|
000001b0 00 00 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 |.....f.........f|
000001c0 2e 0f 1f 84 00 00 00 00 00 66 2e 0f 1f 84 00 00 |.........f......|
000001d0 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 2e 0f |...f.........f..|
000001e0 1f 84 00 00 00 00 00 66 2e 0f 1f 84 00 00 00 00 |.......f........|
000001f0 00 66 2e 0f 1f 84 00 00 00 00 00 66 2e 0f 1f 84 |.f.........f....|
00000200 00 00 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 |.....f.........f|
00000210 2e 0f 1f 84 00 00 00 00 00 66 2e 0f 1f 84 00 00 |.........f......|
00000220 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 0f 1f |...f.........f..|
00000230 84 00 00 00 00 00 55 aa |......U.|
00000238
and then:
objdump -D main.o
which contains:
Disassembly of section .eh_frame:
0000000000000000 <.eh_frame>:
0: 14 00 adc $0x0,%al
2: 00 00 add %al,(%rax)
4: 00 00 add %al,(%rax)
6: 00 00 add %al,(%rax)
8: 01 7a 52 add %edi,0x52(%rdx)
b: 00 01 add %al,(%rcx)
d: 78 10 js 1f <.eh_frame+0x1f>
f: 01 1b add %ebx,(%rbx)
11: 0c 07 or $0x7,%al
13: 08 90 01 00 00 1c or %dl,0x1c000001(%rax)
19: 00 00 add %al,(%rax)
1b: 00 1c 00 add %bl,(%rax,%rax,1)
1e: 00 00 add %al,(%rax)
20: 00 00 add %al,(%rax)
22: 00 00 add %al,(%rax)
24: 06 (bad)
25: 00 00 add %al,(%rax)
27: 00 00 add %al,(%rax)
29: 41 0e rex.B (bad)
2b: 10 86 02 43 0d 06 adc %al,0x60d4302(%rsi)
31: 00 00 add %al,(%rax)
33: 00 00 add %al,(%rax)
35: 00 00 add %al,(%rax)
...
so we can see that in the hd main.img, the first bytes are exactly the same as those in .eh_frame, and the image size is 512 + sizeof(.eh_frame) instead of the expected 512.
Tested on Ubuntu 18.04, GCC 7.3.0, binutils 2.30.
I am using VS Code 1.24.0 on macOS to edit YAML files that are saved to an NFS share (published on a QNAP NAS) and used by an Ubuntu 18 linux system.
When saving the YAML file VS Code often inserts a bunch of non-printable control characters which causes an error parsing the YAML. To fix it I need to open the file with vim and remove them.
00000110 20 73 65 72 76 65 72 3a 20 4e 41 53 31 0a 20 20 | server: NAS1. |
00000120 70 65 72 73 69 73 74 65 6e 74 56 6f 6c 75 6d 65 |persistentVolume|
00000130 52 65 63 6c 61 69 6d 50 6f 6c 69 63 79 3a 20 52 |ReclaimPolicy: R|
00000140 65 74 61 69 6e 00 00 00 00 00 00 00 00 00 00 00 |etain...........|
00000150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000270 00 00 00 00 00 00 00 00 00 00 00 00 00 |.............|
0000027d
Note 1: It never happens if I use VS Code on the linux system and edit the files locally; but I need to use this as a headless server so this is not how I want to work.
Note 2: This appears to be a similar issue to that raised here some time ago - but no solution is available.
In my attempt to get "Steam for Linux" working on Debian, I've run into an issue. libcef (Chromium Embedded Framework) works fine with GLIBC_2.13 (which eglibc on Debian testing can provide), but requires one pesky little extra function from GLIBC_2.15 (which eglibc can't provide):
$ readelf -s libcef.so | grep -E "#GLIBC_2\.1[4567]"
1037: 00000000 0 FUNC GLOBAL DEFAULT UND __fdelt_chk#GLIBC_2.15 (49)
2733: 00000000 0 FUNC GLOBAL DEFAULT UND __fdelt_chk##GLIBC_2.15
My plan of attack here was to LD_PRELOAD a shim library that provides just these functions. This doesn't seem to work. I really want to avoid installing GLIBC_2.17 (since it is in Debian experimental; even Debian sid still has GLIBC_2.13).
This is what I've tried.
fdelt_chk.c is basically stolen from the GNU C library:
#include <sys/select.h>
# define strong_alias(name, aliasname) _strong_alias(name, aliasname)
# define _strong_alias(name, aliasname) \
extern __typeof (name) aliasname __attribute__ ((alias (#name)));
unsigned long int
__fdelt_chk (unsigned long int d)
{
if (d >= FD_SETSIZE)
__chk_fail ();
return d / __NFDBITS;
}
strong_alias (__fdelt_chk, __fdelt_warn)
My Versions script looks as follows:
GLIBC_2.15 {
__fdelt_chk; __fdelt_warn;
};
I then build the library as follows:
$ gcc -m32 -c -fPIC fdelt_chk.c -o fdelt_chk.o
$ gcc -m32 -shared -nostartfiles -Wl,-s -Wl,--version-script Versions -o fdelt_chk.so fdelt_chk.o
However, if I then run Steam (with a bunch of extra stuff to get it working in the first place), the loader still refuses to find the symbol:
% LD_LIBRARY_PATH="/home/tinctorius/.local/share/Steam/ubuntu12_32" LD_PRELOAD=./fdelt_chk.so:./steamui.so ./steam
./steam: /lib/i386-linux-gnu/i686/cmov/libc.so.6: version `GLIBC_2.15' not found (required by /home/tinctorius/.local/share/Steam/ubuntu12_32/libcef.so)
However, the version symbol is also provided by the .so I just built:
% readelf -s fdelt_chk.so
Symbol table '.dynsym' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FUNC GLOBAL DEFAULT UND __chk_fail#GLIBC_2.3.4 (3)
2: 0000146c 0 NOTYPE GLOBAL DEFAULT ABS _edata
3: 0000146c 0 NOTYPE GLOBAL DEFAULT ABS _end
4: 00000310 44 FUNC GLOBAL DEFAULT 11 __fdelt_warn##GLIBC_2.15
5: 00000310 44 FUNC GLOBAL DEFAULT 11 __fdelt_chk##GLIBC_2.15
6: 00000000 0 OBJECT GLOBAL DEFAULT ABS GLIBC_2.15
7: 0000146c 0 NOTYPE GLOBAL DEFAULT ABS __bss_start
At this point, I don't know what I can do to trick the loader (who?) into choosing my symbols. Am I going in the right direction at all?
I ran into this same problem, though not with Steam. What I was trying to run wanted 2.15 for fdelt_chk while my system had 2.14. I found a solution for simple cases like ours where we can easily provide our own implementation for the missing functionality.
I started out from your attempted solution of implementing the functionality and LD_PRELOADing it. Using LD_DEBUG=all (as suggested by osgx) showed that the linker was still looking for 2.15, so just having the right symbol wasn't enough and there was some other versioning mechanism somewhere. I noticed that objdump -p and readelf -V both showed references to 2.15, so I looked up documentation on ELF and found information on version requirements.
So my new goal was to transform references to 2.15 into references to something else. It seemed reasonable that I could just overwrite structures that referred to 2.15 with the structures that referred to some lower version, like 2.1. In the end, after some trial and error, I found just editing the right Elfxx_Vernaux(es?) in .gnu.version_r was sufficient, but caveat hacker, I guess.
The .gnu.version_r section is a list of 16-byte Elfxx_Verneeds and 16-byte Elfxx_Vernauxes. Each Elfxx_Verneed entry is followed by the associated Elfxx_Vernauxes. As far as I could tell, vn_file is actually how many associated Elfxx_Vernauxes there are, even though the docs say number of associated verneed array entries. It might just be a misunderstanding on my part, though.
So, to start off making the edits, let's look at some of the info from readelf -V. I snipped out parts we don't care about.
$ readelf -V mybinary
<snip stuff before .gnu.version_r>
Version needs section '.gnu.version_r' contains 5 entries:
Addr: 0x00000000000021ac Offset: 0x0021ac Link: 4 (.dynstr)
<snip libraries that don't refer to GLIBC_2.15>
0x00c0: Version: 1 File: libc.so.6 Cnt: 10
0x00d0: Name: GLIBC_2.3 Flags: none Version: 19
0x00e0: Name: GLIBC_2.7 Flags: none Version: 16
0x00f0: Name: GLIBC_2.2 Flags: none Version: 15
0x0100: Name: GLIBC_2.2.4 Flags: none Version: 14
0x0110: Name: GLIBC_2.1.3 Flags: none Version: 13
0x0120: Name: GLIBC_2.15 Flags: none Version: 12
0x0130: Name: GLIBC_2.4 Flags: none Version: 10
0x0140: Name: GLIBC_2.1 Flags: none Version: 9
0x0150: Name: GLIBC_2.3.4 Flags: none Version: 4
0x0160: Name: GLIBC_2.0 Flags: none Version: 2
From this we see that the section starts at 0x21ac. Each file listed will have a Elfxx_Verneed followed by an Elfxx_Vernaux for each of the subentries (like GLIBC_2.3). I assume the order of the info in the output will always match the order in the file since readelf is just dumping the structures. Here's my entire .gnu.version_r section.
000021A0 01 00 02 00
000021B0 A3 0C 00 00 10 00 00 00 30 00 00 00 11 69 69 0D
000021C0 00 00 11 00 32 0D 00 00 10 00 00 00 10 69 69 0D
000021D0 00 00 0B 00 3C 0D 00 00 00 00 00 00 01 00 02 00
000021E0 BE 0C 00 00 10 00 00 00 30 00 00 00 13 69 69 0D
000021F0 00 00 08 00 46 0D 00 00 10 00 00 00 10 69 69 0D
00002200 00 00 07 00 3C 0D 00 00 00 00 00 00 01 00 02 00
00002210 99 0C 00 00 10 00 00 00 30 00 00 00 11 69 69 0D
00002220 00 00 06 00 32 0D 00 00 10 00 00 00 10 69 69 0D
00002230 00 00 05 00 3C 0D 00 00 00 00 00 00 01 00 02 00
00002240 AE 0C 00 00 10 00 00 00 30 00 00 00 11 69 69 0D
00002250 00 00 12 00 32 0D 00 00 10 00 00 00 10 69 69 0D
00002260 00 00 03 00 3C 0D 00 00 00 00 00 00 01 00 0A 00
00002270 FF 0C 00 00 10 00 00 00 00 00 00 00 13 69 69 0D
00002280 00 00 13 00 46 0D 00 00 10 00 00 00 17 69 69 0D
00002290 00 00 10 00 50 0D 00 00 10 00 00 00 12 69 69 0D
000022A0 00 00 0F 00 5A 0D 00 00 10 00 00 00 74 1A 69 09
000022B0 00 00 0E 00 64 0D 00 00 10 00 00 00 73 1F 69 09
000022C0 00 00 0D 00 70 0D 00 00 10 00 00 00 95 91 96 06
000022D0 00 00 0C 00 7C 0D 00 00 10 00 00 00 14 69 69 0D
000022E0 00 00 0A 00 87 0D 00 00 10 00 00 00 11 69 69 0D
000022F0 00 00 09 00 32 0D 00 00 10 00 00 00 74 19 69 09
00002300 00 00 04 00 91 0D 00 00 10 00 00 00 10 69 69 0D
00002310 00 00 02 00 3C 0D 00 00 00 00 00 00
To briefly talk about the structure here, it starts out with an Elfxx_Verneed. As per the docs, we can see there will be 2 Elfxx_Vernauxes, one offset 16 bytes, and the next Elfxx_Verneed is offset 48 bytes. These offsets are from the start of the current structure. It looks like technically the associated Elfxx_Vernauxes might not be adjacent after the current Elfxx_Verneed but it was actually so in all the files I poked around in.
From this we can find the file we want (libc.so.6) in a few different ways. Cross reference the string (which I won't get into), find the Elfxx_Verneed with a count of 0A 00 (10, matching our readelf output above), or find the last Elfxx_Verneed since it's the last one readelf output. In any case, the right one for my file is at 0x226C. Its first Elfxx_Vernaux starts at 0x227C.
We want to find the Elfxx_Vernaux with a version of 0C 00 (12, again matching our readelf output above). We see the Elfxx_Vernaux that matches is at 0x22CC and the entire structure is 95 91 96 06 00 00 0C 00 7C 0D 00 00 10 00 00 00. We'll be overwriting the first 12 bytes so as to leave the offset alone. We're only modifying the data, not moving around the structures, after all.
To pick the data to overwrite with, we just copy it from a different Elfxx_Vernaux for a version of glibc we can satisfy. I picked one for 2.1, which is at 0x22EC in my file, with the data 11 69 69 0D 00 00 09 00 32 0D 00 00 10 00 00 00. So take the first 12 bytes from this and overwrite the first 12 bytes above, and that's it for the hex editing.
Of course, you might have multiple references to deal with. Your program might have multiple binaries to edit.
At this point, our program still won't run. But instead of being told something like GLIBC_2.15 not found it should complain about missing __fdelt_chk. Now we do the shim and LD_PRELOADing described in the question, except instead of versioning our implementation as 2.15, we use the version we picked while hex editing. At this point the program should run.
This method depends on being able to provide an implementation for whatever's missing. Our __fdelt_chk is extremely simple but I don't doubt that in some cases providing an implementation could be more difficult than just upgrading the system's libc instead.
For what it's worth, the __fdelt_chk function is related to the FORTIFY_SOURCE feature which was added in glibc 2.15. It enables compile-time and run-time checking for buffer overflows.
If you were able to recompile with the following CFLAGS added, it would build a backwards compatible binary without the extra checking:
-U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=0
I write the following bytes into a file named disk.img
FA 8D 36 1B 7C E8 01 00 F4 AC 3C 00 74 0C B4 0E
BB 07 00 B9 01 00 CD 10 EB EF C3 4D 61 79 20 74
68 65 20 66 6F 72 63 65 20 62 65 20 77 69 74 68
20 79 6F 75 21 0D 0A 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
..enough zero to make the size of file 512bytes.
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 AA
The above bytes are proper instructions and magic number that should work when loading into the boot sector. But after I executed "qemu-X86_64 disk.img", error happens.
Error -13 while loading disk.img
Does anyone know how to solve the problem or what is the reason that might lead to this error?
Thank you!
I don't know if you can fill an image with just anything and expect it to work just because you have 55 AA in the correct place. Since you seem to be writing a bootloader make sure your code thinks it is executing at the correct place. It should be in offset 0x7C00 (if I remember this correctly, double check that). You set it by writing the line [org 0x7C00] at the top of your assembly file.
Also I'm not sure you can have only a 512 byte file. Try to make the disk image bigger than that using something like dd if=/dev/zero of=disk.img bs=512 count=2000 and then just copy your bootloader to the first part of the disk using dd again.
Also, you should use the -hda or -fda tags, so it would be qemu -hda disk.img. -hda means hard drive image, and -fda means floppy disk image.