While debugging a process with lldb I get to this assembly instruction:
-> 0x7ffff79c5187 <+7>: movq %fs:(%r14), %r14
Or in intel syntax:
-> 0x7ffff79c5187 <+7>: mov r14, qword ptr fs:[r14]
Contents of the registers:
(lldb) register read fs
fs = 0x0000000000000000
(lldb) register read r14
r14 = 0xffffffffffffff08
I don't know how to calculate what address is being accessed here (just reading 0xffffffffffffff08 fails), so I would like to use the same addressing mode in order to get the address that is accessed here (then set a watchpoint).
I tried many address expressions, but they are all apparently invalid.
Here are some of the ways I tried to read that memory:
x "$fs:($r14)"
x "%fs:(%r14)"
x "fs:(r14)"
x $rs
x $fs
memory read %fs:(%r14)
memory read 'qword ptr $fs:[$r14]'
memory read '%fs:(%r14)'
memory read '$fs:($r14)'
memory read '*(int **)$fs:($r14)'
memory read '*(int **)$fs:($r14)'
memory read '*(int *)$fs:($r14)'
But I always get
error: invalid start address expression.
error: address expression <my address expression> evaluation failed
Friends
Im with a little problem. Im trying to create a delphi Dll with a form in RAD Studio, but i don't know how to make it load with DllMain. I want to inject this Dll in a third-party process at runtime after.
I created the Dll project with the form without problems, but i can't find nothing good related to "how to load it with DllMain", or at least the tutorials/things i found doesn't helped me (or i'm just dumb).
Can someone help me? Give me some hint or a site/video where i can learn it?
I really appreciate your time guys! =)
You could use assembly to inject the ebp-based stack into some variables. Here is an example:
library Project1;
uses
System.SysUtils,
Windows,
System.Classes;
var
hInstDLL: THandle;
fdwReason: DWORD;
lpReserved: DWORD;
begin
asm
push eax; // Save the current eax
mov eax, [ebp+$8] // Put into eax the first argument of the current function (DLLMain)
mov [hInstDLL], eax; // Put into hInstDLL this argument
mov eax, [ebp+$c] // Load into eax the second argument
mov [fdwReason], eax; // Save to fdwReason
mov eax, [ebp+$10] // Put into eax the last argument
mov [lpReserved], eax; // Put into lpReserved (unnecessery)
pop eax; // Restore the original eax value
end;
if fdwReason = 1 {DLL_PROCESS_ATTACH} then
begin
// Do your stuff;
end;
end.
The example Reset Handler code provided by STMicro for STM32 (in my case it is for STM32H753) is the following:
Reset_Handler:
ldr sp, =_estack
movs r1, #0
b LoopCopyDataInit
...
I don't understand the first instruction, that sets the stack pointer.
Indeed the Vector Table is defined as follows:
This means that the Stack Pointer is set by the CPU from the first word in the Vector Table. This is confirmed by debug (when breaking before executing the very first instruction of the Reset Handler, the SP is set properly).
Is there a reason to keep this instruction ldr sp, =_estack in thr Reset Handler ?
The Vector table contains on its first position the initial stack address. But the programmer might want to set another value to it or set up the double stack.
In the linker script you have:
_estack = address ;
and in the very simple startup file:
g_pfnVectors:
.word _estack
.word Reset_Handler
but you can change those values to be different or the ResetHandler is called from the bootloader. Then you need to set the stack pointer to the correct value.
I am writing an operating system... I am working on context switching
..
I can switch the kernel into user program and go back. but SVC call seems not work well.
syscall:
svc SYSCALL_SVC_NUMBER
bx lr
when calling svc it trigger interrupt, I can see the control flow go back to kernel. The hard fault arise when it gets back to user program.
around here
--> bx lr
I've checked that all the registers are correctly loaded, except that xPSR lacks of thumb bit. That's why the hard fault comes.
But I have no idea why xPSR is clear to zero...
.global activate
activate:
/* save kernel state in ip register */
mrs ip, psr
push {r4, r5, r6, r7, r8, r9, r10, r11, ip, lr}
/* switch to process stack */
msr psp, r0
mov ip, #2
msr control, ip
ldr ip, [sp, #0x38]
msr psr_nzcvq, ip
/* load user state */
pop {r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, r10, r11, lr}
add sp, #0x8
ldr ip, [sp, #-0x8]
/* this line can branch correctly */
bx ip
Ahh, yes, you are right, but you dont need to modify anything at the same time.
Using this code for example
mov r1,#0x22
mov r2,#0x33
mov r3,#0x44
mov r0,#0x55
mov r12,r0
mov r0,#0x66
mov r14,r0
mov r0,#0x11
svc #1
The stack looks like this when it hits the svc handler
Fairly certain this is documented
20000FD4 FFFFFFF9 return address
20000FD8 00000011 r0
20000FDC 00000022 r1
20000FE0 00000033 r2
20000FE4 00000044 r3
20000FE8 00000055 r12
20000FEC 00000066 r14
20000FF0 0100009E r15
20000FF4 21000000 xPSR
On an exception on a cortex-m they behave in hardware in a way that can call a C function directly per their calling convention. Understand that the program counter itself has an lsbit of 0, the lsbit of 1 thing is for BX, BLX and POP, it is used by those instructions to determine ARM or THUMB mode, the bit is stripped then used as the PC.
The return from an SVC can/should look like the above. if you want to use svc or any other interrupt to do a context switch you need to build the stack to match.
there is a bit of a chicken and egg problem of course, for each thread you build a stack image like the above
20000FD4 FFFFFFF9
20000FD8 00000011
20000FDC 00000022
20000FE0 00000033
20000FE4 00000044
20000FE8 00000055
20000FEC 00000066
20000FF0 01001234
20000FF4 21000000
but you can put dont cares in the registers except for the pc which you set to the entry point of your thread. with the lsbit not set. Also you need some structure where you keep the state of the other registers.
then when you context switch you save the registers other than the above in a structure somewhere, which includes the sp, you then fill in those registers from the next thread including its sp. then you bx lr to return.
there is a little more to it than that, see atomthreads or chibios or other open source, functional OS.
You are correct the address that was interrupted or in this case the address after the svc, the program counter, is on the stack without the lsbit set, but at the same time that is correct. The actual lr used to return from the exception (svc or timer interrupt or whatever) is the special "exception return" 0xFFFFFFxx which has the lsbit set.
Basically, what I have is a game which has a function MyFunc(), and it's called on 4 places in the game. One of the addresses is 0x10002000, and the bytes are E8 0B 83 01 00.
I'm injecting a DLL, and want to patch that 0xE8 (call) to my own address. When I do it with Cheat Engine's Auto Assembler and write call MYADDRESS, it generates the proper opcode, and proper bytes.
However, if I do it with the DLL, this is what I get:
What I want to achieve is call 74C611CC. So I need to generate the bytes for the opcode to be like I want instead of what is it currently (in the screenshot)
I use this kind of code:
*(BYTE*) dwPatchAddr = 0xE8;
*(DWORD*) (dwPatchAddr + 1) = (DWORD) myFunc;
An e8 instruction is a relative call instruction, not absolute. So the next 4 bytes need to be the difference between the pc when processing this instruction and your target function. So what you want is:
*(BYTE *)dwPatchAddr = 0xE8;
*(DWORD *)(dwPatchAddr + 1) = (DWORD)((char *)myFunc - (char *)(dwPatchAddr + 5));
Note that the PC address used to compute the offset is actually the address of the next instruction after the call (what will also be pushed as the return address).