Calculating jmp's from one segment to another in windows PE files - x86-64

Assume I have a binary on my disk that I load into memory using VirtualAlloc and ReadFile.
If I want to follow a jmp instruction from one section to another, what do I need to add/subtract to get the destination address.
In other words, I want to know how IDA calculates the loc_140845BB8 from jmp loc_140845BB8
Example:
.text:000000014005D74E jmp loc_140845BB8
Jumps to the section seg007
seg007:0000000140845BB8 ; seg007:0000000140845BC4↓j
seg007:0000000140845BB8 and rbx, r14
PE info (seg007 is the section named "")

Segments are arbitary, it jumps where it jumps, without regard for segments. Jump location is calculated as the signed 32-bit value following the 0xE9 JMP opcode, added to the the address of where the next instruction would be (i.e. the location of JMP + 5 bytes).
def GetInsnLen(ea):
insn = ida_ua.insn_t()
return ida_ua.decode_insn(insn, ea)
def MakeSigned(number, size):
number = number & (1<<size) - 1
return number if number < 1<<size - 1 else - (1<<size) - (~number + 1)
def GetRawJumpTarget(ea):
if ea is None:
return None
insnlen = GetInsnLen(ea)
if not insnlen:
return None
result = MakeSigned(idc.get_wide_dword(ea + insnlen - 4), 32) + ea + insnlen
if ida_ida.cvar.inf.min_ea <= result < ida_ida.cvar.inf.max_ea:
return result
return None

Related

Is it possible to disable JVM JIT loop optimisation

I have pretty trivial scala code:
def main(): Int = {
var i: Int = 0
var limit = 0
while (limit < 1000000000) {
i = inc(i)
limit = limit + 1
}
i
}
def inc(i: Int): Int = i + 1
I am playing with JVM JIT method inlining on inc method. When inlining is enabled I get surprisingly good examples 2s vs 4ns - what I would like to make sure or at least validate is that no loop optimisation took palace in the mean time. I took look at machine code which seems ok
0x000000010b22a4d6: mov 0x8(%rbp),%r10d ; implicit exception: dispatches to 0x000000010b22a529
0x000000010b22a4da: cmp $0xf8033d43,%r10d ; {metadata('IncWhile$')}
0x000000010b22a4e1: jne L0001 ;*iload_3
; - IncWhile$::main#4 (line 7)
0x000000010b22a4e3: cmp $0x3b9aca00,%ebx
0x000000010b22a4e9: jge L0000 ;*if_icmpge
; - IncWhile$::main#7 (line 7)
0x000000010b22a4eb: sub %ebx,%r14d
0x000000010b22a4ee: add $0x3b9aca00,%r14d ;*iadd
; - IncWhile$::inc#2 (line 16)
; - IncWhile$::main#12 (line 9)
0x000000010b22a4f5: mov $0x3b9aca00,%ebx ;*if_icmpge
; - IncWhile$::main#7 (line 7)
L0000: mov $0xffffff65,%esi
I also checked flight recorder and didn't find anything suspicious but as I am not regular user I would like to double check with someone more experienced.
Code can be found on github
Of course, the loop has been optimized. No regular CPU can execute
1 billion operations in just a few nanoseconds.
There are many loop optimizations in HotSpot - do you want to disable all of them? E.g.
-XX:LoopUnrollLimit=0
-XX:+UseCountedLoopSafepoints
-XX:-UseLoopPredicate
-XX:-PartialPeelLoop
-XX:-LoopUnswitching
-XX:-LoopLimitCheck
etc. To disable most of loop optimizations, use
-XX:LoopOptsCount=0

How do I print a string in one line in MARIE?

I want to print a set of letters in one line in MARIE. I modified the code to print Hello World and came up with:
ORG 0 / implemented using "do while" loop
WHILE, LOAD STR_BASE / load str_base into ac
ADD ITR / add index to str_base
STORE INDEX / store (str_base + index) into ac
CLEAR / set ac to zero
ADDI INDEX / get the value at ADDR
SKIPCOND 400 / SKIP if ADDR = 0 (or null char)
JUMP DO / jump to DO
JUMP PRINT / JUMP to END
DO, STORE TEMP / output value at ADDR
LOAD ITR / load iterator into ac
ADD ONE / increment iterator by one
STORE ITR / store ac in iterator
JUMP WHILE / jump to while
PRINT, SUBT ONE
SKIPCOND 000
JUMP PR
HALT
PR, OUTPUT
JUMP WHILE
ONE, DEC 1
ITR, DEC 0
INDEX, HEX 0
STR_BASE, HEX 12 / memory location of str
STR, HEX 48 / H
HEX 65 / E
HEX 6C / L
HEX 6C / L
HEX 6F / O
HEX 0 / carriage return
HEX 57 / W
HEX 6F / O
HEX 72 / R
HEX 6C / L
HEX 64 / D
HEX 0 / NULL char
My program ends up halting past two iterations. I can't seem to figure out how to print a set of characters in one line. Thanks.
Your value of STR_BASE is almost certainly incorrect. Based on what is here I would say it needs to be 18 instead of 12. Also you would either want to remove current null char that is between "HELLO" and "WORLD" and replace it with a space or simply remove that line, depending on your intended output.

2 bit branch predictor with two for loops

I got a 2 bit branch predictor, my starting state is weakly taken and I need to calculate the prediction accuracy:
for (int i=0; i < 100; i++)
{
for (int j=0; j < 50; j++)
{
...
}
}
So with i = 0 we take the branch, so we are at i = 0 and j = 0 and set our predictor to strongly taken, right ? So if we iterate j now, does that mean we are not taking a new branch ? As we are still in the i = 0 branch, or does every iteration count as a new branch ?
Let's manually compile it into x86 assembly first for better understanding (any other would do to):
mov ebx, 0 // this is our var i
.L0:
# /------------ inner loop start -----------\
mov eax, 0 // this is our var j
.L1:
// ...
add eax, 1
cmp eax, 50
jl .L1 // jump one
# \------------ inner loop end -------------/
add ebx, 1
cmp ebx, 100
jl .L0 // jump two
I think this code is pretty straight forward even if your not familiar with assembly:
Set ebx to 0
jump two gets back here
Set eax to 0
jump one gets back here
Execute our loop code // ...
add 1 to eax
compare eax to 50 (this sets some bits in a flag register)
jump to label .L1: if eax wasn't 50
add 1 to ebx
compare ebx to 50 (this sets some bits in a flag register)
jump to label .L0: if ebx wasn't 100
End of the loops
So on the first iteration we arrive at jump one and predict it will be taken. Since eax < 50 we take it and update it to strongly taken. Now we do this another 48 times. On the 50 iteration we don't jump because eax == 50. This is a single misprediction and we update to weakly taken.
Now we arrive at jump two for the first time. since ebx < 100 we take it and update it to strongly taken. Now we start all over with that inner loop by jumping to L0. We do this another 98 times. On the 100 iteration of the inner loop we don't jump because ebx == 100. This is a single misprediction and we update to weakly taken.
So we execute the innerloop 100 times with a single misprediction each for a total of 100 mispredictions for jump one and 100 * 49 = 4900 correct predictions. The outer loop is executed only once and has only 1 misprediction and 99 correct predictions.

How to skip a line from execution in windbg everytime it hits?

Suppose I want to skip line 3 of function func everytime it is called
int func() {
int a = 10, b =20;
a = 25;
b = 30;
return a+b
}
so everytime It should be returning 40 (ie doesn't execute 3rd line a=25)
Is there any similar command in windbg like jmp in gdb?
again a very late answer but if messing with assembly is not preferable
set a conditional breakpoint to skip executing one line
in the example below 401034 is the line you do not want to execute
so set a conditional breakpoint on that line to skip it
bp 401034 "r eip = #$eip + size of current instruction";gc"
7 in this case gc = go from conditionl break
jmptest:\>dir /b
jmptest.c
jmptest:\>type jmptest.c
#include <stdio.h>
int func()
{
int a = 10 , b = 20;
a = 25;
b = 30;
return a+b;
}
int main (void)
{
int i , ret;
for (i= 0; i< 10; i++)
{
ret = func();
printf("we want 40 we get %d\n",ret);
}
return 0;
}
jmptest:\>cl /nologo /Zi jmptest.c
jmptest.c
jmptest:\>dir /b *.exe
jmptest.exe
jmptest:\>cdb -c "uf func;q" jmptest.exe | grep 401
00401020 55 push ebp
00401021 8bec mov ebp,esp
00401023 83ec08 sub esp,8
00401026 c745fc0a000000 mov dword ptr [ebp-4],0Ah
0040102d c745f814000000 mov dword ptr [ebp-8],14h
00401034 c745fc19000000 mov dword ptr [ebp-4],19h
0040103b c745f81e000000 mov dword ptr [ebp-8],1Eh
00401042 8b45fc mov eax,dword ptr [ebp-4]
00401045 0345f8 add eax,dword ptr [ebp-8]
00401048 8be5 mov esp,ebp
0040104a 5d pop ebp
0040104b c3 ret
jmptest:\>cdb -c "bp 401034 \"r eip = 0x40103b;gc\";g;q " jmptest.exe | grep wan
t
we want 40 we get 40
we want 40 we get 40
we want 40 we get 40
we want 40 we get 40
we want 40 we get 40
we want 40 we get 40
we want 40 we get 40
we want 40 we get 40
we want 40 we get 40
we want 40 we get 40
jmptest:\>
If you're familiar with assembly, you can use the a command to change the assembly (i.e. turn the opcodes for, "a = 25;" into all NOPs). This is what I typically do when I want to NOP out or otherwise change an instruction stream.
Occasionally people will rely on the fact that the byte code for the NOP instruction is 0x90 and use the e command to edit the memory (e.g. "ew #eip 0x9090"). This is the same result as using the a command.
Lastly, if you're hitting this operation infrequently and just want to manually skip the instruction you can use the, "Set Current Instruction" GUI operation:
http://msdn.microsoft.com/en-us/library/windows/hardware/ff542851(v=vs.85).aspx
There is a tutorial here that explains how to do this, you can set the offset so that it skips the line: http://cfc.kizzx2.com/index.php/tutorial-using-windbg-to-bypass-specific-functions-windbg-kung-fu-series/ and set the register eip to this value.
Also, you can set the breakpoint and put the command into the breakpoint to do the same: http://japrogbits.blogspot.co.uk/2010/01/using-breakpoints-to-skip-function-in.html and another blog: http://www.shcherbyna.com/?p=1234 and also you can use the .call to achieve the same: http://blogs.msdn.com/b/oldnewthing/archive/2007/04/27/2292037.aspx

Need help identifying and computing a number representation

I need help identifying the following number format.
For example, the following number format in MIB:
0x94 0x78 = 2680
0x94 0x78 in binary: [1001 0100] [0111 1000]
It seems that if the MSB is 1, it means another character follows it. And if it is 0, it is the end of the number.
So the value 2680 is [001 0100] [111 1000], formatted properly is [0000 1010] [0111 1000]
What is this number format called and what's a good way for computing this besides bit manipulation and shifting to a larger unsigned integer?
I have seen this called either 7bhm (7-bit has-more) or VLQ (variable length quantity); see http://en.wikipedia.org/wiki/Variable-length_quantity
This is stored big-endian (most significant byte first), as opposed to the C# BinaryReader.Read7BitEncodedInt method described at Encoding an integer in 7-bit format of C# BinaryReader.ReadString
I am not aware of any method of decoding other than bit manipulation.
Sample PHP code can be found at
http://php.net/manual/en/function.intval.php#62613
or in Python I would do something like
def encode_7bhm(i):
o = [ chr(i & 0x7f) ]
i /= 128
while i > 0:
o.insert(0, chr(0x80 | (i & 0x7f)))
i /= 128
return ''.join(o)
def decode_7bhm(s):
o = 0
for i in range(len(s)):
v = ord(s[i])
o = 128*o + (v & 0x7f)
if v & 0x80 == 0:
# found end of encoded value
break
else:
# out of string, and end not found - error!
raise TypeError
return o