Windbg: add multiple unresolved breakpoints - windbg

I have an unloaded module for wich I'd like to add unresolved breakpoints but I can't get it to work.
I have tried
to use a wildcard as in bm. That doesn't seem to be supported
bu "RPS32!*"
to explicitly name the methods but each breakpoint get's assigned id 0.This only sets the breakpoint for the last added.
bu "RPS32!RpsConvertBuffer"
bu "RPS32!RpsConvertFile"
to explicitly name the methods and the id's. The id's don't seem to stick. Each breakpoint again just redefines id 0 and only the last added is actually set.
bu39 "RPS32!RpsConvertBuffer"
bu40 "RPS32!RpsConvertFile"
So my question is actually twofold:
Is it possible to have multiple unresolved breakpoints?
If it's possible, what is wrong with the syntax I'm using?

The chance of myself running into this same issue again is quite high so I'm pretty much answering my own question out of self interest.
Remove the quoting around the methods
Probably this is WinDbg Breakpoint syntax 101 but adding quotes around the method makes WinDbg
use the address of the current instruction to add an unresolved breakpoint
reusing Id 0
and interpreting what's between quotes as a command.
Looking at the breakpoint list, that penny really should have dropped sooner
(1e48.1c10): Break instruction exception - code 80000003 (first chance)
eax=00000000 ebx=00000000 ecx=08160000 edx=0012e118 esi=fffffffe edi=00000000
eip=77220ed4 esp=0025f93c ebp=0025f968 iopl=0 nv up ei pl zr na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246
ntdll!LdrpDoDebuggerBreak+0x2c:
77220ed4 cc int 3
0:000> bu "Unresolved1"
0:000> bu "Unresolved2"
breakpoint 0 redefined
0:000> bl
0 e Disable Clear 77220ed4 0001 (0001) 0:**** ntdll!LdrpDoDebuggerBreak+0x2c "Unresolved2"

Related

Windbg alias replacement broken? Simplified example

Yesterday I asked a question about the behavior of windbg aliases (Strange behavior of windbg alias in loops) and got some helpful answers.
Now I have a simplified example that shows the behavior I am seeing, without any loops. It seems like alias replacement is simply broken, and the documentation about .block {} is basically wrong.
In a foo.windbg script I have the following:
;aS ${/v:foo} 1
al
.block
{
.echo foo
}
I run the script with
$$><foo.windbg
If the alias foo is not defined before running the script (or if it is already defined to 1), this works as expected. However if I already have foo defined to a different value, e.g. 0
;aS ${/v:foo} 0
then when I run the script foo gets set to 1 (I can see that from the al command in the script) but the command .echo foo in the script produces 0. Even the reference to foo is in a .block{}.
It works if the closing curly brace in the block statement is immediately after the reference to foo
;aS ${/v:foo} 1
al
.block
{
.echo foo}
This doesn't help because it means you can't use the alias unless it happens to be at the end of a .block{} or other compound statement. I though referencing the alias with ${foo} would help but it does not.
From the answers to my previous question I see that deleting (ad) the alias before setting it seems to fix the problem in some cases. Just using
ad foo
will error out the script if foo is not defined so I can't use it. Using
ad *
works but deletes all aliases including ones I have already created and want to use. I tried
.if (${/d:foo}) {ad ${/v:foo}}
but that gives the same behavior, where foo is replaced with the old value when used further down in the script. So I guess the work-around is to start the script with
aS ${/v:foo} dummy
ad ${/v:foo}
which seems to work.
So the basic problem is that alias replacement fails (in some cases) unless the alias name is followed by the closing curly brace of certain statements like .block. All the examples from the windbg documentation just so happen to do exactly this, and/or delete all aliases first and work around the problem that way.
I know I'm beating the dead horse but does this behavior have an explanation? Am I confused? It seems simply broken and, for the way I want to use aliases, useless.
Thanks,
Dave
It seems you have many questions. It's about deleting aliases, using aliases, running scripts, understanding WinDbg ...
Deleting aliases
One of them is "How to delete an alias without knowing in advance whether it's defined or not?" The answer to that is .block{ad /q ${/v:foo}}.
How to use aliases in your scripts
The next thing I'd advise for is: do not use aliases without the alias syntax. This is not so much for WinDbg itself, but rather for the readers of your script and for maintainability.
When I see a line
.echo foo
then I expect it to print "foo" and nothing else. When I see a line
.echo ${foo}
then I know that there is a variable or alias involved.
Conceptual bug with aliases: unable to escape them
Actually I think this is a conceptual bug, because there's no reliable way in WinDbg to echo a literal "foo", but in most echo scenarios I just want it to do that and nothing else
0:000> as foo oops
0:000> .echo foo
oops
0:000> .echo "foo"
oops
0:000> .echo ${foo}
oops
0:000> .echo "${foo}"
oops
0:000> .echo ${/v:foo}
${/v:foo}
0:000> .echo "${/v:foo}"
${/v:foo}
0:000> .printf "foo"
oops
0:000> .printf "${foo}"
oops
0:000> .printf "${/v:foo}"
${/v:foo}
And then you find bugs like you do:
0:000> .printf "foo\n"
foo
Your use of aliases
I appreciate that you provide small MREs to reproduce your issues. From what I see it looks like you're trying to use aliases like variables.
Let's think about what an alias is: You use aliases on Linux if you have a command that you need often but it's too long or complicated to type. I have aliases defined like ll for ls -l and cd.. for cd .. because I'm somewhat used to the Windows syntax.
The problem is: a WinDbg alias is not an alias as you know it from Linux. It's more like a preprocessor definition doing a stupid search and replace.
In the cases you presented so far, it has nothing to do with having an abbreviation for shorter typing or circumventing typing mistakes. You seem to assign it a value. In that case you should consider using pseudo registers like $t1.
Running scripts
You are running the script with $$>< and that method has the property of
Condenses to single command block: Yes
What it will do: it concatenates all lines in the file by a semicolon. Any you have to live with the consequences of that. See "further reading" on how that causes problems.
Further reading
It seems you want to understand WinDbg. That post will make you scratch your head and facepalm. What you'll learn there:
You can't simply concatenate commands using ; as a separator
You can't simply write empty statements
WinDbg is whitespace sensitive (sometimes)
A line is not always a line
String escaping is broken
Comments are not always comments
But why?
Warning: pure speculation ahead. I have never worked for Microsoft.
Why is that? WinDbg exists for an incredibly long time. The oldest version I have on my computer is version 4.0.18 from 1999, likely developed for Windows NT 4 SP 6. I can imagine that WinDbg 3 versions for Windows NT 3 existed as well, so it potentiall goes back to 1993.
At that time, there were no IDE features like "Find references" or "Find usages", so it was quite hard to do an impact analysis and I believe it was often unclear to developers what side effects a new command like as could have.
Combined with the impression of WinDbg being a Microsoft internal tool only and "the developers will know what they are doing" gives us a useful but overall buggy tool today.
WinDbg was probably not developed with a clear roadmap in mind. It probably does not have something you'd call "architecture". Some developer was debugging something complicated and thought it would be great to provide a sort of script to others in order to make debugging easier. So WinDbg is likely more a collection of useful code snippets we call commands today, but they have no common lexer or parser, for example.
i don't see a solid use case in the query so i wont go into the alias debate ;
also the query is a bit of an xy problem because you think using alias is the way out for your contrived problem instead of talking about what the actual problem you are trying to solve .
but if you want some modern UserVariables try checking out Javascript you can probably pull what you want out to a standard es6 compliant framework
0:000> dx Debugger.State.UserVariables
Debugger.State.UserVariables
0:000> dx #$foo = "sugar"
#$foo = "sugar" : sugar
Length : 0x5
0:000> dx #$blah = "honey"
#$blah = "honey" : honey
Length : 0x5
0:000> dx -r0 "i want "+ #$foo+ " mixed With "+#$blah
"i want "+ #$foo+" mixed With "+#$blah : i want sugar mixed With honey
just to complete the loop here is a javascript that compares a user provided argument with an already existing UserVariable and does some debugprinting
"use strict";
function foo(argu)
{
for (var i in host.namespace.Debugger.State.UserVariables)
{
if(i===argu)
{
host.diagnostics.debugLog("hi " + i +" with me ");
var tmp = "host.namespace.Debugger.State.UserVariables." + i
var boo = eval(tmp)
host.diagnostics.debugLog(boo +"\n")
}
}
}
create few user variables with
dx #$foo = "sugar"
dx #$bar = "honey"
dx #$blah = "milk"
0:000> dx Debugger.State.UserVariables
Debugger.State.UserVariables
foo : sugar
bar : honey
blah : milk
load the script and play with it
0:000> .scriptload d:\test.js
JavaScript script successfully loaded from 'd:\test.js'
0:000> dx #$scriptContents.foo("bar")
hi bar with me honey
#$scriptContents.foo("bar")
0:000> dx #$scriptContents.foo("foo")
hi foo with me sugar
#$scriptContents.foo("foo")
0:000> dx #$scriptContents.foo("blah")
hi blah with me milk
#$scriptContents.foo("blah")

Why is RAX not used to pass a parameter in System V AMD64 ABI?

I don't understand what the benefit of not passing a parameter in RAX,
Since the return value is in RAX it is going to be clobbered by the callee anyway.
Can someone explain?
x86-64 System V does use AL for variadic functions: the caller passes the number of FP args in XMM registers.
(This is only an optimization to allow the callee to not dump all the vector regs into an array; the number in AL is allowed to be higher than the number of FP args. In practice, gcc's code-gen for variadic functions just checks if it's non-zero and dumps either none or all 8 of xmm0..7. I think the ABI guarantees that it's safe to always pass al=8 even if there aren't actually any FP args, and that you can't pass pass FP args on the stack instead by setting al=0)
But why not use r9b for that, and use RAX for the 6th arg? Or RAX for some earlier arg?
Because RAX has so many implicit uses in x86, and experiments when designing the calling convention (http://web.archive.org/web/20140414124645/http://www.x86-64.org/pipermail/discuss/2000-November/001257.html) found that using RAX tended to require extra instructions in the caller or callee. e.g. because RAX was often needed as part of computing other args in the caller, or was needed while doing something with one of the other args before the code gets around to using the arg that was passed in RAX.
RAX is used for rep stos (which gcc used to use more aggressively to inline memset), and it's used for div and widening (one-operand) mul/imul, which gcc uses for division by a compile-time constant. (Why does GCC use multiplication by a strange number in implementing integer division?).
Most of the other RAX special uses are just shorter encodings of things you can also do with other registers, like cdqe vs. movsxd rax, eax (or between any other registers). Or add eax,imm32 (no ModRM) vs. add r/m32, imm32 (or most other ALU instructions). See one of my answers on
Tips for golfing in x86/x64 machine code. Original 8086 lacked many of the longer non-AX alternatives, but between 8086 and 386, stuff like imul r32,r32 and movsx/movzx were added. Other RAX-only instructions aren't worth using when optimizing for speed (like xlatb, lodsd), or are obsolete by P6 / AMD64 extensions (lahf as part of FP compares obsoleted by fucomi and using SSE/SSE2 ucomisd for FP math), or are specialized instructions like cmpxchg or cpuid that are too rare to have an impact on calling convention design. Compilers didn't use the BCD instructions like aaa anyway, and AMD64 removed them.
The designers of the x86-64 System V calling convention (primarily Jan Hubička for the integer arg-passing register design) generally aimed to avoid registers with many / common implicit uses. rdx comes before rcx in the arg-passing order, because cl is needed for variable shift counts (without BMI2). These are maybe more common than mul and div, because 2-operand imul reg,reg allows normal non-widening multiplies without clobbering RDX:RAX.
The choice of rdi and rsi as the first 2 args was apparently motivated by inlining memset or memcpy as rep movs (which gcc did back in 2000, even though it wasn't actually a good choice in many of the cases where gcc did that). Even though rep-string instructions use RCX as the counter, they still found it on average saved instructions to pass the 3rd arg in RDX instead of RCX, so the calling convention doesn't quite work out for memcpy to be rep stosb/ret.
Jan Hubička evaluated multiple variations on arg-passing registers by compiling SpecInt with a then-current version of x86-64 gcc. See my answer on Why does Windows64 use a different calling convention from all other OSes on x86-64? for some more details and links.
One of the arg-register orders he evaluated was RAX, RDX, RCX, RBX, RSI, RDI, but he found that less good than other options. (See the mailing list message linked above).
It's fairly common for RISC calling conventions to pass the first arg in the first return-value register. ARM does this (r0), and I think so does PowerPC. Others (like MIPS) don't. But all of those architectures have no implicit uses of most integer registers, often just a link register and maybe the stack pointer.
x86-64 SysV and Windows do this for FP args: xmm0 for passing and returning.

How to interprete double entries in Windbg "x /2" result?

I'm debugging a dumpfile (memory dump, not a crashdump), which seems to contain two times the amount of expected objects. While investigating the corresponding symbols, I've noticed the following:
0:000> x /2 <product_name>!<company>::<main_product>::<chapter>::<subchapter>::<Current_Object>*
012511cc <product_name>!<company>::<main_product>::<chapter>::<subchapter>::<Current_ObjectID>::`vftable'
012511b0 <product_name>!<company>::<main_product>::<chapter>::<subchapter>::<Current_ObjectID>::`vftable'
01251194 <product_name>!<company>::<main_product>::<chapter>::<subchapter>::<Current_Object>::`vftable'
0125115c <product_name>!<company>::<main_product>::<chapter>::<subchapter>::<Current_Object>::`vftable'
For your information, the entries Current_Object and Current_ObjectID are present in the code, no problem there.
What I don't understand, is that there seem to be two entries for every symbol, and their memory addresses are very close to each other.
Does anybody know how I can interprete this?
it can be due to veriety of reasons Optimizations and redundant code elimination being one at the linking time (pdb is normally made when you compile) see this link by raymond chen for an overview
quoting relevent paragraph from the link
And when you step into the call to p->GetValue() you find yourself in Class1::GetQ.
What happened?
What happened is that the Microsoft linker combined functions that are identical
at the code generation level.
?GetQ#Class1##QAEPAHXZ PROC NEAR ; Class1::GetQ, COMDAT
00000 8b 41 04 mov eax, DWORD PTR [ecx+4]
00003 c3 ret 0
?GetQ#Class1##QAEPAHXZ ENDP ; Class1::GetQ
?GetValue#Class2##UAEHXZ PROC NEAR ; Class2::GetValue, COMDAT
00000 8b 41 04 mov eax, DWORD PTR [ecx+4]
00003 c3 ret 0
?GetValue#Class2##UAEHXZ ENDP ; Class2::GetValue
Observe that at the object code level, the two functions are identical.
(Note that whether two functions are identical at the object code level is
highly dependent on which version of what compiler you're using, and with
which optimization flags. Identical code generation for different functions
occurs with very high frequency when you use templates.) Therefore, the
linker says, "Well, what's the point of having two identical functions? I'll
just keep one copy and use it to stand for both Class1::GetQ and
Class2::GetValue."

Is it possible to replace every instance of a particular function with a dummy in a compiled binary?

Is it possible to alter the way that an existing x86-64 binary references and/or calls one particular function. Specifically, is it possible to alter the binary such nothing happens (similar to a nop) at the times when that function would normally have executed?
I realize that there are powerful speciality tools out there (ie decompilers/disassemblers) for just this sort of task, but what I'm really wondering is if the executable formats are human-readable "enough" to be able to do this sort of thing (on small programs, at least) with just vim and a hex editor.
Are certain executable file formats (eg mach-o, elf, whatever the heck windows uses, etc.) more readable than others? Are they all just completely incomprehensible gibberish? Any expert views and/or good jumping off points/references would be greatly appreciated.
Disclaimer
Someone came by and quickly downvoted the initial version of this question, so I want to make this perfectly clear: I am not interested in disabling any serial or security checks or anything of the sort. Originally I had wanted a program to stop making a really irritating noise, but now I'm just curious about how compilers and executables work.
I'm in this for the educational value, and I think that other people on SE will be interested in the answer. However, I appreciate that others might not be as comfortable with this topic. If you have a concern about something I've said, please leave a comment and I promise I'll change my post.
This is trivial to do when the function in question is in the binary itself and uses standard calling conventions. Example:
void make_noise() { printf("Quack!\n"); }
int fn1() { puts("fn1"); make_noise(); return 1; }
int fn2() { puts("fn2"); make_noise(); return 2; }
int main() { puts("main"); return fn1() + fn2() - 3; }
gcc -w t.c -o a.out && ./a.out
This outputs (expected):
main
fn1
Quack!
fn2
Quack!
Now let's get rid of the noise:
gdb -q --write ./a.out
(gdb) disas/r make_noise
Dump of assembler code for function make_noise:
0x000000000040052d <+0>: 55 push %rbp
0x000000000040052e <+1>: 48 89 e5 mov %rsp,%rbp
0x0000000000400531 <+4>: bf 34 06 40 00 mov $0x400634,%edi
0x0000000000400536 <+9>: e8 d5 fe ff ff callq 0x400410 <puts#plt>
0x000000000040053b <+14>: 5d pop %rbp
0x000000000040053c <+15>: c3 retq
End of assembler dump.
This tells us a few things:
The function that we want to get rid of starts at address 0x40052d
The op-code of retq instruction is 0xC3.
Let's patch retq as the first instruction of make_noise, and see what happens:
(gdb) set *(char*)0x40052d = 0xc3
(gdb) disas make_noise
Dump of assembler code for function make_noise:
0x000000000040052d <+0>: retq
0x000000000040052e <+1>: mov %rsp,%rbp
0x0000000000400531 <+4>: mov $0x400634,%edi
0x0000000000400536 <+9>: callq 0x400410 <puts#plt>
0x000000000040053b <+14>: pop %rbp
0x000000000040053c <+15>: retq
End of assembler dump.
It worked!
(gdb) q
Segmentation fault (core dumped) ## This is a long-standing GDB bug
And now let's run patched binary:
$ ./a.out
main
fn1
fn2
Voila! No noise.
If the function is in a different binary, LD_PRELOAD techniques mentioned by Florian Weimer is usually easier than binary patching.
ELF dynamic linking implementations often support LD_PRELOAD and LD_AUDIT modules, which can both intercept calls into another shared object. LD_AUDIT offers more control, and exists on GNU/Linux (but the Solaris documentation is the canonical reference).
For calls within the same shared object, this may not be possible if the target function is not exported (or the call is executed via a hidden alias; glibc does this a lot). If you have debugging information, you can use systemtap to intercept the call. If the function is inlined, intercepting the call might not be possible even with systemtap because there is no exact place in the instruction stream where the call takes place.

How to avoid executing variables in lc3 assembly

I am making my first steps in lc3 assembly programming and I noticed that every time I try to store a negative value in memory, for example using "ST" instruction, there is some kind of error. In this memory location is stored "TRAP xFF" instead...
Anybody know how can I get over it??
You're getting that error because your variables are apart of the run-time code. It's usually best practice to put your variables at the end of your code after the HALT command.
.ORIG x3000
MAIN
LD R0, VAR1
NOT R0, R0
ADD R0, R0, #1
ST R0, VAR2
HALT
VAR1 .FILL #5
VAR2 .FILL #0
.END
The reason you were getting those errors is because when you were storing numbers into your variables the simulator thought they were commands. The trap command has an opcode of 1111 which also a negative number. When the simulator ran into your variable it couldn't figure out what type of TRAP command it was, thus the error. By preventing the simulator from running your variables you won't get that error.