How to find memory used by NSString? - iphone

I got stuck when trying to look at the memory used by NSString:
My NSString theKey shows fine in lldb:
(lldb) p *theKey
(NSString) $5 = {
NSObject = {
isa = __NSCFConstantString
}
}
The content is visible:
(lldb) po theKey
BBBBBBBBBB
I also get the memory of the NSString object:
(lldb) fr v
(ViewController *const) self = 0x0897eb10
(SEL) _cmd = "viewDidLoad"
(NSString *) theKey = 0x00004664 #"BBBBBBBBBB"
When regarding the memory of the NSString object I only get strange values:
(lldb) memory read 0x00004664
0x00004664: 50 93 87 01 c8 07 00 00 68 37 00 00 0b 00 00 00 P.......h7......
0x00004674: 50 93 87 01 c8 07 00 00 74 37 00 00 0b 00 00 00 P.......t7......
This is, an NSString is not an array of char, and the memory contains … yes … what? How can I find the memory address where the characters stored in my NSString are stored?

You asked:
How can I find the memory address where the characters stored in my
NSString are stored?
In short, you can't. NSString is effectively opaque. The closest you're going to get is probably to call the -UTF8String method on it, which will give you a const char*, but if the string itself is not natively encoded with UTF8 encoding, then that pointer will not point to the "actual" data in the NSString, so...

I recently had the same problem. You can use the following program to find
the C pointer in memory of an NSString.
#import <Foundation/Foundation.h>
int main(int argc, const char * argv[]) {
NSString *a=#"###########";
const char *ptr = CFStringGetCStringPtr( (__bridge CFStringRef)a,
kCFStringEncodingMacRoman ) ;
puts( ptr ) ; // find ptr
NSLog(#"address:%p",ptr);
NSLog(#"content:%s",ptr);
return 0;
}
I found that the solution is to use CFStringGetCStringPtr to find the C pointer, which can then be used to find the NSString in memory.
Refer to: http://opensource.apple.com/source/CF/CF-1153.18/CFString.c

I had the same question. I wanted to see the bytes ( in hex ) that were stored in memory for an NSString.
I started with this byte array:
const char bytes1[7] = { 0x05, 0x31, 0x32, 0x33, 0x34, 0x35, 0x00 }; // \0x512345
Placing a breakpoint, after nsStrInput was initialized - like #ipmcc said - the
UTF8String method returned a Character array pointer.
(lldb) po [nsStrInput UTF8String]
0x0000000100769138
(lldb) mem read -c10 0x0000000100769138
0x100769138: 05 31 32 33 34 35 00 00 00 00 .12345....
I could see my instance of NSString did hold the unprintable ASCII charactoer (0x05). That was the answer I was hoping to find.

Related

ebpf: how to use BPF_FUNC_trace_printk in eBPF assembly program

I have a small socket filter type eBPF program, where I'm trying to print a protocol value read from __sk_buff context:
struct bpf_insn prog[] = {
BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_6, offsetof(struct __sk_buff, protocol)),
BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_0, -4),
BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -4),
BPF_MOV64_IMM(BPF_REG_2, 4),
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_trace_printk),
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
};
...
I create a raw socket and bind it to the lo interface, then setsockopt(fd, SOL_SOCKET, SO_ATTACH_BPF, ...). It compiles and loads with no problems, however whenever I ping 127.0.0.1 I never see traces in the trace_pipe.
So, to make sure that it BPF_FUNC_trace_printk actually can work, I changed it so that it prints a static string on the stack, and it does print on every packet hitting the loopback.
What am I doing wrong?
Read the friendly manual :)
I don't believe you are calling the bpf_trace_printk() helper correctly (BPF_FUNC_trace_prink is just an integer, by the way). Its signature, commented in the kernel UAPI header bpf.h or in the bpf-helpers man page, is as follows:
long bpf_trace_printk(const char *fmt, u32 fmt_size, ...);
This means that the first argument must be a constant, null-terminated format string, not an integer like you do.
What does clang do?
I understand you are attaching your eBPF programs to sockets and cannot compile the whole program from C. However, why not compile that specific part as a generic networking eBPF program to see what the bytecode should look like? Let's write the C code:
#include <linux/bpf.h>
static long (*bpf_trace_printk)(const char *fmt, __u32 fmt_size, ...) = (void *) BPF_FUNC_trace_printk;
int printk_proto(struct __sk_buff *skb) {
char fmt[] = "%d\n";
bpf_trace_printk(fmt, sizeof(fmt), skb->protocol);
return 0;
}
Compile to an object file. For the record this would not load, unless we provide both a valid licence string (because bpf_trace_prink() needs a GPL-compatible program) and a compatible program type at load time. But it does not matter in our case, we just want to look at the generated instructions.
$ clang -O2 -g -emit-llvm -c prink_protocol.c -o - | \
llc -march=bpf -mcpu=probe -filetype=obj -o prink_protocol.o
Dump the bytecode:
$ llvm-objdump -d prink_protocol.o
prink_protocol.o: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <printk_proto>:
0: b4 02 00 00 25 64 0a 00 w2 = 680997
1: 63 2a fc ff 00 00 00 00 *(u32 *)(r10 - 4) = r2
2: 61 13 10 00 00 00 00 00 r3 = *(u32 *)(r1 + 16)
3: bf a1 00 00 00 00 00 00 r1 = r10
4: 07 01 00 00 fc ff ff ff r1 += -4
5: b4 02 00 00 04 00 00 00 w2 = 4
6: 85 00 00 00 06 00 00 00 call 6
7: b4 00 00 00 00 00 00 00 w0 = 0
8: 95 00 00 00 00 00 00 00 exit
We can see that on the first two instructions, the program writes the format string (in little endian) onto the stack: 680997 is 0x000a6425, \0\nd%. r2 still contains the length for the format string. The protocol value is stored in r3, the third argument for the call to bpf_trace_prink().

How do loop over the search results for a byte string and offset the resultant pointer (in WinDbg)?

I'm attempting to search for an arbitrarily long byte string in WinDbg and print out the address if an integer in the vicinity meets some criteria.
Pseudo-register $t0 contains the starting address I want to search.
Here's something that, based on the Windows docs, maybe could work (though it clearly doesn't).
.foreach (place { s -[1] #$t0 L?30000 00 00 00 00 00 20 00 00 }) { .if ( (place +0x8) <= 0x1388) { .printf "0x%x\n", place } }
Search
First, the search command doesn't quite work correctly. I only want the address of the match (not the data).
s -[1] #$t0 L?30000 00 00 00 00 00 20 00 00
The docs say that the 1 flag will only return the address. When I issue that command, WinDbg replies
^ Syntax error in 's -1 #$t0 L?30000 00 00 00 00 00 20 00 00 '
If I leave out the -1, it finds two matches.
What am I doing wrong here?
Condition
I don't think the condition is behaving the way I want. I want to look at the third dword starting at place, i.e. place+8, and verify that it's smaller than 5000 (decimal). The .if inside the .foreach isn't printing a meaningful value for place (i.e. the address returned from the search). I think it's dereferencing place first and comparing the value of that integer to 5000. How do I look at the value of, say, *(int*)(place+8)?
Documentation?
The docs are not helping me very much. They only have sparse examples, none of which correspond to what I need.
Is there better documentation somewhere besides MS's Hardware Dev Center?
you can start writing JavaScript for a more legible way of scripting
old way
0:000> s -b vect l?0x1000 4d
00007ff7`8aaa0000 4d 5a 90 00 03 00 00 00-04 00 00 00 ff ff 00 00 MZ..............
00007ff7`8aaa00d4 4d 90 80 d2 df f9 82 d3-4d 90 80 d2 52 69 63 68 M.......M...Rich
00007ff7`8aaa00dc 4d 90 80 d2 52 69 63 68-4c 90 80 d2 00 00 00 00 M...RichL.......
0:000> s -[1]b vect l?0x1000 4d
0x00007ff7`8aaa0000
0x00007ff7`8aaa00d4
0x00007ff7`8aaa00dc
using javascript
function search(addr,len)
{
var index = []
var mem = host.memory.readMemoryValues(addr,len)
for(var i = 0; i < len; i++)
{
if(mem[i] == 0x4d)
{
index.push(addr+i)
}
}
return index
}
executed will return address like which you can manipulate further
0:000> dx -r1 #$scriptContents.search(0x00007ff78aaa0000,1000)
#$scriptContents.search(0x00007ff78aaa0000,1000) : 140701160046592,140701160046804,140701160046812
length : 0x3
[0x0] : 0x7ff78aaa0000
[0x1] : 0x7ff78aaa00d4
[0x2] : 0x7ff78aaa00dc
improving the script a little to find something based on first result
we will try to find the index of Rich string that follows the character 'M'
modified script
function search(addr,len)
{
var index = []
var Rich = []
var result = []
var mem = host.memory.readMemoryValues(addr,len)
for(var i = 0; i < len; i++)
{
if(mem[i] == 0x4d)
{
index.push(addr+i)
var temp = host.memory.readMemoryValues(addr+i+4,1,4)
host.diagnostics.debugLog(temp +"\t")
if(temp == 0x68636952)
{
Rich.push(addr+i)
}
}
}
result.push(index)
result.push(Rich)
return result
}
result only the third occurance of char "M" is followed by Rich string
0:000> dx -r2 #$scriptContents.search(0x00007ff78aaa0000,1000)
3 3548576223 1751345490 #$scriptContents.search(0x00007ff78aaa0000,1000) : 140701160046592,140701160046804,140701160046812,140701160046812
length : 0x2
[0x0] : 140701160046592,140701160046804,140701160046812
length : 0x3
[0x0] : 0x7ff78aaa0000
[0x1] : 0x7ff78aaa00d4
[0x2] : 0x7ff78aaa00dc
[0x1] : 140701160046812
length : 0x1
[0x0] : 0x7ff78aaa00dc
0:000> s -b vect l?0x1000 4d
00007ff7`8aaa0000 4d 5a 90 00 03 00 00 00-04 00 00 00 ff ff 00 00 MZ..............
00007ff7`8aaa00d4 4d 90 80 d2 df f9 82 d3-4d 90 80 d2 52 69 63 68 M.......M...Rich
00007ff7`8aaa00dc 4d 90 80 d2 52 69 63 68-4c 90 80 d2 00 00 00 00 M...RichL.......
load the extensension jsprovider.dll .load jsprovider
write a script say foo.js
load the script .scriptload ...\path\foo.js
execute any functions inside the js you wrote with dx #$scriptContents.myfunc(myargs)
see below using cdb just for ease of copy paste windbg works just as is
F:\>type mojo.js
function hola_mojo ()
{
host.diagnostics.debugLog("hola mojo this is javascript \n")
}
F:\>cdb -c ".load jsprovider;.scriptload .\mojo.js;dx #$scriptContents.hola_mojo();q" cdb | f:\usr\bin\grep.exe -A 6 -i reading
0:000> cdb: Reading initial command '.load jsprovider;.scriptload .\mojo.js;dx #$scriptContents.hola_mojo();q'
JavaScript script successfully loaded from 'F:\mojo.js'
hola mojo this is javascript
#$scriptContents.hola_mojo()
quit:
If I read this part of the documentation
s [-[[Flags]Type]] Range Pattern
correctly, you cannot leave out Type when specifying flags. That's because the flags are inside two square brackets. Otherwise it would have been noted as s [-[Flags][Type]] Range Pattern.
Considering this, the example works:
0:000> .dvalloc 2000
Allocated 2000 bytes starting at 00ba0000
0:000> eb 00ba0000 01 02 03 04 05 06 07 08 09
0:000> eb 00ba1000 01 02 03 04 05 06 07 08 09
0:000> s -[1]b 00ba0000 L?2000 01 02 03 04 05 06 07 08
0x00ba0000
0x00ba1000
Also note that you'll have a hidden bug for the use of place: it should be ${place}. By default, that will work with the address (line break for readability on SO):
0:000> .foreach (place {s -[1]b 00ba0000 L?2000 01 02 03 04 05 06 07 08 })
{ .if ( (${place} +0x8) < 0xba1000) { .printf "0x%x\n", ${place} } }
0xba0000
In order to read a DWord from that address, use the dwo() MASM oerator (line break for readability on SO):
0:000> .foreach (place {s -[1]b 00ba0000 L?2000 01 02 03 04 05 06 07 08 })
{ .if ( (dwo(${place} +0x8)) < 0xba1000)
{ .printf "0x%x = 0x%x\n", ${place}, dwo(${place}+8) } }
0xba0000 = 0x9
0xba1000 = 0x9

UnsafeMutableRawPointer to UnsafeMutablePointer<T>: EXC_BAD_ACCESS on pointee

I'm trying to get an UnsafeMutablePointer from an UnsafeMutableRawPointer obtained using Unmanaged.passUnretained().toOpaque():
class C { var foo = 42, bar = "bar" }
let c = C()
let rawPointer = Unmanaged.passUnretained(c).toOpaque()
let pointer = rawPointer.bindMemory(to: C.self, capacity: 1)
let pointee = pointer.pointee
print(pointee.foo) // EXC_BAD_ACCESS
Here's some LLDB output, which looks strange to me as everything seems alright in pointer until I ask for its pointee:
(lldb) frame variable -L c
scalar: (memtest2.C) c = 0x0000000101d00030 {
0x0000000101d00040: foo = 42
0x0000000101d00048: bar = "bar"
}
(lldb) frame variable -L rawPointer
0x00000001005e2e08: (UnsafeMutableRawPointer) rawPointer = {
scalar: _rawValue = 0x0000000101d00030 {
0x0000000101d00040: foo = 42
0x0000000101d00048: bar = "bar"
}
}
(lldb) frame variable -L pointer
0x00000001005e2e10: (UnsafeMutablePointer<memtest2.C>) pointer = 0x0000000101d00030
(lldb) frame variable -L pointer._rawValue
scalar: (memtest2.C) pointer._rawValue = 0x0000000101d00030 {
0x0000000101d00040: foo = 42
0x0000000101d00048: bar = "bar"
}
(lldb) frame variable -L pointee
0x00000001005e2e18: (memtest2.C) pointee = 0x00000001005b65d8 {
0x00000001005b65e8: foo = 140736790071664
0x00000001005b65f0: bar = ""
}
I've also tried assumingMemoryBound(to:), load(as:), or simply:
let pointer = UnsafePointer<C>(bitPattern: Int(bitPattern: rawPointer))!
print(pointer.pointee.foo) // EXC_BAD_ACCESS
But I always get this EXC_BAD_ACCESS error. What is going on here?
My rawPointer points to where the c instance data is located. It is not, as I expected, a pointer on the reference. Classes are reference types: the value of c is the memory address of where the class instance data is located. Yet the toOpaque() doc was clear:
Unsafely converts an unmanaged class reference to a pointer.
(toOpaque() actually calls unsafeBitCast(c, UnsafeMutableRawPointer.self))
To have a pointer on the reference one can simply do:
let referencePointer = UnsafeMutablePointer<C>(&c)
As my rawPointer points to the instance data, doing pointer.pointee tells the runtime that the first word of the instance data is a (or its) reference. Which of course isn't true nor makes sense.
Illustrating: (I've slightly change my initial code: both foo and bar are Int)
(lldb) frame variable c
(testmem.C) c = 0x0000000101833580 (foo = 42, bar = 84)
This (0x0000000101833580) is where the instance data is located. Let's see what the memory contains at this address:
(lldb) memory read 0x0000000101833580
0x101833580: e0 65 5b 00 01 00 00 00 02 00 00 00 00 00 00 00
0x101833590: 2a 00 00 00 00 00 00 00 54 00 00 00 00 00 00 00
I learned that the first word (e0 65 5b 00 01 00 00 00) is type data, the second (02 00 00 00 00 00 00 00) is reference counts (I don't know more about this), and the rest is the instance data. Indeed, 0x2a is 42 and 0x54 is 84. The value of foo and bar.
Doing pointer.pointee means telling the runtime that the first word (e0 65 5b 00 01 00 00 00 or 0x00000001005b65e0) is a reference pointing to where our instance data is located (which is obviously not the case)! Implying that pointer.pointee.foo is located at 0x00000001005b65e0 + 16 (0x00000001005b65f0) and bar at + 24 (0x00000001005b65f0).
(lldb) memory read 0x00000001005b65e0
0x1005b65e0: a9 65 5b 00 01 80 1d 00 80 62 5b 00 03 00 00 00
0x1005b65f0: 70 e9 10 9e ff 7f 00 00 00 00 00 00 00 00 00 00
foo contains 0x0000007fff9e10e970 which in decimal is 140735845296496 and corresponds to:
(lldb) frame variable -L pointee
0x00000001005e2e18: (testmem.C) pointee = 0x00000001005b65e0 {
0x00000001005b65f0: foo = 140735845296496 // this
0x00000001005b65f8: bar = 0
}
And as this data wasn't allocated by my program we don't have access to it, hence the EXC_BAD_ACCESS error.
Life makes sense now 😂.

How can I convert * )xÿª5c’√K’Rk¬É* to plain text in objective C?

Please check here is my code
int asciiCode = #"29 78 D8 BB 35 12 63 D5 C3 4B 18 D5 52 6B C2 83";
NSString *strings = [NSString stringWithFormat:#"%c", asciiCode];
NSLog(#"%c",strings);
}
if the value of 29 78 D8 BB 35 12 63 D5 C3 4B 18 D5 52 6B C2 83 it should display as india is my string value but it is displaying as )xÿª5c’√K’Rk¬É
int asciiCode = #"29 78 D8 BB 35 12 63 D5 C3 4B 18 D5 52 6B C2 83";
This is invalid. #"..." is a NSString, not an int.
You could have a char array, like this:
char *asciiCode = {0x29, 0x78, 0xD8, 0xBB, ...}
Which you could turn into a NSString using
NSString *strings = [NSString stringWithCString:asciiCode usingEncoding:NSUTF8StringEncoding];
The characters you have shown aren't ASCII, as they have their 8th bit set, but they could be UTF-8 encoded? If so, you simply need to treat them like a NUL-terminated C-String:
unsigned char cstr[] = { 0x29, 0x78, 0xD8, 0xBB, 0x35, 0x12, 0x63, 0xD5, 0xC3,
0x4B, 0x18, 0xD5, 0x52, 0x6B, 0xC2, 0x83, 0x0 };
NSString *string = [[NSString alloc] initWithUTF8String:cstr];
NSLog(#"%#",string);
or:
NSString *string = [NSString stringWithUTF8String:cstr];
NSLog(#"%#",string);
Depending on whether you want a retained or autoreleased NSString object.
You could also prepend your character codes with \x. For example:
NSString *string = #"\x29 \x78 \xD8 \xBB \x35 \x12 \x63 \xD5 \xC3 \x4B \x18 \xD5 \x52 \x6B \xC2 \x83";
NSLog(#"%#",string);
But beware that the codes should be in hexadecimal. for example, 29 here is hexadecimal 29 (0x29) not in decimal.

problem reading hex values from file with NSString or NSData

I have a file which I need to grab some data from but I am only getting nil when i read it.
the file contains
K vU™"3Dô
U0?*
the hex values are
1B 1B 01 1B 4B 07 20 1B 76 02 06 55 AA 11 22 33 44 99 20 0A 1B 55 30 1B 1B 3F 0C 1B 2A 01 0C 0C
I have tried reading the file a few ways but always get nil when i turn it into a string.
NSData *textData = [[NSData alloc] initWithContentsOfFile:filename];
NSLog(#"%#",[textData description]);
//prints out <1b1b011b 4b07201b 76020655 aa112233 4499200a 1b55301b 1b3f0c1b 2a010c0c>
text = [[NSString alloc] initWithData:textData encoding:NSUTF8StringEncoding];
//at this point text is nil
text = [[NSString alloc] initWithContentsOfFile:filename encoding:NSUTF8StringEncoding error:err];
//at this point text is still nil
text = [NSString stringWithContentsOfFile:filename encoding:NSUTF8StringEncoding error:err];
//at this point text is still nil
I know its reading the file. I can print out the attributes and see it says the size is 30. what did i miss/what am i doing wrong?
Your file is probably simply not encoded in UTF8. That's why trying to interpret the data into some UTF8 string fails.
Are you sure it is not UTF-16 (LE or BE) or any other encoding? (or maybe simply ASCII -- with some non-printable characters)
How about text = [[NSString alloc] initWithString:[textData description]];? :)