What's the CFString Equiv of NSString's UTF8String? - iphone

I'm stuck on stoopid today as I can't convert a simple piece of ObjC code to its Cpp equivalent. I have this:
const UInt8 *myBuffer = [(NSString*)aRequest UTF8String];
And I'm trying to replace it with this:
const UInt8 *myBuffer = (const UInt8 *)CFStringGetCStringPtr(aRequest, kCFStringEncodingUTF8);
This is all in a tight unit test that writes an example HTTP request over a socket with CFNetwork APIs. I have working ObjC code that I'm trying to port to C++. I'm gradually replacing NS API calls with their toll free bridged equivalents. Everything has been one for one so far until this last line. This is like the last piece that needs completed.

This is one of those things where Cocoa does all the messy stuff behind the scenes, and you never really appreciate just how complicated things can be until you have to roll up your sleeves and do it yourself.
The simple answer for why it's not 'simple' is because NSString (and CFString) deal with all the complicated details of dealing with multiple character sets, Unicode, etc, etc, while presenting a simple, uniform API for manipulating strings. It's object oriented at its best- the details of 'how' (NS|CF)String deals with strings that have different string encodings (UTF8, MacRoman, UTF16, ISO 2022 Japanese, etc) is a private implementation detail. It all 'just works'.
It helps to understand how [#"..." UTF8String] works. This is a private implementation detail, so this isn't gospel, but based on observed behavior. When you send a string a UTF8String message, the string does something approximating (not actually tested, so consider it pseudo-code, and there's actually simpler ways to do the exact same thing, so this is overly verbose):
- (const char *)UTF8String
{
NSUInteger utf8Length = [self lengthOfBytesUsingEncoding:NSUTF8StringEncoding];
NSMutableData *utf8Data = [NSMutableData dataWithLength:utf8Length + 1UL];
char *utf8Bytes = [utf8Data mutableBytes];
[self getBytes:utf8Bytes
maxLength:utf8Length
usedLength:NULL
encoding:NSUTF8StringEncoding
options:0UL
range:NSMakeRange(0UL, [self length])
remainingRange:NULL];
return(utf8Bytes);
}
You don't have to worry about the memory management issues of dealing with the buffer that -UTF8String returns because the NSMutableData is autoreleased.
A string object is free to keep the contents of the string in whatever form it wants, so there's no guarantee that its internal representation is the one that would be most convenient for your needs (in this case, UTF8). If you're using just plain C, you're going to have to deal with managing some memory to hold any string conversions that might be required. What was once a simple -UTF8String method call is now much, much more complicated.
Most of NSString is actually implemented in/with CoreFoundation / CFString, so there's obviously a path from a CFStringRef -> -UTF8String. It's just not as neat and simple as NSString's -UTF8String. Most of the complication is with memory management. Here's how I've tackled it in the past:
void someFunction(void) {
CFStringRef cfString; // Assumes 'cfString' points to a (NS|CF)String.
const char *useUTF8StringPtr = NULL;
UInt8 *freeUTF8StringPtr = NULL;
CFIndex stringLength = CFStringGetLength(cfString), usedBytes = 0L;
if((useUTF8StringPtr = CFStringGetCStringPtr(cfString, kCFStringEncodingUTF8)) == NULL) {
if((freeUTF8StringPtr = malloc(stringLength + 1L)) != NULL) {
CFStringGetBytes(cfString, CFRangeMake(0L, stringLength), kCFStringEncodingUTF8, '?', false, freeUTF8StringPtr, stringLength, &usedBytes);
freeUTF8StringPtr[usedBytes] = 0;
useUTF8StringPtr = (const char *)freeUTF8StringPtr;
}
}
long utf8Length = (long)((freeUTF8StringPtr != NULL) ? usedBytes : stringLength);
if(useUTF8StringPtr != NULL) {
// useUTF8StringPtr points to a NULL terminated UTF8 encoded string.
// utf8Length contains the length of the UTF8 string.
// ... do something with useUTF8StringPtr ...
}
if(freeUTF8StringPtr != NULL) { free(freeUTF8StringPtr); freeUTF8StringPtr = NULL; }
}
NOTE: I haven't tested this code, but it is modified from working code. So, aside from obvious errors, I believe it should work.
The above tries to get the pointer to the buffer that CFString uses to store the contents of the string. If CFString happens to have the string contents encoded in UTF8 (or a suitably compatible encoding, such as ASCII), then it's likely CFStringGetCStringPtr() will return non-NULL. This is obviously the best, and fastest, case. If it can't get that pointer for some reason, say if CFString has its contents encoded in UTF16, then it allocates a buffer with malloc() that is large enough to contain the entire string when its is transcoded to UTF8. Then, at the end of the function, it checks to see if memory was allocated and free()'s it if necessary.
And now for a few tips and tricks... CFString 'tends to' (and this is a private implementation detail, so it can and does change between releases) keep 'simple' strings encoded as MacRoman, which is an 8-bit wide encoding. MacRoman, like UTF8, is a superset of ASCII, such that all characters < 128 are equivalent to their ASCII counterparts (or, in other words, any character < 128 is ASCII). In MacRoman, characters >= 128 are 'special' characters. They all have Unicode equivalents, and tend to be things like extra currency symbols and 'extended western' characters. See Wikipedia - MacRoman for more info. But just because a CFString says it's MacRoman (CFString encoding value of kCFStringEncodingMacRoman, NSString encoding value of NSMacOSRomanStringEncoding) doesn't mean that it has characters >= 128 in it. If a kCFStringEncodingMacRoman encoded string returned by CFStringGetCStringPtr() is composed entirely of characters < 128, then it is exactly equivalent to its ASCII (kCFStringEncodingASCII) encoded representation, which is also exactly equivalent to the strings UTF8 (kCFStringEncodingUTF8) encoded representation.
Depending on your requirements, you may be able to 'get by' using kCFStringEncodingMacRoman instead of kCFStringEncodingUTF8 when calling CFStringGetCStringPtr(). Things 'may' (probably) be faster if you require strict UTF8 encoding for your strings but use kCFStringEncodingMacRoman, then check to make sure the string returned by CFStringGetCStringPtr(string, kCFStringEncodingMacRoman) only contains characters that are < 128. If there are characters >= 128 in the string, then go the slow route by malloc()ing a buffer to hold the converted results. Example:
CFIndex stringLength = CFStringGetLength(cfString), usedBytes = 0L;
useUTF8StringPtr = CFStringGetCStringPtr(cfString, kCFStringEncodingUTF8);
for(CFIndex idx = 0L; (useUTF8String != NULL) && (useUTF8String[idx] != 0); idx++) {
if(useUTF8String[idx] >= 128) { useUTF8String = NULL; }
}
if((useUTF8String == NULL) && ((freeUTF8StringPtr = malloc(stringLength + 1L)) != NULL)) {
CFStringGetBytes(cfString, CFRangeMake(0L, stringLength), kCFStringEncodingUTF8, '?', false, freeUTF8StringPtr, stringLength, &usedBytes);
freeUTF8StringPtr[usedBytes] = 0;
useUTF8StringPtr = (const char *)freeUTF8StringPtr;
}
Like I said, you don't really appreciate just how much work Cocoa does for you automatically until you have to do it all yourself. :)

In the sample code above, the following appears:
CFIndex stringLength = CFStringGetLength(cfString)
stringLength is then being used to malloc() a temporary buffer of that many bytes, plus 1.
But the header file for CFStringGetLength() expressly says it returns the number of 16-bit Unicode characters, not bytes. So if some of those Unicode characters are outside the ASCII range, the malloc() buffer won't be long enough to hold the UTF-8 conversion of the string.
Perhaps I'm missing something, but to be absolutely safe, the number of bytes needed to hold N arbitrary Unicode characters is at most 4*n, when they're all converted to UTF-8.

From the documentation:
Whether or not this function returns a valid pointer or NULL depends on many factors, all of which depend on how the string was created and its properties. In addition, the function result might change between different releases and on different platforms. So do not count on receiving a non-NULL result from this function under any circumstances.
You should use CFStringGetCString if CFStringGetCStringPtr returns NULL.

Here's some working code. I started with #johne's answer, replaced CFStringGetBytes with CFStringGetLength for simplicity, and made the correction suggested by #Doug.
const char *useUTF8StringPtr = NULL;
char *freeUTF8StringPtr = NULL;
if ((useUTF8StringPtr = CFStringGetCStringPtr(cfString, kCFStringEncodingUTF8)) == NULL)
{
CFIndex stringLength = CFStringGetLength(cfString);
CFIndex maxBytes = 4 * stringLength + 1;
freeUTF8StringPtr = malloc(maxBytes);
CFStringGetCString(cfString, freeUTF8StringPtr, maxBytes, kCFStringEncodingUTF8);
useUTF8StringPtr = freeUTF8StringPtr;
}
// ... do something with useUTF8StringPtr...
if (freeUTF8StringPtr != NULL)
free(freeUTF8StringPtr);

If it's destined for a socket, perhaps CFStringGetBytes() would be your best choice?
Also note that the documentation for CFStringGetCStringPtr() says:
This function either returns the requested pointer immediately, with no memory allocations and no copying, in constant time, or returns NULL. If the latter is the result, call an alternative function such as the CFStringGetCString function to extract the characters.

Here's a way to printf a CFStringRef which implies we get a '\0'-terminated string from a CFStringRef:
// from: http://lists.apple.com/archives/carbon-development/2001/Aug/msg01367.html
// by Ali Ozer
// gcc -Wall -O3 -x objective-c -fobjc-exceptions -framework Foundation test.c
#import <stdio.h>
#import <Foundation/Foundation.h>
/*
This function will print the provided arguments (printf style varargs) out to the console.
Note that the CFString formatting function accepts "%#" as a way to display CF types.
For types other than CFString and CFNumber, the result of %# is mostly for debugging
and can differ between releases and different platforms. Cocoa apps (or any app which
links with the Foundation framework) can use NSLog() to get this functionality.
*/
void show(CFStringRef formatString, ...) {
CFStringRef resultString;
CFDataRef data;
va_list argList;
va_start(argList, formatString);
resultString = CFStringCreateWithFormatAndArguments(NULL, NULL, formatString, argList);
va_end(argList);
data = CFStringCreateExternalRepresentation(NULL, resultString,
CFStringGetSystemEncoding(), '?');
if (data != NULL) {
printf ("%.*s\n", (int)CFDataGetLength(data), CFDataGetBytePtr(data));
CFRelease(data);
}
CFRelease(resultString);
}
int main(void)
{
// To use:
int age = 25;
CFStringRef name = CFSTR("myname");
show(CFSTR("Name is %#, age is %d"), name, age);
return 0;
}

Related

Wrong charset of file names after unzip

I have the following problem: I extracted a zip file via SSZipArchive (in a Swift app) and there are some file names with "invalid" characters.
I think the reason is that I zipped the files under Windows and so the names are now coded in ANSI.
Is there a way to convert all the "corrupted" folder and file names during the unzip process?
Or later? It would be no problem if I have to iterate over the folder tree and rename the files.
But I have no idea how to find out which names are set in ANSI and I also don't know how to correct the charset.
The official spec says that the path should be either encoded in Code Page 437 MS-DOS Latin US or UTF-8 (if Bit 11 of the general purpose field is set):
D.1 The ZIP format has historically supported only the original IBM PC
character encoding set, commonly referred to as IBM Code Page 437.
This limits storing file name characters to only those within the
original MS-DOS range of values and does not properly support file
names in other character encodings, or languages. To address this
limitation, this specification will support the following change.
D.2 If general purpose bit 11 is unset, the file name and comment
should conform to the original ZIP character encoding. If general
purpose bit 11 is set, the filename and comment must support The
Unicode Standard, Version 4.1.0 or greater using the character
encoding form defined by the UTF-8 storage specification. The
Unicode Standard is published by the The Unicode Consortium
(www.unicode.org). UTF-8 encoded data stored within ZIP files is
expected to not include a byte order mark (BOM).
I recently released a Swift open source implementation of the ZIP file format called ZIPFoundation. It conforms to the standard and should be able to detect Windows path names and decode them properly.
Probably fixed in latest SSZipArchive (currently 2.1.1). I've implemented support for non-Unicode filenames in a way similar to the code below, so you can reuse it to process your filenames yourself if you want.
OK, it's in Objective-C, but as SSZipArchive has the fix in itself already, you shouldn't need it anymore. Otherwise, either make a bridging header to include the objective-c code to your swift app, or convert it to Swift (should be easy).
#implementation NSString (SSZipArchive)
+ (NSString *)filenameStringWithCString:(const char *)filename size:(uint16_t)size_filename
{
// unicode conversion attempt
NSString *strPath = #(filename);
if (strPath) {
return strPath;
}
// if filename is non-unicode, detect and transform Encoding
NSData *data = [NSData dataWithBytes:(const void *)filename length:sizeof(unsigned char) * size_filename];
// supported encodings are in [NSString availableStringEncodings]
[NSString stringEncodingForData:data encodingOptions:nil convertedString:&strPath usedLossyConversion:nil];
if (strPath) {
return strPath;
}
// if filename encoding is non-detected, we default to something based on data
// note: hexString is more readable than base64RFC4648 for debugging unknown encodings
strPath = [data hexString];
return strPath;
}
#end
#implementation NSData (SSZipArchive)
// initWithBytesNoCopy from NSProgrammer, Jan 25 '12: https://stackoverflow.com/a/9009321/1033581
// hexChars from Peter, Aug 19 '14: https://stackoverflow.com/a/25378464/1033581
// not implemented as too lengthy: a potential mapping improvement from Moose, Nov 3 '15: https://stackoverflow.com/a/33501154/1033581
- (NSString *)hexString
{
const char *hexChars = "0123456789ABCDEF";
NSUInteger length = self.length;
const unsigned char *bytes = self.bytes;
char *chars = malloc(length * 2);
// TODO: check for NULL
char *s = chars;
NSUInteger i = length;
while (i--) {
*s++ = hexChars[*bytes >> 4];
*s++ = hexChars[*bytes & 0xF];
bytes++;
}
NSString *str = [[NSString alloc] initWithBytesNoCopy:chars
length:length * 2
encoding:NSASCIIStringEncoding
freeWhenDone:YES];
return str;
}
#end

Objective C: looping issue

I'm just starting out with objective c (coming from java) and I'm working on a calculator program just to practice with the syntax and some basic stuff. The way I'm going about it is having the user input a string and looking through for operators (taking order of operations into account) and then finding the term surrounding that operator, calculating it, replacing the term with the answer, and repeating for all the terms; however, I'm having an issue with the method I'm using to calculate the term. I pass in the index of the operator and have it loop backwards until it hits another operator to find the number immediately before it, and do the same forwards for the number after. My issue is that the loop does not stop when it hits the operators, and instead just continues until the end of the string in both directions. It's probably something really simple that I've overlooked but I've been trying to figure this out for a while and can' seem to get it. I've included an SSCCE of just the first half of the method, with a predetermined string and operator index. (also, a secondary question: is there any better way to post code blocks on this site rather than manually putting in 4 spaces before every line?)
#import <Foundation/Foundation.h>
int firstNumInTerm(int index);
NSString *calculation;
int main(int argc, const char * argv[])
{
#autoreleasepool {
calculation = #"51-43+378*32";
int firstNumber = firstNumInTerm(9);
NSLog(#"The number before the term is: %i", firstNumber);
}
return 0;
}
int firstNumInTerm(int index){
int firstNumIndex = index - 1;
int firstNumLength = 1;
NSRange prevChar = NSMakeRange(firstNumIndex - 1, 1);
while ([calculation substringWithRange:prevChar] != #"*" &&
[calculation substringWithRange:prevChar] != #"/" &&
[calculation substringWithRange:prevChar] != #"+" &&
[calculation substringWithRange:prevChar] != #"-" &&
firstNumIndex > 0) {
NSLog(#"prevChar: %#", [calculation substringWithRange:prevChar]);//TEST
firstNumIndex--; firstNumLength++;
prevChar = NSMakeRange(firstNumIndex - 1, 1);
}
NSRange firstRange = NSMakeRange(firstNumIndex, firstNumLength);
int firstNum = [[calculation substringWithRange:firstRange] intValue];
NSLog(#"firstNum String: %#", [calculation substringWithRange:firstRange]);//TEST
NSLog(#"firstNum int: %i", firstNum);//TEST
return firstNum;
}
The problem with this line:
[calculation substringWithRange:prevChar] != #"*" is that you are comparing the value of two pointers. [calculation substringWithRange:prevChar] returns a pointer to an NSString object, as does the NSString literal statement #"*". The simplest way to compare two strings is by using the isEqualToString: method of NSString. For example:
NSString *myName = #"Stephen";
NSString *yourName = #"Matt";
if([myName isEqualToString:yourName]){
printf("We have the same name!");
}
else{
printf("We do not have the same name");
}
If you are going to be doing a lot of string comparisons, it might be wise to write a macro, such as:
#define STREQ(x,y) [x isEqualToString:y]
Regarding copy/pasting code into StackOverflow:
Since I use XCode 99% of the time, I find it handy to select the text I am going to copy and then hit Cmd-]. This shifts the text to the right one tab-width. I then Cmd-c to copy and then Cmd-[ to undo the right-shift.
You can't do that in Objective-C: [calculation substringWithRange:prevChar] != #"*"
Instead, you need to do :
[[calculation substringWithRange:prevChar] compare:#"*"] != NSOrderedSame
(I know, it's longer, but arithmetic operators aren't overloaded for string like they are in Java).
I see others have answered this to correct the issue with your string comparison operations, but a better way to split this string up would be using NSString's native parsing methods. For example:
NSArray *numbers = [ calculation componentsSeparatedByCharactersInSet:
[ NSCharacterSet characterSetWithCharactersInString: #"*/+-" ] ];
Will give you an array containing each of the numbers (in order) in your string. You could come up with custom parsing routines, but using NSString's is going to likely be more straightforward and a lot less buggy. It will also be easier for someone else to read and understand.
while((![[calculation substringWithRange:prevChar] isEqualToString:#"*"]) && …){
}
or
NSArray *operators = #[#"+", #"-", #"*", #"/"];
while(![operators contains:[calculation substringWithRange:prevChar]])

copy NSData into int32_t variable

I have a big NSDictionary full of entries that are all of type NSData. I have several entries that need to be of type int32_t however I am not 100% sure how to copy the data in the entries of the NSDictionary across..
is it as simple as doing the following -
.h
//..
int32_t myint;
}
#property (assign) int32_t myint;
//..
.m
//..
#synthesize cardID;
//..
- (void)assignSearchData:(NSData*)searchData
{
myint = [searchData objectForKey:#"IntKey"];
}
//..
or do I need some type of data conversion inside my method?
and a quick side question, have I even declared the int32_t correctly? I have looked for an example in the docs and on here but am struggling to find one.
Well, you can access the raw bytes in the data object directly.
void const *dataPtr = [data bytes];
Now that you have a pointer to raw memory, you can copy it any way you want (these rules apply to any data transfer, not just iOS). If you need to consider alignment boundaries, you need to use memcpy.
int32_t myInt;
memcpy(&myInt, dataPtr);
Otherwise, if on an architecture that allows integer manipulation across alignment boundaries...
int32_t myInt = *(int32_t const *)dataPtr;
Now, ARM supports access across alignment boundaries, but it's much slower. I have not done a performance comparison, but you are not continuing to use the mal-alignged pointer, so it may be better than the memcpy function call (though, to be honest, that is probably way too much performance consideration for you).
The biggest concern is byte-order of the data. If it's provided by you, then do whatever you want, but you should prefer one standard.
If it's coming from a third party, it's probably in network byte order (aka big-endian). You may need to convert to your host endian representation. Fortunately, that's straight forward with hton and ntoh and their friends.
FWIW, Intel is little-endian, and network-byte-order is big-endian, modern Macs and iOS devices are little-endian, older Macs are big-endian.
// Convert from network order to host order.
// Does the right thing wherever your code is running
myInt = ntohl(myInt);
In short, either...
int32_t myInt = ntohl(*(int32_t const *)[data bytes]);
or
int32_t myInt;
memcpy(&myInt, [data bytes);
myInt = ntohl(myInt);
So, the data has to get in there somehow. It's, the inverse...
int32_t myInt = 42;
myInt = htonl(myInt);
NSData *data = [NSData dataWithBytesNoCopy:&myInt length:sizeof(myInt) freeWhenDone:NO];
Of course, use the right Data initializer... that one will just use those raw bytes on the stack, so you better not use it after the stack unwinds.
You don't have to worry about alignment on the data you send, unless you are guaranteeing the receiver that the data will be aligned to some boundary.
Yes, int32_t is fine. So you have a stream of byes. What you need to know is what he layout of the bytes are. It you know what the data is it will be pretty easy to construct it.
Given a NSData object with a length of 4 (size of int32_t), then you would :
int32_t val;
if([data length] == sizeof(uint32_t)) {
void *bytes = [data bytes];
// if the layout is same as iOS then
memcpy(&val, bytes, sizeof(int32_t) );
}
if that is not the case, then you can try:
unsigned char val[4] = {0,0,0,0};
if([data length] == sizeof(uint32_t)) {
memcpy(val, bytes, sizeof(int32_t) );
then rearrange the bytes
}

iOS - libical / const char * - memory usage

I am using the libical library to parse the iCalendar format and read the information I need out of it. It is working absolutely fine so far, but there is one odd thing concerning ical.
This is my code:
icalcomponent *root = icalparser_parse_string([iCalData cStringUsingEncoding:NSUTF8StringEncoding]);
if (root)
{
icalcomponent *currentEvent = icalcomponent_get_first_component(root, ICAL_VEVENT_COMPONENT);
while (currentEvent)
{
while(currentProperty)
{
icalvalue *value = icalproperty_get_value(currentProperty);
char *icalString = icalvalue_as_ical_string_r(value); //seems to leak
NSString *currentValueAsString = [NSString stringWithCString:icalString
encoding:NSUTF8StringEncoding];
icalvalue_free(value);
//...
//import data
//...
icalString = nil;
currentValueAsString = nil;
icalproperty_free(currentProperty);
currentProperty = icalcomponent_get_next_property(currentEvent, ICAL_ANY_PROPERTY);
} //end while
} //end while
icalcomponent_free(currentEvent);
}
icalcomponent_free(root);
//...
I did use instruments to check my memory usage and were able to find out, that this line seems to leak:
char *icalString = icalvalue_as_ical_string_r(value); //seems to leak
If I'd copy and paste this line 5 or six times my memory usage would grow about 400kb and never get released anymore.
There is no free method for the icalvalue_as_ical_string_r method because it's returning a char *..
Any suggestions how to solve this issue? I would appreciate any help!
EDIT
Taking a look at the apple doc says the following:
To get a C string from a string object, you are recommended to use UTF8String. This returns a const char * using UTF8 string encoding.
const char *cString = [#"Hello, world" UTF8String];
The C string you receive is owned by a temporary object, and will become invalid when automatic deallocation takes place. If you want to get a permanent C string, you must create a buffer and copy the contents of the const char * returned by the method.
But how to release a char * string properly now if using arc?
I tried to add #autorelease {...} in front of my while-loop but without any effort. Still increasing memory usage...
Careful with the statement "no free method...because it's returning a char*"; that is never something you can just assume.
In the absence of documentation you can look at the source code of the library to see what it does; for example:
http://libical.sourcearchive.com/documentation/0.44-2/icalvalue_8c-source.html
Unfortunately this function can do a lot of different things. There are certainly some cases where calling free() on the returned buffer would be right but maybe that hasn't been ensured in every case.
I think it would be best to request a proper deallocation method from the maintainers of the library. They need to clean up their own mess; the icalvalue_as_ical_string_r() function has at least a dozen cases in a switch that might have different deallocation requirements.
icalvalue_as_ical_string_r returns a char * because it has done a malloc() for your result string. If your pointer is non-NULL, you have to free() it after use.

NSString stringWithFormat swizzled to allow missing format numbered args

Based on this SO question asked a few hours ago, I have decided to implement a swizzled method that will allow me to take a formatted NSString as the format arg into stringWithFormat, and have it not break when omitting one of the numbered arg references (%1$#, %2$#)
I have it working, but this is the first copy, and seeing as this method is going to be potentially called hundreds of thousands of times per app run, I need to bounce this off of some experts to see if this method has any red flags, major performance hits, or optimizations
#define NUMARGS(...) (sizeof((int[]){__VA_ARGS__})/sizeof(int))
#implementation NSString (UAFormatOmissions)
+ (id)uaStringWithFormat:(NSString *)format, ... {
if (format != nil) {
va_list args;
va_start(args, format);
// $# is an ordered variable (%1$#, %2$#...)
if ([format rangeOfString:#"$#"].location == NSNotFound) {
//call apples method
NSString *s = [[[NSString alloc] initWithFormat:format arguments:args] autorelease];
va_end(args);
return s;
}
NSMutableArray *newArgs = [NSMutableArray arrayWithCapacity:NUMARGS(args)];
id arg = nil;
int i = 1;
while (arg = va_arg(args, id)) {
NSString *f = [NSString stringWithFormat:#"%%%d\$\#", i];
i++;
if ([format rangeOfString:f].location == NSNotFound) continue;
else [newArgs addObject:arg];
}
va_end(args);
char *newArgList = (char *)malloc(sizeof(id) * [newArgs count]);
[newArgs getObjects:(id *)newArgList];
NSString* result = [[[NSString alloc] initWithFormat:format arguments:newArgList] autorelease];
free(newArgList);
return result;
}
return nil;
}
The basic algorithm is:
search the format string for the %1$#, %2$# variables by searching for %#
if not found, call the normal stringWithFormat and return
else, loop over the args
if the format has a position variable (%i$#) for position i, add the arg to the new arg array
else, don't add the arg
take the new arg array, convert it back into a va_list, and call initWithFormat:arguments: to get the correct string.
The idea is that I would run all [NSString stringWithFormat:] calls through this method instead.
This might seem unnecessary to many, but click on to the referenced SO question (first line) to see examples of why I need to do this.
Ideas? Thoughts? Better implementations? Better Solutions?
Whoa there!
Instead of screwing with a core method that you very probably will introduce subtle bugs into, instead just turn on "Static Analyzer" in your project options, and it will run every build - if you get the arguments wrong it will issue a compiler warning for you.
I appreciate your desire to make the application more robust but I think it very likely that re-writing this method will more likely break your application than save it.
How about defining your own interim method instead of using format specifiers and stringWithFormat:? For example, you could define your own method replaceIndexPoints: to look for ($1) instead of %1$#. You would then format your string and insert translated replacements independently. This method could also take an array of strings, with NSNull or empty strings at the indexes that don't exist in the “untranslated” string.
Your method could look like this (if it were a category method for NSMutableString):
- (void) replaceIndexPointsWithStrings:(NSArray *) replacements
{
// 1. look for largest index in "self".
// 2. loop from the beginning to the largest index, replacing each
// index with corresponding string from replacements array.
}
Here's a few issues that I see with your current implementation (at a glance):
The __VA_ARGS__ thingy explained in the comments.
When you use while (arg = va_arg(args, id)), you are assuming that the arguments are nil terminated (such as for arrayWithObjects:), but with stringWithFormat: this is not a requirement.
I don't think you're required to escape the $ and # in your string format in your arg-loop.
I'm not sure this would work well if uaStringWithFormat: was passed something larger than a pointer (i.e. long long if pointers are 32-bit). This may only be an issue if your translations also require inserting unlocalised numbers of long long magnitude.