I want to encode a short title in filenames. The problem is that occasionally the title will contain a character such as a colon or a slash. Is there a standard encoding that would be typical/appropriate for this?
EDIT: to clarify, I want to encode the title in such a way that the encoded title could be used as a filename. Or is that called percent escaping?
The way I do this is with a category on NSURL, which I use to get the NSURL for a filename in a particular directory. Once I have this NSURL, I can fetch or save the file using the URL after performing the usual checks about whether or not the file already exists and handling those cases accordingly.
The relevant code snippet is:
+ (NSURL *)adnURLForFileName:(NSString *)fileName inDirectory:(NSSearchPathDirectory)searchDirectory {
NSString *percentEscapedFileName = [fileName stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
NSFileManager *fileManager = [[NSFileManager alloc] init];
NSURL *URLForDirectory = [[fileManager URLsForDirectory:searchDirectory inDomains:NSUserDomainMask] objectAtIndex:0];
return [NSURL URLWithString:percentEscapedFileName relativeToURL:URLForDirectory];
}
You can download the full category code from GitHub - NSURL+ADNFileHelpers
You could use -stringByReplacingOccurrencesOfString:withString: to replace the slash character with U+2044, the "solidus" aka "fraction slash". It looks like this: ⁄
http://en.wikipedia.org/wiki/Solidus_(punctuation)
The slash is not allowed in the Unix APIs. The colon is not allowed in HFS and in the old File Manager APIs. The same filename character will show up as a colon in the former and as a slash in the latter. In practice: you can use the Finder to rename a file to "/" (because the Finder uses the traditional Mac separator of :), but it will show up as ":" if you use ls.
If you need to allow both colons and slashes, you need to encode the characters somehow. You could use URL-style escaping, but if you expect the user to look at the filename in the Finder or in some other program, it's going to look horrible. It's better to escape just the path separator. For example, if you're using the Unix style APIs (path separator /), you could encode / as :- and : as :: (to avoid ambiguity). Or you could use some other little-used character for the escape.
I have approached this problem by filtering the title before using it in the filename. NSString has some useful methods, such as stringByStandardizingPath and stringByReplacingOccurrencesOfString:withString:. The filtering approach is lossy, in that the original title information might not be restorable. Similarly, I don't think encoding would work because iOS allows such a wide range of characters in its filenames. One possible alternative solution could be a plist archive with key=filename, value=title.
Related
I am looking for a way to decode quoted-printables.
The quoted-printables are for arabic characters and look like this:
=D8=B3=D8=B9=D8=A7=D8=AF
I need to convert it to a string, and store it or display..
I've seen post on stackoverflow for the other way around (encoding), but couldn't find decoding.
Uhm, it's a little hacky but you could replace the = characters with a % character and use NSString's stringByReplacingPercentEscapesUsingEncoding: method. Otherwise, you could essentially split the string on the = characters, convert each element to a byte value (easily done using NSScanner), put the byte values into a C array, and use NSString's initWithBytes:length:encoding: method.
Note that your example isn't technically in quoted-printable format, which specifies that a quoted-printable is a three character sequence consisting of an = character followed by two hex digits.
In my case I was coming from EML... bensnider's answer worked great... quoted-printable (at least in EML) uses an = sign followed by \r\n to signify a line wrapping, so this was the code needed to cleanly translate:
(Made as a category cause I loves dem)
#interface NSString (QuotedPrintable)
- (NSString *)quotedPrintableDecode;
#end
#implementation NSString (QuotedPrintable)
- (NSString *)quotedPrintableDecode
{
NSString *decodedString = [self stringByReplacingOccurrencesOfString:#"=\r\n" withString:#""]; // Ditch the line wrap indicators
decodedString = [decodedString stringByReplacingOccurrencesOfString:#"=" withString:#"%"]; // Change the ='s to %'s
decodedString = [decodedString stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding]; // Replace the escaped strings.
return decodedString;
}
#end
Which worked great for decoding my EML / UTF-8 objects!
Bensnider's answer is correct, the easy way of it.
u'll need to replace the "=" to "%"
NSString *s = #"%D8%B3%D8%B9%D8%A7%D8%AF";
NSString *s2 = [s stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
s2 stored "سعاد" which makes sense so this should work straight forward with out a hack
In some cases the line ends are not "=\r\n" but are only "=\n", in which case you need another step:
decodedString = [self stringByReplacingOccurrencesOfString:#"=\n" withString:#""];
Otherwise, the final step fails due to the unbalanced "%" at the end of a line.
I know nothing of the iPhone, but most email processing libraries will contain functions to do this, as email is where this format is used. I suggest searching for MIME decoding type functions, similar to those at enter link description here.
The earlier posters approach also seems fine to me - I feel he is being a little too self-deprecating in describing it as hacky :)
Please see a working solution that takes a quoted-printable-containing strings and resolves those graphemes. The only thing you should pay attention to is the encoding (that answer is based upon UTF8, by it can be easily switched to any other): https://stackoverflow.com/a/32903103/2799410
i'm saving a NSString inside an NSArray and that NSArray inside an NSDictionary. While doing this, a process inside my NSDictionary notifies me if my string is like Hi I'm XYZ. Then in the place of single quote the appropriate UTF character is getting stored.
So how to avoid this or how can I get my actual text along with special characters from NSArray or from my NSDictionary?
Any help is thankful.
NSString internally uses Unicode characters. So it easily can handle all sorts of characters from different languages.
You cannot choose the internal encodig of NSString. It's always Unicode. If you have an encoding problem, then you have either created the NSString instance incorrectly or you have output the instance the wrong way.
And there's no such thing as an UTF character.
Please better describe your problem and show the relevant source code.
I am using libxml2 in my iPhone app. I have an NSString that holds the pathname to an XML file. The pathname may include non-ASCII characters. I want to get a C string representation of the NSString for to pass to xmlReadFile(). It appears that cStringUsingEncoding gives me the representation I seek. I am not clear on which encoding to use.
I wonder if there is a "default" encoding in iPhone OS that I can use here and ensure that I can roundtrip non-ASCII pathnames.
Use NSString's fileSystemRepresentation. If the string contains characters that are not representable in the file system's encoding then this method will raise an exception.
To convert back, use NSFileManager's stringWithFileSystemRepresentation:length:
Currently I have this for my file path and file...
NSURL *storeUrl = [NSURL fileURLWithPath: [[self applicationDocumentsDirectory] stringByAppendingPathComponent: #"Shared\PartyPlanner.sqlite"]];
This allows me to share the file with iTunes, but instead of just having 'PartyPlanner.sqlite' in the 'applicationDocumentsDirectory\Shared'
I have "SharedPartyPlanner.sqlite" in the 'applicationDocumentsDirectory'
is there a cleaner or easier way to get to the shared folder inside of applicationDocumentsDirectory?
In UNIX-like systems (including the iPhone OS), the directory separator is /, not \.
Also, in C-like languages (including Objective-C), the \ in a string is used to escape a character, e.g. \n → a new line. You need to type Shared\\PartyPlanner.sqlite if you really need a backslash in the file name.
I'm trying to write a small Python script to parse the .strings file in my iPhone application project and determine which keys might not be in use. I'm, also doing some string matching to filter out some of the results. This is where my problems start :). If I try something like
for file_line in strings_file:
if 'search_keyword' in file_line:
...
the search keyword will often not match, even though if I print every file line in the same for I seem to be reading the text correctly and my search keywords appear.
The problem is these .strings files are in some binary format. Does anyone know of a proper way to parse these files?
Use correct encoding to open the .strings-file and in your source code. According to documentation the encoding of your file could be utf-16.
# -*- coding: utf-8 -*-
import codecs
for line in codecs.open(u'your_file.strings', encoding='utf-16'):
if u'keyword' in line:
# process line
No experience with those .strings files, but here is the reason why you don't find matches:
strings_file.read()
returns a string with the full content of the file. Iterating over a string iterates over single characters, i.e. in your for loop, file_line isn't a line, it's always just one single character (a string of length 1), which obviously can't contain a multi-character search word.
It sounds like the stings file was saved as data. If python can't read it as is you can convert it to a plain text file in Objective-c.
Just: (1) read the strings file into a file with the proper encoding. (2) Convert to dictionary (3) write dictionary to another file.
So:
NSString *strings=[NSString stringWithContentsOfFile:filePath encoding:NSUTF16StringEncoding error:&error];
NSDictionary *dict=[strings propertyList];
[dict writeToFile:anotherFilePath atomically:NO];