NSScanner behavior - iphone

I am very new to iOS development. I am trying to parse a simple csv file that has about 10 lines separated by commas. I am using the code below but not able understand why NSScanner, when parsing the fields (fields in the code below) does not go to the next string after the comma. I have to execute the line
[fields scanCharactersFromSet:fieldCharSet intoString:nil];
to make it go past the delimiter. However, I don't have to do the same thing for lines - NSScanner automatically sets the position to the next line past the newline. In both cases I am using the same method - [lines scanUpToCharactersFromSet:intoString] Is there something I am not understanding?
Here is the test file I am trying to parse:
Name,Location,Number,Units
A,AA,4,mm
B,BB,3.5,km
C,CC,10.2,mi
D,DD,2,mm
E,EE,6,in
F,FF,2.8,m
G,GG,3.7,km
H,HH,4.3,mm
I,II,4,km
Here is my code:
-(void)parseFile {
NSCharacterSet *lineCharSet = [NSCharacterSet newlineCharacterSet];
NSCharacterSet *fieldCharSet = [NSCharacterSet characterSetWithCharactersInString:self.separator];
// import the file
NSStringEncoding *encoding = nil;
NSError *error = nil;
NSString *data = [[NSString alloc] initWithContentsOfURL:self.absoluteURL usedEncoding:encoding error:&error];
NSString *line,*field;
NSScanner *lines = [NSScanner scannerWithString:data];
while (![lines isAtEnd]) {
[lines scanUpToCharactersFromSet:lineCharSet intoString:&line];//automatically sets to next line - why?
NSLog(#"%#\n",line);
NSScanner *fields = [NSScanner scannerWithString:line];
while (![fields isAtEnd]) {
[fields scanUpToCharactersFromSet:fieldCharSet intoString:&field];
[fields scanCharactersFromSet:fieldCharSet intoString:nil]; //have to do this otherwise will not go to next symbol
NSLog(#"%#\n", field);
}
}
}

That's just the way NSScanner works. When you use scanUpToCharactersFromSet:intoString:, it scans characters up to but not including the characters in the set. If you want it to move past characters in the set, you have two options:
Make it scan those characters. You are doing this now using scanCharactersFromSet:intoString:. Another way you could do it is [fields scanString:self.separator intoString:nil].
Tell the scanner that the separator character is to be skipped, using setCharactersToBeSkipped:. However, this will make it hard for you to detect empty fields.
The scanner's default set of characters-to-be-skipped includes the newline. That's why your outer scanner skips the newline.
You could do this entirely using componentsSeparatedByString:, instead of using NSScanner. Example:
-(void)parseFile {
NSString *data = [[NSString alloc] initWithContentsOfURL:self.absoluteURL usedEncoding:encoding error:&error];
for (NSString *line in [data componentsSeparatedByString:#"\n"]) {
if (line.length == 0)
continue;
NSLog(#"line: %#", line);
for (NSString *field in [line componentsSeparatedByString:self.separator]) {
NSLog(#" field: %#", field);
}
}
}

Related

parsing string starting with # and # in objective-C

So I am trying to parse a string that has the following format:
baz#marroon#red#blue #big#cat#dog
or, it can also be separated by spaces:
baz #marroon #red #blue #big #cat #dog
and here's how I am doing it now:
- (void) parseTagsInComment:(NSString *) comment
{
if ([comment length] > 0){
NSArray * stringArray = [comment componentsSeparatedByString:#" "];
for (NSString * word in stringArray){
}
}
}
I've got the components separated by space working, but what if it has no space.. how do I iterate through these words? I was thinking of using regex.. but I have no idea on how to write such regex in objective-C. Any idea, for a regex that would cover both of these cases?
Here's my first attempt:
NSError * error;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(#|#)\\S+" options:NSRegularExpressionCaseInsensitive error:&error];
NSArray* wordArray = [regex matchesInString:comment
options:0 range:NSMakeRange(0, [comment length])];
for (NSString * word in wordArray){
}
Which doesn't work.. I think my regex is wrong.
Here is a way to do it using NSScanner that puts the separated strings and a string representation of their ranges into an array (this assumes that your original string started with a # -- if it doesn't and you need it to, then just prepend the hash to the string at the start).
NSMutableArray *array = [NSMutableArray array];
NSString *str = #"#baz#marroon#red#blue #big#cat#dog";
NSScanner *scanner = [NSScanner scannerWithString:str];
NSCharacterSet *searchSet = [NSCharacterSet characterSetWithCharactersInString:#"##"];
NSString *outputString;
while (![scanner isAtEnd]) {
[scanner scanUpToCharactersFromSet:searchSet intoString:nil];
[scanner scanCharactersFromSet:searchSet intoString:&outputString];
NSString *symbol = [outputString copy];
[scanner scanUpToCharactersFromSet:searchSet intoString:&outputString];
NSString *wholePiece = [[symbol stringByAppendingString:outputString]stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
NSString *rangeString = NSStringFromRange([str rangeOfString:wholePiece]);
[array addObject:wholePiece];
[array addObject:rangeString];
}
NSLog(#"%#",array);
I think the regular expression you really want is [##]?\\w+. It will find groups of letters optionally preceded by an # or #. Your expression wouldn't work because it looks for any non-space character, which includes # and #. (Depending on what can be in the "words," you might want something more or less specific than \w, but it isn't clear from the question.)
If you need the ranges, then NSRegularExpression probably works well:
NSString *comment = #"#baz#marroon#red#blue #big#cat#dog";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"[##]\\w+" options:0 error:nil];
NSArray* wordArray = [regex matchesInString:comment
options:0
range:NSMakeRange(0, [comment length])];
for (NSTextCheckingResult *result in wordArray)
NSLog(#"%#", [comment substringWithRange:result.range]);
Or, [##][a-zA-z]+ works if you're ok with ASCII alpha words only.

Objective-C: Find consonants in string

I have a string that contains words with consonants and vowels. How can I extract only consonants from the string?
NSString *str = #"consonants.";
Result must be:
cnsnnts
You could make a character set with all the vowels (#"aeiouy")
+ (id)characterSetWithCharactersInString:(NSString *)aString
then use the
- (NSString *)stringByTrimmingCharactersInSet:(NSCharacterSet *)set
method.
EDIT: This will only remove vowels at the beginning and end of the string as pointed out in the other post, what you could do instead is use
- (NSArray *)componentsSeparatedByCharactersInSet:(NSCharacterSet *)separator
then stick the components back together. You may also need to include capitalized versions of the vowels in the set, and if you want to also deal with accents (à á è è ê ì etc...) you'll probably have to include that also.
Unfortunately stringByTrimmingCharactersInSet wont work as it only trim leading and ending characters, but you could try using a regular expression and substitution like this:
[[NSRegularExpression
regularExpressionWithPattern:#"[^bcdefghjklmnpqrstvwx]"
options:NSRegularExpressionCaseInsensitive
error:NULL]
stringByReplacingMatchesInString:str
options:0
range:NSMakeRange(0, [str length])
withTemplate:#""]
You probably want to tune the regex and options for your needs.
Possible, for sure not-optimal, solution. I'm printing intermediate results for your learning. Take care of memory allocation (I didn't care). Hopefully someone will send you a better solution, but you can copy and paste this for the moment.
NSString *test = #"Try to get all consonants";
NSMutableString *found = [[NSMutableString alloc] init];
NSInteger loc = 0;
NSCharacterSet *consonants = [NSCharacterSet characterSetWithCharactersInString:#"bcdfghjklmnpqrstvwxyz"];
while(loc!=NSNotFound && loc<[test length]) {
NSRange r = [[test lowercaseString] rangeOfCharacterFromSet:consonants options:0 range:NSMakeRange(loc, [test length]-loc)];
if(r.location!=NSNotFound) {
NSString *temp = [test substringWithRange:r];
NSLog(#"Range: %# Temp: %#",NSStringFromRange(r), temp);
[found appendString:temp];
loc=r.location+r.length;
} else {
loc=NSNotFound;
}
}
NSLog(#"Found: %#",found);
Here is a NSString category that does the job:
- (NSString *)consonants
{
NSString *result = [NSString stringWithString:self];
NSCharacterSet *characterSet = [NSCharacterSet characterSetWithCharactersInString:#"aeiou"];
while(1)
{
NSRange range = [result rangeOfCharacterFromSet:characterSet options:NSCaseInsensitiveSearch];
if(range.location == NSNotFound)
break;
result = [result stringByReplacingCharactersInRange:range withString:#""];
}
return result;
}

Find characters from the given string with numbers.

How do I get string using NSScanner from a string which contains string as well as numbers too?
i.e. 001234852ACDSB
The result should be 001234852 and ACDSB
I am able to get numbers from the string using NSScanner and characters by using stringByReplacingOccurrencesOfString but I want to know, is that possible to get string from with the use of NSScanner or any other built in methods?
I would like to know the Regex for the same.
If you can guarantee that the string always consists of numbers followed by letters, then you could do the following with NSScanner:
NSScanner *scanner = [NSScanner scannerWithString:#"001234852ACDSB"];
NSString *theNumbers = nil;
[scanner scanCharactersFromSet:[NSCharacterSet decimalDigitCharacterSet]
intoString:&theNumbers];
NSString *theLetters = nil;
[scanner scanCharactersFromSet:[NSCharacterSet letterCharacterSet]
intoString:&theLetters];
A regular expression capturing the same things would look like this:
([0-9]+)([a-zA-Z]+)
Finally after google for the same and go through some information from net, I reached to my destination. With this I'm posting the code, this may help many who are facing the same problem as I have.
NSString *str = #"001234852ACDSB";
NSScanner *scanner = [NSScanner scannerWithString:str];
// set it to skip non-numeric characters
[scanner setCharactersToBeSkipped:[[NSCharacterSet decimalDigitCharacterSet] invertedSet]];
int i;
while ([scanner scanInt:&i])
{
NSLog(#"Found int: %d",i); //001234852
}
// reset the scanner to skip numeric characters
[scanner setScanLocation:0];
[scanner setCharactersToBeSkipped:[NSCharacterSet decimalDigitCharacterSet]];
NSString *resultString;
while ([scanner scanUpToCharactersFromSet:[NSCharacterSet decimalDigitCharacterSet] intoString:&resultString])
{
NSLog(#"Found string: %#",resultString); //ACDSB
}
You don't have to use a scanner to do it.
NSString *mixedString = #"01223abcdsadf";
NSString *numbers = [[mixedString componentsSeparatedByCharactersInSet:[[NSCharacterSet characterSetWithCharactersInString:#"0123456789"] invertedSet]] componentsJoinedByString:#""];
NSString *characters = [[mixedString componentsSeparatedByCharactersInSet:[[NSCharacterSet characterSetWithCharactersInString:#"abcdefghijklmnouprstuwvxyz"] invertedSet]] componentsJoinedByString:#""];
For other possible solution view this question Remove all but numbers from NSString

NSScanner vs. componentsSeparatedByString

I have a large text file (about 10 MB). In the text file there are values like (without the empty lines between the rows, I couldn't format it here properly):
;string1;stringValue1;
;string2;stringValue2;
;string3;stringValue3;
;string4;stringValue4;
I'm parsing all the 'stringX' values to an Array and the 'stringValueX' to another string, using a pretty ugly solution:
words = [rawText componentsSeparatedByString:#";"];
NSEnumerator *word = [words objectEnumerator];
while(tmpWord = [word nextObject]) {
if ([tmpWord isEqualToString: #""] || [tmpWord isEqualToString: #"\r\n"] || [tmpWord isEqualToString: #"\n"]) {
// NSLog(#"%#*** NOTHING *** ",tmpWord);
}else { // here I add tmpWord the arrays...
I've tried to do this using NSScanner by following this example: http://www.macresearch.org/cocoa-scientists-part-xxvi-parsing-csv-data
But I received memory warnings and then it all crashed.
Shall I do this using NSScanner and if so, can anyone give me an example of how to do that?
Thanks!
In most cases NSScanner is better suited than componentsSeparatedByString:, especially if you are trying to preserve memory.
Your file could be parsed by a loop like this:
while (![scanner isAtEnd]) {
NSString *firstPart = #"";
NSString *secondPart = #"";
[scanner scanString: #";" intoString: NULL];
[scanner scanUpToString: #";" intoString: &firstPart];
[scanner scanString: #";" intoString: NULL];
[scanner scanUpToString: #";" intoString: &secondPart];
[scanner scanString: #";" intoString: NULL];
// TODO: add firstPart and secondPart to your arrays
}
You probably need to add error-checking code to this in case you get an invalid file.
You should use fast enumeration. It's far better than the one using objectEnumerator. Try this
for (NSString *word in words) {
// do the thing you need
}

UTF-8 conversion

I am grabbing a JSON array and storing it in an NSArray. However it includes JSON encoded UTF-8 strings, for example pass\u00e9 represents passé. I need a way of converting all of these different types of strings into the actual character. I have an entire NSArray to convert. Or I can convert it when it is being displayed, which ever is easiest.
I found this chart http://tntluoma.com/sidebars/codes/
Is there a convenient method for this or a library I can download?
thanks,
BTW, there is no way I can find to change the server so I can only fix it on my end...
You can use an approach based on the NSScanner. The following code (not bug-proof) can gives you a way on how it can work:
NSString *source = [NSString stringWithString:#"Le pass\\u00e9 compos\\u00e9 a \\u00e9t\\u00e9 d\\u00e9compos\\u00e9."];
NSLog(#"source=%#", source);
NSMutableString *result = [[NSMutableString alloc] init];
NSScanner *scanner = [NSScanner scannerWithString:source];
[scanner setCharactersToBeSkipped:nil];
while (![scanner isAtEnd]) {
NSString *chunk;
// Scan up to the Unicode marker
[scanner scanUpToString:#"\\u" intoString:&chunk];
// Append the chunk read
[result appendString:chunk];
// Skip the Unicode marker
if ([scanner scanString:#"\\u" intoString:nil]) {
// Read the Unicode value (assume they are hexa and four)
unsigned int value;
NSRange range = NSMakeRange([scanner scanLocation], 4);
NSString *code = [source substringWithRange:range];
[[NSScanner scannerWithString:code] scanHexInt:&value];
unichar c = (unichar) value;
// Append the character
[result appendFormat:#"%C", c];
// Move the scanner past the Unicode value
[scanner scanString:code intoString:nil];
}
}
NSLog(#"result=%#", result);
If you use the JSON Framework, then all you do is get your JSON string and convert it to an NSArray like so:
NSString * aJSONString = ...;
NSArray * array = [aJSONString JSONValue];
The library is well-written, and will automatically handle UTF8 encoding, so you don't need to do anything beyond this. I've used this library several times in apps that are on the store. I highly recommend using this approach.