I am struggling with a text file that I have to read in. In this file, there are two types of line:
133 0102764447 44 11 54 0.4 0 0.89 0 0 8 0 0 7 Attribute_Name='xyz' Type='string' 02452387764447 884
134 0102256447 44 1 57 0.4 0 0.81 0 0 8 0 0 1 864
What I want to do here is to textscan all the lines and then try to determine the number of 'xyz' (and the total number of lines).
I tried to use:
fileID = fopen('test.txt','r') ;
data=textscan(fileID, %d %d %d %d %d %d %d %d %d %d %d %d %d %s %s %d %d','\n) ;
And then I will try to access data{i,16} to count how many are equal to Attribute_Name='xyz', it doesnt seem to be an efficient though.
what will be a proper way to read the data(what interests me is to count how many Attribute_Name='xyz' do I have)? Thanks
You could simply use count which is referenced here.
In your case you could use it in this way:
filetext = fileread("test.txt");
A = count(filetext , "xyz")
fileread will read the whole text file into a single string. Afterwards you can process that string using count which will return the occurrences from the given pattern.
An alternative when using older versions of MATLAB is this one. It will work with R2006a and above.
filetext = fileread("test.txt");
A = length(strfind(filetext, "xyz");
strfind will return an array which length represents the amount of occurrences of the specified string. The length of that array can be accessed by length.
There is the option of strsplit. You may do something like the following:
count = 0;
fid = fopen('test.txt','r');
while ~feof(fid)
line = fgetl(fid);
words = strsplit( line )
ind = find( strcmpi(words{:},'Attribute_Name=''xyz'''), 1); % Assume only one instance per line, remove 1 for more and correct the rest of the code
if ( ind > 0 ) then
count = count + 1;
end if
end
So at the end count will give you the number.
I do not understand why
char test = '\032';
converts to
26 dec
'\032' seems to be interpreted as octal, but I want it to be treated as a decimal number.
I think I am confused with the character encoding.
Can anybody clarify this for me and give me a hint on how to convert it the way I want it?
In C, '\octal-digit' begins an octal-escape-sequence. There is no decimal-escape-sequence.
Code could simply use:
char test = 32;
To assign the value of 32 to a char, code has many options:
// octal escape sequence
char test1 = '\040'; // \ and then 1, 2 or 3 octal digits
char test2 = '\40';
// hexadecimal escape sequence
char test3 = '\x20'; // \x and then 1 or more hexadecimal digits
// integer decimal constant
char test4 = 32; // 1-9 and then 0 or more decimal digits
// integer octal constant
char test5 = 040; // 0 and then 0 or more octal digits
char test6 = 0040;
char test7 = 00040;
// integer hexadecimal constant
char test8 = 0x20; // 0x or 0X and then 1 or more hexadecimal digits
char test9 = 0X20;
// universal-character-name
char testA = '\u0020'; // \u & 4 hex digits
char testB = '\U00000020'; // \U & 8 hex digits
// character constant
char testC = ' '; // When the character set is ASCII
The syntax you are using (\0xxx) is for octal. To use decimal, you can just do:
char test = (char)32;
After hours of research I gave up.
I receive text data from a WebService. For some case, the text is inJapanese, and the WS returns its Unicoded version. For example: \U00e3\U0082\U008f
I know that this is a Japanese char.
I am trying to display this Unicode char or string inside a UILabel.
Since the simple setText method does'nt display the correct chars, I used this (copied) routine:
unichar unicodeValue = (unichar) strtol([[[p innerData] valueForKey:#"title"] UTF8String], NULL, 16);
char buffer[2];
int len = 1;
if (unicodeValue > 127) {
buffer[0] = (unicodeValue >> 8) & (1 << 8) - 1;
buffer[1] = unicodeValue & (1 << 8) - 1;
len = 2;
} else {
buffer[0] = unicodeValue;
}
[[cell title] setText:[[NSString alloc] initWithBytes:buffer length:len encoding:NSUTF8StringEncoding] ];
But no success: the UILabel is empty.
I know that one way could be convert the chars to hex and then from hex to String...is there a simpler way?
SOLVED
First you must be sure that your server is sending UTF8 and not UNICODE CODE POINTS. The only way I found is to json_encode strings which contain UNICODE chars.
Then, in iOS user unescaping following this link Using Objective C/Cocoa to unescape unicode characters, ie \u1234
I am trying to escape double-byte (usually Japanese or Chinese) characters from a string so that they can be included in an RTF file. Thanks to poster falconcreek, I can successfully escape special characters (e.g. umlaut, accent, tilde) that are single-byte.
- (NSString *)stringFormattedRTF:(NSString *)inputString
{
NSMutableString *result = [NSMutableString string];
for ( int index = 0; index < [inputString length]; index++ ) {
NSString *temp = [inputString substringWithRange:NSMakeRange( index, 1 )];
unichar tempchar = [inputString characterAtIndex:index];
if ( tempchar > 127) {
[result appendFormat:#"\\\'%02x", tempchar];
} else {
[result appendString:temp];
}
}
return result;
}
It appears this is looking for any unicode characters with a decimal value higher than 127 (which basically means anything not ASCII). If I find one, I escape it and translate that to a hex value.
EXAMPLE: Small "e" with acute accent gets escaped and converted to its hex value, resulting in "\'e9"
While Asian characters are above 127 decimal value, the output from the above appears to be reading the first byte of the unicode double byte character and encoding that then passing the second byte as is. For the end user it ends up ????.
Suggestions are greatly appreciated. Thanks.
UPDATED Code sample based on suggestion. Not detecting. :(
NSString *myDoubleByteTestString = #"blah は凄くいいアップです blah åèüñ blah";
NSMutableString *resultDouble = [NSMutableString string];
for ( int index = 0; index < [myDoubleByteTestString length]; index++ )
{
NSString *tempDouble = [myDoubleByteTestString substringWithRange:NSMakeRange( index, 1 )];
NSRange doubleRange = [tempDouble rangeOfComposedCharacterSequenceAtIndex:index];
if(doubleRange.length > 2)
{
NSLog(#"%# is a double-byte character. Escape it.", tempDouble);
// How to escape double-byte?
[resultDouble appendFormat:tempDouble];
}
else
{
[resultDouble appendString:tempDouble];
}
}
Take a look at the code at rangeOfComposedCharacterSequenceAtIndex: to see how to get all the characters in a composed character. You'll then need to encode each of the characters in the resulting range.
I have got below value(dynamic) from the server:
drwxr-xr-x 9 0 0 4096 Jan 10 05:30 California
Now i want to get valu like this.
drwxr-xr-x
9
0
0
4096
Jan 10
05:30
California
Please help me for this question
you can try smth like this
NSArray* components = [initialString componentsSeparatedByString:#" "];
See NSString componentsSeparatedByString for your answer.
As others have mentioned, you can use NSString's member function componentsSeparatedByString: or componentsSeparatedByCharactersInSet:
As an alternative (for more powerful tokenizing), look into the Objective-C NSScanner class in the foundation framework of Mac OS X.
You could do something like this:
NSString *str = "drwxr-xr-x 9 0 ... ";
NSScanner *scanner = [NSScanner scannerWithString:str];
In order to obtain each token in string form, use NSScanner's scanUpToCharactersFromSet:intoString: member function.
NSString *token = [NSString string];
NSCharacterSet *div = [NSCharacterSet whitespaceCharacterSet];
[scanner scanUpToCharactersFromSet:div intoString:token];
// token now contains #"drwxr-xr-x"
Subsequent calls to the above would return 9, 0, and so on.
Note: the code above has not been tested.
[myStringValue componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
may be useful as well.
Use a regex: RegexKitLite.
This is a "complete example" of a way to use a regex to do what you want with a lot of explanation, so it's a bit of a long answer. The regex used is just one way to do this, and is "fairly permissive" in what it accepts. The example shows:
How to match more than "one line / directory" at once.
A possible way to handle different date formats (Jan 10 05:30 and Apr 30 2009)
How to create an "array of arrays" of matches.
Iterate over the matched array and create a NSDictionary based on the parsed results.
Create a "comma separated values" version of the results.
Note: The example splits up some of its long strings across multiple lines. A string literal in the form of #"string1 " #"string2" will be "automagically" concatenated by the compiler to form a string that is equivalent to #"string 1 string2". I note this only because this might look a bit unusual if you're not used to it.
#import <Foundation/Foundation.h>
#import "RegexKitLite.h"
int main(int argc, char *argv[]) {
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
NSString *stringToMatch =
#"drwxr-xr-x 9 0 0 4096 Jan 10 05:30 California\n"
#"-rw-r--r-- 1 johne staff 1335 Apr 30 2009 tags.m"; // A random entry from my machine with an "older" date.
NSString *regex =
#"(?m)^" // (?m) means: to "have ^ and $ match new line boundaries". ^ means: "Match the start of a line".
// Below,
// (...) means: "Capture for extraction the matched characters". Captures start at 1, capture 0 matches "everything the regex matched".
// [^\\p{Z}]+ says: "Match one or more characters that are NOT 'Separator' characters (as defined by Unicode, essentially white-space)".
// In essence, '[^\\p{Z}]+' matches "One or more non-white space characters."
// \\s+ says: Match one or more white space characters.
// ([^\\p{Z}]+)\\s+ means: Match, and capture, the non-white space characters, then "gobble up" the white-space characters after the match.
#"([^\\p{Z}]+)\\s+" // Capture 1 - Permission
#"([^\\p{Z}]+)\\s+" // Capture 2 - Links (per `man ls`)
#"([^\\p{Z}]+)\\s+" // Capture 3 - User
#"([^\\p{Z}]+)\\s+" // Capture 4 - Group
#"([^\\p{Z}]+)\\s+" // Capture 5 - Size
#"(\\w{1,3}\\s+\\d+\\s+(?:\\d+:\\d+|\\d+))\\s+" // Capture 6 - The "date" part.
// \\w{1,3} means: One to three "word-like" characters (ie, Jan, Sep, etc).
// \\d+ means: Match one or more "digit-like" characters.
// (?:...) means: Group the following, but don't capture the results.
// (?:.A.|.B.) (the '|') means: Match either A, or match B.
// (?:\\d+:\\d+|\\d+) means: Match either '05:30' or '2009'.
#"(.*)$"; // Capture 7 - Name. .* means: "Match zero or more of any character (except newlines). $ means: Match the end of the line.
// Use RegexKitLites -arrayOfCaptureComponentsMatchedByRegex to create an
// "array of arrays" composed of:
// an array of every match of the regex in stringToMatch, and for each match,
// an array of all the captures specified in the regex.
NSArray *allMatchesArray = [stringToMatch arrayOfCaptureComponentsMatchedByRegex:regex];
NSLog(#"allMatchesArray: %#", allMatchesArray);
// Here, we iterate over the "array of array" and create a NSDictionary
// from the results.
for(NSArray *lineArray in allMatchesArray) {
NSDictionary *parsedDictionary =
[NSDictionary dictionaryWithObjectsAndKeys:
[lineArray objectAtIndex:1], #"permission",
[lineArray objectAtIndex:2], #"links",
[lineArray objectAtIndex:3], #"user",
[lineArray objectAtIndex:4], #"group",
[lineArray objectAtIndex:5], #"size",
[lineArray objectAtIndex:6], #"date",
[lineArray objectAtIndex:7], #"name",
NULL];
NSLog(#"parsedDictionary: %#", parsedDictionary);
}
// Here, we use RegexKitLites -stringByReplacingOccurrencesOfRegex method to
// create a new string. We use it to essentially transform the original string
// in to a "comma separated values" version of the string.
// In the withString: argument, '$NUMBER' means: "The characters that were matched
// by capture group NUMBER."
NSString *commaSeparatedString = [stringToMatch stringByReplacingOccurrencesOfRegex:regex withString:#"$1,$2,$3,$4,$5,$6,$7"];
NSLog(#"commaSeparatedString:\n%#", commaSeparatedString);
[pool release];
pool = NULL;
return(0);
}
Compile and run with:
shell% gcc -Wall -Wmost -arch i386 -g -o regexExample regexExample.m RegexKitLite.m -framework Foundation -licucore
shell% ./regexExample
2010-01-14 00:10:38.868 regexExample[49409:903] allMatchesArray: (
(
"drwxr-xr-x 9 0 0 4096 Jan 10 05:30 California",
"drwxr-xr-x",
9,
0,
0,
4096,
"Jan 10 05:30",
California
),
(
"-rw-r--r-- 1 johne staff 1335 Apr 30 2009 tags.m",
"-rw-r--r--",
1,
johne,
staff,
1335,
"Apr 30 2009",
"tags.m"
)
)
2010-01-14 00:10:38.872 regexExample[49409:903] parsedDictionary: {
date = "Jan 10 05:30";
group = 0;
links = 9;
name = California;
permission = "drwxr-xr-x";
size = 4096;
user = 0;
}
2010-01-14 00:10:38.873 regexExample[49409:903] parsedDictionary: {
date = "Apr 30 2009";
group = staff;
links = 1;
name = "tags.m";
permission = "-rw-r--r--";
size = 1335;
user = johne;
}
2010-01-14 00:10:38.873 regexExample[49409:903] commaSeparatedString:
drwxr-xr-x,9,0,0,4096,Jan 10 05:30,California
-rw-r--r--,1,johne,staff,1335,Apr 30 2009,tags.m