Unable to get data from a div tag using HTML parsing (hpple) in iPhone - iphone

I am trying to parse the below link using hpple:
http://www.decanter.com/news/wine-news/529748/mimimum-pricing-opponents-slam-cameron-speech
Code:
- (void)parseURL:(NSURL *)url {
NSData *htmlData = [NSData dataWithContentsOfURL:url];
TFHpple *xpathParser = [[TFHpple alloc] initWithHTMLData:htmlData];
NSArray *elements = [xpathParser searchWithXPathQuery:#"<div class=\"body\" id=\"article-529748-body\">"];
NSLog(#"elements %#",elements);
TFHppleElement *element = [elements objectAtIndex:0];
NSString *myTitle = [element content];
[xpathParser release];
}
but it is crashing. Crash Report:
XPath error : Invalid expression
<div class="body" id="article-529748-body">
^
XPath error : Invalid expression
<div class="body" id="article-529748-body">
^
How to solve this issue? why my elements array is empty? Am I parsing in a wrong way? I want to get the information available in that div tag.

Check that your elements array is not empty
- (void)parseURL:(NSURL *)url {
NSData *htmlData = [NSData dataWithContentsOfURL:url];
TFHpple *xpathParser = [[TFHpple alloc] initWithHTMLData:htmlData];
NSArray *elements = [xpathParser searchWithXPathQuery:#"<div class=\"body\" id=\"article-529748-body\">"];
NSLog(#"elements %#",elements);
if([elements count]){
TFHppleElement *element = [elements objectAtIndex:0];
}
NSString *myTitle = [element content];
[xpathParser release];
}

Try changing this:
NSArray *elements = [xpathParser searchWithXPathQuery:#"<div class=\"body\" id=\"article-529748-body\">"];
To:
NSArray *elements = [xpathParser searchWithXPathQuery:#"//div [#class='body'] [#id=\'article-529748-body\']"];

Writing this (2 years later!) in case it's useful to someone else with a similar problem.
In order to parse the html within the div, you need to
use syntax similar (single-quotes don't need to be escaped) to that quoted by JamMySon on this page
remember that [element content] only gives you the content( if any) for that node , NOT its children.
Because of this you may need to use recursion to walk though the div's node-tree.
Code (ARC):
- (void) decanterHpple{
NSURL *url = [NSURL URLWithString:#"http://www.decanter.com/news/wine-news/529748/mimimum-pricing-opponents-slam-cameron-speech"];
NSData *htmlData = [NSData dataWithContentsOfURL:url];
TFHpple *pageParser = [TFHpple hppleWithHTMLData:htmlData];
NSString *queryString = #"//div[#id='article-529748-body']";//1.works with unescaped single-quotes(') AND 2.No need for class='' when using id=''
NSArray *elements = [pageParser searchWithXPathQuery:queryString];
//old code ~ slightly amended
if([elements count]){
TFHppleElement *element = [elements objectAtIndex:0];
NSString *myTitle = [element content];
NSLog(#"myTitle:%#",myTitle );
}
//new code
NSString *theText = [self stringFromWalkThruNodes:elements];
NSLog(#"theText:%#",theText );
}
using this recursive method:
- (NSString*) stringFromWalkThruNodes:(NSArray*) nodes {
static int level = 0;//level is only useful for keeping track of recursion when stepping through with a breakpoint
level++;//put breakpoint here...
NSString *text = #"";
for (TFHppleElement *element in nodes){
if (element.content) {
text = [text stringByAppendingString:element.content];
}
if (element.children) {
NSString *innerText = [self stringFromWalkThruNodes:element.children];
text = [text stringByAppendingString:innerText];
}
}
level--;
return text;
}
This gives the output:
2014-10-22 19:44:07.996 Decanted[10148:a0b] myTitle:(null)
2014-10-22 19:44:07.997 Decanted[10148:a0b] theText:
On a visit to a hospital in north-east England, Mr Cameron is to call for the drinks industry to do more to tackle a problem which
costs the National Health Service £2.7bn a year.A ban on the sale of
alcohol below cost price - less than the tax paid on it - is set to be
introduced in England and Wales from 6 April, but ministers are
expected to push for a higher minimum price for drink.Opponents of a
minimum unit price say it is unfair because it penalises all drinkers,
not just binge or problem drinkers.Responding to the Prime Minister’s
comments, Wine and Spirit Trade Association spokesman Gavin Partington
reiterated the drinks indusry’s commitment ‘to helping the Government
tackle alcohol misuse, alongside other stakeholders.‘This is why we
are working hard through the Public Health Responsibility Deal on a
range of initiatives to promote responsible drinking.’These
initiatives, Partington said, include the expansion of Community
Alcohol Partnerships across the UK and a national campaign by
retailers to raise consumer awareness about the units of alcohol in
alcoholic drinks.Partington said, ‘Unlike these measures, minimum unit
pricing is a blunt tool which would both fail to address the problem
of alcohol misuse and punish the vast majority of responsible
consumers. As Government ministers acknowledge, it is also probably
illegal'.Decanter is also against the scheme, calling it
‘fundamentally flawed.’‘The real problem,’ editor Guy Woodward has
said, ‘lies with supermarkets who use wine as a loss-leader, slashing
margins, bullying suppliers and dragging down prices in order to
attract customers…Selling wine at a loss helps neither consumers nor
the trade.’Other opponents of the scheme include the British Beer and
Pub Association, which told the BBC there was ‘a danger it would be
done through higher taxation, which would be hugely damaging to
pub-goers, community pubs and brewers, costing thousands of vital
jobs.’It is thought any move toward minimum pricing could also be
illegal under European competition law, which is aimed at pushing down
prices for consumers and allowing firms to operate in a free
market.
PS. Only started playing with Hpple this p.m. after reading the aforementioned Wenderlich tutorial; I'm sure someone more experienced may come up with a more elegant solution!

Related

Truncate string values from txt file and add to array

I have a txt file which was copied to Supporting Files of my Xcode project.The data in txt file is of format:
abacus#frame with balls for calculating
abate#to lessen to subside
abdication#giving up control authority
aberration#straying away from what is normal
....................around 4000 lines
I have successfully extracted data from the file using the below code:
NSString *greFileString = [NSString stringWithContentsOfFile:[[NSBundle mainBundle]pathForResource:#"grewords" ofType:#"txt"] encoding:NSUTF8StringEncoding error:nil];
self.greWordsArray = [NSMutableArray arrayWithArray:[greFileString componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]]];
When I print greWordsArray,I could see the below output in log
"abacus#frame",
with,
balls,
for,
calculating,
"",
"abate#to",
lessen,
to,
subside,
"",
"abdication#giving",
up,
control,
authority,
"",
"aberration#straying",
away,
from,
what,
is,
normal,
"",
But I want the values in two separate arrays,one holding abacus,abate,abdication,authority aberration and other array with frame with balls for calculating,to lessen to subside,giving up control,straying away from what is normal i.e. one array holding string before # symbol and one with after # symbol
I know there are several methods like checking for special character method,string by replacing occurrences of string,using character set,but the fact is since my string greFileString is a bundle holding multiple strings,if I try any of these methods only abacus is getting added to array,but I want abacus,abate,abdication,aberration to be added to array.
EDIT
Following suggestion of H2CO3,I have implemented the following way:
NSString *greFileString = [NSString stringWithContentsOfFile:[[NSBundle mainBundle]pathForResource:#"grewords" ofType:#"txt"] encoding:NSUTF8StringEncoding error:nil];
NSArray *greData = [NSArray arrayWithArray:[greFileString componentsSeparatedByString:#"#"]];
self.greWordsArray = [NSMutableArray array];
self.greWordHints = [NSMutableArray array];
for (NSString *greWord in greData)
{
if ([greWord characterAtIndex:0] == (unichar)'#')
{
[greWordHints addObject:greWord];
}
else
{
[greWordsArray addObject:greWord];
}
}
NSLog(#"gre words are %#",greWordsArray);
NSLog(#"gre hints are %#",greWordHints);
Here is the logged output:
gre words are (
abacus,
"frame with balls for calculating
\nabate",
"to lessen to subside
\nabdication",
"giving up control authority
\naberration",
"straying away from what is normal
\nabet",
"help/encourage somebody (in doing wrong)
\nabeyance",
"suspended action
\nabhor",
"to hate to detest
gre hints are (
)
Can someone please guide me on this?
It's quite trivial: if the first character of the string is a '#', then put it in the one array, else put it in the other one.
NSArray *words = [greFileString componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
NSMutableArray *at = [NSMutableArray array];
NSMutableArray *noAt = [NSMutableArray array];
for (NSString *s in words)
if ([s characterAtIndex:0] == (unichar)'#')
[at addObject:s];
else
[noAt addObject:s];
Disregard the above - OP was lying to me >.< The text file actually consists of lines in which an at-symbol delimits the word and the explanation, i. e.
word1#explanation one
word2#explanation two
etc. This means that first the lines should be retrieved (perhaps using - [NSString componentsSeparatedByString:]), then each line is to be split into two part (the same method is useful here too).
Finally got it working,initially my idea was right,but couldn't execute it properly.First of all we need to fetch the data from txt file and place in array.Then as H2CO3 mentioned we need to loop through and here we need to implement components Separated By String.Now we are ready with data,what needs to be done is placing data in arrays using array object at index 0 in words and 1 in hints,i.e.:
NSString *greFileString = [NSString stringWithContentsOfFile:[[NSBundle mainBundle]pathForResource:#"grewords" ofType:#"txt"] encoding:NSUTF8StringEncoding error:nil];
self.greWordHints = [NSMutableArray array];
self.greWordsArray = [NSMutableArray array];
NSArray *greWords = [NSMutableArray arrayWithArray:[greFileString componentsSeparatedByCharactersInSet:[NSCharacterSet newlineCharacterSet]]];
for (NSString *greWord in greWords)
{
if (greWord && greWord.length)
{
NSArray *trimmedGreData = [greWord componentsSeparatedByString:#"#"];
[greWordsArray addObject:[trimmedGreData objectAtIndex:0]];
[greWordHints addObject:[trimmedGreData objectAtIndex:1]];
}
}
Hope it helps some one,thanks :)

Yahoo! weather in an iphone app

im developing an iphone app using yahoo weather service ( i have a key ).
i have 2 question :
can i use it in my app for commercial use ( like posting my app in appstore for free or no )
why the xml and json result are different :
http://weather.yahooapis.com/forecastrss?w=29330057&u=c
and
http://weather.yahooapis.com/forecastjson?w=29330057&u=c
there is any thing to do to much ( the first have the wanted location )?
thank you.
I suspect this is an issue with XML namespaces. Depending on the framework used and the actual full XML you'd have to access the elements by their namespace. You might want to switch to another, DOM-based framework (not using NSXMLParser), for example GDataXMLNode by Google. In a DOM-based framework you can access the individual nodes in a tree-like structure instead of building one on your own.
There are plenty of examples for this on the net, for example Building an RSS reader or How to read and write XML documents with GDataXML. But to give a quick example how this might look:
NSError *error = nil;
GDataXMLDocument *doc = [[GDataXMLDocument alloc] initWithData:data options:0 error:&error];
if (doc == nil) { return nil; }
NSMutableDictionary *result = [[NSMutableDictionary alloc] init];
NSArray *lists = [doc nodesForXPath:#"/result/list" error:nil];
if ([lists count] > 0)
{
for (GDataXMLNode *list in lists) {
int listid = [self integerInNode:list forXPath:#"listid"];
NSString *listname = [self stringInNode:list forXPath:#"name"];
[result setValue:[NSNumber numberWithInt:listid] forKey:listname];
}
}
[doc release];
return [result autorelease];
Yes, Yahoo! let you use their APIs under a fair-use policy, even commercially. Don’t be an ass and give them enough props though, e.g. their icon or logo with a link to their website.
I don’t think that it’s important to know why there are differences in both output formats. Use what is better / easier for you. Personally I prefer using JSON and Apple’s NSJSONSerialization class.

TouchXML problem

I have got following xml which I need to parse using TouchXML.
<?xml version="1.0"?>
<categories>
<category0>
<title>Alcoholic Drinks</title>
<description>Buy beers, wines, sprits and champagne from the top online alocholic drink stores.
Whatever your tipple you are sure to find a drinks supplier from our top shops below:
</description>
<status>1</status>
<popularStatus></popularStatus>
<order></order>
<link>alcoholic-drinks</link>
<id>1</id>
</category0>
<category1>
<title>Art and Collectibles</title>
<description>Are you looking to buy contemporary or fine art, or do you prefer to make your own artwork?&#
Whether type of artwork or craft materials you are looking for, you are certain to find one of the shops below more than helpful:
</description>
<status>1</status>
<popularStatus></popularStatus>
<order></order>
<link>art-and-collectibles</link>
<id>2</id>
</category1>
<category2>
<title>Auctions</title>
<description>Are you looking for the UK's biggest and best Auction Sites?
The team at safebuyer.co.uk have scoured the web to find the UK's favourite auctions, so why wait, start your bidding now!
</description>
...
...
...
I am thinking to create two loops from root node in order to fetch title and link but coudnt figure out how to do it. Can anybody help please.
If you can change your XML file and make all the category tag same. You can put all ... Instead of ... and ....
So that would be pretty easy to parse. You just need to make category class and all the tag would be parse automatically if you have correct xml parsing code.
CXMLNode *node;
for(i=0; i<10 ; i++){
NSString *xpath = [NSString stringWithFormat:#"//category%d/title", i];
NSArray *title = [[node nodesForXPath:xpath] stringValue];
}
Use the above code..
CXMLElement *element;
NSArray *titleItems = [[NSArray alloc] initWithArray:[element nodesForXPath:#"//category" error:nil]];
for(CXMLElement *item in titleItems){
NSString *title = [[item selectSingleNode:#"title"] stringValue];
}
Note: category node should be repeating.....
The code below gives you a dictionary where keys are titles and data are the links. Of course, if your XML document is "big", this is not the best way to do it.
CXMLDocument *doc = [[[CXMLDocument alloc] initWithXMLString:theXML options:0 error:nil] autorelease];
NSArray *categories = nil;
NSMutableDictionary* results = nil;
categories = [doc nodesForXPath:#"/categories/*[starts-with(name(), 'category')]" error:nil];
if (categories != nil && [categories count] > 0)
{
results = [NSMutableDictionary dictionaryWithCapacity:[categories count]];
for (CXMLElement *category in categories)
{
NSArray* titles = [category elementsForName:#"title"];
if ([titles count] >0)
{
NSArray* links = [category elementsForName:#"link"];
[result setObject:([links count]>0?[[links objectAtIndex:0] stringValue]:nil;
forKey:[[titles objectAtIndex:0] stringValue]];
}
}
}

GData Objective C client memory leak

I have a method where I fetch GDataFeedBase entries and return these as an array to another function
NSMutableArray *tempFeedArray = [NSMutableArray array];
NSURL *feedURL = [[NSURL alloc] initWithString:escapedUrlString];
NSData *data = [NSData dataWithContentsOfURL:feedURL];
GDataFeedBase *feedBase = [[GDataFeedBase alloc] initWithData:data];
[tempFeedArray addObjectsFromArray:[feedBase entries]];
[feedURL release];
[feedBase release];
return tempFeedArray;
.....
I have another function where I retrieve required values from tempFeedArray object that is GDataEntryYouTubeVideo
for(int count = 0; count < loopCount; count ++){
NSMutableDictionary *feedBaseEntryDict = [[NSMutableDictionary alloc] init];
entry = [tempFeedArray objectAtIndex:count];
youTubeUrl = [[entry alternateLink] href];
if ([entry statistics]!= nil) {
noOfVws= [[[entry statistics] viewCount] intValue];
}
duratn = [[[entry mediaGroup] duration] stringValue];
descr = [[[entry mediaGroup] mediaDescription] stringValue];
authorName = [[[entry authors] objectAtIndex:0] name];
publishedDt = [[entry publishedDate] stringValue];
rating = [[[entry rating] average] stringValue];
imageURL = [[[[entry mediaGroup] mediaThumbnails] objectAtIndex:0] URLString];
videoTitle = [[[entry mediaGroup] mediaTitle] stringValue];
.....
}
......
For the first time everything works fine. But the next time, it shows memory leak at
GDataXMLNode stringFromXMLString:
Did anyone else face this issue?
I found similar issue raised in gdata developer forum:
http://groups.google.com/group/gdata-objectivec-client/browse_thread/thread/f88de5a7bb784719/cab328a8725ee6c5
but the solution doesn't solve the issue.
Any help is much appreciated.
Looks like it might not be your code but the client library there were a few other threads on the same issue. This one has a work around, but I have not tried it myself.
The other options you have would be to upgrade to latest version (1.12 was released on Apr 11th 2011), take a look at the source and try to track down your problem, or submit an issue (it looks like the project is still actively developed).
Since the code is "stealing" entries from the feed, leaving them pointing to their parent feed (rather than copying the entries, which would create independent versions) there may be an issue with the strings cache. Try disabling the cache by commenting out -addStringsCacheToDoc in GDataXMLNode.m

Simple Way to Strip Tags In Obj-C

I am just learning objective-c and iPhone development, and I am really struggling with some very basic tasks. I am only on my 3rd day of the learning process - so that is to be expected somewhat. I'm still almost ashamed to ask such a simple question.
Anyhow, here's my question. I have a .NET web service which I call using a GET for http://somehost/ping
it returns 'pong'
<string xmlns="http://schemas.microsoft.com/2003/10/Serialization/">pong</string>
The simplest of test cases.
Back on the iPhone when I retrieve the URL I have the above string as a result. I only want the 'pong' part. This seems like programming 101, but I can't seem to find a simple example of how to do it that doesn't involve defining delagates or other seemingly complex processing steps.
The problem is simple enough, find the first '>' and extract everything from there until the first '<' as an NSString. That's all I need to do.
Does anyone have a basic example of how to do this?
This is dry-coded, and kinda ugly imho. But here is a more direct answer.
NSString *xml = #"<tag>pong</tag>";
NSRange range = [xml rangeOfString:#">"];
xml = [xml substringFromIndex:range.location + 1];
range = [substring rangeOfString:#"<"];
xml = [xml substringToIndex:range.location];
Hey Sylvanaar, I'm having to do similar types of parsing inside of the client. My general methodolgy for parsing xml responses is like this. I'm pretty sure the classes are available on iphone side too. Note: it may not be the absolute best method, but it does work.
- (id)initWithXMLNode:(NSXMLNode *)node {
self = [super init];
if (self != nil) {
NSError *error;
NSArray *objects;
// Get the fingerprint
objects = [node objectsForXQuery:#"for $fingerprint in ./Fingerprint return data($fingerprint)" error:&error];
handleErrorInInit(error)
fingerprint = getFingerprint(objects);
// Get the moduleName
objects = [node objectsForXQuery:#"for $moduleName in ./Foldername return data($moduleName)" error:&error];
handleErrorInInit(error)
moduleName = getNSString(objects);
}
return self;
}
Worth showing this too. Note that NSXMLDocuments are a subclass of NSXMLNodes.
- (NSXMLDocument *)xmlDocumentFromData:(NSData *)data {
NSError *error;
NSXMLDocument *document = [[[NSXMLDocument alloc] initWithData:data options:0 error:&error] autorelease];
if (error) {
[NSApp presentError:error];
return nil;
}
return document;
}
Sometimes a full on XML parse makes sense, but a quick index/substring routine can be appropriate as well:
NSRange startBracket = [xmlFragment rangeOfString:#">"];
if(startBracket.location != NSNotFound) {
NSRange endBracket = [xmlFragment rangeOfString:#"<"
options:0
range:NSMakeRange(startBracket.location,
[xmlFragment length] - startBracket.location)];
if(endBracket.location != NSNotFound) {
NSString *value = [[xmlFragment substringFromIndex:startBracket.location+1]
substringToIndex:endBracket.location];
// Do something with value...
}
}
(Not tested, needs more error handling, yadda yadda yadda..)