Does the w3c.org site have documentation on "select"? - select

I cannot work out where this doco might be - i'm assuming they do have something on it. I realise this is a dead simple question, but no amount of searching is bringing this up for me.
Bing/DuckDuck etc search cannot find anything particularly relevant, and the only w3c.org links I followed went to "functions", which apparently "select" isn't.
EDIT (Apologies for ambiguity) I am looking for the definition of something along the lines of :
<xsl:variable name="variableName" select="some/path/here" />

That is an XSLT variable. Like many other XSLT elements, it has a select attribute that takes an XPath expression as a value. The value of this attribute just typically happens to be an XPath expression, but the attribute itself isn't directly related to XPath, so you won't find it documented in the XPath spec.

You code fragment is in XSLT, so the specification you want is either "XSLT 1.0" or "XSLT 2.0", which you can find very easily by using these as Google search terms. The value of the select attribute is an XPath expression, so you may also want the XPath 1.0 or XPath 2.0 specification; these can be found the same way.

Related

DOM4J Xpath Truncating Results

We are maintaining an application that heavily relies on DOM4J and Xpath. Once in a while we see a strange behaviors with results of XPath execution via DOM4J: The text result would simply be truncated.
We tried applying the recommendation provided here: http://www.mail-archive.com/dom4j-user#lists.sourceforge.net/msg02688.html
It seems the problem occurs less frequently but it still manifests itsself. Last time we got it, it was after applying Xpath on a the clone of a document parsed as indicated by the previous link.
This also seems to be similar to the issue mentioned here: Use of text() function when using xPath in dom4j

The driver.findelement don't find the tab element:

i have this Problem with my test ..the
driver.findElement(By.xpath("//html/body/div[2]/div/div/div[2]/div[2]/div/div[2]/div/div/div/div/div/div/div/ul/li[2]/a[2]/em/span/span/span")).click();
don't find the element.
the eclipse show this message of error
Cannot locate a node using
//html/body/div[2]/div/div/div[2]/div[2]/div/div[2]/div/div/div/div/div/div/div/ul/li[2]/a[2]/em/span/span/span
EDIT : Post edited to reflect answer to actual problem. Original answer follows.
Long XPath expressions are fragile, and tests are prone to fail when relying on them : a completely unrelated change somewhere else in the document can mess everything up, and even if you're aware of the problem, the tests' code is just harder to maintain.
In this particular case, since the site is generated by GWT, it's even worse - there is little control over the actual HTML changes. A good solution when using GWT is to use the ensureDebugId method (see link in comments).
Are you sure that this XPath expression is correct ? Does other tests work with this driver ?
I'd recommend avoiding the use of long XPath expressions like that - wouldn't it be safer in the long term to start the expression at an id-specified div somewhere in the page rather than at the root of the DOM ?

Regular Expressions (HTML parsing on iPhone)

I am trying to pull data from a website using objective-c. This is all very new to me, so I've done some research. What I know now is that I need to use xpath, and I have another wrapper for that called hpple for the iPhone. I've got it up and running in my project.
I am confused about the way I retrieve information from the site. Apparently I am to use regular expressions in this line of code:
NSArray * a = [doc search:#"//a[#class='sponsor']"];
This is just an example. Is that stuff in the search:#"...." the regular expression? If so, I guess I can develop the hundreds of patterns that I will need for my program to parse the site (I need a lot of data), but is there a better way? I'm very lost in this. Any help is appreciated.
The parameter is an XPath, not a regular expression. Here's a breakdown:
All xpaths are interpreted relative to a context node. In this case, it's the root node.
// is an abbreviation meaning "all descendents"
a means "all child nodes with a node type of 'a'" (in HTML, that's anchors)
[...] contains a predicate, refining just which a to match
# is an abbreviation for attribute nodes
#class means an attribute named "class"
#class='sponsor' means a class attribute equal to "sponsor". Note this will not match nodes with a class containing "sponsor", such as <a class="big sponsor" ...>; the class must be equal.
All together, we have "'a' nodes descending from the root that have class equal to 'sponsor'".
That is an XPath expression, not a regular expression. The W3C has an XPath reference here: http://www.w3.org/TR/xpath/. Basically you are searching for <a> elements with the class "sponsor".
Note that this is a good thing! Regular expressions are bad for parsing HTML.

Extracting function call list from DOxygen XML Output

I posted a question on the DOxygen forums and also am posting it here for a better response.
I have a moderately sized C project of about 2,900 functions. I am using DOxygen 1.5.9 and it is successfully generating a call graph for the functions. Is there a way to extract this out for further analysis? A simple paired list would be sufficient, e.g.
Caller,Callee
FunctionX, FunctionY
...
I am comfortable with XSLT but I must say that the DOxygen XML output is complex. Has anyone done this before and can provide some guidance on how to parse the XML files?
Thanks in advance!
Based on what I see in the contrived example that I created,
Parse files with a name similar to ^_(.+)\d+(c|cpp|h|hpp)\.xml$, if my regex-foo is right.
Find all <memberdef kind="function">. It has a unique id attribute. I believe the XPath for this is //memberdef[#kind='function'].
Within that element, find all <references>.
For each of those tags, the refid attribute uniquely refers to the id attribute of the corresponding <memberdef> that is being called.
The text node within each <references> corresponds to the <name> of the corresponding <memberdef> that is being called.
This seems like a nice, straightforward way to express call graphs. You should have no trouble using XSLT or any other sane XML-parsing suite to get the desired results.

Non-trivial screen scraping selections using pQuery

I'm using pQuery (a Perl port of jQuery) to select elements and retrieve text from a HTML-document.
Consider the following markup:
<x>
<y>code1</y>
<z>stuff</z>
<y>code2</y>
<z>foobar</z>
</x>
And the following pQuery code:
my $target_value = pQuery($markup)->find($pquery_selector)->text;
I'm trying to formulate $pquery_selector so that it matches <z>foobar</z> in the markup above using the following rule: find the z-element that follows after a y-element which has a body containing "code2". While this is possible using jQuery I'm not sure that the pQuery syntax is powerful enough to handle such an expression.
Is this type of selection possible using the pQuery syntax?
In jQuery it might be possible to write a selector like 'y:contains(code2)+z'. However, pQuery is still unfinished (as of version 0.07), and a selector like x+z just gives an error demonstrating that the module developer hasn't gotten around to translating that part of the jQuery code.
Since pQuery hasn't been touched since 2008, I'd recommend either fixing it yourself (the code is on cpan and github), or using a more mature module like HTML::TreeBuilder::XPath (which does require learning XPath syntax, but actually works for non-trivial things).
The XPath equivalent of the above jQuery selector would be '//y[contains(text(), 'code2')]/following-sibling::z'