Which one of the locator is efficient by.css or by.xpath or by.id in Protractor? - protractor

Which one is better, in terms of performance , to use : by.css or by.xpath or by.id.
I have a really lengthy xpath :
by.xpath('//*#id="logindiv"]/div[3]/div/div[1]/div/nav/div/div[1]/form/div/div/button')
which can be used with other selectors like by.css or by.id.
But it is not very clear which one is better.

Protractor uses selenium-webdriver underneath for element lookup/interaction etc, so this is not protractor specific question, but rather selenium-webdriver specific.
CSS selectors perform far better than Xpath and it is well documented in Selenium community. Here are some reasons,
Xpath engines are different in each browser, hence make them inconsistent.
Last time I checked, IE does not have a native xpath engine, therefore selenium-webdriver injects its own xpath engine for compatibility of its API. Hence we lose the advantage of using native browser features that selenium-webdriver inherently promotes.
Xpath tend to become complex like your example and hence make hard to read/maintain in my opinion.
However there are some situations where, you need to use xpath, for example, searching for a parent element or searching element by its text (I wouldn't recommend the later).
You can read blog from Simon(creator of selenium-webdriver) here . He also recommends CSS over Xpath.
So I would recommend you use id, name etc for faster lookup. If thats not available use css and finally use xpath if none other suite your situation.

Related

Are there reliable algorithms for generating robust DOM node selectors, given only the target node?

In writing a scraper, we typically use some kind of selector to identify particular nodes of interest. Ideally the selectors should continue to work even as the page changes over time. A lot of the common approaches like grabbing nodes by id are fragile on frequently updated pages and impossible on some nodes. I'm trying to find good algorithms for generating robust selectors, but since there doesn't seem to be a standard terminology for this problem, it's hard to find everything that's out there.
Here are the selector DSLs I already know.
XPath selectors - Implemented everywhere from JS to the popular
Python and Ruby scraping libraries.
CSS selectors - Found in many of the places where you can find xpath
selectors.
High level selectors - Here I'll give the example of Chickenfoot,
which allows users to write click("begin tutorial") to find a link
with the text "Begin Tutorial." Usually these are implemented on top of
xpath and CSS selectors. I'd love to find out about more members of
this language family.
Visual selectors - This would be the approach taken by, for instance,
Sikuli, which makes it appear as though the program is calling a
function on a screengrab of the relevant node. I don't know any
web-specific instances of this approach, but I imagine there are
some.
Here are the selector generation algorithms I already know. By a selector generation algorithm I mean an algorithm that takes a node as input and produces a robust selector as output.
iMacros: Finds all elements with the same node type and text as the
target element, finds the target element's index in this list list. Uses
the node type, text, and index as the selector. Also includes id
for forms and form elements.
CoScripter: Uses element's text if available. If not, uses preceding
text.
Selenium: Uses id where available. Uses various other attributes
otherwise, such as image alt text, links' displayed texts, buttons'
displayed texts.
Wargo System: Uses element text.
Many systems: Many systems use the xpath from the root to the target node, or some
suffix of that xpath.
All of these selector generation algorithms fail on some nodes. Are there better approaches out there? Or other approaches that I could combine with these algorithms to produce a better hybrid algorithm?
When I started investigating this topic for some work I am doing, I was also surprised by how little information is available on this topic.
I did find this 2003 paper, but unfortunately, I only have access to the abstract:
Abe, Mari, and Masahiro Hori. “Robust Pointing by XPath Language: Authoring Support and Empirical Evaluation.” In Proceedings of the 2003 Symposium on Applications and the Internet, 156 – . SAINT ’03. Washington, DC, USA: IEEE Computer Society, 2003.
For my own use, I followed the approach in Tim Vasil's 50-line jquery plugin. I won't reproduce the code which is available at that link, but instead I'll describe it:
It recursively traverses up the DOM tree from the element, building a selector "backwards". At each level:
If the node has an ID, just use that and skip all the parents; they aren't added to the selector.
If node has a tag name or a set of classes that is unique among its siblings, use that as the selector. Otherwise, use :nth-child.
Since I will be storing element contents between visits to a page, I'm thinking of implementing some "blunder detection" here, maybe using a percentage change from last visit to detect if the selector may be grabbing the wrong element.

Has anyone implement DITA 1.2 Subject scheme in their work?

I would like to know if there is anyone who has implemented the subjectscheme maps of DITA1.2 in their work? If yes, can you please break-up the example to show:
how to do it?
when not to use it?
I am aware of the theory behind it, but I am yet to implement the same and I wanted to know if there are things I must keep in mind during the planning and implementation phase.
An example is here:
How to use DITA subjectSchemes?
The DITA 1.2 spec also has a good example (3.1.5.1.1).
What you can currently do with subject scheme maps is:
define a taxonomy
bind the taxonomy to a profiling or flagging attribute, so that it the attribute only takes a value that you have defined
filter or flag elements that have a defined value with a DITAVAL file.
Advantage 1: Since you have a taxonomy, filtering a parent value also filters its children, which is convenient.
Advantage 2: You can fully define and thus control the list of values, which prevents tag bloat.
Advantage 3: You can reuse the subject scheme map in many topic maps, in the usual modular DITA way, so you can apply the same taxonomies anywhere.
These appear to be the main uses for a subject scheme map at present.
The only disadvantages I have found is that I can think of other hypothetical uses for subject scheme maps such as faceted browsing, but I don't think any implementation exists. The DITA-OT doesn't have anything like that yet anyway.

HTML xpath tree dump? using Ruby Watir

Help! In carefully stepping through irb to control a browser (Firefox and Chrome) using the Watir library, it seems the xpath addresses are too shifty to rely on. Eg. one moment, the xpath for one's balance seems consistent, so I use that address in my script. Sometimes it works, but too often crashing with "element not found" although every time I manually step through, the webpage data is there (firebug inspect to confirm).
Yes, using Ajax-reliant sites, but not that changing....bank websites that pretty much remain the same across visits.
So question....is there some way watir-webdriver can simply give me a long, verbose dump of everything it sees in the DOM at the moment, in the form of an xpath tree? Would help me troubleshoot.
The big answer is to not use xpath, but instead use watir as the UI is intended to be used.
When it comes to a means to specify elements in browser automation, by and large Xpath is evil, it is SLOW, the code it creates is often (as you are finding) very very brittle, and it's nearly impossible to read and make sense of. Use it only as a means of last resort when nothing else will work.
If you are using a Watir API (as with Watir or Watir-webdriver) then you want to identify the element based on it's own attributes, such as class, name, id, text, etc If that doesn't work, then identify based on the closest container that wraps the element which has a way to find it uniquely. If that doesn't work identify by a sibling or sub-element and use the .parent method as a way to walk 'up' the dom to the 'parent container element.
To the point of being brittle and difficult readability, compare the following taken from the comments and consider the code using element_by_xpath on this:
/html/body/form/div[6]/div/table/tbody/tr[2]/td[2]/table/tbody/tr[2]/td/p/table[‌​2]/tbody/tr/td[2]/p/table/tbody/tr[3]/td[2]
and then compare to this (where the entire code is shorter than just the xpath alone)
browser.cell(:text => "Total Funds Avail. for Trading").parent.cell(:index => 1).text
or to be a bit less brittle replace index by some attribute of the cell who's text you want
browser.cell(:text => "Total Funds Avail. for Trading").parent.cell(:class => "balanceSnapShotCellRight").text
The xpath example is very difficult to make any sense of, no idea what element you are after or why the code might be selecting that element. And since there are so many index values, any change to the page design or just extra rows in the table above the one you want will break that code.
The second is much easier to make sense of, I can tell just by reading it what the script is trying to find on the page, and how it is locating it. Extra rows in the table, or other changes to page layout will not break the code. (with the exception of re-arranging the columns in the table, and even that could be avoided if I was to make use of class or some other characteristic of the target cell (as did an example in the comments below)
For that matter, if the use of the class is unique to that element on the page then
browser.cell(:class => 'balanceSnapShotCellRight').text
Would work just fine as long as there is only one cell with that class in the table.
Now to be fair I know there are ways to use xpath more elegantly to do something similar to what we are doing in the Watir code above, but while this is true, it's still not as easy to read and work with, and is not how most people commonly (mis)use xpath to select objects, especially if they have used recorders that create brittle cryptic xpath code similar to the sample above)
The answers to this SO question describe the three basic approaches to identifying elements in Watir. Each answer covers an approach, which one you would use depends on what works best in a given situation.
If you are finding a challenge on a given page, start a question here about it and include a sample of the HTML before/after/around the element you are trying to work with, and the folks here can generally point you the way.
If you've not done so, work through some of the tutorials in the Watir wiki, notice how seldom xpath is used.
Lastly, you mention Firewatir. Don't use Firewatir, it's out of date and no longer being developed and will not work with any recent version of FF. Instead use Watir-Webdriver to driver Firefox or Chrome (or IE).
You just need to output the "innerXml" (I don't know Watir) of the node selected by this XPath expression:
/
Update:
In case that by "dump" you mean something different, such as a set of the XPath expressions each selecting a node, then have a look at the answer of this question:
https://stackoverflow.com/a/4747858/36305

Looking for example of dojo.store with dijit.Tree over REST

I'm looking for an end to end example using dojo.store with dijit.Tree over REST.
There are many existing examples that use the older dojo api, dojo.data.api, but a dearth of ones using the dojo.store api.
Is the reason that dijit.Tree doesn't fully support dojo.store yet?
If so, do I need to use the dojo.data.ObjectStore wrapper to encapsulate dojo.store for use with dijit.tree?
I saw one example of working around this by extending StoreFileCache:
http://dojo-toolkit.33424.n3.nabble.com/New-object-store-and-dijit-Tree-td2680201.html
Is that the recommended option, or should I
a) stick to dojo.data.api until dijit.Tree supports dojo.store directly, or
b) use the dojo.data.ObjectStore wrapper
Thanks
There is now a tutorial on the DTK website that seems to cover pretty much exactly this topic.
http://staging.dojotoolkit.org/documentation/tutorials/1.6/store_driven_tree/
However, as I know linking to something without giving an answer is considered a poor practice, the general idea is that rather than using a dojo.data.ObjectStore to wrap around it and then potentially shoving it through a ForestStoreModel, you can simply augment your dojo.store-based store to add the methods that the Tree will look for. Here's a simple example from the tutorial:
usGov = new dojo.store.JsonRest({
target:"data/",
mayHaveChildren: function(object){
// see if it has a children property
return "children" in object;
},
getChildren: function(object, onComplete, onError){
// retrieve the full copy of the object
this.get(object.id).then(function(fullObject){
// copy to the original object so it has the children array as well.
object.children = fullObject.children;
// now that full object, we should have an array of children
onComplete(fullObject.children);
}, onError);
},
getRoot: function(onItem, onError){
// get the root object, we will do a get() and callback the result
this.get("root").then(onItem, onError);
},
getLabel: function(object){
// just get the name
return object.name;
}
});
It's worth noting that in this case, we're making some assumptions about what the data looks like. You'd need to know how your children relate and customize the methods below for that purpose, but it's hopefully fairly clear as to how to do that for yourself.
You can also just stick to dojo.data APIs for now, but this approach definitely feels more lightweight. It takes a couple of layers out of the stack and working with customizing a dojo.store-based store is much easier.
Given the two options you outlined, I'd say it's a matter of how well you know the different APIs.
dojo.store is more light-weight, and perhaps easier to understand, but the wrapper adds some overhead. If you think your project will live for a long time, this is probably the best way to go.
dojo.data is a legacy API, which will phased out eventually. If your project is short-lived, and only based on dijit.Tree, this might be your best option for now.
Personally, I'd go with dojo.store and write my own TreeStoreModel to get the best of both worlds. This approach is very similar to Brian's suggestion.
In case you're interested, I've written up a two-series post on how to use dijit.Tree with the ObjectStore wrapper, and implementing a JsonRest backend in PHP.

Is it a bad to use $$ in Prototype?

The Prototype JS API documentation mentions the $$() function, which allows you to select and extend elements based on CSS selectors, like the $() function in jQuery does.
However, on that page, $$ is presented like some sort of last resort:
Sometimes the usual tools from your DOM arsenal just aren't enough to quickly find elements or collections of elements. If you know the DOM tree structure, you can simply resort to CSS selectors to get the job done.
Why is that? Should I stay away from $$ and just use document.getElementsByClassName (ugh) instead?
Based on that quote you wrote, I'd say they are encouraging you to use $$(). $$() offers you a cross-browser way to access elements quickly and easily. On the other hand, document.getElementsByClassName() is either buggy or not functional in IE version up to and including version 8.
In a complicated project, I try to stay away from using $$() so that I don't accidentally select something I don't want. For a smaller project, I wouldn't worry. I can usually accomplish what I need to with $(Element).childElements or $(Element).immediateDecendants instead.