Phantomjs can't open filename

Phantomjs can't open filename - coffeescript

I'm writing a bit of coffeescript with phantomjs to take screenshots of multiple urls. Every time I try running it though, I get an error message, Can't open <filename>. What gives? Here's my code:
page = require('webpage').create()
page.viewportSize =
width: 1024
height: 760
urls = phantom.args
i = 1
for url in urls
do (url) ->
output = "screenshot-#{i}.png"
page.open url, (status) ->
if status isnt 'success'
console.log "Error opening url \"#{page.reason_url}\": #{page.reason}"
phantom.exit(1)
else
console.log "Page opened.."
window.setTimeout (->
page.clipRect =
top: 0
left: 0
width: 1024
height: 760
page.render(output)
), 200
i += 1
phantom.exit()
I tried commenting bits out and it seems that the part that's failing is the page.open(url). Oddly the error message says the file itself can't be opened.

(As a quick aside: I think there is a cut-and-paste error in line 14. I believe you want #{url} in place of #{page.reason_url}.)
I see two issues here, or one issue manifest in a couple of ways.
The page.open method is asynchronous -- that's why it accepts a callback function (your (status)->... stuff).
That's causing two types of problem for your code:
Your phantom.exit() call on line 28 is outside of the scope of the for loop (that starts on line 9) and of the callback method (that starts on line 12).
Because the open call is asynchronous, when urls is ['http://google.com'], the for loop will call page.open('http://google.com/') and then jump to phantom.exit() BEFORE the page has finished loading or the callback is invoked.
You need to defer your phantom.exit() call until the page.render(output) method is complete. Moving phantom.exit() right after -- and at the same of indentation as -- page.render(output) will make your program work for a single URL.
Even with phantom.exit inside the callback, you'll still have a problem when more than one URL is passed to your program. Again, because page.open is asynchronous, you're effectively calling page.open several times in quick succession (and then exiting before all of those pages have loaded).
Moreover, like a regular, single, web-browser tab, Phantom's page object can't really open multiple pages at a time.
To fix your program need to ensure that (a) one URL has been loaded and rendered before moving on to the next one, and (b) that you don't call phantom.exit() until all of the URLs have been rendered.
There are a few ways to go about this (google for "nodejs asynchronous for loop" or something like that), but depending upon your needs, it may be easier to just render one page per invocation of the program.
Here's a version of your program, updated for PhantomJS 2 and set up to render exactly one URL:
system = require('system')
page = require('webpage').create()
page.viewportSize =
width: 1024
height: 760
url = system.args[2]
output = "screenshot-#{Date.now()}.png"
page.open url, (status) ->
if status isnt 'success'
console.error "Error opening url \"#{url}\": #{page.reason}"
phantom.exit(1)
else
console.log "Page opened..."
window.setTimeout (->
page.clipRect =
top: 0
left: 0
width: 1024
height: 760
page.render(output)
console.log "Page rendered..."
phantom.exit()
), 200
Since Phantom no longer parses CoffeeScript directly, to run this you'll need to first "compile" to JavaScript, like so:
coffee -c foo.coffee
phantomjs foo.js "http://www.google.com/"
This seems to work properly for me.

Related

Javascript DOM addressing into a sub-window DOM element

Given this screenshot of a Firefox DOM rendering, I'm interested in reading that highlighted element down a ways there and writing to the "hidden" attribute 3 lines above it. I don't know the Javascript hierarchy nomenclature to traverse through that index "0" subwindow that shows in the first line under window indexed "3" which is the root context of my code's hierarchy. That innerText element I'm after does not appear anywhere else in the DOM, at least that I can find...and I've looked and looked for it elsewhere.
Just looking at this DOM, I would say I could address that info as follows: Window[3].Window[0].contentDocument.children[0].innerText (no body, interestingly enough).
How this DOM came about is a little strange in that Window[0] is generated by the following code snippet located inside an onload event. It makes a soft EMBED element, so that Window[0] and everything inside is transient. FWIW, the EMBED element is simply a way for the script to offload the task of asynchronously pulling in the next .mp4 file name from the server while the previous .mp4 is playing so it will be ready instantly onended; no blocking necessary to get it.
if (elmnt.contentDocument.body.children[1] == 'undefined' || elmnt.contentDocument.body.children[1] == null)
{
var mbed = document.createElement("EMBED");
var attsrc = document.createAttribute("src")
mbed.setAttributeNode(attsrc);
var atttyp = document.createAttribute("type")
mbed.setAttributeNode(atttyp);
var attwid = document.createAttribute("width")
mbed.setAttributeNode(attwid);
var atthei = document.createAttribute("height")
mbed.setAttributeNode(atthei);
elmnt.contentDocument.body.appendChild(mbed);
}
elmnt.contentDocument.body.children[1].src=elmnt.contentDocument.body.children[0].currentSrc + '\?nextbymodifiedtime'
elmnt.contentDocument.body.children[1].type='text/plain'
I know better than to think Window[3].Window[0]...... is valid. Can anyone throw me a clue how to address the DOM steps into the contentDocument of that Window[0]? Several more of those soft Windows from soft EMBED elements will eventually exist as I develop the code, so keep that in mind. Thank you!

elmnt.contentWindow[0].document.children[0].innerText does the trick

How to use WebUI.getUrl().contains('atlassian') with timeout value

I have a piece of code that has a 5 sec delay and getUrl after. If I dont delay the execution, getUrl returns false since the site doesn't load yet.
WebUI.delay(5)
assert WebUI.getUrl().contains('atlassian')
In the website, there is a div which leads to another window when clicked. This code checks if the opened page is an Atlassian webpage. However, I don't want to use delay for 5 sec(it may take way longer or shorter). Is there a way to put a timeout, for instance wait for 1 min until page loads and if not loaded -> fail execution?

Try waiting for page load
WebUI.waitForPageLoad(5, FailureHandling.STOP)
assert WebUI.getUrl().contains('atlassian')
This will wait for 5 seconds for the page load and stop execution with test failed if the page isn't loaded in that time.
Alternatively, you could use WebUI.waitForElementPresent(to, timeout) where to is a test object you are certain is present when the page is loaded.

Reload page if 'not available'?

I've a standalone Raspberry Pi which shows a webpage from another server.
It reloads after 30 minutes via JavaScript on the webpage.
In some cases, the server isn't reachable for a very short time and Chromium shows the usual This webpage is not available message, and stops reloading
(because no JavaScript from the page triggers an reload).
In this case, how can I still reload the webpage after a few seconds?
Now i had the Idea to fetch the website results via AJAX and replace it in the current page if they were available.

Rather than refreshing the webpage every few minutes, what you can do is ping the server using javascript (pingjs is a nice library that can do that)
Now, if the ping is successful, reload the page. If it is not successful, wait for 30 more seconds and ping it again. Doing this continuously, will basically make you wait until the server is open again (i.e. you can ping it)
I think this is a much simpler method compared to making your own java browser and making a browser plugin.
Extra info: You should use a exponential function or timeout checking to avoid unnecessary processing overhead. i.e. the first time out find the ping fails, wait for 30 seconds, second time wait for 30*(2^1) sec, 3rd time wait for 30*(2^2) and so on until you reach a maximum value.
Note - this assumes your server is really unreachable ... and not just that the html page in unavailable (there's a small but appreciable difference)

My favored approach would be to copy the web page locally using a script every 30 mins and point chromium to the local copy.
The advantage is that script can run every 30 seconds, and it checks if the successful page pull happened in the last 30 mins. If YES it then does nothing. If NO then you can keep attempting to pull it. In the mean time the browser will be set to refresh the page every 5 seconds, but because it is pulling a local page it does little to no work for each refresh. You then can detect if what it has pulled back has the required content in it.
This approach assumes that your goal is to avoid refreshing the page every few seconds and therefore reducing load on the remote page.
Use these options to grab the whole page....
# exit if age of last reload is less than 1800 seconds (30 minutes)
AGE_IN_SECS=$(( $( perl -e 'print time();' ) - $(stat -c "%Y" /success/directory/index.html) ))
[[ $AGE_IN_SECS -lt 1800 ]] && exit
# copy whole page to the current directory
cd /temporary/directory
wget -p -k http://www.example.com/
and then you need to test the page in some way to ensure you have what you need, for example (using bash script)....
RESULT=$(grep -ci "REQUIRED_PATTERN_MATCH" expected_file_name )
[[ $result -gt 0 ]] && cp -r /temporary/directory/* /success/directory
rm -rf /temporary/directory/*
NOTE:
This is only the bare bones of what you need as I don't know the specifics of what you need. But you should also look at trying to ...
ensure you have a timeout on the wget, such that you do not have multiple wgets running.
create some form of back off so that you do not hammer the remote server when it is trouble
ideally show some message on the page if it is over 40 minutes old so that viewer knows a problem is being experienced.
you could use a chromium refresh plugin to pull the page from locally
you can use your script to alter the page once you have downloaded it if you want to add in additional/altered formatting (e.g. replace the css file?)

I see three solutions:
Load page in iframe (if not blocked), and check for content/response).
Create simple browser in java (not so hard, even if you dont know this language, using webview)
Create plugin for your browser.

reloading a page via javascript is pretty easy:
function refresh() {
var xhr = new XMLHttpRequest();
xhr.onreadystatechange = function() {
if (xhr.readyState == 4 && xhr.status === 200)
document.body.innerHTML = this.responseXML.body;
else
setTimeout('refresh', 1500);
};
xhr.open('GET', window.location.href);
xhr.responseType = "document"
xhr.send();
}
setInterval('refresh', 30*60*1000);
this should work as you requested

How to know any UI rendering is completed in automation code

I am wanting to know a button is rendered on main window UI or not. This button rendering is depending on server response result (written in Objective C). If server response comes perfectly it becomes render perfectly (VISIBLE) otherwise it is not present there (INVISIBLE). And whenever it becomes visible I always tap on it for further next process.
I wrote code
UIATarget.localTarget().pushTimeout(200);
//My code
UIATarget.localTarget().popTimeout();
By the above code I have to wait till 200 sec but my concern is I want to wait but whenever object is on screen I don't want keep me busy in WAITING MODE.
How will I write code in automation?
Thanks

Ok, this might give you idea how to follow-up:
For your view implement an accessibilityValue method which returns a JSON formatted value:
- (NSString *)accessibilityValue
{
return [NSString stringWithFormat:
#"{'MyButtonisVisible':%#}",
self.MyButton.isHidden ? #"false" : #"true"];
}
Then somehow you can access it from your test javascript:
var thisproperty = eval("(" + element.value() + ")");
if (thisproperty.MyButtonisVisible) {
UIATarget.localTarget().tap({"x":100, "y":100});
}
Hope that helps.

If you make the name different when you enable the button you can do this:
var awesomeButton = target.frontMostApp().mainWindow().buttons()[0];
UIATarget.localTarget().pushTimeout(200);
awesomeButton.withName("My Awesome Button");
if (awesomeButton.isVisible()) {
UIALogger.logError("Error no awesome button!");
}
UIATarget.localTarget().popTimeout();
withName will repeatedly test the name and control will return to your script once the name matches or when the time out is reached.
Per Apple's Doc
withName:
Tests if the name attribute of the element has the given string value. If the match fails, the test is retried until the current timeout expires.
Timeout Periods:
If the action completes during the timeout period, that line of code returns, and your script can proceed. If the action doesn’t complete during the timeout period, an exception is thrown.
https://developer.apple.com/library/etc/redirect/xcode/ios/e808aa/documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/UsingtheAutomationInstrument/UsingtheAutomationInstrument.html#//apple_ref/doc/uid/TP40004652-CH20

Perl CGI gets parameters from a different request to the current URL

This is a weird one. :)
I have a script running under Apache 1.3, with Apache::PerlRun option of mod_perl. It uses the standard CGI.pm module. It's a regularly accessed script on a busy server, accessed over https.
The URL is typically something like...
/script.pl?action=edit&id=47049
Which is then brought into Perl the usual way...
my $action = $cgi->param("action");
my $id = $cgi->param("id");
This has been working successfully for a couple of years. However we started getting support requests this week from our customers who were accessing this script and getting blank pages. We already had a line like the following that put the current URL into a form we use for customers to report an issue about a page...
$cgi->url(-query => 1);
And when we view source of the page, the result of that command is the same URL, but with an entirely different query string.
/script.pl?action=login&user=foo&password=bar
A query string that we recognise as being from a totally different script elsewhere on our system.
However crazy it sounds, it seems that when users are accessing a URL with a query string, the query string that the script is seeing is one from a previous request on another script. Of course the script can't handle that action and outputs nothing.
We have some automated test scripts running to see how often this happens, and it's not every time. To throw some extra confusion into the mix, after an Apache restart, the problem seems to initially disappear completely only to come back later. So whatever is causing it is somehow relieved by a restart, but we can't see how Apache can possibly take the request from one user and mix it up with another.

This, it appears, is an interesting combination of Apache 1.3, mod_perl 1.31, CGI.pm and Apache::GTopLimit.
A bug was logged against CGI.pm in May last year: RT #57184
Which also references CGI.pm params not being cleared?
CGI.pm registers a cleanup handler in order to cleanup all of it's cache.... (line 360)
$r->register_cleanup(\&CGI::_reset_globals);
Apache::GTopLimit (like Apache::SizeLimit mentioned in the bug report) also has a handler like this:
$r->post_connection(\&exit_if_too_big) if $r->is_main;
In pre mod_perl 1.31, post_connection and register_cleanup appears to push onto the stack, while in 1.31 it appears as if the GTopLimit one clobbers the CGI.pm entry. So if your GTopLimit function fires because the Apache process has got to large, then CGI.pm won't be cleaned up, leaving it open to returning the same parameters the next time you use it.
The solution seems to be to change line 360 of CGI.pm to;
$r->push_handlers( 'PerlCleanupHandler', \&CGI::_reset_globals);
Which explicitly pushes the handler onto the list.
Our restart of Apache temporarily resolved the problem because it reduced the size of all the processes and gave GTopLimit no reason to fire.
And we assume it has appeared over the past few weeks because we have increased the size of the Apache process either through new developments which included something that wasn't before.
All tests so far point to this being the issue, so fingers crossed it is!