Return the entire remote DOM using CasperJS

Return the entire remote DOM using CasperJS - dom

I'm using CasperJS to do some screen scraping and I am running into a strange problem. When I navigate to a URL using my web browser the DOM looks entirely different from what CasperJS encounters when it navigates to the same URL.
To this end, I'd like to dump the remote DOM via CasperJS in order to troubleshoot what is going on.
Has anyone done such a thing?
Tips appreciated!

Related

Lit not working, not giving back anything from the custom web component

I'm trying to learn how to use lit as a dev tool for making web components and I'm having issues with making it run on my system.
In the documentation it states that you just need to run the command "npm i lit" on the folder of your project and after being successfully installed it should be running.
I did the "simple greeting" test also available on the documentation but it's returning me a blank page. I even copy-pasted both ts and html. Still blank page.
on my html, if I drill something into it, it shows on the page, so I know it's something regarding the custom web component (named simple-greeting).
I already did a course on native web components and I understand how it works, but I never worked with lit.
Isn't it just to import the necessary things from the respective library (like html, css, LitElement) and use it in TS?
Am I missing something? I am really confused and can't find anything online.
Thanks in advance.

It seems I forgot to run the web server. Still a long way to go, I guess.

how to put chrome devtool in browser?

I'm working for a project which run by chrome headless driver puppeteer, And i recently found a website (https://chrome.browserless.io/) could show puppeteer process in browser. This website appended an chrome devtoll which looks like magic.
i try to figure out how did work, and i found that this website inject a chrome devtool iframe, and the url usually like below:
https://chrome-devtools-frontend.appspot.com/serve_file/#7f3cdc3f76faecc6425814688e3b2b71bf1630a4/inspector.html?wss=chrome.browserless.io/devtools/page/(4BDC5841A823B95BF9B6107801819A31)&remoteFrontend=true
i think the version after inspector.html refer for the puppeteer code, but i don't know how this work.
i think this is some method imply over DevTools Protocol. I search the document but found nothing about how to put chrome devtool iframe in browser.
Does anyone know how to do this? or any document about this?

The Browserless Chrome Debugger you mentioned in your question has an instance of CodeMirror text editor embedded in the left hand section.
In the right hand section, an iframe element exists to display the result of the code you executed.
Simply put, you can simulate this type of behavior and allow users to execute Puppeteer code directly from your website by following a series of steps:
Sandbox a section of your system with a system container manager, such as LXD.
Install Node.js, NPM, and Puppeteer.
Install a web-based code editor, like CodeMirror, and embed it into a web page.
Validate and send an AJAX request with the code from the text editor to your sandbox server.
Sanitize the code, and then pass the code to Puppeteer.
Return the result to your callback function in your AJAX request on the client-side.
Format and sanitize the result before displaying it in an iframe.
Note: This is a naive implantation of this concept intended to explain the bare necessities of what would be required to obtain the goal in the question.
Make sure that you follow all generally recommended security practices.

SilverStripe CMS times-out when changing pages in the CMS

I have installed SilverStripe on several servers successfully in the past (but I'm not a SilverStripe expert). This time my SS install fails to work and I'm at a loss how to fix it.
The Problem
SilverStripe 2.4.6 installed correctly on the server (AFAIK).
The front-end works as expected. (Show default theme. Pages all load correctly.)
I am able to log into the CMS admin section succesfully. The CMS loads but when changing site pages in the CMS using the browser pane on the left, the CMS shows the circular loading symbol. The new page load never completes.
Using the console of Firebug in Firefox - When attempting to change pages in the CMS (by clicking on the page browser pane) the CMS tries to load two pages. The second page request 404s.
The first GET request is from the initial page loads.
The following POST+GET requests fire when clicking on the page tree to change pages.
Attempting to Find the Solution
I've tried deleting and re-installing silverstripe twice. (2.4.7 and 2.4.6) Both times the problem recurs.
A strange thing is that this server is already running two other silverstripe sites (both of which I installed without a hitch). All three websites are accessed via different domains. I tried accessing this install via another domain thinking there might be something wrong with how this third domain is configured but that didn't help either.
What should I try now? I'm stumped.
Thanks in advance.
Responses to Comments
Check your root .htaccess file. Make sure RewriteBase is set to /
Checked. Full .htaccess on PasteBin
Indeed the javascrip URL is strange. Check if there is anything unusual about what's being returned from the previous POST request. Is the site running in dev, test or live mode?
I can't see anything unusual in the POST request.
Clue Found: The site is running in DEV mode. Switching to LIVE mode and the problem disappears. Also the second GET request only shows up in DEV mode.
Example Post request with response.
Example Get request with respones.

This is a work around more than a fix but if you'd rather be coding than bug hunting it might be worth a go! (remember to log out of SS before doing this fix)
In your mysite/_config.php file change
Director::set_environment_type("dev");
to
if(!isset($_GET['isDev']))
Director::set_environment_type("dev");
else
Director::set_environment_type("live");
Then you can develop the website in dev mode normally and to use the admin in live mode and avoid the bug you just go to: http://{your_domain}/admin?isDev=0
N.B. might find a proper answer when pastebin.com isn't overloaded and I can see your responses!

Silent Printing from Web Page

I am working on a web application that needs silent printing (without print dialog box) with client side printer.
After some research we found that we can make it work using ActiveX and Foxit Reader.
Currently it works great but it constraints us with IE only and we want to make it work with Firefox and Chrome as well.
I know there is no direct code to make it work, but there must a work around?
What I need is the point to start e.g. Chrome/Firefox plugins to access local printer - or make windows service that runs in background in the client side, change browser settings, use ActionScript etc.
It will also be great if someone also illustrate how Facebook access local webcam from its website it may make about accessing clients peripherals from website. Thanks in advance.

For anyone who still wants information regarding printing on Chrome and/or FireFox without the use of ActiveX, extensions or client-side scripting, please see my answer to another question similar to this.

Perl: Scraping a website with frames and javascript

I have a website with 2 frames. Actions performed in 1 frame(enter data in text box/select radio button/click a href) cause the other frame to load data with javascript. I need to be able to enter data in the first frame and scrape the data in the second. What can I do for this?

Load the website in Firefox, then turn on the Firebug extension, enable the 'Net' tab, and have a look at the HTTP data being sent to and from the browser.
Sometimes it can help to try to forget what the webpage looks like, and concentrate on the posts and responses you see in Firebug's Net tab -- that's all you need to reproduce to get your data out.

You can either:
Reverse engineer the JS (monitoring HTTP traffic can help) to figure out what data actually gets sent to the server and then replicate that in your Perl.
Use WWW::Mechanize::Firefox to run a complete browser stack and interrogate it to read the results.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse