I am trying to scrape a page that is modified by javascript after initial load using Scrapy on Raspberri Pi.
I tried to install docker and scrapinghub/splash to render the page before passing it into scrapy, but realized Splash doesn't support ARM yet. Are there other options to scrape pages using javascript with Scrapy on a Raspberry Pi?
Currently, using the normal scrapy request on the site I only get this html, which is because the site loads first, and then the javascript renders the entire content. So before the javascript the page source looks empty:
<body class="notie8 notie9 lang-{{html.lang}}">
<!--<![endif]-->
<div loading-line></div>
<div page-layout>
<div ng-view></div>
</div>
</body>
</html>
For reference, the site I am referring to is: https://www.sreality.cz/hledani/prodej/byty?region=brno
Sreality uses API, isn't this a way to go? For your URL, there's this API call: https://www.sreality.cz/api/cs/v2/estates?category_main_cb=1&category_type_cb=1&per_page=20®ion=brno&tms=1502631428897 (look for XHR requests in your browser's developer tools).
Related
I am trying to get the Facebook plugin working on my wiki. Using the guidelines provided by Facebook I copied the Java Script SDK in MediaWiki:Common.js, which should run it for all users.
Now I am trying to enter the simple HTML code also generated by Facebook :
<div id="Facebook" class="fb-page" data-href="(my FB page)" data-small-header="false" data-adapt-container-width="true" data-hide-cover="false" data-show-facepile="true" data-show-posts="false"></div>
'My FB page' is of course replaced by the actual URL.
I tried two ways of entering the code.
Making a Template:Facebook page and linking to it like {{Facebook}} on the actual page
Even though this is not recommended forcing the html by adding the parameter $wgRawHtml = true; and entering the code in the ... tags.
None of this ways resulted in the Facebook plugin appearing on my wiki.
I am trying to implement Facebook Register Plugin on my website for landing pages.
I am following this guide:
https://developers.facebook.com/docs/plugins/registration/v2.0
I've written following code:
<div id="registration">
<iframe src="https://www.facebook.com/plugins/registration?client_id=660604224016242&redirect_uri=http://automaton.in/store_user_data.php?&fields=[{"name":"name"},{"name":"email"},{"name":"password"},{"name":"gender"},{"name":"birthday"}]">
</iframe>
</div>
But nothing is showing up instead of a box with border.
I've also created my facebook app with app-id 660604224016242.
Also, I am running my site locally without a local server! Do I need to run this on a server?
Please help me, I really need to implement this plugin!
I am trying to understand what kind of web applications Apache Wicket is suitable for, and it seems to be page based from what I have seen. How can it be used to make single page web apps as well?
At our company we have 3 SPA built on Wicket. They all work :) Basiclly our structure is:
<html>
<head>
</head>
<body>
<div wicket;id="menu">
<!-- navi links -->
</div>
<div wicket:id="main">
</div>
</body>
</html>
And we replace the main panel and it's inside panels effectivly getting an SPA.
Ajax support in Wicket 6 is very good.
Yes, its basically for page based webapps. So, it can also be used easily form single page web app.
I suggest just reading this short example of Hello world
After that, you can easily edit your equivalent of HelloWorld.html and HelloWorld.java to make really easy HTML in Java.
I need to take data from html page, so I am using LWP to get the page content.
the response I got is partial and not the full source of the page.
...
<div style="display:none" id="QUERY" query=""></div>
<div style="display:none" id="COLL" idcoll=""></div>
<div style="display:none" id="BROWSE" field=""></div>
<div id="center"></div>
<div id="loading"></div>
...
when using a web debugger(FIRE BUG) I can see a hidden content under:
<div id="center"></div>
<div id="loading"></div>
How can I get the hidden data using Perl ?
It breaks my mind for 3 days now !
Thanks ahead.
let's say it a JS running... How can i
see the content?
You could use WWW::Mechanize::Firefox. It seems to support Javascript.
If the content is indeed added using Javascript, you might be able to use WWW::Scripter with the Javascript or Ajax plugin.
If it is not present in the HTML source that LWP fetches, it is added in some other way. There probably is a Javascript running, or the webserver serves you and LWP different pages because of cookies or user agent string.
Install Firebug or use the Safari Develop menu to see what AJAX/XHR requests are being done to the server, and with what POST/GET parameters. You can then use LWP or any other HTTP client module to do such a request.
Hey guys, I have developed a small site that i would like to embed into a tab on a facebook page.
Previously I used this code to load in an iframe, it worked great:
<a onClick="outside_location.setInnerFBML(link_1);" style="cursor: pointer;">Link 1</a> | <a class="red" onClick="outside_location.setInnerFBML(link_2);" style="cursor: pointer;">Link 2</a>
<div id="outside_location"></div>
<fb:js-string var="link_1"><fb:iframe width="760" height="1280" frameborder='0' src='http://www.WebWhispers.in' /></fb:js-string>
<fb:js-string var="link_2"><fb:iframe width="760" height="1280" frameborder='0' src='http://google.com/' /></fb:js-string>
<script type="text/javascript" charset="utf-8">
var outside_location = d
document.getElementById('outside_location');
</script>
However, it has stopped working. I dont think facebook allows iframe inside of pages, only applications.
How can I load this page in without learning FBML? The site uses Jquery so I cant use FBML anyway.
I know applications can use iFrames, can I make it an application and then embed the application into a page tab somehow?
No. Tab pages can not contain iFrames. They must be written using FBML and FBJS.
One reason for this is that Facebook does not want to enable Tab pages to detect who looks at them. All requests (including images) on tab pages are proxied through Facebook for this reason. If iframes were allowed then the application would be able to detect who looked at it, which would present a privacy issue for Facebook users.
This is either a policy change by Facebook or, more likely, a bug. I say it's unlikely to be a policy change as it throws a script error, whereas a policy change would more likely strip the code out before it's rendered.
There's a bug report you can add votes to and follow here.