I am working on Apache Nutch modification project. We already swapped Nutch's original module with ours built using HtmlUnit. I need to download whole Facebook user site (ex. http://www.facebook.com/profile.php?id=100002517096832), which is going to be parsed using our own parser. Unfortunately Facebook is using mechanism called BigPipe (http://www.facebook.com/note.php?note_id=389414033919). That's why most of current website is hidden in <.!-- --> tags.
Usually when we scroll down Facebook page, new content is being unpacked every time we are about to hit bottom of the page. I have tried to use Javascript that scroll my htmlPage (HtmlPage object from HtmlUnit project), but finally I realized that scrolling is not triggering loading new content on Facebook user site.
How can I check, what event on page triggers loading content on current Facebook page? Maybe I should approach problem from different side, for example try to extract BigPipe "things" on my own? Have you ever did that?
Before dealing to your question … what kind of project are you trying to build there?
Since Apache Nutch is an open source web-search software, I think you are trying to build some kind of search engine, that scrapes Facebook user profiles/feeds to get data and make it searchable on some third-party website?
Well, that would be a violoation of Facebook Platform Policies:
I. Features and Functionality
12. You must not include data obtained from us in any search engine or directory without our written permission.
So, do you have that written permission?
Related
I have implemented AMP successfully for my webpages and google started indexing it, which I came to know via WebMaster tool. I am facing some issues which is present and disappears in short span of time.
Issue logged are:
User authored JavaScript found on page
The pages doesn't contain any script tags except schema.
This error is showing for few pages from 120 pages instead of following same
template. Below is the image link:
Have some more query:
I have observe different amp urls getting redirected to its original page when the same amp url is being used in Web Browser.
Is Google taking care of it or its on us to do the redirection?
I am planning to implement the sign in and share buttons on my web pages which will be using javascript. But if I do so, I do get validation error. So what is the right approach.
Can anyone please help me on this?
Please ensure that all script tags are of type application/ld+json. There should be no executable code in these script tags.
Redirection is something that you must be doing on your end. Google doesn't do any sort of redirection from AMP to non-amp pages if the URL is hit directly. In fact that URL schema that Google uses in their carousel is entirely their own, and just includes the path to your page inside it. E.g. https://cdn.ampproject.org/v/www.yoursitehere.com/path/to/article.html
Social sharing using Javascript inserted in the page is not allowed, as no Javascript is allowed. If you want to use social sharing, use a non-javascript implemention, or try out the amp-social-share
thanks for the response. As per the query which I asked
Please ensure that all script tags are of type application/ld+json. There should be no executable code in these script tags - I am not using any Script as of now except amp only
Redirection is something that you must be doing on your end. Google doesn't do any sort of redirection from AMP to non-amp pages if the URL is hit directly. In fact that URL schema that Google uses in their carousel is entirely their own, and just includes the path to your page inside it. E.g. https://cdn.ampproject.org/v/www.yoursitehere.com/path/to/article.html -
Understood
Social sharing using Javascript inserted in the page is not allowed, as no Javascript is allowed. If you want to use social sharing, use a non-javascript implementation, or try out the amp-social-share - Implemented Social Share and its working fine
Can we implement AMP for eCommerce sites where a lot of JavaScript, forms, plugins can be included? As of my knowledge AMP wants to keep it simple and thus restrict as many JavaScript, form tag is not valid only. So is there any chance we can implement AMP on eCommerce sites.
I need to create a custom Facebook Page Tab app which will show an external site in an iframe. This need to have adverts on it but I'm not sure if this is possible as the site is hosted externally.
I'm not sure if I need to sign up to the Facebook Audience Network to get approved etc. either?
Any help or advice would be great.
Many browsers have this limitation of not allowing external sites to be shown in an iframe. Imagine the case when you are working hard to create a site and others show all your content in iframes. That is, naturally frustrating.
However, there is a candidate-solution: Let's suppose you create a page which sends a request to the other site and appends all the content into the body and head of your page. This is very much possible, so the solution is to:
Create a page in your site, let's call it outsider
In the server-side code of your outsider page send a request to the desired page to be shown
You will get the html of the page. Process it and include its content into the head and body of outsider. This includes:
3.1. Checking all the CSS to be reached, as the target page might refer to local CSS, which is unreachable locally at your end. Process the URLs of CSS files
3.2. Checking all the Javascript to be reached, as the target page might refer to local JS, which is unreachable locally at your end. Process the URLs of JS files
3.3. Apply the idea described in 3.1. and 3.2. for other resources, like images, until you are satisfied with the content of outsider
Create an iframe, having the source to point to outsider. outsider is inside your scope, so it should be shown
NOTE: If the site owning the target page does not like the possibility of you showing their content inside iframes, they might protect it by, let's say, having Javascript in their code, which checks whether the page is inside an iframe. Remove that code while processing the response to your request. If nothing else prevents you from showing the page in an iframe, then you should achieve success.
I inserted the facebook code where required and absolutely nothing happens or shows up. Using dreamweaver cs6 and in live view or when testing in ANY browser, nothing shows up. It's a blank box. ANy thoughts? I read somewhere that the website must be made into an application of facebook? Is this something?
Specifically, after activating the plugin and properly connecting the plugin to the facebook app,which does work and has been used extensively before hand, comments simply fail to load. There is no comment box, no past comments pull in either (note the app has historical comments already).
In the plugin settings area on the right side of the screen where the comments check box exists, there is a read more link that takes you to the facebook developers page where it discusses adding the facebook comments iframe.
If there is some other location where comment settings might also exist (past the three fields that require the 2 keys and the app name) I could not find them and this was not referenced in the plugin setup.
We use the javascript SDK for login and sharing (iframes for the Like button). Javascript is loaded after the page load. We're seeing 1.5 to 3 second slower full page loads with Facebook enabled. What can we do to identify the cause and optimize perceived and real page load speed?
Make sure all JS includes are after CSS includes as for rending
Remote javascript loads are at the mercy of whoever is hosting them. Sometimes you can locally host them, but then you don't get the latest version, and some JS includes won't work if they're not included remotely.
Try putting the facebook include as the very last element in the tag. The actual facebook logic won't happen until the rest of the document loads however.
I have successfully setup an iFrame based App using the Javascript SDK, and we are trying to enable it on a Page Tab.
It seems Facebook has changed some things lately, because the app breaks when added to the Page Tab. I even went as far as making sure that all external scripts were included in the main index.php file, and that the body tags were taken out.
No, I'm trying to find out if it is even possible to use methods such as the stream.publish within a Profile Tab at all.
It seems like it isn't. As far as I can tell, you can no longer use any social methods on the Profile Tab.
Here were two related articles on the subject:
insidefacebook.com/2010/08/19/facebook-moving-toward-iframes-over-fbml-for-canvas-apps-and-page-tabs/
-and-
developers.facebook.com/roadmap
If anyone can confirm or deny this, it would be a huge help. The Facebook docs are just all over the place.
Here's a link to the working App Canvas as it stands now: http://apps.facebook.com/votetesting/
I know that on tab pages, you cannot do any JS until the user clicks on something first. Maybe that is the problem.
-Roozbeh