Hashbang link previews and crawling in Facebook and others popular services - facebook

Do currently any popular websites support previewing links with hashbangs?
E.g. pasting link like
https://groups.google.com/forum/#!topic/pigfi/cF6GzPfeuO8
To a Facebook chat message.
(Translates to https://groups.google.com/forum/?_escaped_fragment_=topic/pigfi/cF6GzPfeuO8 )
There exists a spec for crawling hashbang URLs https://developers.google.com/webmasters/ajax-crawling/docs/getting-started
However, I found little information which websites or popular services currently support crawling / previewing these kind of hash bang URLs (besides GoogleBot?)

Related

Crawl Errors in Google Webmaster Tools

I have a selection of crawl errors in my Google Webmaster tools for links that no longer exist on my site. These are a result of an old hack, where a pharmacy hack linked to PDFs. These have all been removed months ago, but external sites are still linking to these pages, which are then causing crawl errors.
Is there a way to alert Google that these links are fake/spam?
There is a special page where google allows reporting of various spam pages, you should check this:
https://www.google.com/webmasters/tools/spamreport?hl=en&pli=1

How does the server know when to serve an amp page

I understand that there will be a version of a site with HTML designed for desktop devices and then the AMP pages.
Is there anything I need to do so that the site serves AMP content to mobile devices?
Good question!
Summary:
AMP supplies no automatic means of redirection, only necessary markup for its search engine (and potentially other websites/apps/search engines to send their users to the AMP page
Old methods of redirecting mobile users to mobile sites could be used, typically by detecting mobile user-agents and redirecting them via 301/302 redirect to the AMP page
Redirecting mobile users might not be worth doing because the aforementioned old methods kinda suck
Full answer:
In terms of Google and the search engine results page (SERP), you will need to include this in your desktop markup:
<link rel="amphtml"
href="https://www.example.com/url/to/amp/document.html">
and this in your AMP markup:
<link rel="canonical"
href="https://www.example.com/url/to/standard/document.html">
so that Google and other high-traffic networks like Twitter, LinkedIn or Pinterest, will detect the amphtml signature and direct mobile browsers to the AMP page accordingly. I would say Facebook but since AMP is a competitor product to Facebook Instant Articles, I suspect they will drag their heels.
AMP is of course a completely different animal, being both open-source and a web technology as opposed to a native app platform for content, but the web and native platforms stand in opposition to one another and while Google provides a great number of apps, it's clear from technologies like ServiceWorker that they are pushing for the web as a content platform—which should come as no surprise because time spent in Facebook or Apple apps is time spent away from Google search and its advertising, from which Google derives its revenue.
But I digress: obviously this rel="amphtml" declaration will only instruct Google et al. to redirect this result to mobile users from their pages. This is because a redirection policy was not the intention by Google or the AMP team, who rather envisage a world where everybody goes through Google or other big player rather than being visited directly or linked directly by email or something.
In theory, it might one day be implemented at the browser level, but it takes browser vendors long enough to standardise essential layout/styling properties and JavaScript APIs, let alone random non-standard considerations as AMP currently is. Apple will drag its heels when it comes to the browser because it would compete with their own News app.
We can probably expect that AMP redirection will be implemented in the Chrome browser (and therefore Opera), but even that could be a while yet. So, in order to force mobile devices to redirect to the AMP pages as opposed to your standard ones, you'd ultimately need to configure your web server to sniff for mobile user-agents (or less commonly supported MIME types) and redirect (use 302 for the sake of SEO) them to the AMP pages.
This may seem like something of a regression to past habits, and you'd be right to think so. The redirection will immediately slow the journey down a little bit, though AMP is valuable for its on-page optimisations as well as its HTTP response/transport time. Before the advent and current zenith of responsive web design, this is how mobile users would be catered for, especially in the WAP days. Websites would serve a mobile-friendly version served under a subdomain like mob.website.com or m.website.com. There were flavours of XHTML targeting mobile devices, which are still used by Google+ for its "basic" pages (note the DOCTYPE). These "basic" pages are reserved for devices of a low screen resolution, as we can see from this line:
<link rel="alternate"
media="only screen and (max-width: 640px)"
href="/app/basic/+SOME_PAGE">
This approach may have even served as an inspiration for AMP.
A similar redirection practice should hopefully not present a difficulty for you, because you probably intend to use amp.website.com or perhaps a subdirectory for your AMP pages anyway.
Since all websites should be responsive anyway — in terms of SEO, and because targeting only mobile devices is made more difficult by the unreliability of redirection techniques and of using user-agents and MIME types as a detection method — you might be tempted to try to estimate the connection speed or physical location of the visitor.
Then, if the connection speed is low, or if the user is located far from your origin server, it might be best to redirect them to the AMP page (since it is served from Google's CDN and uses HTTP/2 + heavy caching to serve content faster).
However, any CDN can be used for all pages to deliver them faster to everybody, not just slow connection users or people located far from your origin server; the point of AMP is only partially to deliver content via CDN and perhaps more about serving responsibly constructed pages to devices which are well known for their crappy JavaScript execution times, like mobile phones.
Ultimately, I wouldn't enforce a redirect for all mobile users. I would leave it to Google to direct visitors arriving via its search engine to be sent to the AMP pages. If AMP is going to catch on and be a long-lived product, browsers will implement it eventually.
Come to think of it, if you're serving content to mobile devices, it might be irresponsible to serve AMP pages to people using older Windows Phone or Blackberry devices whose browsers may not even properly support AMP.
There's a lot to think about but I hope I have provided an answer to your question, and if not, at least some considerations to bear in mind before deciding on the right answer for your product.
For more information about separate mobile sites, you can read this documentation on the subject provided by Google.
For examples of how to configure your web server to detect mobile user-agents and send them to a different subdomain, you can find articles or code samples quite easily if you search for them.
Just for completitude, I used the following redirect to serve AMP pages to certain user agents, it's a .htaccess for an apache web server with mod_redirect enabled:
<IfModule mod_rewrite.c>
RewriteBase /
RewriteCond %{REQUEST_URI} !/amp/$ [NC]
RewriteCond %{HTTP_USER_AGENT} (android|blackberry|googlebot\- mobile|iemobile|iphone|ipod|\#opera\ mobile|palmos|webos) [NC]
RewriteRule ^([a-zA-Z0-9-]+)([\/]*)$ https://www.yoursite.com/$1/amp/ [L,R=302]
</IfModule>
IMHO this question deserves more interest.
amdouglas answer is great and fully covers the front-end aspects. As for Server-side, Jesús Diéguez Fernández is a good start but would require to be maintained overtime to remain accurate (user agent signatures).
To complete this (and even if it was not a PHP related question), here below is a PHP snippet I use server-side to redirect mobile-devices requests.
It uses Mobile-Detect (+23M downloads), which holds an incredibly complete and up-to-date list of user-agent strings (that should be easily adapted to any programming language).
<?php
require_once "libs/Mobile_Detect.php";
$detector = new MobileDetect();
$is_mobile = $detector->isMobile();

redirect for smartphones and Googlebot-mobile

I'm building a mobile version of my site for smart-phones
(iPhone/Blackberry/Android/WebOS)
and I want to redirect to the mobile version from my main site whenever the user agent is of one of the kinds listed above (my mobile site is on a different url than my Desktop site).
My mobile version is more like a WebApp and does not contain the same content as the Desktop site.
After reading This Post by Google I understand that the Googlebot expects smartphones to display the Desktop version of the site (Googlebot-Mobile is not used for smartphones)
I'm afraid that if I redirect to the mobile version for smartphones, Google will give me penalty for cloaking, How can I avoid this?
I know that including a link from the main site to the mobile version and vice versa helps a lot.
Any other advice/best practices on how to be google friendly when creating mobile versions of the site for smartphones?
From the article:
For Googlebot and Googlebot-Mobile, it does not matter what the URL structure is as long as it returns exactly what a user sees too.
The key thing is you must be consistent in the content you give to the bot and the one you serve to the user.
Another interesting excerpt from the article:
For now, we expect smartphones to handle desktop experience content so there is no real need for mobile-specific effort from webmasters. However, for many websites it may still make sense for the content to be formatted differently for smartphones, and the decision to do so should be based on how you can best serve your users.
You can also serve a different page/content/styling based on the UA string, as stated in the article:
If you serve all types of content from www.example.com, i.e. serving desktop-optimized content or mobile-optimized content from the same URL depending on the User-agent, this will also lead to correct crawling by Googlebot and Googlebot-Mobile. This is not considered cloaking by Google.
I think it all boils down how different the content/styling is. If it's only slightly different, I would probably go with the same url serving both. If it's dramatically different, I would use a different url for smartphones.
Hope this helps!
Updating this with current information. Google now crawls with a smartphone Googlebot-Mobile user agent. See: Google blog post
Google's SEO PDF explains how to avoid cloaking penalties. Specifically, see Page 27. See: SEO PDF
The gist is, the content you serve a desktop user can be different from the content you serve a mobile user, as long as Googlebot is always served the same content you serve to any desktop user, and Googlebot-Mobile is always served the same content you serve to any mobile user. To abide by this, it seems to me you should not configure your site to serve mobile content based on finding "Googlebot-Mobile" in the user agent. The bot will supply a typical smartphone user agent string as part of it's own user agent--that's the part to rely on, or else if a new device comes out that you do not yet account for, you'll serve desktop content to it, but mobile content to Googlebot-Mobile impersonating that device.
You could use subdomain for your mobile site and redirect google mobile bot there together with smartphones

Facebook Applications

I just wanted to know if there is a way to host a facebook application in facebook's servers and not elsewhere. is facebook providing hoting for applications?
Thanks
There is no hosting provided by facebook for facebook applications. There are currently two types of facebook applications: iFrame and FBML. iFrame apps can be coded using the sdk's in your language of choice and are a bit more open as far as javascript, database and other functionality. The FBML apps must be written using facebook's markup language FBML, FBJS and FQL for queries. This route is a bit more limited as you can only use the FB markup, js and query languages. Whichever one of these paths you choose you will need to host your code yourself.
You should check this out:
http://developers.facebook.com/docs/guides/canvas/
Facebook recently updated these docs with the release of the GraphAPI, they are much better than before. Good place to get started.

Facebook App vs. Facebook Connect site

I'm reading Facebook's documentation so I can figure out how to enable Facebook Connect on my site. What confuses me is which parts apply to Facebook applications and Facebook Connect, because I'll be reading along, thinking I'm learning about Facebook Connect, but then I'll reach a section that mentions Facebook applications. For example, here's an except from the page on Data.getCookies:
This method returns all cookies for a given user and application.
Cookies only apply to Web applications; they do not apply to desktop applications.
I think of my website as a Web application, but I can't tell if "Web applications" simultaneously refers to Facebook Connect sites and Facebook applications. How can I tell if what I'm reading applies to Facebook Connect and not just Facebook apps?
In that context, "Web applications" refers to canvas based apps with Facebook. "Desktop apps" is the other type mentioned there, and refers to a non-web app like a widget for your system tray in Windows.
I would look at the Facebook platform as a set of APIs:
Facebook canvas applications (Apps you use in FB. What users think of as "Facebook apps")
FBML / FBJS apps
Iframe canvas apps
Facebook desktop applications (Rare)
Facebook connect applications (Websites with elements of FB in them. CNN, Digg)
Web
iPhone
Note that all of these can access the Facebook API, the REST and FQL interface. Most of the documentation is for FBML canvas applications. On the left side of the Facebook developer wiki you can see a few top-level options:
API (you can always use this)
FBML (canvas apps only)
XFBML (Facebook connect only)
FQL (you can always use this)
FBJS (mostly canvas apps, some connect functionality)
I'm sure you've seen:
http://wiki.developers.facebook.com/index.php/Facebook_Connect
Which is the main connect documentation. I hope this helps you get organized.
Good luck!
Many aspects of the FB web applications (like FBML, FQL) are common for both FB apps and FB Connect. I would say that FB Connect is more likely to be used on sites implementing more FB's visual elements (FBML). Additionally, FB Connect can be used off-line (where the user does not have a current session directly with FB).
I admit that the documentation is fairly scattered and often quite vague - but once you keep reading more and more about it, the concepts become clearer. At least that was my experience.
I recently found a great blog post that describes the differences between FBML canvas pages, iframe canvas pages and Facebook Connect sites. It focuses more on the technical difference between FBML and iframe apps, but since these technologies are mentioned throughout Facebook's documentation, it seems almost essential for Facebook Connect developers to have a basic understanding of regular Facebook apps, even though they won't be working with them directly. I think knowing about this page a few months ago would have saved me alot of heartache.