How do I check if a page is Google cached in Perl? - perl

I tried lots of modules, seems like nothing works well.
Do you have any idea how to check if a page I supply, for example:
http://bloggingheads.tv/forum/member.php?u=12129
Is available on google cache?
I'm using Perl.

Try connecting to
http://www.google.com/search?q=incache:[url]
(Note that the URL should be URL-encoded)
For example:
http://www.google.com/search?q=incache%3Ahttp%3A%2F%2Fbloggingheads.tv%2Fforum%2Fmember.php%3Fu%3D12129
If your page is stored in the Google cache, you will have a search result. If the page isn't, you will have a text saying "Your search did not match any document".
You can try to parse this page to know if your page is in the Google cache.

Related

Is it possible to create a canonical URL for pages with just a pageid in confluence?

I need to groups each space in the Piwik web analytics software. The cleanest way to do it would be if all URL´s had the spacekey in it, to allow each space-owner to easily get a complete view of their space and retain all functionality like browsing the site with an analytics overlay.
Some URL´s are canonical, however some just have an URL like /pages/viewpage.action?pageId=199921170
Is there some way through the AJS API or other method to find force a working URL in the form: /display/spacekey/title-of-page
The most important part is to have the spacekey in the URL. If there´s no workaround I might just generate an invalid URL by inserting spacekey and let each space-owner fix their page-titles if they want working analytics :-)
We are running Confluence version 5.10.7
(There´s an unresolved open issue on https://jira.atlassian.com/browse/CONF-11285 concerning the broader issue of sometimes having ugly URLs in confluence)
Instead of adding an incorrect URL, you could use a custom variable to record the space key, e.g.
_paq.push(['setCustomVariable','1','Space Key', AJS.params.spaceKey]);

AMP errors in web master tool

I have implemented AMP successfully for my webpages and google started indexing it, which I came to know via WebMaster tool. I am facing some issues which is present and disappears in short span of time.
Issue logged are:
User authored JavaScript found on page
The pages doesn't contain any script tags except schema.
This error is showing for few pages from 120 pages instead of following same
template. Below is the image link:
Have some more query:
I have observe different amp urls getting redirected to its original page when the same amp url is being used in Web Browser.
Is Google taking care of it or its on us to do the redirection?
I am planning to implement the sign in and share buttons on my web pages which will be using javascript. But if I do so, I do get validation error. So what is the right approach.
Can anyone please help me on this?
Please ensure that all script tags are of type application/ld+json. There should be no executable code in these script tags.
Redirection is something that you must be doing on your end. Google doesn't do any sort of redirection from AMP to non-amp pages if the URL is hit directly. In fact that URL schema that Google uses in their carousel is entirely their own, and just includes the path to your page inside it. E.g. https://cdn.ampproject.org/v/www.yoursitehere.com/path/to/article.html
Social sharing using Javascript inserted in the page is not allowed, as no Javascript is allowed. If you want to use social sharing, use a non-javascript implemention, or try out the amp-social-share
thanks for the response. As per the query which I asked
Please ensure that all script tags are of type application/ld+json. There should be no executable code in these script tags - I am not using any Script as of now except amp only
Redirection is something that you must be doing on your end. Google doesn't do any sort of redirection from AMP to non-amp pages if the URL is hit directly. In fact that URL schema that Google uses in their carousel is entirely their own, and just includes the path to your page inside it. E.g. https://cdn.ampproject.org/v/www.yoursitehere.com/path/to/article.html -
Understood
Social sharing using Javascript inserted in the page is not allowed, as no Javascript is allowed. If you want to use social sharing, use a non-javascript implementation, or try out the amp-social-share - Implemented Social Share and its working fine
Can we implement AMP for eCommerce sites where a lot of JavaScript, forms, plugins can be included? As of my knowledge AMP wants to keep it simple and thus restrict as many JavaScript, form tag is not valid only. So is there any chance we can implement AMP on eCommerce sites.

Tumblr share url

I've came across this page https://www.tumblr.com/examples/share/sharing-links-to-articles.html which shows a possible way to customly create a share URL for tumblr.
Simplified version of what they have:
Click to share
http://jsfiddle.net/m5ow6bhs/2/
This will take you to the log in page or straight to the share page if you're already logged in. However, if you change the http%3A%2F%2F part to a simple http:// it will now load to a "Not Found Page". http://jsfiddle.net/m5ow6bhs/3/ What the hell Tumblr?
Do you guys have any idea what's going on or what's the correct code to share something to Tumblr?
Cheers.
As with most share services, the URL should be passed as an encoded string. This supports the OPs comments about http%3A%2F%2F(encoded) and http:// (raw).
Tumblr provides variable transformations in the theme operators to handle encoding, but sadly it doesn't work with custom variables.
One quick solution is to drop the http:// part. Example: http://jsfiddle.net/L9jd8dhz/
I have discovered as of recently that the share URL needs to be updated as such:
https://www.tumblr.com/widgets/share/tool?shareSource=legacy&canonicalUrl=<-urlencode(share_url)->&posttype=link
The &posttype= seems to be a new requirement to make the share work correctly.

Site results redirect to Google

A site I'm working on has been hacked. The CMS (which I didn't build) was accessed and some files (e.g. "km2jk4.php.jpg") were uploaded in image fields. I have since deleted them (a week ago). Now, when I search for the site on Google, then click the result, it either:
a) simply redirects me to the Google search page
OR
b) a download dialogue appears asking me to download a zip file, with the source domain something like gb.celebritytravelnetwork.com
Clearly the site's been compromised. But if I simply type the URL in the address bar, the site loads fine. This only happens when I click through Google results.
There is no .htaccess file on the server, and this is not a virus on my computer, since many other people have reported the same thing happening, so this question is not relevant.
Any ideas please?
Thanks.
Your Source files have been changed.
Check all the files included in the index page. They might be header , footer pages.
And try using : fetch as google bot.

The title, link and description don't work

I've been reading guides and examples for a long time (hours) but I can't manage. I tried to use all html meta tag like title, description, and og:property. Also tried to use the link sharer and also to create a new blank page with just the info I want to share to facebook in order to test. Also I tried to generate an random url in php so to have always a different url variable (the url to share and also the url of the main page containing the script). I also grabbed (url linter) a lot of time the url to clean the cache of facebook. It always give me the title of the site domain as title or the url itself as the shared title and description. I don't know what to do.
The main web site is from joomla. In the code of index of joomla I put a php include if the url has the variable "articolo" id. This incuded php page has regulat head body etc. So maybe I facebook check the main meta of joomla first? So now I tried to open a popup with just the page for sharing. Look here: link
It's possible that the title is locked in, meaning that after X number of likes Facebook doesn't allow you to change it anymore. Can you give us an example URL you're having issues with?
EDIT
Ok, now the link you provided shows some very interesting output. http://modernolatina.it/wjs/index.php?option=com_content&view=article&id=96&Itemid=258&autore=6&articolo=6
First, you webserver, instead of sending back a 200 code, is sending back a 500 code.
Secondly the HTML your webserver is sending back has two HTML tags (Do a view source on the content returned)
Fix up those two issues and I think the linter will be happier with your page.
Test your page here:
http://developers.facebook.com/tools/debug