Google Search Console deadlock: Unable to remove URL so I can index a different URL

Google Search Console deadlock: Unable to remove URL so I can index a different URL - google-search-console

I recently tried some SEO tweaks to see if I could improve traffic to my personal website. One of these changes was shortening some of my URLs. This worked just fine for about 4-5 of the URLs that I changed, except for one.
The problematic URL is
https://aleksandrhovhannisyan.github.io/blog/dev/how-to-add-a-copy-to-clipboard-button-to-your-jekyll-blog/
Which I shortened to
https://aleksandrhovhannisyan.github.io/blog/dev/jekyll-copy-to-clipboard-button/
The former URL no longer exists on my website, and I did not set up a 301 redirect. Instead, I requested that Google remove it from search results via Google Search Console. I also requested that it index the new URL.
Basically, I'm in a bizarre deadlock situation where Google refuses to index my new URL, claiming it selected the old one as a canonical because of "duplicate content":
According to the Removals page, the old URL was in fact removed:
But when I inspect this old URL, Google Search Console claims the URL is on Google:
How do I get out of this mess?

Related

How to remove google search results for 303 redirect?

I run a dynamic site that may or may not redirect a certain route based on user preferences.
Let's say it's http://clientname.example.com/maybe. Our backend has a response for /maybe, but if the client decides they would rather use their site for the information on that page, we instead use a 303 Redirect to their page on a separate domain.
All of our content pages use the <meta name="robots" content="noindex"> tag, so google will not index any of our pages. HOWEVER, when I search google for "site:our_domain_name.com", I get a bunch of results that all trace back to those dynamic routes that return a 303. When I click on the search results in google, the 303 is followed as expected and I arrive at the client's site. What I want, is for my piece of the puzzle to not show in results at all.
I was troubleshooting it this morning, and I realized that our noindex meta tag was obviously not being seen by the robot as it was following the redirect, so I added a rule on the server that adds the 'X-Robot-Tag: noindex' header to redirect responses.
Is that enough? If I wait long enough, will those search results be removed?

Is that enough? If I wait long enough, will those search results be removed?
No because if an external page links to your site, Google will follow the link to your site, then your 303 (if your return such a code) and won't see the noindex.
Don't return a 303 for Google bots and you should be fine. It may take a bit of time, because Google needs to reprocess the page and see the noindex to remove it.

How to prevent Google from indexing redirect URL I do not own

A domainname that I do not own, is redirecting to my domain. I don´t know who owns it and why it is redirecting to my domain.
This domain however is showing up in Googles search results. When doing a whois it also returns this message:
"Domain:http://[baddomain].com webserver returns 307 Temporary Redirect"
Since I do not own this domain I cannot set a 301 redirect, or disable it. When clicking the baddomain in Google it shows the content of my website but the baddomain.com stays visible in the URL bar.
My question is: How can I stop Google from indexing and showing this bad domain in the search results and only show my website instead?
Thanks.

Some thoughts:
You cannot directly stop Google from indexing other sites, but what you could do is add the cannonical tag to your pages so Google can see that the original content is located on your domain and not "bad domain".
For example check out : https://support.google.com/webmasters/answer/139394?hl=en
Other actions can be taken SEO wise if the 'baddomain' is outscoring you in the search rankings, because then it sounds like your site could use some optimizing.
The better your site and domain rank in the SERPs, the less likely it is that people will see the scraped content and 'baddomain'.
You could however also look at the referrer for the request and if it is 'bad domain' you should be able to do a redirect to your own domain, change content etc, because the code is being run from your own server.
But that might be more trouble than it's worth as you'd need to investigate how the 'baddomain' is doing things and code accordingly. (properly iframe or similar from what you describe, but that can still be circumvented using scripts).
Depending on what country you and 'baddomain' are located in, there are also legal actions. So called DMCA complaints. This however can also be quite a task, and well - it's often not worth it because a new domain will just pop up.

Error parsing input URL, no data was scraped. only with new pages on my site

The problem i have is that i own a website where other people can post stuff ,creating new pages on my domain, but the problem that occured today is that all the new post pages created today are malfunctioning , sharing is not loading thumbnail picture and title and so on, but the weird this is that all the posts(new pages) created before today are all working fine
What caused an error to occur out of nowhere?
I also cannot debug any of the URL's of my website as the same error: Error parsing input URL, no data was scraped
The website im having problems with is here http://www.vabameedia.ee/vm/184/h%C3%A4da-ei-anna-h%C3%A4beneda.html
This is one of the sites where it says no error on page but facebook still cant reach it. http://www.vabameedia.ee/vm/178/craig-parks-%C3%BChek%C3%A4eline-krossisoitja.html

For people experiencing the same problem but for different causes, I discovered a few interesting things about how Facebook "scrapes" pages, checking the logs of the server while doing some trials.
First of all: if you never tried to share a page with FB, FB never tried to scrape it, and it will not try to do so if you only put the url in the Debug tool.
That's the first reason because you get the error: it just states that FB has no information on the page, you must "force" it to scrape the page.
The first time you try to share a page, FB scrapes it (asks your server the first 40k of the page and analyse the opengraph tags).
What can happen is that you do not see the image: Facebook Share Dialog does not display thumbnails one first load
The reason is that FB behind the scenes is still scraping your page and caching the image. The next time, in fact, you have also the image.
How to solve it? Pre caching: https://developers.facebook.com/docs/sharing/best-practices#precaching
or simply add
<meta property="og:image:width" content="450"/>
<meta property="og:image:height" content="298"/>

I was pulling my hair out trying to fix this issue. Hours and hours of troubleshooting to no avail. After speaking with one of our programmers about a topic unrelated I thought of something to try as a long shot.
Much to my surprise, it worked!!!
This is the reason behind the problem and my solution for it:
When you draft a post in WordPress it generates a link based on your article's title (unless you manually change it). The title of my article included special characters, however the auto-generated link didn't display these special characters, only hyphens to replace the spaces. Should be fine right? Wrong! Somewhere embedded in metadata and code in the WordPress platform are those special characters and they mess up the way Facebook pulls info from the article being linked to. This is a problem because certain special characters invalidate hyperlinks.
For example:
Article Title: R[eloaded]
Auto-generated hyperlink DISPLAYED in WordPress "Permalink" field: http://www.example.com/reloaded
Actual WordPress Auto-generated hyperlink: http://www.example.com/r[eloaded]
Those brackets will invalidate the link and Facebook will be unable to pull any information (ie pictures) from it.
Solution:
(1) Simply, manually change the WordPress hyperlink address to something that doesn't include any special characters (this will not change the title of your article).
(2) Click "Update" to change the post to include the new hyperlink.
(3) Click "Purge from Cache" in the WordPress window
(4) Refresh your Facebook browser window
(5) Paste the new hyperlink for your article
(6) Enjoy your Facebook post with a preview image and information
Sidenote: Don't pull your hair out over Facebook, it's not worth it. =)

If you're using Wordpress, edit the post in question to change the permalink (just alter it slightly), then update the post. Using the new permalink in the Facebook OG debugger should now work.
It's a weird fix, but I think it takes care of a problem caused by special characters being used in the title of a post, which is then used to make the permalink.

Its all about DNS issue, was having same issue and resolved it by updating domain name servers to actual name servers.
In my case my domain was pointed to ns1.websterz.net and ns2.websterz.net and on this server i had DNS redirect to my other server (where web site is hosted). I Just updated name servers of the domain to actual name servers where my web site is hosted on. This was account migration case i forgot to update name servers as of new server.
Everything works fine now.

301 Redirect appears to be losing referrer information

We've just put a new website live and I have varying Url re-writing in place to handle the old indexed pages, performing a 301 redirect to the new equivalent page location on the new site.
We've noticed since the day the new site went live that in Google Analytics, the stats in general have plummeted substantially :(
One of our SEO guys has pointed out that when you click on one of the old indexed pages in google, it correctly 301's to the new location, however, if you view the __utmz Google Analytics cookie, it has 'direct' in it, whereas he believes that should be 'organic'.
He thinks that the referrer information is being lost during the 301 redirect, and as a result, this is being treated as direct traffic instead of organic?
The new website is an ASP.NET 4.0 Web Forms application and is using Routing for the new Url's. I am generating the new route/url for old pages within the global.asax within the Application_BeginRequest routine.
If a 301 is needed for the request, this is the code that is executed:
Response.Clear();
Response.Status = "301 Moved Permanently";
Response.AddHeader("Location", newUrl);
Response.End();
Is there anything here that would indicate what the problem might be, or any ideas beyond the above what might be causing such an issue?

I located the problem - a silly error on my part with a relative url to a file that accompanies our google analytics tag, working in some locations of the website, but in others, the include was returning a 404! My bad.

How best to setup 301 redirects from an old site that has many duplicate entries indexed on Google?

I am currently working with a client to redevelop their website. One of the final things I need to do before launch, is to make sure that their old website's pages are correctly redirected to the new URL structure of the new website.
Unfortunately, when I check Google to see how their current site is indexed, this relatively small website appears to have over 1500 pages indexed.
When I look at the indexed links on Google, many appear to be duplicates of the same page, but because of the terrible URI structure used on the old website, Google treats them differently.
For example, the 'Map' page is indexed at least twice on Google, under the following 2 URLs:
www.website.com/frame_page-map.html?mp_session=iris7k85851j05q55piqci31u3&mp_session=iris7k85851j05q55piqci31u3?page_code=map&mp_session=iris7k85851j05q55piqci31u3&mp_session=iris7k85851j05q55piqci31u3
www.website.com/frame_page-map.html?mp_session=sel6m8j5cu8lulep4dqa32sne7&mp_session=sel6m8j5cu8lulep4dqa32sne7?page_code=map&mp_session=sel6m8j5cu8lulep4dqa32sne7&mp_session=sel6m8j5cu8lulep4dqa32sne7
Only the session name is different in the URL (and I have no idea why it is repeated four times in a single URL, either).
For reference, the replacement URL for this page is:
www.website.com/contact/map
My question is: How do I setup a redirect for these multiple records on Google? Do I simply set-up the redirect for the old URL minus all of the URI parameters (i.e. www.website.com/frame_page-map.html) or is there another better method to do this?
Thanks for any help you might be able to offer!

It depends on what your goals are. If you don't care about the querystrings then setup a 301 (permanent redirect) that points to just your root page - map.html. To prevent google from indexing querystring params as separate pages use the canonical tag and have it reference the parent. This isn't guaranteed to work, but google takes your canonical into consideration when indexing.
If you care about the querystring values then you will have to setup a redirect for each one. There is a querystring parameter that you can append to your redirects that will tell it to be ignored so you don't have to write a regex that detects it.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse