Octopress, github pages, CNAME domain and google website search - github

My blog was successfully transferred to octopress and github-pages. My problem though is that website's search uses google search but the result of 'search' as you can see, are pointing to the old (wordpress) links. Now these links have change structure, following default octopress structure.
I don't understand why this is happening. Is it possible for google to have stored in it's DB the old links (my blog was 1st page for some searches, but gathered just 3.000 hits / month... not much by internet's standards) and this will change with time, or is it something I'm able to change somehow?
thanks.

1.You can wait for Google to crawl and re-index your
pages, or you can use the URL Removal Request tool
to expedite removal of old pages from the index.
http://www.google.com/support/webmasters/bin/answer.py?answer=61062
According to that page, the removal process
"usually takes 3-5 business days."
Consider submitting a Sitemap:
http://www.google.com/support/webmasters/bin/answer.py?answer=40318
click here to resubmit your sitemap.
More information about Sitemaps:
http://www.google.com/support/webmasters/bin/answer.py?answer=34575
http://www.google.com/support/webmasters/bin/topic.py?topic=8467
http://www.google.com/support/webmasters/bin/topic.py?topic=8477
https://www.google.com/webmasters/tools/docs/en/protocol.html
2.Perhaps your company might consider the
Google Mini? You could set up the Mini to
crawl the site every night or even 'continuously'.
http://www.google.com/enterprise/mini/
According to the US pricing page,
the Mini currently starts at $1995 for a
50,000-document license with a year of support.
Here is the Google Mini discussion group:
http://groups.google.com/group/Google-Mini
http://www.google.com/enterprise/hosted_vs_appliance.html
(Click: "show all descriptions")
http://www.google.com/support/mini/
(Google Mini detailed FAQ)

Related

GA landing pages (not set), but URI is known AND organic traffic down, direct/none up?

I need help troubleshooting 2 main issues with our Google Analytics data. Both started occurring around May 5, 2020. I've worked through troubleshooting recommendations in few blog posts but have had no luck. Can anyone point me in the right direction for how to troubleshoot these issues?
Organic traffic has dropped considerably on our /blog/ pages while direct/none traffic has increased. When I check Google Search Console's organic search data, I see numbers that reflect the organic + direct/none traffic in google analytics.
When I look at the landing page report, there is a huge increase in (not set) landing pages on our blog. I saw 14,000% and 26,000% increases... Our overall landing page traffic is down by 15%. Weirdly, the URI is known, but the landing page is (not set)...?
Please check out this video to see the data in GA - http://m.bixel1.net/jxe9ei
One potential cause is that we have a homepage redirect for anyone using chrome. The redirect goes from / to /c/ for anyone on chrome and is hard coded. We've been testing this since the beginning of the year and we switched the test to serve to 100% of chrome visitors on March 26, 2020. Could this possibly be causing our traffic issues?
The redirect certainly creates anomalies (consider that when landing on the page both / and /c/ are tracked).
I noticed, for example by accessing a blog page from google, that the pageview has no referrer while the events sent after 30 seconds have it (and it is google.com).
Check your configuration in Google Tag Manager if there is any strange setting on the referral or something configured that can interfere with the referrer.
In any case, this (referrer) is surely the reason why you have the (not set).

Different Google Index Information

More or less three month ago, I launched my own website. On the first day, I also verified my website for the Google Webmaster Tools, combined them with the Google Analytics Account and submitted a sitemap index file linked to five sitemap files.
But till now, I receive different Google Index Status information:
In Webmaster Tools:
Menu: Crawl -> Sitemaps: 123,861 Urls submitted, 64,313 Urls indexed
Menu: Google Index -> Index Status: 65,375 Urls indexed
When I type in google.de: “site:www.mysite.de”, then I receive 103,000 results.
When I check my website with push2check.net, I receive 110,000 Urls in Google Index.
What is wrong there? I understand that’s impossible for Google to deliver the accurate date because of the distributed processing and the result also depends on the location, where you searching from and so on. But between 65,000 and 110,000 is a huge gap. What’s the reason?
Thanks in advance!
Toby
google.de: “site:www.mysite.de”
You can search this type then google view your site all pages display index by Google.
And
Only Search
push2check.net
Then google View all result display when your website link have.
Then both result is different.

Rich Snippet not showing in Google Search result page

About a month ago we implemented Rich Snippets on the product detail pages for our e-commerce site (example).
We used the http://schema.org/ syntax for the structured data, as it seems to be the route Google are taking moving forward.
The data appears to be correct in the Rich Snippet Testing Tool and the data has started to appear in Google Webmaster Tools.
However the data is still to be seen on the SERP.
We have followed the rich data guide on Google to the letter and still no results. Is this a case of just waiting?
Here is an additional piece of information that is making it all the more puzzling, we initially went with a Microformats implementation and within 24 hours the data started showing up on the SERP. However we moved away from this because the Schema.org approach seemed a better bet.
I suppose it is one of the reasons explained in my Wiki post at
http://wiki.goodrelations-vocabulary.org/FFAQ#Why_is_Google_not_showing_rich_snippets_for_my_pages.3F
While that one refers to GoodRelations markup, the situation should be the same for schema.org.
Martin
Quote:
If you have added GoodRelations (manually or via a shop extension module) to your shop and still do not get rich snippets in Google search results, this can have one of the following reasons:
Google has not yet re-crawled your page or pages. Google dedicates just a limited amount of crawling time to a site, depending on its global relevance. It may be that Google has simply not yet re-indexed your page. Wait 2 - 8 weeks ;-)
The markup is invalid. Try the Google Validator. If that shows a rich snippet in the preview, you may just have to wait 4 - 12 weeks until Google will notice and white-list your pages. If it does not show a rich snippet, you either do not have valid GoodRelations markup in the page, you are missing properties that Google requires (e.g. gr:validThrough for prices), the price of the item has expired, or you use markup for which Google does not show rich snippets. Currently, Google shows snippets only for products and offers.
Google cannot see that your page changed. Your XML sitemap (http://example.com/sitemap.xml or similar) does not contain a lastmod attribute or the lastmod attribute was not updated after you added GoodRelations/schema.org. This attribute is important for crawlers to notice which pages need to be reindexed.
Low ranking of your item pages. Your item pages have a low ranking and what you see in your Google results are category pages or other pages summarizing multiple items. GoodRelations shop extensions add markup only to the "deep" item pages, because those are best for rich snippets. Use the title / product name of one of your products and restrict the Google search to your site with the additional statement site:www.example.com.

Domain blocked and no data scraped

I recently purchased the domain www.iacro.dk from UnoEuro and installed WordPress planning to integrate blogging with Facebook. However, I cannot even get to share a link to the domain.
When I try to share any link on my timeline, it gives the error "The content you're trying to share includes a link that's been blocked for being spammy or unsafe: iacro.dk". Searching, I came across Sucuri SiteCheck which showed that McAfee TrustedSource had marked the site as having malicious content. Strange considering that I just bought it, it contains nothing but WordPress and I can't find any previous history of ownership. But I got McAfee to reclassify it and it now shows up green at SiteCheck. However, now a few days later, Facebook still blocks it. Clicking the "let us know" link in the FB block dialog got me to a "Blocked from Adding Content" form that I submitted, but this just triggered a confirmation mail stating that individual issues are not processed.
I then noticed the same behavior as here and here: When I type in any iacro.dk link on my Timeline it generates a blank preview with "(No Title)". It doesn't matter if it's the front page, a htm document or even an image - nothing is returned. So I tried the debugger which returns the very generic "Error Parsing URL: Error parsing input URL, no data was scraped.". Searching on this site, a lot of people suggest that missing "og:" tags might cause no scraping. I installed a WP plugin for that and verified tag generation, but nothing changed. And since FB can't even scrape plain htm / jpg from the domain, I assume tags can be ruled out.
Here someone suggests 301 Redirects being a problem, but I haven't set up redirection - I don't even have a .htaccess file.
So, my questions are: Is this all because of the domain being marked as "spammy"? If so, how can I get the FB ban lifted? However, I have seen examples of other "spammy" sites where the preview is being generated just fine, e.g. http://dagbok.nu described in this question. So if the blacklist is not the only problem, what else is wrong?
This is driving me nuts so thanks a lot in advance!
I don't know the details, but it is a problem that facebook has with web sites hosted on shared servers, i.e. the server hosting your web site also hosts a number of other web sites.

Removing crawing from search engine on my login page

I have a login page (login.aspx) that is currently indexed in google when somebody does a search.
I have created a robots.txt file with the following:
User-agent: *
Disallow: /login.aspx
My question is how long will it take effect to where my login.aspx page will no longer be indexed by google. Is there anything else necessary to tell Google not to index my login page?
It could take up to 90 days before the index is removed from google database but realistic a week or two to update. You could also ask google to remove that page on Webmaster Tools but will work the same way as the crawler.
You might also want to log in to Google Webmaster tools and use the "Remove URL" feature from Site Configuration/crawler access and also increase the crawling speed from Site Configuration/Settings . This might help accelerate the removal of the URL.