How to remove duplicate pages from search engine - google-search-console

I have duplicate content from my home page.
In Google Webmasters they tell me that I have a problem with duplicate content:
For example:
www.example.com/page1/ www.example.com/page2/ www.example.com/page2/
How can I remove it?

What that page says is not that you have duplicate pages but that you have several pages with the same meta description:
Meta descriptions are HTML attributes that provide concise summaries of webpages. They commonly appear underneath the blue clickable links in a search engine results page.
Usually each page should have their own meta description that describes its content (that is the reason why Google warns you about duplicates), but sometimes it's OK that several pages share the same description.
For example, based in your screen shot your site appears to be about mobile phones, let's say that the duplicates with 2 pages are one for the summary of the phone and the other for the technical specifications (I'm supposing as I don't understand Arabic), the meta description of both pages could be similar but not exact because it should reflect that they cover different aspects of the phone (summary vs technical spec).
On the other hand the duplicate with 14 pages appears to be several pages from a product list, maybe phones with same tag, if this is correct then it's OK that all those pages have the same meta description, as they are just parts of the same topic split into several pages.

Related

Rich Snippets Questions

I am introducing rich snippets on my site and have some questions I can't find solution:
Do I need to put main company snippet only on mainpage or all pages (contacts, social networks, etc) - I mean copy the code on all links?
How do I do the beauty two columns snippets with the main site links, and how do I define what these main links are? - Example when we search Facebook we see: Facebook Login, Facebook Register, Facebook Profile, etc... all with a brief description below. Are there the separate pages that contain snippet and google identifies the most relevant? What code to put on each page?
If you are trying to add knowledge graph items you just need to markup your homepage only. For individual rich snippet items like star ratings, breadcrumbs etc you'll have to mark each page for them to show up for all of your search results.
Contact and Social profiles as you mentioned are knowledge graph items.
In the second part I am assuming you are referring to the links below the branded search result. The are not rich snippets but are rather called sitelinks and are generated if you have good on-site structure and internal linking. Sitelinks are picked by Google itself and you have little control over them.

Entity Property in URL and SEO

I have coded my ASP.NET MVC application in a way that allows stored entities to be retrieved via a friendly name in the URL, for example:
www.mysite.com/artists/james-brown/songs
Where james-brown is a URL friendly string stored on my Artist entity.
Now imagine I add an artist that no one has heard of before, and no one ever navigated to that artist's songs page.
How would Google/Yahoo/Other Search Engines know that my site does indeed have songs for that unknown artist.
Do I create a sitemap and maintain it through code as I add / remove artists?
There are few defined known ways to make the new links visible to search engine world.
XML and HTML Sitemap:
Add it to sitemap and submit it through webmaster tools.
HTML sitemaps are another way to achieve it. If your site has footer sitemap, you can add it to them.
Internal Links
Create internal links from your high ranking pages or highly crawled pages to the new pages. Google and other search engines tend to crawl pages where the content changes frequently. So if you have a refreshed content pages, try adding it to those pages and chances are high for those pages to be discovered quickly.
External Links
Create links from external blogs, company blogs and sites like pagetube.org which can help it to be discovered.
Yeah just add them to either sitemap, internal or even external links

Rich Snippet not showing in Google Search result page

About a month ago we implemented Rich Snippets on the product detail pages for our e-commerce site (example).
We used the http://schema.org/ syntax for the structured data, as it seems to be the route Google are taking moving forward.
The data appears to be correct in the Rich Snippet Testing Tool and the data has started to appear in Google Webmaster Tools.
However the data is still to be seen on the SERP.
We have followed the rich data guide on Google to the letter and still no results. Is this a case of just waiting?
Here is an additional piece of information that is making it all the more puzzling, we initially went with a Microformats implementation and within 24 hours the data started showing up on the SERP. However we moved away from this because the Schema.org approach seemed a better bet.
I suppose it is one of the reasons explained in my Wiki post at
http://wiki.goodrelations-vocabulary.org/FFAQ#Why_is_Google_not_showing_rich_snippets_for_my_pages.3F
While that one refers to GoodRelations markup, the situation should be the same for schema.org.
Martin
Quote:
If you have added GoodRelations (manually or via a shop extension module) to your shop and still do not get rich snippets in Google search results, this can have one of the following reasons:
Google has not yet re-crawled your page or pages. Google dedicates just a limited amount of crawling time to a site, depending on its global relevance. It may be that Google has simply not yet re-indexed your page. Wait 2 - 8 weeks ;-)
The markup is invalid. Try the Google Validator. If that shows a rich snippet in the preview, you may just have to wait 4 - 12 weeks until Google will notice and white-list your pages. If it does not show a rich snippet, you either do not have valid GoodRelations markup in the page, you are missing properties that Google requires (e.g. gr:validThrough for prices), the price of the item has expired, or you use markup for which Google does not show rich snippets. Currently, Google shows snippets only for products and offers.
Google cannot see that your page changed. Your XML sitemap (http://example.com/sitemap.xml or similar) does not contain a lastmod attribute or the lastmod attribute was not updated after you added GoodRelations/schema.org. This attribute is important for crawlers to notice which pages need to be reindexed.
Low ranking of your item pages. Your item pages have a low ranking and what you see in your Google results are category pages or other pages summarizing multiple items. GoodRelations shop extensions add markup only to the "deep" item pages, because those are best for rich snippets. Use the title / product name of one of your products and restrict the Google search to your site with the additional statement site:www.example.com.

From where does Google get the abstract for each of its site results, that it displays on its search result page?

I am working on a project in which i have to search for terms on a search engine and then cluster the results on their contextual sense. So i have to treat each result as a document. unfortunately, the data present along with each result on the result page is too little for clustering. Hence, I wanted to know from where the search engines get the abstract for each result that they show. If i could get that entire abstract then i can cluster the results by treating them as separate documents.
From where does google get the abstract ?
For eg: If you search for "1000 Mile" on google, the second result shows the following abstract:
"The women's 1000 Mile Collection is based on classic designs and reflects Wolverine's long heritage of crafting quality footwear. Complementing these classics ..."
This abstract is not present in the Meta tags of the page.
From where does Google find this data.
Thanks
From Does Google use the Meta Description Tag for Description of Page?
Google will choose your search results snippets from the following places (not necessarily in this order):
The page's Meta Description tag
The page's Open Directory Project (ODP) Listing
Page content relevant to the search query
If you do not want Google to use the ODP listing's description then you can tell them not to do so with the following Meta tag:
<meta name="robots" content="NOODP">
If you want to encourage Google to use your Meta Description tag then make sure it is unique to each page. Also make sure it contains an accurate description of the page's content.
In thew absence of an ODP description and Meta Description tag, Google will use a portion of the page's text as the description. This text will contain the closest matches to the search query. I have not seen any official limit to how long this can be but a couple of sentences seems about right.
On a related note, if you don't want a snippet to be shown with a particular page you can use the following Meta tag to prevent one from being shown:
<meta name="robots" content="nosnippet">
See this blog post for Google's tips on using the meta description tag.
According to this site, "The meta description should typically be at most 145 to 150 characters in length as these are the maximum number of characters typically displayed at Yahoo! and Google, respectively."
That site is Flash-based, and Google can index Flash content, so given that the snippet isn't in the HTML source of the page as you point out, nor is it in the cached version of the page, I'm guessing that it's somewhere in the Flash movie.
It's kind of arbitrary that the snippet mentions 'The women's 1000 Mile Collection' while the site link itself is to the parent category of 1000 mile, not just women's, so I'm guessing here that gathering snippet-friendly metadata from a Flash site is an imprecise science. That's my best guess.
In this Google Webmaster blog post, they explain how they use external text or HTML files loaded into the Flash movie, and in one of the comments Jonathan Simon says (sorry):
"We try our best to crawl Flash content but the results can sometimes be less than ideal. You are only seeing a title in the search results for your site because that's the only bit of HTML text that you have outside of your Flash content. You could add a Meta description element to offer more information in HTML. You could also add some other text that's not a part of your Flash content. Just doing this should improve the snippet you see associated with your site in the search results."

Linking to a Page that "contains" a specific Web Content Article in Liferay 6

I'm building a Portlet for a site powered by Liferay EE 6.0 SP1 that will suggest related or otherwise interesting content depending on what the user is currently looking at.
For example, suppose the user is on a Page that contains a Web Content Display portlet that is displaying Web Content Article 5. My portlet will contain HTML links to the Pages where the user can view Web Content Articles 6 and 7 (which contain content that is determined to be similar to the content in Web Content 5).
The problem comes in because I don't want my portlet to display HTML links to Web Content Articles 6 and 7 (assuming such a concept is even valid), I want my portlet to display links to the Pages on which those items are displayed (i.e., links to the Pages that contain Web Content Display portlets configured to show those Web Content Articles).
Is there a way to:
Associate a Web Content Article with a Page so that if I have the former, I can fetch the latter?
Or, determine the page(s) that contain portlets that display a Web Content Article?
Alternatively, if there were a way to get all portlet instances associated with a particular page, that might lead to a solution as well.
One approach to this problem appears to be to add a "Link to Page" control to the Web Content Article's Structure. Content managers can use this to create many-to-one relationships between Web Content Articles and Pages.
This solution is problematic, though, because there is no constraint on what page is selected when the Web Content is edited.
For example, a content manager might create a Web Content Article entitled "Our History", but specify the "Products" page as the value of that Article's "Link to Page" control. When the related content portlet renders the "Our History" Article, it will create a hyperlink to the "Products" page which in this case does not display the "Our History" Article anywhere.
Arguably, this could be considered a feature, but perhaps there is a better way to do it.
I afraid this is a feature that does not exist yet on Liferay. At least on Liferay pages there is an feature request on the very same topic. Dates on the discussion are on March 2011 so probably something is coming soon :)
Another solution that we are currently considering is to create a custom view mode for the portal (i.e., "VIEW", "PRINT", etc.) called "XML". When the portal detects that the browser is requesting the XML mode (similarly to how Sitecore detects which device to use), it bypasses the theme, and all portlets that support this XML mode would render their content in XML format.
The output might look something like this:
<?xml version="1.0" encoding="utf-8" ?>
<portal>
<portlet id="..." title="..." ...>
<JournalArticle>
<uuid_>...</uuid_>
...
</JournalArticle>
...
</portlet>
</portal>
A periodic process would then crawl the site in XML mode and update a Lucene index.
The obvious problem with this approach is that it requires that every portlet we use on the site be custom-developed. For various reasons (some would call it an over-ambitious creative department; I call it a significantly deficient existing feature set), we might end up having to go this route anyway.