How to ignore Microdata due to JSON-LD? - schema.org

We are using a theme which unfortunately relies on Microdata like: <div itemscope itemtype="http://schema.org/Article">
We would like to use JSON-LD instead, however, a theme is constantly updated by the company which created it, and updating it after Microdata removal would take too much time and labor. I wondered if there is a tag which would say "ignore Microdata", so it could stay as it is and we could include our JSON-LD snippet without modifying a whole template.

There is no way to convey that the Microdata should be ignored.
In the ideal case, you would give the Microdata and the JSON-LD items that are about the same thing the same URI (itemid in Microdata, #id in JSON-LD).
<div itemscope itemtype="http://schema.org/Article" itemid="#the-article">
</div>
<script type="application/ld+json">
{
"#context": "http://schema.org",
"#type": "Article",
"#id": "#the-article"
}
</script>
That way, supporting consumers can learn that these items describe the same thing, i.e., there are not two articles, only one, and properties added to one item are also relevant for the other item.
If that’s not possible, you could try to "destroy" the Microdata without making the document invalid. You could do this with a script, after each release of a new version of the theme. Simply remove every itemtype attribute. Your document will still keep the Microdata, but it’s no longer using a vocabulary, so the structured data will likely not be re-used.

Related

Schema dot org and alternate languages

I have schema dot org markup on my website. But I also have an alternate language; each of my pages has a French version in a different page with proper hreflang tags.
Google's instructions don't really mention different languages, neither does schema dot org. For example, I have an "Organization" schema set up on the homepage. Do I need to translate it on the French homepage or leave it in English, and if so, do I change the URL to point to the French homepage as well? Wouldn't this cause Google to think there are two different organizations? Same question would apply to schemas like "Product".
hreflang not directly related to schema.org (That's why you didn't find any references on google/schema.org).
Schema.org is a set of extensible schemas that enables webmasters to
embed structured data on their web pages for use by search engines and
other applications. https://schema.org/
VS
Hreflang specifies the language and optional geographic restrictions
for a document. Hreflang - Google Support. The hreflang attribute on each page should include a reference to itself as well as to all the pages that serve as alternates for it https://moz.com/learn/seo/hreflang-tag.
Two pages example
**microdata (Same idea for JSON-LD). And the same idea to any schema.
Your English version
/en/about
<div itemscope itemtype="http://schema.org/LocalBusiness">
<h1><span itemprop="name">Hello World</span></h1>
<p itemprop="description">A superb collection of fine gifts and clothing
</div>
hreflang:
<link rel="alternate" href="http://example.com/en/about" hreflang="en" />
<link rel="alternate" href="http://example.com/fr/about" hreflang="fr-fr" />
Your French version
/fr/about
<div itemscope itemtype="http://schema.org/LocalBusiness">
<h1><span itemprop="name">Bonjour le monde</span></h1>
<p itemprop="description">Une superbe collection de beaux cadeaux et vêtements
</div>
hreflang:
<link rel="alternate" href="http://example.com/en/about" hreflang="en" />
<link rel="alternate" href="http://example.com/fr/about" hreflang="fr-fr" />
itemprop="name" above give extra semantic data about your LocalBusiness - each page use another language (Specify by Hreflang).
One of google guideline is:
Don't mark up content that is not visible to readers of the page. For
example, if the JSON-LD markup describes a performer, the HTML body
should describe that same performer. https://developers.google.com/search/docs/guides/sd-policies
Not official google answer about this topic - but its better to translate the JSON-LD data as well. By Wordpress or other CMS, it should be easy to pull the data.
Anyway, JSON-LD not related to site indexing (like hreflang -or- canonical). There is no need to change a URL because of a schema. You find reports (status/errors/rich results) about your schema under google search console - docs her.
Live example (From nike site):
English schema (rich card preview):
Data Testing tool
French schema (rich card preview):
Data testing tool
Follow the structured data guidelines of Google requires:
Relevance
Your structured data should be a true representation of the
page content.
as well as further
Location
Put the structured data on the page that it describes, unless
specified otherwise by the documentation. If you have duplicate pages
for the same content, we recommend placing the same structured data on
all page duplicates, not just on the canonical page.
Thus, if the information on your home page has a separate web page with duplicate content in French, then using structured data, you MUST set the content for data in French.
This is completely justified in terms of semantics. Google uses structured data to search for entities with API Google Knowledge Graph, for rich search results, for voice search, for machine learning. It is obvious that users using French in a web search is willing and will receive search results in French.

RDFa OfferCatalog Syntax

I have been trying to find the best way to link two items together using RDFa, specifically linking a Person to multiple SoftwareApplication entries.
The way I currently do this on the author page is:
<div class="container text-center" vocab="http://schema.org/" typeof="Person">
...
<span property="hasOfferCatalog" typeof="OfferCatalog">
<meta property="numberOfItems" content="10" />
<span property="itemListElement" typeof="CreativeWork">
<meta property="name" content="Project Name" />
<meta property="url" content="https://www.my-domain.tld/ProjectName/" />
</span>
...
As above the project is actually a SoftwareApplication, and the URL has a complete RDFa/Schema.org definition of it, but if i put:
typeof="SoftwareApplication"
on the author's page then, kind of expectedly, Google's Structured Markup validator throws errors about required values not being present for it, CreativeWork throws no errors but is less specific. I don't really want to repeat the entire SoftwareApplication metadata everywhere the project is referenced, I'd rather just say "go look at this URL".
What is the correct/best way to cross reference the SoftwareApplication pages from the author page? in the project the reverse reference is easy as there is an Author attribute, which can be of type Person, which is acceptable with just name and URL.
Once I know the correct RDFa way of referencing I'll apply the tags to content in the page rather than using meta tags.
To link items together, you need a suitable property. Like author (to state which Person is the creator of the SoftwareApplication), or like hasOfferCatalog (to state which SoftwareApplication is offered by the Person).
Inverse properties
In most cases, Schema.org defines its properties only for one direction. So there is only author, and no authorOf. If you need the property for the other direction, you can use RDFa’s rev attribute.
Linking instead of repeating
If you don’t want to repeat your data (i.e., only define it once and link/refer to this definition instead), you can provide a URL value. Schema.org allows this for all properties, even if URL is not listed as expected type. If you want to follow Semantic Web best practices, give your entities URLs (as identifiers) with RDFa’s resource attribute, and use these URLs as property values to refer to the entities.
For this, simply use one of the linking elements (e.g., elements with href or src attribute).
Example
Using the author case as example:
<!-- on the page about the software: /software/5 -->
<div typeof="schema:SoftwareApplication" resource="/software/5#this">
Author:
<a property="schema:author" typeof="schema:Person" href="/persons/alice#i">Alice</a>
</div>
<!-- on the page about the person: /persons/alice -->
<div typeof="schema:Person" resource="/persons/alice#i">
Authored by:
<a rev="schema:author" typeof="schema:SoftwareApplication" href="/software/5#this">Software 5</a>
</div>
Errors in Google’s SDTT
If the Structured Data Testing Tool gives errors about missing properties, note that it doesn’t mean that something is wrong with your markup. Schema.org never requires a property.
It just means that these properties are required for getting a certain Google search feature. So ignore these errors if you don’t want to get the feature (or if you can’t provide all required properties).
Thank you for the other response, I'll have a read over the linked resources, I have also found a solution to the specific case in my question.
Google Search Console has a page on Carousels which shows that you can use ListItem, which only "needs" URL, to populate the hasOfferCatalog property. E.g.
<span property="itemListElement" typeof="ListItem">
<meta property="position" content="1" />
<meta property="url" content="https://www.my-domain.tld/ProjectName/" />
</span>

Duplicate or link to WebSite JSON-LD?

I'm replacing the microdata (itemscope et al) on our sites with JSON-LD. Do I need to declare the WebSite on every page, or can I place it once on the home page?
If the latter, will processors (by which I mean Google) tie each page to it automatically via the domain name, or is there some way to link to it? Given that "Linked Data" is right there in the name, I've found no examples that make use of it. They all replicate or embed the data directly in the thing that's linking.
For example, I want to link to our YouTube videos that we embed in articles, but Google doesn't understand a URL for the video property. If I expand it into a VideoObject, Google complains that I don't know the width, height, duration, etc. All that data is on youtube.com at the URL I'm specifying. Why can't it pull the video information itself?
Do I need to declare the WebSite on every page, or can I place it once on the home page?
From the perspectives of Schema.org and Linked Data, it’s perfectly fine (and I would say it’s even the best practice) to provide an item only once, and reference it via its URI whenever it’s needed.
In JSON-LD, this can be done with #id. For example:
<!-- on the homepage -->
<script type="application/ld+json">
{
"#context": "http://schema.org",
"#type": "WebSite",
"#id": "http://example.com/#site",
"hasPart": {
"#type": "WebPage",
"#id": "http://example.com/"
}
}
</script>
<!-- on another page -->
<script type="application/ld+json">
{
"#context": "http://schema.org",
"#type": "WebPage",
"#id": "http://example.com/foobar",
"isPartOf": {"#id": "http://example.com/#site"}
}
</script>
Whether Google actually follows these references is not clear (as far as I know, it’s undocumented)¹. It’s clear that their testing tool doesn’t show the data from referenced URIs, but that doesn’t have to mean much. At least their testing tool displays the URI (as "ID") in case one is provided.
If you want to provide a URL value for the video property, note that URL is not one of its expected values. While Schema.org still allows this (any property can have a text or URL value), it’s likely that some consumers will handle only expected values. It’s also perfectly fine to provide a VideoObject value if you only provide a url property. The fact that Google’s testing tool gives errors doesn’t mean that something’s wrong; it just means that Google won’t consider this video for their video-related rich results.
¹ But for the few rich result features Google offers, authors would typically not need to reference something from another page anyway, I guess. Referencing of URIs is typically done for other Semantic Web and Linked Data cases.

Why does Google Testing Tool use the "id" attribute to generate a URL for the Microdata item?

I'm using some Microdata to describe a blog post, and I'm surprised by the value return for Schema.org’s BlogPosting by the Google Developers Testing Tool.
I would have expected it to be the itemprop url, not a merge of the website URL and the item id.
Am I doing something wrong, or is it only a Google display issue?
<div itemscope="itemscope"
itemprop="blogPost"
itemtype="http://schema.org/BlogPosting"
id="foobar">
<a itemprop="url" href="/realone">real</a>
</div>
Value returned by https://developers.google.com/structured-data/testing-tool/:
BlogPosting: http://www.example.com/foobar
url: http://www.example.com/realone
This is strange.
It’s definitely not conforming to the Microdata Note. Apart from Microdata’s itemref attribute, HTML5’s id attribute has no special meaning in Microdata.
If Google wants to use the id value anyway, they should at least generate the URL with a fragment identifier, i.e., http://www.example.com/#foobar.
My guess is that they are (probably unintentionally) handling HTML5’s id attribute the same way as Microdata’s itemid attribute. If using itemid instead of id in your example, Google’s Testing Tool output is the same, but this time correct.

Using Schema.org’s "url" property on a Product page without adding a visual link

After a bit of research I found recommendations as in:
<div itemscope itemtype="http://schema.org/Product">
<a itemprop="url" href="URLOFPRODUCT">Link</a>
</div>
But I am trying to avoid linking to the product, on the product page.
Another approach I've noticed is the use of meta tags but outside the head, which is a big 'no no'.
Any suggestions?
For providing a URL in Microdata, you must use "a URL property element". Currently these are:
a, area, audio, embed, iframe, img, link, object, source, track, and video.
a and link are the only "generic" elements from this set.
If you don’t want to provide a visual link (by using a), go with link (which is typically hidden in browser default stylesheets). This is not "a big 'no no'", as link elements are allowed in the body if used for Microdata.