What is the correct hierarchical schema.org markup for a category page of articles? - schema.org

I am working on a section of a website that is a combination of 'how-to' articles and 'faq' articles. When excerpted groups of those articles are displayed in a list by category I am not sure what schema to use for the container and the individual articles. Blog and BlogPosting is for blogs and this is not a blog. The articles are not dated or in chronological order. So I am thinking each one is either 'CreativeWork' or 'Article'. But I am not sure what the container's schema should be when they are displayed in excerpted groups or categories.
Edit:
Just to clarify.
Here's a simple version of my markup:
<div itemscope="" itemtype=" ??????? ">
<article itemscope itemtype="http://schema.org/Article"></article>
<article itemscope itemtype="http://schema.org/Article"></article>
<article itemscope itemtype="http://schema.org/Article"></article>
</div>

Firstly don't get too hung up on lists they are not as important [to the machine] as you think.
Try identifying the categories as well as you can. Not having a view into your world, I'll pick a couple of random categories "World War II" & "Europe". Create a page for each (quite possibly these could be your current list pages) and add the Schema.org specific to the category term itself.
{
"#context": "http://schema.org",
"#type": ["Place","DefinedTerm"],
"#id": "http://example.com/concepts/europe",
"name": "Europe",
"sameAs": "http://www.wikidata.org/entity/Q46",
....
{
"#context": "http://schema.org",
"#type":"DefinedTerm",
"#id": "http://example.com/concepts/wwii",
"name": "World War II",
"sameAs": "http://www.wikidata.org/entity/Q362",
....
Then for your articles use the "about" property to reference them to the categories:
{
"#context": "http://schema.org",
"#type":"Article",
"#id": "http://example.com/articles/A123",
"name": "World War II in Europe",
"about": ["http://example.com/concepts/wwii",
"http://example.com/concepts/europe"],
.....
That in theory is all you need to do for the crawler, which should have crawled all your pages, to understand you articles and what they are about.
If you want to be a bit more explicit, on the category pages you could add in the reverse relationships using the subjectOf property:
"subjectOf": ["http://example.com/articles/A123",
"http://example.com/articles/A033"],
Lists of things are of more use to humans, whereas in the machines (eg. The Knowledge Graph) they can work it out from the relationship information you provide.

Related

copyright information for image repositories (schema.org, json-ld)

This is not exactly a technical question, it's about understanding the elements of CreativeWork and ImageObject in schema.org.
When creating structured data for an image, there are some attributes I can use to convey the copyright information, notably:
creator
copyrightHolder
license
This looks quite straightforward for me (even considering the legal differences of "copyright" in the Anglo-American and European worlds).
What I'm looking for is how to include the copyright information from pictures bought on image repositories like iStock. The information provided looks like: istock.com ©acreator ID-000000000.
How is this information meant to be used with schema.org?
As copyrightHolder is meant to be a Person or Organization, using the string 'istock.com ©someone' as the person's or organisation's name doesn't seem to be the right thing to do.
So I came up with this JSON-LD code:
[
{
"#context": "https://schema.org",
"#type": "ImageObject",
"contentUrl": "https://path/to/my/image",
"copyrightHolder": {
"name": "istock.com"
},
"creator": {
"name": "someone"
},
"copyrightNotice": "istock.com ©acreator ID-000000000",
"license": "https://www.istockphoto.com/en/legal/license-agreement"
}
]
I'm still not sure where to put the ID of the image, and also I'm not happy using the id on istock.com as "name" attribute for "creator".
I could also only use the copyrightNotice and not use "creator" and "copyrightHolder" but I'm not sure if I'd meet the legal requirement then.
The schema.org information not really related to any legal requirement issues (It is vocabulary for search engines).
The credit should be visible to the user (With -or- Without schema) if the license requires adding a credit (Who is the copyrightHolder VS creator also not related to schema.org).
More important:
Don't mark up content that is not visible to readers of the page. Google Guidelines
So only add structured data that visible to the users. I guess you don't have in your web design clickable text to istock license (So you don't need to add this).
Even creditText could be enough.
Example (The markup visible to readers):
Basic example - Withtout
<img src="italy-beach.jpg" alt="italy beach."/>
By Jane Doe
copyrights istock.com. Author Leonardo
Basic example - With
<div itemscope itemtype="https://schema.org/ImageObject">
<img src="italy-beach.jpg"
alt="italy beach."
itemprop="contentUrl" />
By <span itemprop="author">Jane Doe</span>
copyrights <span itemscope itemtype="https://schema.org/Organization" itemprop="copyrightHolder"><span itemprop="name">istock.com</span> Author: <span>Leonardo</span>
</div>
image id
You can use https://schema.org/identifier.

How should schema.org relational properties be interpreted?

I am currently looking into schema.org to use it with API platform, but there are certain properties that I don't understand.
Let's take https://schema.org/Organization for example:
A Thing (and this case an Organization) has properties, like a name and an address. Now what I don't understand is the property department. However in real life an organization doesn't have just a single department; it has several at least.
Shouldn't that property be oneToMany?
Or do I not understand it and does department link to the parent company, which makes the child Organization (the one with the department property) a department? But if that was to be the case I'd think there would be a Department object instead (extending from the Organization object).
When I define this property in my API Platform's schema.yaml, it expects a single value, just like I would expect from the schema.org's documentation.
Am I missing something? Can someone please explain how I should interpret and use such properties?
Edit: I found out that API Platform expects every property to have a single value, unless specified otherwise. So I have to setup the department property to be oneToMany.
That combined with the great explanation below (the accepted answer) explains it all.
All Schema.org properties can have multiple values. Usually it doesn’t make sense for every property (e.g., birthDate), but it’s possible anyway.
For the department property, the domain (the item which has this property) is the parent organization, and the range (the item which is the value of this property) is the department. In cases like this, where the domain and range expect the same types, you have to interpret the textual definition to make sure for which "direction" the property is intended.
(If, for some reason, you can’t provide multiple values for a property, note that you can use every Schema.org property in the other direction, too, even if no inverse property is defined.)
Examples
An organization (#1) has two departments (#2, #3).
JSON-LD
Using an array ([]):
{
"#context": "http://schema.org/",
"#type": "Organization",
"#id": "#1",
"department": [
{
"#type": "Organization",
"#id": "#2"
},
{
"#type": "Organization",
"#id": "#3"
}
]
}
Microdata
Repeating the property:
<div itemscope itemtype="http://schema.org/Organization" itemid="#1">
<div itemprop="department" itemscope itemtype="http://schema.org/Organization" itemid="#2"></div>
<div itemprop="department" itemscope itemtype="http://schema.org/Organization" itemid="#3"></div>
</div>
RDFa
Repeating the property:
<div typeof="schema:Organization" resource="#1">
<div property="schema:department" typeof="schema:Organization" resource="#2"></div>
<div property="schema:department" typeof="schema:Organization" resource="#3"></div>
</div>

Schema.org practices for small company: 'Organization' and 'WebSite' in JSON-LD on every page, Microdata for everything else

I'm wondering how to build my Schema.org. I'm using mixed approach with both JSON-LD and Microdata elements. I don't use them to describe one thing in 2 different ways. I need some guidelines about what to include.
For now I have description of our company on every page:
<script type="application/ld+json">
{
"#context" : "http://schema.org",
"#type" : "Organization",
"url" : "https://our.url",
"logo" : "https://our.url/logo2.svg",
"contactPoint" : [{
"#type" : "ContactPoint",
"telephone" : "",
"contactType" : "Customer Service"
}],
"sameAs" :[],
"name" : "Our Small Company"
}
</script>
Than I have a small description of our webpage again in JSON-LD:
<script type="application/ld+json">
{
"#context" : "http://schema.org",
"#type" : "WebSite",
"url" : "http://our.url",
"potentialAction" : {
"#type" : "SearchAction",
"target" : "http://our.url/search",
"query-input" : "required name=search_term_string"
}
}
</script>
And from here after I have Microdata for all elements. For example search results are ItemList with products, etc.
Does this seem Ok? Should I include JSON-LD company description on every page or only on the home page or not at all? Do I need to dig down and provide more specific description for every page (for example search page could be SearchResultsPage instead of WebSite)?
Providing some data in JSON-LD and some data in Microdata should be fine (but if both were about the same entities, you should denote this explicitly). It can become problematic if you want to connect the entities, though.
Connecting WebSite and Organization
Speaking of connecting entities, I would recommend to do this for your WebSite and Organization items. For example, you could state that your Organization is the publisher of the WebSite, and/or that the WebSite is about the Organization.
There are two ways how to achieve this in JSON-LD:
use one script element and embed the Organization node as value
keep both script elements (or one script element with #graph), give each node a URI (with #id) and reference these URIs as values
The former probably has better consumer support, the latter makes it more suitable to provide multiple properties (e.g., author and publisher) without having to duplicate the whole data (but you could use a mixed way, too).
Example for the former way:
<script type="application/ld+json">
{
"#context" : "http://schema.org",
"#type" : "WebSite",
"publisher" : {
"#type" : "Organization"
}
}
</script>
Example for the latter way:
<script type="application/ld+json">
{
"#context" : "http://schema.org",
"#type" : "Organization",
"#id" : "/#org"
}
</script>
<script type="application/ld+json">
{
"#context" : "http://schema.org",
"#type" : "WebSite",
"publisher" : {"#id": "/#org"},
"about" : {"#id": "/#org"},
"mainEntity" : {"#id": "/#org"},
"author" : {"#id": "/#org"}
}
</script>
(where /#org is the URI that represents the organization itself, not just a page/site about or of the organization)
Providing WebPage
You can provide a WebPage item for each page. It can be helpful in many situations. But like it’s the case with any other type, too, there is no requirement whatsoever.
If you want to provide such an item, using the more specific types (like SearchResultsPage) where applicable is of course preferable. But if that’s not possible, using WebPage everywhere is way better than not providing it all.
In your case, you would have to decide in which syntax to provide it. JSON-LD would allow you to provide it as hasPart of the WebSite according to the former way, as explained above. But that would make it hard to connect the WebPage with your page’s main entity (which you specify in Microdata) via the mainEntity property. As I think this is an important relation, I would specify the WebPage in Microdata and connect the WebSite and the WebPage via URIs.
You could do this from the JSON-LD WebSite node with:
"hasPart" : {"#id": "/current-page.html"}
(You could also do this from the WebPage Microdata with the inverse property isPartOf, but then you’d have to provide an #id for the WebSite.)
Having the WebPage in Microdata, e.g., on the body element, it allows you to provide the mainEntity property:
<body itemscope itemtype="http://schema.org/WebPage">
<article itemprop="mainEntity" itemscope itemtype="http://schema.org/Article">
<!-- for an article that is the main content of the page -->
</article>
</body>
<body itemscope itemtype="http://schema.org/SearchResultsPage">
<ul itemprop="mainEntity" itemscope itemtype="http://schema.org/ItemList">
<!-- for a search result list that is the main content of the page -->
</ul>
</body>
Connecting WebPage and Organization
If you prefer, you could explicitly state that the Organization is the publisher/author/etc. of the WebPage, too:
<link itemprop="author publisher" href="/#org" />
(It could be deduced because you state this for the WebSite and every WebPage is connected via hasPart, but this is probably too advanced for many consumers, so stating it explicitly could help.)

Google does not correctly merge microdata and json+ld in the same page using same URI id

I have a product page with "microdata" and "json+ld" codes. Both of the codes refers to the same #id URI object (http://www.example.org/product#this) so I would expect to "mix/merge" both properties, but instead structured data testing tool shows 2 "individual" products so....
1- Does Google support using two syntax in the same page?
2- Is this well implemented? Can I refer two codes to the same object using itemId for microdata and #id for json+ld?
3- Can this damage my page in terms of structure data indexing?
thanks
You can check it out using this code in test tool:
<div itemscope itemtype="http://schema.org/Product" itemid="http://www.example.org/product#this">
<a itemprop="url" href="http://www.example.org/product">
<div itemprop="name"><strong>Product Name</strong></div></a>
<div itemprop="description">Product Description</div>
<div itemprop="brand" itemscope itemtype="http://schema.org/Organization"><span itemprop="name">Product Brand</span></div>
<div itemprop="offers" itemscope itemtype="http://schema.org/Offer"> <span itemprop="price">100</span><link itemprop="itemCondition" href="http://schema.org/NewCondition" /> New</div>
</div>
<script type="application/ld+json">
{
"#context": "http://schema.org/",
"#id": "http://www.example.org/product#this",
"name": "Product Name",
"#type": "Product",
"image": "http://www.example.com/anvil_executive.jpg",
"mpn": "925872",
"brand": {
"#type": "Thing",
"name": "ACME"
},
"offers": {
"#type": "Offer",
"priceCurrency": "USD",
"price": "119.99",
"itemCondition": "http://schema.org/UsedCondition",
"availability": "http://schema.org/InStock"
}
}
</script>
My guess would be that Google’s Structured Data Testing Tool doesn’t support this for different syntaxes, as it seems to work if using the same syntax. But as they still display the URIs correctly (http://www.example.org/product#this in both cases), you could argue that it’s just the tool’s interface that doesn’t merge them.
However, as far as I know Google does not document to support these subject URIs anyway (but this doesn’t necessarily mean that they don’t support it), so it might not matter for them.
Your example works fine if using http://linter.structured-data.org/: it creates one item with both brands and both offers.
While technically it is feasible to merge data coming from two different syntaxes (read microdata and json-ld) and the Structured Data Linter confirms so, Google does not support it, which means properties won't be merged (and won't satisfy Rich Snippets' requirements).
We have a final confirmation by several actors in the SEO World, including Dan Brickley and Jarno van Driel.
in general you can use both syntaxes side by side, but you won't get
the fine-grained merging of triples by ID that a pure RDF application
might expect (Dan Brickley on Twitter, Jan 14th, 2020, bold mine)
--
I think #danbri already was pretty clear. Highly doubt you'll get a
different answer from other Googlers. (Jarno van Driel on Twitter, Jan 14th, 2020)
The proposed solution so far is to parse the microdata and publish it as JSON-LD.

Any example implementation JSON-LD document on a web page

I need to look at any document that has an embedded JSON-LD object in the HTML.
I have embedded a JSON-LD object on my architecture. I have context from 2 sources schema.org and my custom set of vocab with is further negotiated with the scheam.org reference which is www.mysitename/vocab.
Here I have provided a download link with .jsonld file extension. I need to see a reference from any HTML document that has embedded a JSON-LD object in it to get an idea about it.
Below is the embedded JSON-LD object in the source-code of my HTML document:
<script type="application/ld+json">
{
"#context": [
"http://schema.org/",
"http://puneet.ys/vocab"
],
"#id": "http://puneet.ys/seahawks",
"#type": "SportsTeam",
"name": "Seattle Seahawks",
"url": "http://puneet.ys/seahawks",
"image": "/2011/12/08/35/team/222398/large.jpg",
"interactionCount": "124 UserLikes",
"logo": "/2011/12/08/35/team/222398/large.jpg",
"description": "The Seattle Seahawks are […]",
"discipline": "http://puneet.ys/sport/football",
"subOrganizationOf": "http://puneet.ys/company/nfl",
"location": {
"#id": "http://puneet.ys/seattle-wa",
"#type": "Place",
"name": "Seattle, WA",
"url": "http://puneet.ys/seattle-wa",
"image": "/2011/12/08/35/city/28545/large.jpg",
"interactionCount": "25 UserLikes",
"photo": "/2011/12/08/35/city/28545/large.jpg",
"sameAs": [
"http://www.freebase.com/m/0d9jr",
"http://en.wikipedia.org/wiki/Seattle",
"http://www.seattle.gov/"
]
},
"claimedBy": "http://puneet.ys/fan/chris-mccoy",
"sameAs": [
"http://www.freebase.com/m/070xg",
"http://www.facebook.com/Seahawks",
"http://www.twitter.com/seahawks",
"http://en.wikipedia.org/wiki/Seattle_Seahawks",
"http://www.seahawks.com"
]
} </script>
In this Google Webmaster Tools Answer it says:
The data, enclosed within the <script type="application/ld+json"> ... </script> tags as shown in the examples below may be placed in either the <HEAD> or <BODY> region of the page. Either way, it won’t affect how your document appears in users’ web browsers.
I wrapped your script above in <html><header> tags and it worked with the Google Email Markup-Tester, although for some reason the Google Webmaster Structured Data Testing Tool doesn't pick up JSON-LD yet.
You can also use Google's Structured Data Markup Helper, enter some dummy URL or HTML, click "Start Tagging" (blue button), add at least one tag, then "Create HTML" (red button) and finally change the "Microdata"-dropdown to "JSON-LD".