Mixing vocabularies, Google's structured data testing tool and Schema.org extensions - schema.org

We are using several vocabularies along with schema.org and struggle with the structured data testing tool from Google. Is is even possible to completely pacify it when mixing vocabularies?
Some of the classes and properties we use are specializations of classes and properties of schema.org.
I have read the page about the extension mechanism. It is completely unclear to me what external extensions actually are. It is completely unclear to me if and how it is possible to communicate to Google that a class/property is a specialization of a schema.org class/property (so that Google uses RDFS reasoning to get statements involving the schema.org namespace).
The example I am using is http://www.netestate.de/imgtag_schema_example/lio.html
The RDFa in that page describes the image shown. The <img> tag in the source has a typeof attribute.
If I use typeof="lio:Image", I get 1 error about lio:Image not being known to Google. Makes sense. Validation URL: http&colon;//www.netestate.de/imgtag_schema_example/lio.html
If I use typeof="lio:Image schema:ImageObject", I get exactly the same error. Validation URL: http&colon;//www.netestate.de/imgtag_schema_example/lioschema.html
If I use typeof="schema:ImageObject", I get 19 errors about properties not recognized as compatible with ImageObject. Validation URL: http&colon;//www.netestate.de/imgtag_schema_example/schema.html
If I use typeof="schema:ImageObject lio:Image", I get 1 error about a class that is not known to Google (the class is not named but "ImageObject" is red!). Validation URL: http&colon;//www.netestate.de/imgtag_schema_example/schemalio.html
If I use typeof="lio:Image" and add the statement lio:Image rdfs:subClassOf schema:ImageObject to the RDFa, the validator separates the triples about http&colon;//purl.org/net/lio#Image ("class not defined, no errors") and the image (unknown class #__sid=rd0, 1 error). Validation URL: http&colon;//www.netestate.de/imgtag_schema_example/liosubclass.html
Where does the relative URI #__sid=rd0 come from?
Why is the error about #__sid=rd0 missing in this simpler example?
http://www.netestate.de/imgtag_schema_example/minimal.html

Don't let any Google Structured Data Testing Tool complaints about unknown vocabulary bother you. Its main purpose it to help publishers understand when they are using structures which Google products/features expect and use. Generally it will only understand the schema.org parts (and won't exploit subtypes to other vocabularies). You might find using the additionalType property helps make some errors go away. The __sid=rd0 ID is just a generated URI for what RDF would consider a 'blank node' in the graph.

Related

What is the difference between BasicHttpRequest and HttpGet, HttpPost, etc in Apache HTTP Client 4.3 ?

I am creating HTTP request using Apache HTTP Client version 4.3.4. I see there are some classes like HttpGet,... and there is also a class BasicHttpRequest. I am not sure which one to use.
Whats the difference and which one should be used in which condition ?
BasicHttpRequest is provided by the core library. As its name suggests it is pretty basic: it enforces no particular method name or type, nor does it attempt to validate the request URI. The URI parameter can be any arbitrary garbage. HttpClient will dutifully transmit it to server as is, if it is unable to parse it to a valid URI.
HttpUriRequest variety on the other hand will enforce specific method type and will require a valid URI. Another important feature is that HttpUriRequest can be aborted at any point of their execution.
You should always be using classes that implement HttpUriRequest per default.
I was just browsing the 4.3.6 javadoc attempting to locate your BasicHttpRequest and was unable to find it. Do you have a reference to the javadoc of this class?
I would be under the impression that BasicHttpRequest would be a base class providing operations and attributes common to more than one HttpRequest. It may be extremely generic for extension purposes.
To the first part of your question, use HttpGet, HttpPost etc for their specific operations. If you only need to HTTP/GET information then use HttpGet, if you need to post a form or document body, then use HttpPost. If you are attempting to use things like the Head, Put, Delete method, then use the correspoding HttpXXX class.

Which one is the correct approach for form validation ? Colander's Schema validation or Deform's form validation?

I have just started using Pyramid for one of my projects and I have a case where in I need to validate a form field input, by taking that form field value and making a web-service call to assert the value's correctness. Like for example there is a field called your bank's CUSTOMER-ID. I need to take that(alone) as input and validate at the server level by making a web-service call (like http://someotherdomain/validate_customer_id/?customer_id=<input_value>)lets say.
I am using Colander for form schema management and Deform for all form validation logic. I am confused about where I need to place my validation logic for the CUSTOMER-ID case. Is it at MySchema().bind(customer_id=<input_value>) (which has a deferred validator that queries the web-service) or something at the form.validate(request.POST.items()) ? If I take the deferred validator's path, then MySchema().bind is raising colander.Invalid error for incorrect CUSTOMER-ID. Thats fine. But that error is not at the form level but at the schema level. So how would I tell the user about this in a sane way ?
I have good experience with Django forms so I was expecting something like clean method. A form error like form['customer_id'].error is what I am expecting at the template level. Is it possible with Pyramid's Deform or with Colander ?
So I think the big problem you're having is understanding the separation of concerns of Colander and Deform. Colander is what people like to call a general schema validation library. Which means we define a schema, where each node has a particular data type and some nodes might be required/optional. Colander is then able to validate that schema, and tell us whether or no the data we passed to colander conforms to that schema. As an example, in my web apps, I am often building apis that accept GET/POST params that need to be validated. So in Pyramid, let's say I have this scenario:
request.POST = {
'post_id': 1,
'author_id': 1,
'unnecessary_attr': 'stuff'
}
I can then validate it like so:
# schema
schema = SchemaNode(Mapping(),
SchemaNode(Integer(), name='post_id'),
SchemaNode(Integer(), name='author_id'))
schema.deserialize(request.POST)
And it will error if it can't conform the data to the specified schema. So you can see, colander can actually be used to validate ANY set of data, whether that comes from POST/GET/JSON data. Deform on the other hand is a form library, and helps you create/validate forms. It uses colander for all of the validation needs and as you can see it pretty much just completely delegates validation to colander. So to answer your question, you would do all of your validation stuff in colander, and deform would mostly handle the rendering of your forms.
To see a vivid pyramid example application and deform in action look at todopyramid as a part of IndyPy Python Web Shootout. A todo application was implemented in pyramid, django, flask and bottle. I studied the pyramid example - it is well written, shows deform schema validation and uses bootstrap to show validation messages.
Find more pyramid tutorials here:

grave in the Go Language

After looking around for a while I was able to understand how the json: tags are used in the Go language. However two tags I have come across I'm still lost on, and can't seem to find documentation on it.
Both pertain to a REST api service and the full code can be found here-> code.google.com
What is the root: tag used for
gorest.RestService `root:"/orders-service/" consumes:"application/json" produces:"application/json"`
as well how does the method: tag work?
userDetails gorest.EndPoint `method:"GET" path:"/users/{Id:int}" output:"User"`
I didn't know if anyone had any links to a site or document that might explain this more, from the examples I can learn enough to use it. However, I would really like to fully understand it.
Thanks for your time!
Tags are nothing but strings, they don't have any meaning per-se.
Libraries can use reflection to introspect struct fields and interpret their tags. See reflect.StructTag.
In your case, gorest parses the following tags on Services:
root
consumes
produces
and these on Endpoints:
realm
method
path
output
input
role
postdata
Their meaning is described in gorest's documentation.
These are gorest tags. See gorest wiki http://code.google.com/p/gorest/wiki/GettingStarted

GWT - internationalization of entity properties

I'm looking for an elegant solution for the following problem:
In my database, I have some predefined(!) entities. These entities have names and descriptions (Strings). Around the data access layer, there are some EJBs containing business logic to load/search for/etc. those entities.
Now for the frontend, we are developing a GWT application which calls the EJB methods on our backend.
The problem is, that the name and the descriptions of the entities mentioned above must be internationalized - e.g., depending on the user's locale, an entity's description must be "My cool description" (English) or "Beschreibung bla" (German) or whatever :)
My first approach was to use a resource string in the database. So entity A has a description "descriptionA", entity B has a description "descriptionB"... Later on, the GWT app (or any other client) translates this resource string into the actual description using some kind of "resource bundle". E.g.:
*resources_en.properties*:
descriptionA=Actual Description of Entity A
descriptionB=Actual Description of Entity B
*resources_de.properties*:
descriptionA=Beschreibung A
descriptionB=Beschreibung B
(Remember, the entities are predefined, so it's possible to "know" all descriptions at compile time. BUT it would be better if the resource bundle could be enhanced without having to recompile the application).
Is this possible with GWT? How can I do this? Is it better to "translate" on the server or on the client side?
Otherwise, I've to deal with all that i18n stuff on the backend side. Well, this would allow to keep data together (instead of defining the descriptions on the client side). But the big drawback is that the backend must be aware of the caller's locale.
Regards,
Frank
It's mainly a decision between download time/speed vs flexibility. If you compile it GWT inlines the messages and can generate a little faster code, because no string lookup has to be done. However, if you need to make changes and don't want to recompile or want to be a able to let users dynamically alter messages you need dynamic messages.
Regarding the latter case, the Dictionary class can help you with this, see also: http://code.google.com/webtoolkit/doc/latest/DevGuideI18n.html#DevGuideDynamicStringInternationalization
With the Dictionary you generate all messages in the static page served to the user. The users locale can be found in the header Accept-Language, which is send by the browser when a page is requested.
In either case (compiled or dynamic) you might want to serve the locale set by the user in some configuration property and in that case you still need logic for both cases on the server side to serve the locale to the user.
Everything is possible for those who try...
Back to your question: there are several ways to resolve your issue. One would be to introduce some kind of i18n facade and treat your descriptions and names as resource keys. Then you could define convenience methods to access translations i.e. public String translate(String message, Locale locale);. This method could use standard Java ResourceBundle class to access resources at runtime.
The only real problem I see is how to deal with compound messages (i.e. "Blah, blah 4 items" where 4 is a placeholder). Well, what we did in one project in similar situation, we added delimiter and actual resource key then another delimiter and count: "Blah, blah 4 items##items.in.your.whatever##4". In the case of English you could simply trim the first part and for other languages you would need to process whole string.

How to version REST URIs

What is the best way to version REST URIs? Currently we have a version # in the URI itself, ie.
http://example.com/users/v4/1234/
for version 4 of this representation.
Does the version belong in the queryString? ie.
http://example.com/users/1234?version=4
Or is versioning best accomplished another way?
Do not version URLs, because ...
you break permalinks
The url changes will spread like a disease through your interface. What do you do with representations that have not changed but point to the representation that has? If you change the url, you break old clients. If you leave the url, your new clients may not work.
Versioning media types is a much more flexible solution.
Assuming that your resource is returning some variant of application/vnd.yourcompany.user+xml all you need to do is create support for a new application/vnd.yourcompany.userV2+xml media type and through the magic of content negotiation your v1 and v2 clients can co-exist peacefully.
In a RESTful interface, the closest thing you have to a contract is the definition of the media-types that are exchanged between the client and the server.
The URLs that the client uses to interact with the server should be provided by the server embedded in previously retrieved representations. The only URL that needs to be known by the client is the root URL of the interface. Adding version numbers to urls only has value if you construct urls on the client, which you are not suppose to do with a RESTful interface.
If you need to make a change to your media-types that will break your existing clients then create a new one and leave your urls alone!
And for those readers currently saying that this makes no sense if I am using application/xml and application/json as media-types. How are we supposed to version those? You're not. Those media-types are pretty much useless to a RESTful interface unless you parse them using code-download, at which point versioning is a moot point.
I would say making it part of the URI itself (option 1) is best because v4 identifies a different resource than v3. Query parameters like in your second option can be best used to pass-in additional (query) info related to the request, rather than the resource.
Ah, I'm putting my old grumpy hat on again.
From a ReST perspective, it doesn't matter at all. Not a sausage.
The client receives a URI it wants to follow, and treats it as an opaque string. Put whatever you want in it, the client has no knowledge of such a thing as a version identifier on it.
What the client knows is that it can process the media type, and I'll advise to follow Darrel's advice. Also I personally feel that needing to change the format used in a restful architecture 4 times should bring huge massive warning signs that you're doing something seriously wrong, and completely bypassing the need to design your media type for change resiliance.
But either way, the client can only process a document with a format it can understand, and follow links in it. It should know about the link relationships (the transitions). So what's in the URI is completely irrelevant.
I personally would vote for http://localhost/3f3405d5-5984-4683-bf26-aca186d21c04
A perfectly valid identifier that will prevent any further client developer or person touching the system to question if one should put v4 at the beginning or at the end of a URI (and I suggest that, from the server perspective, you shouldn't have 4 versions, but 4 media types).
You should NOT put the version in the URL, you should put the version in the Accept Header of the request - see my post on this thread:
Best practices for API versioning?
If you start sticking versions in the URL you end up with silly URLs like this:
http://company.com/api/v3.0/customer/123/v2.0/orders/4321/
And there are a bunch of other problems that creep in as well - see my blog:
http://thereisnorightway.blogspot.com/2011/02/versioning-and-types-in-resthttp-api.html
These (less-specific) SO questions about REST API versioning may be helpful:
Versioning RESTful services?
Best practices for web service REST API versioning
There are 4 different approaches to versioning the API:
Adding version to the URI path:
http://example.com/api/v1/foo
http://example.com/api/v2/foo
When you have breaking change, you must increment the version like: v1, v2, v3...
You can implement a controller in you code like this:
#RestController
public class FooVersioningController {
#GetMapping("v1/foo")
public FooV1 fooV1() {
return new FooV1("firstname lastname");
}
#GetMapping("v2/foo")
public FooV2 fooV2() {
return new FooV2(new Name("firstname", "lastname"));
}
Request parameter versioning:
http://example.com/api/v2/foo/param?version=1
http://example.com/api/v2/foo/param?version=2
The version parameter can be optional or required depending on how you want the API to be used.
The implementation can be similar to this:
#GetMapping(value = "/foo/param", params = "version=1")
public FooV1 paramV1() {
return new FooV1("firstname lastname");
}
#GetMapping(value = "/foo/param", params = "version=2")
public FooV2 paramV2() {
return new FooV2(new Name("firstname", "lastname"));
}
Passing a custom header:
http://localhost:8080/foo/produces
With header:
headers[Accept=application/vnd.company.app-v1+json]
or:
headers[Accept=application/vnd.company.app-v2+json]
Largest advantage of this scheme is mostly semantics: You aren’t cluttering the URI with anything to do with the versioning.
Possible implementation:
#GetMapping(value = "/foo/produces", produces = "application/vnd.company.app-v1+json")
public FooV1 producesV1() {
return new FooV1("firstname lastname");
}
#GetMapping(value = "/foo/produces", produces = "application/vnd.company.app-v2+json")
public FooV2 producesV2() {
return new FooV2(new Name("firstname", "lastname"));
}
Changing Hostnames or using API Gateways:
Essentially, you’re moving the API from one hostname to another. You might even just call this building a new API to the same resources.
Also,you can do this using API Gateways.
I wanted to create versioned APIs and I found this article very useful:
http://blog.steveklabnik.com/posts/2011-07-03-nobody-understands-rest-or-http
There is a small section on "I want my API to be versioned". I found it simple and easy to understand. The crux is to use Accept field in the header to pass version information.
If the REST services require authentication before use, you could easily associate the API key/token with an API version and do the routing internally. To use a new version of the API, a new API key could be required, linked to that version.
Unfortunately, this solution only works for auth-based APIs. However, it does keep versions out of the URIs.
If you use URIs for versioning, then the version number should be in the URI of the API root, so every resource identifier can include it.
Technically a REST API does not break by URL changes (the result of the uniform interface constraint). It breaks only when the related semantics (for example an API specific RDF vocab) changes in a non backward compatible way (rare). Currently a lot of ppl do not use links for navigation (HATEOAS constraint) and vocabs to annotate their REST responses (self-descriptive message constraint) that's why their clients break.
Custom MIME types and MIME type versioning does not help, because putting the related metadata and the structure of the representation into a short string does not work. Ofc. the metadata and the structure will frequently change, and so the version number too...
So to answer your question the best way to annotate your requests and responses with vocabs (Hydra, linked data) and forget versioning or use it only by non backward compatible vocab changes (for example if you want to replace a vocab with another one).
I'd include the version as an optional value at the end of the URI. This could be a suffix like /V4 or a query parameter like you've described. You might even redirect the /V4 to the query parameter so you support both variations.
I vote up for doing this in mime type but not in URL.
But the reason is not the same as other guys.
I think the URL should be unique (excepting those redirects) for locating the unique resource.
So, if you accept /v2.0 in URLs, why it is not /ver2.0 or /v2/ or /v2.0.0? Or even -alpha and -beta? (then it totally becomes the concept of semver)
So, the version in mime type is more acceptable than the URL.