Validate if graph is following ontology file - dotnetrdf

Imagine I have two RDF (turtle) files, one that contains my custom ontology (a.ttl) and one that contains values according to the ontology (b.ttl).
Is it possible to check if b.ttl is respecting all the definitions defined in a.ttl using .NET RDF?
I can load a.ttl using the OntologyGraph class, can I use this in some way to validate that the graph loaded from b.ttl is following the specification?

It depends on how your definitions are expressed.
If they are expressed in SHACL, then yes - dotNetRDF supports SHACL validation (which is sadly not yet written up in the docs, but take a look at this sample code).
If they are expressed in OWL, then no - dotNetRDF does not have an OWL inference engine so it cannot determine if your data is consistent with the ontology (in general OWL is actually for asserting new facts, and OWL "validation" is a process of determining if the facts asserted remain consistent with the ontology). You may need to look to one of the reasoners listed here to do this sort of processing.
A simple RDF-Schema based set of constraints (like just subclass, property domain, property range) could probably be converted to SHACL fairly easily, but that would be an additional step to add to your process.

Related

Meaningful data generation for Java POJO attributes

I have large number of APIs which I received in swagger json files. I want to stub these hundreds of APIs for which I require to autofill the returnable POJOs with some meaningful data( and not random data ). The aim is to mock these APIs with minimal effort. The closest thing which I found is Podam java library but it generates random data (as shown here - https://mtedone.github.io/podam/). It also generates attribute level meaningful data (in section "Defining an attribute-level strategy" on the previous link), but it mounts to same effort one would take to individually make POJOs. Can anybody prvide suggestion as to how should I solve this with minimal effort.

How to "flow" tagged values in Enterprise Architect from one instance to another

My questions is about bringing a concept to reality through technical availability of EA.
I am looking for a way to connect instances at an object diagram through which I can transfer tagged values. Let me explain the background of the project.
Purpose is to first have Stereotypes for specific roles in the system, such as "Calculation", "Transmission", "Decision", "Qualification", "Abstraction" etc.
Each of these stereotypes have specific tagged values suitable for their purpose.
Then I am creating instances from these stereotypes, eg. "MotorTorque:Calculation" and "LimitedTorque:Abstraction"
Each of these instances have a common tagged value, "criticality", boolean and I want this tagged value to progress from "MotorTorque:Calculation" to "LimitedTorque:Abstraction" through an output port > some sort of flow > input port kind of way.
Questions are:
1- Is this approach technically achievable in EA? If so what would be the correct way to do it?
2- The purpose is to have this "connection" readable in XMI export of the diagram which I will be using as an input for another purpose.
I have created an MDG Technology for my project with stereotypes and tagged values, however, I am having difficulty achieving this "connection", this "flow" of values.
Thank you for your time.
What you are asking for is not directly achievable. However, many ways lead to Rome.
One way would be to <<trace>> connect those objects to a Status class (or what ever you like to name it) and have this carry the "shared TV".
Another way is (by far more complex) to use an add-in. You would anyway need ways to create groups which share the TV. From your current explanation I can't see what that might be. Maybe the instantiating class of those instances? If so, you make a script that propagates a TV setting from ist current to all other linked instances. I'm not sure if the add-in events fire when a TV is changes (I do have some doubts here). If needed I could look that up.
What you propose is partially feasible.
There is a tagged value inheritance chain in EA, in which tagged values are inherited down the generalization chain, and from a classifier to its instances. In the GUI, inherited tagged values are shown separately from the instance's own ones, and in the API they are accessed using the Element.TaggedValuesEx property. Inherited tagged values can also be overridden.
Since the correct way to create a port (or part/property) is to make it an instance of a component, a port will inherit any tagged values from that component. So if your Calculation stereotype applies to component, ports which are instances of Calculation components will inherit the MotorTorque tagged value.
However, there is no way to "flow" tagged values from one port to another. If you want such a function, you'll have to implement it yourself with an Add-In.
Regarding XMI, first you must understand that an XMI export is based on a package, not a diagram. The XMI format itself is extensible, which means that different tool vendors create their own extensions which are typically not publicly documented. Crucially, diagram layouts are part of these non-standardized extensions. In EA's case, the image data is some sort of UU-encoded bitmap which you won't be able to extract any useful information from.
Elements' tagged values are included in an XMI export, but again, the EA extensions are not publicly documented. In other words, you can import EA:s XMI format in another program, but you will need to reverse-engineer the format. Not impossible, but it's probably better to either write your own specialized export function, or export via CSV. Note, however, that CSV export cannot be automated -- there's no call for it in the API.

DBpedia.org Ontology versus Schema.org Ontology

First off, I'm trying to define database tables with attributes from Schema.org, eg., for example I have a table named "JobPosting" that more or less has the same attributes as those defined in http://schema.org/JobPosting (baseSalary, etc.,), same goes for another table named "Organisation"
I have recently come across dbpedia.org (http://dbpedia.org/ontology/Organisation), the schema details seem to be much more richer, but I'm am confused as to:
Is dbpedia.org ontology an extension of those listed in schema.org?
Are dbpedia.org schemas recognized by major search engines (as those from schema.org)
What's the difference between Microdata and RFDs?
I'm going a little stir crazy trying to find the details...I couldn't find any comparisons vis-a-vis dbpedia.og vs schema.org.
Schema.org is one of countless vocabularies (resp. ontologies). The DBpedia Ontology is another one. Both vocabularies are independent of each other. Another vocabulary, related to your example, would be The Organization Ontology.
Which search engines recognize which vocabularies is a question without a definite answer. Search engines might recognize vocabularies without documenting it, or they might not recognize some (parts of) vocabularies although their documentation says otherwise. On top of that, all this might change daily.
You asked for the difference between Microdata and RFDs RDFs, but it’s likely that you mean RDFa in this context. Both are syntaxes which can be used to annotate content with the help of vocabularies. See my answer about differences between Microdata and RDFa.
(RDFS is "just" another vocabulary which can be used to describe vocabularies.)
I will try to answer all your questions, with understandable explanations.
Is dbpedia.org ontology an extension of those listed in schema.org?
No, it's not. There are countless ontologies available online, and any of them can be used combined, or alone, as long as their namespace (i.e. https://www.w3.org/2004/02/skos/ for SKOS or http://rdfs.org/sioc/spec/ for SIOC) is a valid URI.
Are dbpedia.org schemas recognized by major search engines (as those from schema.org)?
dbpedia schemas are as good as any other, and, as stated in the answer for the first question, it really doesn't matter which ontology you decide to use, as long as it best fits your content.
You can even create your own ontology in OWL-RDF.
What's the difference between Microdata and RFDa (not RDFs)?
The only difference between these 2 attribute sets is the way they're written, while they both do the same thing.
Other information:
RDFs stands for Resource Description Format Schema, and it's a format used to write the ontologies, together with OWL
OWL stands for Web Ontology Language, and it was created especially for writing ontologies
RDFa stands for Resource Description Format in Attributes, and it's an attribute set used to create structured data mapped on the existent HTML code
Microdata is an attribute set used to create structured data mapped on the existent HTML code

REST best practice for getting a subset list

I read the article at REST - complex applications and it answers some of my questions, but not all.
I am designing my first REST application and need to return "subset" lists to GET requests. Which of the following is more "RESTful"?
/patients;listType=appointments;date=2010-02-22;user_id=1234
or
/patients/appointments-list;date=2010-02-22;user_id=1234
or even
/appointments/2010-02-22/patients;user_id=1234
There will be about a dozen different lists that I need to return. In some of these, there will be several filtering parameters and I don't want to have big 'if' statements in my server code to select the subsets based on which parameters are present. For example, I might need all patients for a specific doctor where the covering doctor is another and the primary doctor is yet another. I could select with
/patients;rounds=true;specific_id=xxxx;covering_id=yyyy;primary_id=zzzz
but that would require complicated branching logic to get the right list, where asking for a specific subset (rounds-list) will achieve that same thing.
Note that I need to use matrix parameters instead of query parameters because I need to do filtering at several levels of the URL. The framework I am using (RestEasy), fully supports matrix parameters.
Ralph,
the particular URI patterns are orthogonal to the question how RESTful your application will be.
What matters with regard to RESTfulness is that the client discovers how to construct the URIs at runtime. This can be achieved either with forms or URI templates. Both hypermedia controls tell the client what parameters can be used and where to put them in the URI.
For this to work RESTfully, client and server must know the possible parameters at design time. This is usually achieved by making them part of the specification of the link relationship.
You might for example define a 'my-subset' link relation to have the meaning of linking to subsets of collections and with it you would define the following parameters:
listType, date, userID.
In a link template that spec could be used as
<link rel="my-subset' template="/{listType}/{date}/patients;user_id={userID}"/>
Note how the actual parameter name in the URI is decoupled from the specified parameter name. The value for userID is late-bound to the URI parameter user_id.
This makes it possible for the URI parameter name to change without affecting the client.
You can look at OpenSearch description documents (http://www.opensearch.org) to see how this is done in practice.
Actually, you should be able to leverage OpenSearch quite a bit for your use case. Especially the ability to predefine queries would allow you to describe particular subsets in your 'forms'.
But see for yourself and then ask back again :-)
Jan
I would recommend that you use this URL structure:
/appointments;user_id=1234;date=2010-02-22
Why? I chose /appointments because it is simple and clear. (If you have more than one kind of appointment, let me know in the comments and I can adjust my answer.) I chose the semicolons because they don't imply hierarchy between user_id and date.
One more thing, there is no reason why you should limit yourself to just one URL. It is just fine to have multiple URL structures that refer to the same resource. So you might also use:
/users/1234/appointments;date=2010-02-22
To return a similar result.
That said, I would not recommend using /dates/2010-02-22/appointments;user_id=1234. Why? I don't think, in practice, that /dates refers to a resource. Date is an attribute of an appointment but is not a noun on its own (i.e. it is not a first-class kind of thing).
I can relate to what David James answered.
The format of your URIs can be like he suggested:
/appointments;user_id=1234;date=2010-02-22
and / or
/users/1234/appointments;date=2010-02-22
while still maintaining the discoverability (at runtime) of your resource's URIs (like Jan Algermissen suggested).

REST Media type explosion

In my attempt to redesign an existing application using REST architectural style, I came across a problem which I would like to term as "Mediatype Explosion". However, I am not sure if this is really a problem or an inherent benefit of REST. To explain what I mean, take the following example
One tiny part of our application looks like:
collection-of-collections->collections-of-items->items
i.e the top level is a collection of collections and each of these collection is again a collection of items.
Also, each item has 8 attributes which can be read and written individually. Trying to expose the above hierarchy as RESTful resources leaves me with the following media types:
application/vnd.mycompany.collection-of-collections+xml
application/vnd.mycompany.collection-of-items+xml
application/vnd.mycompany.item+xml
Further more, since each item has 8 attributes which can be read and written to individually, it will result in another 8 media types. e.g. one such media type for "value" attribute of an item would be:
application/vnd.mycompany.item_value+xml
As I mentioned earlier, this is just a tiny part of our application and I expect several different collections and items that needs to be exposed in this way.
My questions are:
Am I doing something wrong by having these huge number of media types?
What is the alternative design method to avoid this explosion of media types?
I am also aware that the design above is highly granular, especially exposing individual attributes of the item and having separate media types for each them. However, making it coarse means I will end up transferring unnecessary data over the wire when in reality the client only needs to read or write a single attribute of an item. How would you approach such a design issue?
One approach that would reduce the number of media types required is to use a media type defined to hold lists of other media-types. This could be used for all of your collections. Generally lists tend to have a consistent set of behavior.
You could roll your own vnd.mycompany.resourcelist or you could reuse something like an Atom collection.
With regards to the specific resource representations like vnd.mycompany.item, what you can do depends a whole lot on the characteristics of your client. Is it in a browser? can you do code-download? Is your client a rich UI, or is it a data processing client?
If the client is going to do specific data processing then you pretty much need to stick with the precise media types and you may end up with a large number of them. But look on the bright side, you will have less media-types than you would have namespaces if you were using SOAP!
Remember, the media-type is your contract, if your application needs to define lots of contracts with the client, then so be it.
However, I would not go as far as defining contracts to exchange single attribute values. If you feel the need to do that, then you are doing something else wrong in your design. Distributed interface design needs to have chunky conversations, not chatty ones.
I think I finally got the clarification I sought for the above question from Ian Robinson's presentation and thought I should share it here.
Recently, I came across the statement "media type for helping tune the hypermedia engine, schema for structure" in a blog entry by Jim Webber. I then found this presentation by Ian Robinson of Thoughtworks. This presentation is one of the best that I have come across that provides a very clear understanding of the roles and responsibilities of media types and schema languages (the entire presentation is a treat and I highly recommend for all). Especially lookout for the slides titled "You've Chosen application/xml, you bstrd." and "Custom media types". Ian clearly explains the different roles of the schemas and the media types. In short, this is my take away from Ian's presentation:
A media type description includes the processing model that identifies hypermedia controls and defines what methods are applicable for the resources of that type. Identifying hypermedia controls means "How do we identify links?" in XHTML, links are identified based on tag and RDF has different semantics for the same. The next thing that media types help identify is what methods are applicable for resources of a given media type? A good example is ATOM (application/atom+xml) specification which gives a very rich description of hyper media controls; they tell us how the link element is defined? and what we can expect to be able to do when we dereference a URI so it actually tells something about the methods we can expect to be able to apply to the resource. The structural information of a resource represenation is NOT part of or NOT contained within the media type description but is provided as part of appropriate schema of the actual representation i.e the media type specification won’t necessarily dictate anything about the structure of the representation.
So what does this mean to us? simply that we dont need a separate media type for describing each resource as described above in my original question. We just need one media type for the entire application. This could be a totally new custom media type or a custom media type which reuses existing standard media types or better still, simply a standard media type that can be reused without change in our application.
Hope this helps.
In my opinion, this is the weak link of the REST concept. As an architectural and interface style, REST is outstanding and the work done by Roy F. and others has advanced the state of the art considerably. But there is an upper limit to what can be communicated (not just represented) by standard media types.
For people to understand and use your REST-ish API, they need to understand the meaning of the data. There are APIs where the media types tell most of the story; e.g. if you have a text-to-speech API, the input media type is text/plain and the output media type is audio / mp4, then someone familiar with the subject matter could probably make do. Text in, audio out, probably enough to go on in this case.
But many APIs can't communicate much of their meaning with just media type. Let's say you have an API that handles airline ticketing. The inputs and outputs will mostly be data. The media types on input and output of every API could be application/json or application/xml, so the media type doesn't transmit a lot of information. So then you would look at the individual fields in the inputs & outputs. Maybe there's a field called "price". Is that in dollars or pennies? USD or some other currency? I don't know how a user would answer those questions without either (a) very descriptive names, like "price_pennies_in_usd", or (b) documentation. Not to mention format conventions. Is an account number provided with or without dashes, must letters be all-caps and so on. There is no standard media type that defines these issues.
It's one thing when we're in situations where the client doesn't need a semantic understanding of the data. That works well. The fact that browsers can visually render any compliant document, and interact with any compliant resource, is really great. That's basically the "media" use case.
But it's entirely different when the client (or actually, the developer/user behind the client) needs to understand the semantics of the data. DATA IS NOT MEDIA. There is no way to explain data in all its real-world meaning and subtlety other than documenting it. This is the "data" use case.
The overly-academic definition of REST works in the media use case. It doesn't work, and needs to be supplemented with non-pure but useful things like documentation, for other use cases.
You're using the media type to convey details of your data that should be stored in the representation itself. So you could have just one media type, say "application/xml", and then your XML representations would look like:
<collection-of-collections>
<collection-of-items>
<item>
</item>
<item>
</item>
</collection-of-items>
<collection-of-items>
<item>
</item>
<item>
</item>
</collection-of-items>
</collection-of-collections>
If you're concerned about sending too much data, substitute JSON for XML. Another way to save on bytes written and read is to use gzip encoding, which cuts things down about 60-70%. Unless you have ultra-high performance needs, one of these approaches ought to work well for you. (For better performance, you could use very terse hand-crafted strings, or even drop down to a custom binary TCP/IP protocol.)
Edit One of your concerns is that:
making [the representation] coarse means I will end up transferring unnecessary data over the wire when in reality the client only needs to read or write a single attribute of an item
In any web service there is quite a lot of overhead in sending messages (each HTTP request might cost several hundred bytes for the start line and request headers and ditto for each HTTP response as in this example). So in general you want to have less granular representations. So you would write your client to ask for these bigger representations and then cache them in some convenient in-memory data structure where your program could read data from them many times (but be sure to honor the HTTP expiration date your server sets). When writing data to the server, you would normally combine a set of changes to your in-memory data structure, and then send the updates as a single HTTP PUT request to the server.
You should grab a copy of Richardson and Ruby's RESTful Web Services, which is a truly excellent book on how to design REST web services and explains things much more clearly than I could. If you're working in Java I highly recommend the RESTlet framework, which very faithfully models the REST concepts. Roy Fielding's USC dissertation defining the REST principles may also be helpful.
A media type should be seldomly created and time should be invested in making sure the format can survive change.
As you're relying on xml, there is no particular reason why you couldn't create one media type, provided that media type is described in one source.
Choosing ATOM over having one host media type that supports multiple root elements doesn't necessarily bring you anything: you'll still need to start reading the message within the context of a specific operation before deciding if enough information is present to process the request.
So i would suggest that you could happily have one media type, represented by one root element, and use a schema language to specify which of the elements can be contained.
In other words, a language like xsd can let you type your media type to support one of multiple root elements. There is nothing inherently wrong with application/vnd.acme.humanresources+xml describing an xml document that can take either or as a root element.
So to answer your question, create as few media types as you can possibly afford, by questioning if what you put in the documentation of the media type will be understandable and implementeable by a developer.
Unless you intend on registering these media types you should pick one of the existing mime types instead of trying to make up your own formats. As Jim mentions application/xml or text/xml or application/json works for most of what gets transmitted in a REST design.
In reply to Darrel here is Roy's full post. Aren't you trying to define typed resources by creating your own mime types?
Suresh, why isn't HTTP+POX Restful?