I am currently working on DBpedia datasets and would like to achieve the mapping between schema.org and DBpedia through wikidata. Hence I would like to know if there exists any mapping between schema.org and wikidata.
Not yet, I think, but there is some work in progress:
Schema.org should have mappings to Wikidata terms where possible
Related
I have to model a product, which has properties that aren't listed in the Schema.org Product type. After seeking in many places, I didn't find anything that fits to my need.
How can I extend the Schema.org Product type?
You could always use other vocabularies (that offer the properties you need) in addition to Schema.org. But if you want to use only the vocabulary Schema.org, you have two options in general:
Propose new properties (or classes).
You can do this on Schema.org W3C Community Group’s mailing list, or on Schema.org’s GitHub issue tracker.
See: How can I get involved? How can I propose new schemas or other improvements?
If accepted, it might become part of the core (if it’s something "the most common web applications need"), or it might become an extension.
(deprecated!) Extend existing properties.
Extending existing properties is documented at http://schema.org/docs/old_extension.html, but note that this mechanism is considered outdated.
For specific types (including Product), you can use Schema.org’s additionalProperty property:
A property-value pair representing an additional characteristics of the entitity, e.g. a product feature or another characteristic for which there is no matching property in schema.org.
I'm developing a semi-automatic annotation tool for medical texts and I am completely lost in finding the RDF triplets for annotation.
I am currently trying to use an NLP based approach. I have already looked into Stanford NER and OpenNLP and they both do not have models for extracting disease names.
My question is:
* How can I create a new NER model for extracting disease names? and can I get any help from the OpenNLP or Standford NERs?
* Is there another approach all-together - other than NLP - to extracting the RDF triplets from a text?
Any help would be appreciated! Thanks.
I have done something similar to what you need both with OpenNLP and LingPipe.
I found the exact dictionary-based chunking of LingPipe good enough for my use case and used that. Documentation available here: http://alias-i.com/lingpipe/demos/tutorial/ne/read-me.html
You can find a small demo here:
https://github.com/castagna/nerdf
If a gazetteer/dictionary approach isn't good enough for you, you can try creating your own model, OpenNLP has API for training models as well. Documentation is here: http://opennlp.apache.org/documentation/1.5.2-incubating/manual/opennlp.html#tools.namefind.training
Extracting RDF triples from natural language is a different problem than identify named entities. NER is a related and perhaps necessary step, but not enough. To extract an RDF statement from natural language not only you need to identify entities such as the subject and the object of a statement. But you also need to identify the verb and/or relationship of those entities and also you need to map those to URIs.
I'm starting to work with graph databases, and in my team we've started modeling a graph for our software. The problem comes when we try to "document" the model, to see the structure of our database. With SQL databases you only have to look at the SQL schema.
We've spent some time reading neo4j blogs and documentation, but we've seen that the usual way to show how a graph works is with a minimal graph showing some sample data (Random samples: sample1, sample2, etc). That's great for educational purposes, but we'd love to be able to do it in a little more formal way. We'd like to set what kind of node can relate with another one, and with what kind of relationship, that kind of stuff.
Using Spring you can wrap the graph with classes, but it's very specific to Java and OO model, and we're working with Erlang. We're looking for some kind of formal language (SQL Schema equivalent), or a E-R model equivalent, or something like that.
One way to do this is to put the "meta-model" of your graph (a type network) in the graph as well and then connect the instances (nodes) to their meta-model-type. So you can visualize the meta-model using the graph visualization and at the same time use the meta-model to enforce additional constraints (by storing constraint information in the meta-model and using that when the actual model is updated) and also use the type-nodes of the meta-model to quickly access all "instance"-nodes of this type.
What is the domain you want to model?
A quick idea - could you use a subset of UML? Graph modeling seems to be closer to the domain, so maybe that's reasonable.
What we do is a generalization of the "example data" approach, where we include cardinality on each side of a relationship, as well as type and direction. I also often include a node "type" in the diagram (or some other specification of it's role/relation to domain models) instead of example data, and of course note the expected properties, their types, and whether they are optional. It's less than formal, but has served well so far.
I have some documents and an ontology for some concepts. Are there any frameworks that automatically extracts those concepts from the given documents and creates triples? The ontology must contain special properties?
I found UIMA, but as far as I understood with UIMA I can do only something like this:
create some dictionaries which keep associations with the ontology
use this dictionary with ConceptMapper
write a CAS consumer that creates the triples and persists them -
I don't like this approach because I have to keep in sync the concepts from the ontology and the dictionary.
Can be UIMA used differently, or are there any advanced frameworks that can use directly my ontology with lets say some custom properties as input and based on it annotate the documents?
I want to use ontologies as domain model because I want to create further a knowledge base and ontologies seem more flexible than for example relational model.
Thanks.
After spending more time searching on Google I found GATE and more specifically OntoRoot Gazetter and Large KB Gazetteer.
OntoRoot Gazetteer is a type of a dynamically created gazetteer that is, in combination with few other generic GATE resources, capable of producing ontology-based annotations over the given content with regards to the given ontology. This gazetteer is a part of ‘Gazetteer_Ontology_Based’ plugin that has been developed as a part of the TAO project.
I didn't test them but these ones seem good solution candidates for my problem.
I'm working with a large hierarchical data set in sql server - modelled using the standard "EntityID, ParentID" kind of approach. There are about 25,000 nodes in the whole tree.
I often need to access subtrees of the tree, and then access related data that hangs off the nodes of the subtree. I built a data access layer a few years ago based on table-valued functions, using recursive queries to fetch an arbitrary subtree, given the root node of the subtree.
I'm thinking of using Entity Framework, but I can't see how to query hierarchical data like
this. AFAIK there is no recursive querying in Linq, and I can't expose a TVF in my entity data model.
Is the only solution to keep using stored procs? Has anyone else solved this?
Clarification: By 25,000 nodes in the tree I'm referring to the size of the hierarchical dataset, not to anything to do with objects or the Entity Framework.
It may the best to use a pattern called "Nested Set", which allows you to get an arbitrary subtree within one query. This is especially useful if the nodes aren't manipulated very often: Managing hierarchical data in MySQL.
In a perfect world the entity framework would provide possibilities to save and query data using this data pattern.
Everything IS possible with Entity Framework but you have to hack and slash your way in to it. The database I am currently working against has too many "holder tables" since Points for instance is shared with both teams and users. Both users and teams can also have a blog.
When you say 25 000 nodes do you mean navigational properties? If so I think it could be tricky to get the data access in place. It's not hard to navigate, search etc with entity framework but I tend to model on paper then create the database based on how I want to navigate while using entity framework. Sounds like you don't have that option.
Thanks for these suggestions.
I'm beginning to realise that the answer is to remodel the data in the database - either along the lines of nested sets as Georg suggests, or maybe a transitive closure table, which I've just come across.
That way, I'm hoping to get two key benefits:
a) faster querying aginst arbitrary subtrees
b) a data model which no longer requires recursive querying - so perhaps bringing it within easy reach of the Entity Framework!
It's always amazing how so often the right answer to a difficult problem is not to answer it, but to do something else instead!