I'm developing a semi-automatic annotation tool for medical texts and I am completely lost in finding the RDF triplets for annotation.
I am currently trying to use an NLP based approach. I have already looked into Stanford NER and OpenNLP and they both do not have models for extracting disease names.
My question is:
* How can I create a new NER model for extracting disease names? and can I get any help from the OpenNLP or Standford NERs?
* Is there another approach all-together - other than NLP - to extracting the RDF triplets from a text?
Any help would be appreciated! Thanks.
I have done something similar to what you need both with OpenNLP and LingPipe.
I found the exact dictionary-based chunking of LingPipe good enough for my use case and used that. Documentation available here: http://alias-i.com/lingpipe/demos/tutorial/ne/read-me.html
You can find a small demo here:
https://github.com/castagna/nerdf
If a gazetteer/dictionary approach isn't good enough for you, you can try creating your own model, OpenNLP has API for training models as well. Documentation is here: http://opennlp.apache.org/documentation/1.5.2-incubating/manual/opennlp.html#tools.namefind.training
Extracting RDF triples from natural language is a different problem than identify named entities. NER is a related and perhaps necessary step, but not enough. To extract an RDF statement from natural language not only you need to identify entities such as the subject and the object of a statement. But you also need to identify the verb and/or relationship of those entities and also you need to map those to URIs.
Related
Excuse the incorrect terminology on this one....
So, we are heavily using breeze in our application to work with our REST resources, and for the most part is fine.
However, we are typically using the following pattern as a standard:
/entity/{entityId}/anotherEntity/{anotherEntityId}/andAnotherEntity
Defining everything in meta data, we use navigation properties and foreign keys to bind this altogether, and works beautifully. We also extend the Breeze Labs Abstract REST DataServiceAdapter in our application.
But, we are now thinking, that to keep the API in a cleaner shape, we may want to 'group' some of the resources into more logical places.
This would then mean the pattern would look something like:
/entity/{entityId}/entitySpecificGrouping/anotherEntity/{anotherEntityId}/andAnotherEntity
So, my questions here:
Should we even try and do this? Is it RESTful?
How do we go about this in Breeze meta data?
I appreciate any input on this matter.
I have an Ecore model saved to file.
What I want to do is to modify the Ecore model (add elements, supertypes, attributs, delete attributes...)
But I don't want to do it by hand, I want a script / M2M transformation.
What language, tool would you use ?
What you want is transform your Ecore model into another Ecore model. You should have a look at the model transformation projects of the Eclipse Foundation. I would recommend ATL among those projects as it's easy to master for your need (you do not seems to need very complex transformations). If you need a really simple transformation, I'll suggest doing it directly in Java as it may be easier for your to integrate it in your workflow.
Full disclosure: I work for one of the company contributing to ATL.
Can you be more precise please?
I understood you want to modify your source model, however I suppose you want to do it accordingly to some rules. Is that true? Can you exemplify these rules?
That said, I would recommend you to start with Epsilon. It is a powerful language which allows you to define a variety of model-to-model and model-to-text transformations.
Since you're writing a model-to-model transform you should use Java as nothing beats Java for the sorts of navigations, iterations and fine-grained access that you'll need. If you wanted to generate code from the model I'd suggest one of the templating languages, however.
I'm starting to work with graph databases, and in my team we've started modeling a graph for our software. The problem comes when we try to "document" the model, to see the structure of our database. With SQL databases you only have to look at the SQL schema.
We've spent some time reading neo4j blogs and documentation, but we've seen that the usual way to show how a graph works is with a minimal graph showing some sample data (Random samples: sample1, sample2, etc). That's great for educational purposes, but we'd love to be able to do it in a little more formal way. We'd like to set what kind of node can relate with another one, and with what kind of relationship, that kind of stuff.
Using Spring you can wrap the graph with classes, but it's very specific to Java and OO model, and we're working with Erlang. We're looking for some kind of formal language (SQL Schema equivalent), or a E-R model equivalent, or something like that.
One way to do this is to put the "meta-model" of your graph (a type network) in the graph as well and then connect the instances (nodes) to their meta-model-type. So you can visualize the meta-model using the graph visualization and at the same time use the meta-model to enforce additional constraints (by storing constraint information in the meta-model and using that when the actual model is updated) and also use the type-nodes of the meta-model to quickly access all "instance"-nodes of this type.
What is the domain you want to model?
A quick idea - could you use a subset of UML? Graph modeling seems to be closer to the domain, so maybe that's reasonable.
What we do is a generalization of the "example data" approach, where we include cardinality on each side of a relationship, as well as type and direction. I also often include a node "type" in the diagram (or some other specification of it's role/relation to domain models) instead of example data, and of course note the expected properties, their types, and whether they are optional. It's less than formal, but has served well so far.
I have some documents and an ontology for some concepts. Are there any frameworks that automatically extracts those concepts from the given documents and creates triples? The ontology must contain special properties?
I found UIMA, but as far as I understood with UIMA I can do only something like this:
create some dictionaries which keep associations with the ontology
use this dictionary with ConceptMapper
write a CAS consumer that creates the triples and persists them -
I don't like this approach because I have to keep in sync the concepts from the ontology and the dictionary.
Can be UIMA used differently, or are there any advanced frameworks that can use directly my ontology with lets say some custom properties as input and based on it annotate the documents?
I want to use ontologies as domain model because I want to create further a knowledge base and ontologies seem more flexible than for example relational model.
Thanks.
After spending more time searching on Google I found GATE and more specifically OntoRoot Gazetter and Large KB Gazetteer.
OntoRoot Gazetteer is a type of a dynamically created gazetteer that is, in combination with few other generic GATE resources, capable of producing ontology-based annotations over the given content with regards to the given ontology. This gazetteer is a part of ‘Gazetteer_Ontology_Based’ plugin that has been developed as a part of the TAO project.
I didn't test them but these ones seem good solution candidates for my problem.
I Googled to find tutorials and documentation on Entity Framework and read a couple of articles too. I referred to MSDN documentation also, but still I am not able to understand it clearly.
With a little that I followed is that:
(1) Each table along with rows are considered as a single unit.
(2) It provides a solution to sudden table name change without affecting the application.
(3) It reduces a lot of code.
Can someone explain me in a more easy way with illustrations? Please don't be too technical.
Check out:
Entity Framework Overview
Intro to Entity Framework with SQL Server
Beginner's Guide to Entity Framework (has lots of articles, videos, etc.)
It's rather hard to find something that's not too technical and just shows nice graphical representations.
But basically you have three "layers" inside an EF model:
the physical database model - what tables and columns do you have?
the conceptual model - the business objects / entities you want to work with (which can be very similar or quite different from your physical model)
the mapping layer that defines the mappings between those two worlds