Advantage of ontology over RDBMS in an offline application - rdbms

Is there any advantage of using ontology based database (linked data) instead of RDBMS in an offline application? Does linked data provide more relations and reasoning capabilities using SPARQL than SQL? Can I not achieve the same using joins in SQL?
Suppose I am storing the details of various mobile phones. This database should answer user centric queries like
1.list of all mobiles with good (quantified) touch interface
2.mobiles similar to Samsung Galaxy s4
Can I not retrieve efficient results using RDBMS itself with joins? If the answer is yes, then would the performance of answering these queries between the two database models be of argument here? Basically what is the edge that I get get by using ontologies in such scenarios?

The main advantage of using ontologies is the formalized semantics. This way a reasoner can automatically infer new statements without writing specific code.
But it's true, that you can model every Linked Data also in RDBMS and the other way around. The same is true for querying with SPARQL or SQL. You can achieve the same results. SPARQL has some advantages if your SQL query requires multiple joins. This can be expressed in a far more meaningful way in SPARQL.
The disadvantage of ontology based databases is nowadays still a lack of performance in comparison to RDBMS.

Related

Using Apacheage with PostgreSQL?

I am newly started to Apacheage and wondering What are the main differences between using PostgreSQL alone and using Apache Age with PostgreSQL for data processing. I know Apacheage is an extension for grapgh database In Postgres. But what is the importance of using ApacheAge with Postgres
Apache AGE is an extension for PostgreSQL that enables users to leverage a graph database on top of the existing relational databases. AGE is an acronym for A Graph Extension , a multi-model database fork of PostgreSQL. The basic principle of the project is to create a single storage that handles both the relational and graph data model so that the users can use the standard SQL along with openCypher, one of the most popular graph query languages today.
Reference and for more information you can visit apache age github
PostgreSQL is a relational database management system ( RDBMS ). Meanwhile AGE is an extension over PostgreSQL which allows the functionalities of a graph database to be possible. If we only use PostgreSQL we won't be able to make a graph and make nodes in it and get that functionality, so this is why we use Apache AGE with PostgreSQL.
Apache AGE basically enhances PostgreSQL's relational database capabilities by incorporating graph database features. Data can be stored, accessed, and analyzed as a graph using Apache AGE, which is especially helpful for large, interconnected data sets. Using AGE, users may model and query relationships between data by using graph database features including nodes, edges, and properties.
Also, AGE integrates with PostgreSQL's SQL engine, which means that users can leverage their existing knowledge of SQL to query and analyze graph data. For visualizing you can use Apache Age Viewer.
AGE also supports many of PostgreSQL's advanced SQL features, such as window functions and CTEs (common table expressions).
You can check their website for more details.
Although the other answers are essentially correct, I want to provide a bit of context.
1. Apache Age is a powerful open-source extension of Postgres that adds graph database functionality to the relational database.
To understand this better, you should know what graph databases are. Visit the link to learn more. (graph database). In short, you can leverage open-source extensions like Apache Age to basically extend Postgres's capabilities and model complex relationships in your data.
This combination is particularly useful in scenarios where data is both structured and interconnected, such as social networks, recommendation engines, or fraud detection systems.
The following use cases of Apache Age might should further clear things up.I hope this helps! Let me know if you have any additional questions.
Use Cases of Apache Age:
Ability to store and query graph data using SQL
Combining the strengths of both graph databases and relational databases
Efficiently managing structured and interconnected data
Finding insights and relationships that might be difficult to find using traditional SQL queries alone.
Using Apache Age with PostgreSQL can provide several benefits, such as:
Graph Database Functionality: With Apache Age, users can add graph database functionality to their PostgreSQL database. This allows them to model and store data in a way that is better suited for graph data, as opposed to traditional relational database structures.
Improved Querying: Apache Age provides a graph query language called Cypher, which is specifically designed for querying graph data. This can make it easier to query complex and interconnected data, and can provide better performance for certain types of queries.
Integration with Existing PostgreSQL Systems: Apache Age is an extension for PostgreSQL, which means that it integrates seamlessly with existing PostgreSQL systems. Users can continue to use their existing tools and interfaces, and can easily incorporate graph database functionality into their existing workflows. There are many more.

OLAP and postgresql- tool or methodology?

I was reviewing some documents for making my database perform better and I came across with "OLAP" pre-aggregation term. I was wondering if OLAP is a tool or or methodology or approach. For example my DBMS is postgresql and I am working on a big databse. To speed up I have to use some aggregation and pre-aggregation methods. How OLAP can be helpful?
OLAP is a database role. When storing OLAP data in the db, typically you aren't running live transactional information off the db, but rather keeping it around for analytical and business intelligence reasons.
It isn't a tool. It isn't an approach either, since some approaches are needed for OLAP but some are helpful together in transactional environments as well.
In general you shouldn't think about speeding up an application by incorporating OLAP into it. Instead you would look at separating out reporting functions into a separate db server, and import the data periodically, and then separate data feeds from operation data stores, etc. This is a very different field than transactional application development.

Java ORMs on NoSQL DB like HBase

I have recently started getting familiarized with NoSQL (HBase). I am definitely a noob.
I was investigating about ORMs and high level clients which can be used on HBase and came across a few.
Some ORM libraries like Kundera are providing SQL like data query functionality. I am finding this a little counter intuitive.
Can any one help me understand why we would again need SQL like querying if the whole objective was to move away from it?
Also can anyone comment on your experiences with ORMs for HBase? I looked at a few of them from http://wiki.apache.org/hadoop/SupportingProjects and started looking at Kundera.
Another related question - Does data query with Kundera run map reduce jobs internally?
kundera or Spring data might provide user friendly ORM layer over NoSQL databases, but the underlying entity model still has to be NoSQL friendly. This means that NoSQL users should not blindly follow RDBMS modeling strategies but design ORM entities in such a way so that all NoSQL capabilities can be used.
As a thumb rule, the kundera ORM entities should be designed using query-first strategy where first the queries need to defined so as to create primary keys and also ensuring that relationship model is used as minimal as possible. Querying on random columns and full scans should be avoided and so data might have to be replicated across entities for reducing multiple entity look ups. Also, transactions management needs to be planned. FYI, kundera does not support transactions(beyond single row TX supported by Hbase/Cassandra).
Reason for using Kundera:
1) If looking for SQL like support over HBase. As it is build on top of HBase native API, so it simply transforms these SQL queries in to corresponding GET or PUT method calls.
2) Currently it support HBase-0.20.6 only. Kundera-2.0.6 will enable support for HBase 0-90.x versions.
3) Kundera does not do sometihng out of the box to provide map reduce over SQL like queries. However support for such thing will be provided in Kundera-2.0.6 by enabling support for Hive native queries only!
It is totally JPA compliant, so no need to learn something new. It simply hides complexity at developer level with very minimal effort.
SQL like querying is for developement ease, quick developement, less error prone and reusability ofcourse!
-Vivek

Migrating an ORM based Spring project (Hibernate / JPA) to noSQL (MongoDB/Cassandra/CauchDB)

I am fairly new to the noSQL world, and although I understand the benefits of performance and "cloud" friendliness, it seems the RDBMS world is much simpler and standard and limited to fewer players
I worked with SQL Server, Oracle, DB2, Sybase, Terradata, MySQL and others, and it seems they have in common much more (in terms of Query language, Indexing, ACID, etc) than the noSQL family.
My question is mainly this
Is it at all a valid concept to move an existing Spring/Java EE+JPA app to a noSQL storage? or it will require a complete re-architecture of the system beyond the medium of storage?
Hoping it's a valid goal, are there any migration paths that were case studied as best practices?
Is there an equivalent to the concept in "noSQL" that is comparable to ORM for RDBMS? e.g. any layer of separating the storage implementation from concept (I know GAE BigTable supports JDO and JPA but only partially, is there a newer JSR for a more noSQL friendly JPA?)
Are there any attempts to standardize "noSQL" the same way RDBMS are (query language,
API)
Is "noSQL" a too wide term? Should I modify the question per implementation (KV/Document)
Try Kundera : https://github.com/impetus-opensource/Kundera. it is an open source JPA2.0 compliant solution. You can also join http://groups.google.com/group/kundera-discuss/subscribe for further discussion.
-Vivek
DataNucleus allows JPA persistence to RDBMS, MongoDB, HBase and various others. That is one way you can tackle the problem, giving you a start point for use of your app with other datastores. From there you could modify class hierarchies to get around some of the problems that these other datastores bring. Use of JPA with other datastores is not part of any JSR and never will be, since JPA is designed around RDBMS solely. JDO on the other hand is already a standard for all datastores, as it was designed to be (also supported by DataNucleus)
EclipseLink 2.4 supports JPA with MongoDB and other NoSQL data sources.
http://java-persistence-performance.blogspot.com/2012/04/eclipselink-jpa-supports-mongodb.html
PlayOrm is another solution with it's Scalable-SQL and is JPA-like but breaks from JPA in that it supports noSQL patterns that JPA can't support.

Jira using enterprise architecture by OfBiz

The 'open for business project' is an enterprise framework.
It so happens Jira uses this, and I was pretty shocked at how much work is involved to pull data for a particular entity (say a issue/bug in Jira's case).
Imagine getting a list of all the issues, it has to first get all the columns (or properties) to display for the table column, then pull in the values for each. For an enterprise solution this sounds like a sub-optimal solution (but I understand how it adds flexibility).
You can read how its used in Jira practically: http://confluence.atlassian.com/display/JIRA/Database+Schema
main site: http://ofbiz.apache.org/docs/entity.html
I'm just confused as to how to list all issues. Meaning, what would the sql queries look like?
Its one thing to pull a single issue, but to get a list you have to do allot of work to get the values. I don't think it can be done with a singl query using joins now can it?
(Disclaimer: I work for Atlassian, but I'm not on the JIRA team)
OFBiz EE is just an abstraction layer for moving between database tables and fancy maps called GenericValues. It has no influence over the database schema itself. Your real issue here seems to be that JIRA's database schema is complicated.
The reason it's complicated is because it has to support a data model where an issue is an arbitrary collection of arbitrary fields, at some point in an arbitrary workflow. The fields themselves can be defined by third-party plugins. It's very hard to produce a friendly-looking RDBMS schema to fit this kind of dynamic data model, and JIRA tries as best it can.
You can get information directly out of the database if you want, the database schema is documented in the link above, or you can go up a layer or twelve of abstraction and talk through one of JIRAs many APIs.
A good place to ask questions about getting data out of JIRA is the forums on http://forums.atlassian.com/
The entity engine used in jira is a database abstraction layer ( with a very rich and easy to use API ) that connects your application with one or more datasources. But the databases are still relational, so you can use SQL if you want to. About the issue info you want to pull I'd say it wouldn't be very easy only with joins. I'd recommend you use the scripting language of the RDBMS ( i.e. PL/SQL, pgPL/SQL ).
SELECT * FROM jiraissue;