Quick question, but one that Google hasn't yielded any results for.
Whilst comparing different graph databases, I found this useful slide on the internals of Neo4j. I found slides 3-12 particularly interesting, as they explained how the graph (nodes/edges/properties) is represented/serialized on disk.
I'm now looking closely at OrientDB as my likely DB of choice. Can anyone with knowledge of OrientDB's architecture please describe the internal representation that it uses to store graph data, analogously to that on the slides linked above? Many thanks.
Related
I am using rexster on top of Titan to visualize the graph. I could view the vertices and get the edges for each of them. But, how can i view complete graph from Titan instead of just getting details for each vertex.? I will have thousands of vertices and edges. If i have to view them for each vertex, it will be really tough. Pleas let me know, if there is a way to visualize whole graph.
Rexster's user interface isn't designed to visualize an entire graph. In fact, you will find that when graphs get sufficiently large, viewing the whole graph just isn't possible (or useful). For now though, let's set memory/usability issues aside to answer your question. You will want to use a different graph visualization tool, like Gephi or Cytoscape to visualize your graph. Both of those tools (and others you will find) can take GraphML as an input and using TinkerPop's Reader/Writer classes you can output your graph to that format.
You asked about Rexster so I assume that you were using TinkerPop 2.x and Titan versions prior to 1.0. If you were using Titan 1.0 (and thus TinkerPop 3.x), you could use the Gephi Plugin for the Gremlin Console to help make the integration more seamless.
Most user interface patterns have a performance issue at about the same time they have a usability issue. Developers who ask "how can I stop my 100,000 element drop down list from being so slow" may be missing the point that such a drop down list is useless for an end user, and instead should switch to a different paradigm, such as a typeahead search.
Similarly, once your graph has more than a few hundred vertices in it, visualizing it in its entirety is nearly impossible. You must move to viewing aggregate statistics, simplified versions of the graph, or much more impressionistic sketches that are really only good for seeing clusters.
I come from an SQL background, where grasping the possible relationships between different models and schemas seems to be quite straightforward to me.
How can I shift the same thing to the MEAN world? For example, let's just assume I have a basic blog engine with a posts table and a comments table, where posts have many comments and each comment has a post. While coding this is easy in, say, Rails, I'm getting stuck here and couldn't find good tutorials.
Also, I'm not sure if adding authors to the party is any more complicated - let's just say posts and comments each have an author, and the author has many comments and also has many posts (once I get this I think highlighting "OP" comments is just the matter of a query).
Can you give me a guideline regarding the differences between what I've been used to in Rails and the approach I need now?
You are used to think in terms of normalization. NoSQL databases let you design your data model in structured documents, meaning you can denormalize your data. It has advantages like data locality and atomicity, but can suffer from redundancy and inconsistency.
An example would be embedding the comments inside each post. Thus, you don't have several collections / tables, and can access your data swiftly.
I advice you to read the book MongoDB Applied Design Patterns to better understand the benefits you would earn.
I am completely new to neo4j and I am very sorry for asking such a basic question. I have installed neo4j, I am using the shell "localhost:7474/webadmin/#/console/"
I am looking for a good example that uses some shell commands to read from a pre-existing graph database, traverse, modify,... it and then perform some queries in order to learn it. I don't want to use any Java or Python, all I want is some command line examples that will allow me to learn neo4j.
I searched a lot but could not find a good sample code except one matrix example.
I appreciate any help.
One of the virtues of Neo4J is the excellent documentation and learning material it provides (especially compared to other Graph enabled DBs).
As mentioned, starting with The Cypher Tutorial is a good starting point.
Then, as you learn the basics, check out The Neo4J Manual which has detailed documentation on each and every Cypher language command (as well as many other interesting stuff).
Finally, when you start doing your own queries, keep close a copy of the Cypher Cheat Sheet which summarizes all the commands.
You can even take a look at Cypher without installing or running Neo4J server, just going to the Neo4J Console and test your queries online (and even create links to them).
Caveat: when you start reading you may encounter with Gremlin, which is a common graph query language supported by Neo4J. Is quite awkward and very different from Cypher, so if you are going with Neo4J, you should stick to Cypher, it has more features and most of the development is made against it.
Cypher is your friend (there are several samples on this page):
http://www.neo4j.org/learn/cypher
Check out the Cypher-specific webinars:
http://watch.neo4j.org/
And finally, the Cypher cheatsheet:
http://neo4j.org/resources/cypher
I've got a table that I'd like to present. However, a lot the information in it is only useful in aggregated or visual form.
For example, the country column it itself is boring, but a aggregating all the entries of a country would be really useful. Coordinates are in there as well, so any solution should be able to present stuff on a map.
Note that the solution can be non-web, but I'd really prefer a web application everyone can access. What I've found so far is just the Google Maps API, but that's not very good at showing non-geographical information, is it?
Note that the table has a lot of dimensions, often nominal or ordinal (i.e. no numbers), so visual and plotting-focussed libraries are not that good.
EDIT: maybe that would help you, in absence of other answers
Today, this article popped into my RSS reader: Patterns of Destruction?: Visualizing Earthquake Data w/Tableau.
The author uses Tableau to visualize his data and mentions also Data Applied and GoodData.
Combine the Google Maps API with something like the Javascript Visualization Toolkit?
There are may libraries out there that might do the trick as well:
Raphael
Axis
...
I was wondering if there exists any open source frameworks that will help me include the following type of functionality to my website:
1) If I am viewing a particular product, I would like to see what other products may be interesting to me. This information may be deduced by calculating for example what other people in my region (or any other characteristic of my profile) bought in addition to the product that I am viewing. Kind of like what Amazon.com does.
2) Deduce relationships between people based on their profile, interaction with one another on the website (via commenting on one anotherĀ“s posts for example), use of the website in terms of areas most navigated, products bought in common etc.
I am not looking for a open source website with this functionality, but something like an object model into which I can feed information about users and their use of the site including rules about relationships and then at a later point ask it questions described in (1) and (2) above.
Any pointers to white papers / general information about best approaches to do this, or any related links will really help too.
(I am the developer of Taste, which is now part of Apache Mahout)
1) You're really asking for two things here:
a) Recommend items I might like
b) Favor items that are similar to the thing I am currently looking at.
Indeed, Mahout Taste is all about answering a). Everything it does supports systems like this. Take a look at the documentation to get started, and ask any questions to mahout-user#apache.org.
For 1b) in particular, Mahout has two answers:
If you are only interested in what items are similar to the current item, you would be interested in the ItemSimilarity abstraction in Mahout (org.apache.mahout.cf.taste.similarity.ItemSimilarity) and its implementations, like PearsonCorrelationSimilarity. Based on a set of user-item ratings, this could tell you an estimated similarity between any two items. You'd then just pick the most similar items. In fact, look at the TopItems class in Mahout which can just figure this for you quickly.
But also, you can combine a) and b) by computing recommendations, then applying a Rescorer implementation which then favors items that are similar to the currently-viewed item.
2) Yes likewise, you would be interesting the UserSimilarity abstraction, implementations, etc. This would deduce similarities based on item ratings. Mahout however does not help you deduce these ratings by, say, looking at user behavior. This is domain-specific and up to you.
Sound confusing -- read the docs and feel free to follow up on mahout-user#apache.org where I can tell you more.
I am researching the same topic, as I'm working on a project to help people decide how to vote on California's complicated ballot measures. Here are some open-source collaborative filtering engines that I've found:
Vogoo (PHP)
acts_as_recommendable (Ruby on Rails)
Mahout (formerly Taste) (Java)
There's also a good overview of these engines here.
There are also the Duine framework and OpenSlopeOne.
But in my opinion, Mahout is still the best.
You can find a survey about Open Source Recommender Systems here:
http://girlincomputerscience.blogspot.com.br/2012/11/open-source-recommendation-systems.html
Hope it helps!
You can find a List of Recommender Systems here