string sleuth messages in a NoSql store - spring-cloud

I would like to store the spring cloud sleuth messages that come in my zipkin server (ala #EnableZipkinStreamServer) into a NoSql store.
I know that the original zipkin impl used cassandra which would work for me but I am curious about say MongoDb or CouchBase.
I looked in the documents (http://cloud.spring.io/spring-cloud-sleuth/spring-cloud-sleuth.html#_zipkin_consumer) and saw that you can use spring-boot-starter-jdbc for this and it will write into the MySQL DB instance that you define.
I don't see an API/SPI for putting this into anything else. Is there one and what is it. I can implement the NoSql portion myself.

Related

NoSQL Database for Blog / Content Management System? (MongoDB / Cassandra)

My company has been used Oracle for a long time but we would like to look for a NoSQL database as a replacement for faster querying and flexible schema design.
I have tried to use MongoDB which would be the most popular NoSQL database nowadays. I connected it to Spring Data to do some simple queries, which is quite easy to be set up and code simply. Since we are using Spring MVC for web development, Spring Data seems quite suitable for integration.
However, I heard that Cassandra would have better performance in write and read, especially in large scaling system. I am not sure whether it is worth to move to Cassandra and not sure how to measure the performance between MongoDB and Cassandra.
Here are some requirements for my system:
focusing on article fetching
tagging for articles for users to easily search for their favors or related articles
non-distributed system, but have load-balancing and fail-over
Java based, Spring MVC for web development
articles would be stored as XML
probably provide user-defined tables (collections) and fields (keys)
Therefore I would like to raise some questions:
Which Database is the most suitable for my case? You may also raise other databases apart from MongoDB and Cassandra.
If I use Cassandra, which framework would be suitable for integrating to Spring MVC?
Thank you so much in advanced.
I have experience using Spring and Cassandra together. But I always have written my own data access layer.
Using the ORMs out there for Cassandra will not allow you to leverage its full power, and you will, most likely, introduce bugs because your SQL background will make you expect certain behaviours that are just not what Cassandra will give you.
My advice write the code that will access Cassandra yourself and do not be afraid to denormalize A LOT. Think more about how you want to query (or find it) your data than the format in which you want to save it.
I also strongly recommend reading this amazing article: Cassandra Data Modeling Best Practices part 1 part 2
Another DB which might suit your application better is CouchDB (I like using BigCouch). It is another Document based NoSQL database and is in my opinion superior to MongoDB. It offers better solution for scaling and gives emphasis to Availability (just like Cassandra).
I'd like to point you to this question about the difference between CouchDB and MongoDB.
As far as framework goes Play framework has a lot of plugin to work with NoSQL systems, so you might give it a try. You could try playorm which is the last I experimented on.
EDIT : I forgot to mention Kundera as well as an ORM for Cassandra
Choosing between Cassandra and MongoDB depends on type of storage. MongoDB is primarily for document based storage where you get an edge by having various sql like features.
If you require columnar database with high availability and multi dc replication? go for Cassandra.
http://db-engines.com/en/system/Cassandra%3BHBase%3BMongoDB

What is the use of MongoDB in GrayLog2?

GrayLog2 requires both ElasticSearch and MongoDB, while Logstash uses only ElasticSearch for persisting and searching the logs. what does mongo provide in graylog2?
Graylog2 uses mongodb for the web interface entities. Streams, alerts, users, settings, cached stream counts etc. Pretty much everything you see and edit in the web interface except for the logs themselves.

NoSQL as local storage for logging and tracing

We are developing application which will have many physical servers. We want to use NoSQL for logging and tracing since it does not required structured data.
We don't want to have Centralized logging.
Can we install NoSQL (any one) in each server and store logging/tracing details? Will NoSQL impact my actually process in the server? Is it good idea to do it?
Problem1: Data Collection
Many people're using NoSQL solutions for storing application logs. The first challenge you may have is how to collect huge amount of data from various data sources reliably with ease of management. One concern of not having log collection layer, is lock contention of database caused by high write throughput.
So basically having log collection layer is recommended. There're some open-source log collector implementation such as syslog, Fluentd, Scribe, and Flume :)
Problem2: Storage & Processing
The next big problem is how to store and process data. The backend infrastructure requires a lot of changes as the data volume increase. At first, you can use MongoDB to store all of your data, but at some moment you end up using Apache Hadoop to architect a massively scalable architecture.
Here's an example architecture of having Fluentd for log collection, and MongoDB for log storage and processing.
Here're some links to put the Apache Logs into Amazon S3, MongoDB, or Hadoop HDFS by Fluentd.
Store Apache Logs into Amazon S3
Store Apache Logs into MongoDB
Fluentd + HDFS: Instant Big Data Collection
Disclaimer: I'm a committer of Fluentd project.
definitely this is good idea for doing same thing with nosql rather than sql.
because in logging and tracing volume of data is high and ratio of retrieving data is also high.
you for logging and tracing you need complex reports for analysis so nosql is better for you.
also nosql support distributed environment so you create infrastructure at different geographic location.

Express on Node - what data store?

I'm doing my first project in node/express.
I'm looking to implement a data store and noticed that express is using redis as a session store. Does this mean that express installs redis by default? The reason I ask is that I pondering whether to install mongodb but if redis is already there to use, I'll go with that.
New to node and express so any advice much appreciated.
Last time I checked, express used a built-in in-memory data store by default and connect-redis was a separate package.
express installs neither redis (the database executable) nor node-redis (redis API binding for node) nor connect-redis (session store for connect and express that uses redis).
redis is a very simple database compared to mongodb. Mongodb is a full document-oriented database and redis is just an in-memory key-value store.
Also, express relies on connect for most stuff including session management, and sessions are in fact provided by connect.
The reason I ask is that I pondering whether to install mongodb but if
redis is already there to use, I'll go with that.
Question you should rather ask is if redis would be the right data store for your stuff. Redis doesn't support for example querying which might be crucial for you and your data retrieval, so if that's one of your requirements you should go with mongodb.

Is NoSQL suitable for Selling Tickets Web Application?

I want to write a high scalable web application for selling event tickets. I want to use NoSQL database, like Big Table or MongoDB and Cloud Service like Google App Engine (GAE) or Amazon Elastic Compute Cloud (Amazon EC2)
Is it posible using this type of database to be sure that two client will not be able to buy a ticket for the same place simultaneously? Or may be I will have to use RDBMS database and forget about Google App Engine?
Things like GAE's datastore can still support transactional semantics, for example:
http://code.google.com/appengine/docs/python/datastore/transactions.html
So yes, it is possible to do what you're seeking to do. (Note - GAE's Datastore is not exactly NoSQL, since it uses SQL-like queries.)
I have a problem with this question. Not all NoSQL databases are created equally, and different NoSQL databases have different ways they store data. Generally the thing you should be worried about are: data is actually written to disk and not just into memory. Most NoSQL databases can do this but not by default. Let's just say this is not a problem, you can usually tell the database like MOngo or Cassandra to write data to disk, can even tell how many servers at minimum the data should be written to.
The problem is that you may not get a true transactional support. When you deal with ecommerce it's important to have all or nothing type of transation where several operations either succeed completely or rolled back. There must be absolutely no chance that only part of your data is saved. For example, if you need to write data to more than one table (collection or document in NoSQL lingo), if server goes down in the middle of the process and your data is only written to one table, that's usually unacceptable in ecommerce.
I am not familiar with all NoSQL databases, but the ones I know don't have this option yet.
MySQL, on the other hand, does.
If transactional support or lack of it does not bother you, then I think its OK to use NoSQL as long as you tell it to save data to disk and not just into memory.
The answer is 'maybe.'
Depending on what you're trying to build, you many be able to use some of the techniques in this post:
http://kylebanker.com/blog/2010/06/07/mongodb-inventory-transactions/
Using something like get_or_insert you can easily ensure that two clients are not receiving the same resource simultaneously on Google App Engine. However, there are big differences between GAE and a RDBMS, so make sure you study them further before you make a decision.