mongoDB as a file storage for Grails application - mongodb

I've recently came across a need to store a higher amount of files in my application and because PaaS platform used to host the application provides mongo, I've would like to use it.
However because I'm quite inexperienced with mongo I have almost no idea what is the current state of mongo related plugins and tools for grails. What should I use? As I want to keep domain classes in SQL database and use mongo only to store related files (in this case it will be mostly a bunch of PDFs and text documents related to domain instance) the mongoDB ORM [1] plugin seems too "heavy". Unfortunately mongoDB ORM is probably the only mongo plugin for grails in active development at the moment.
In short, what would be the best plugin / library tool-set for this purpose? The closest thing that matches my need I've found is grails-mongo-files plugin [2], which is probably a little bit outdated with no further development.So far it seems that I will have to use mongo's java driver (or the gmongo wrapper) and write some storage service and taglib by myself (what is not necessary a bad thing).
[1] http://grails.org/plugin/mongodb
[2] https://github.com/quirklabs/grails-mongo-file

There is also the mongodb gridfs plugin. http://grails.org/plugin/mongodb-gridfs
One thing to consider is that gridfs effectively does two calls to mongo, one to retrieve file information and one to retrieve the file. So it might not be a good fit if your files are under 16 megabytes.
Here is a post on how to do this manually if you want to bypass plugins - http://jameswilliams.be/blog/entry/171

Related

Springboot 2.0.5 and MongoDB TimeSeries

I have a project with 16 micro-services using Springboot 2.0.5. One of the service that calculates OHLCV of a very large trade data stored in Mongo collection.
MongoDB 6.0 has introduced timeseries functionalities that can make our life easier. But going through Mongo and Spring docs, I realized that I'll need to upgrade my Springboot version to 2.7.x, because I am currently using spring-boot-starter-data-mongodb, which doesn't have TimeSeries support.
I have also tried to migrate entire project to 2.7.4, removing Netflix Zuul, which is literally a backbone of entire architecture, but it was way too much of change. So I have decided to roll back to what was and is working fine. And it doesn't feel logical to migrate such large code base for just one feature support.
Is there any other way we can use timeseries functionality in Mongo (or other DB) which supports Springboot 2.0.5 ? The data we are talking is millions of document....
For anyone coming to this hoping for an easy answer, there isn't (at least I couldn't find it)
The way I solved is to NOT use any dependecy hack to support latest MongoDB drivers in an old Spring boot version.
Rather, I used PostGres with TimeScaleDB extension which supports OHLCV out of the box with minimum code. Also, is way too faster, and way more easy then to work around the dependency version issues.

Spring Data : Embedded /Non embedded?

I'm using Spring Data for Neo4j and MongoDB, I find it awesome, but now I just found out about the embedded and not embedded DB stuff.
Here's my situation :
Using Spring Data with the annotations, repositories, templates and thinking that I just need to change the DB address to make it work elsewhere.
My questions :
1) I don't even understand what they mean by embedded vs non embedded (on the same machine vs on a distant machine ?)
2) Do I have to change all the work I've done to make it work with a 'non embedded' DB ?
What I wan't to do is to deploy my Spring Boot app that is using Neo4j to Heroku or CloudFoundry and use Graphen (Neo4j paas) for the DB. But when I saw all this story about Spring Data working only for embedded, I just lost all the hope and happiness I had when building my app.
3) If 2) is Yes, is it an easy transition ? is there a lot of things to change ?
EDIT :
Here's what I'm talking about :
http://inserpio.wordpress.com/2014/04/30/extending-the-neo4j-server-with-spring-data-neo4j/
He's adding some custom boilerplate code to make it work with a non embeded DB, is it ok ? Why it doesn't work as any other DB (like with JPA, where you just specify the address of the DB).
inserpio here. Don't lose your happiness, please: Spring Data Neo4j team is working hard to implement a new release that improves remote performances.
When Spring Data Neo4j started neither Cypher nor Neo4j-Server existed, while only the embedded version was available. As the server version was delivered SDN team provided a quick solution that works well if you only use repositories, but becomes a little bit too chatty is you want to use #Entity too. The problem is matching those #Entity with the returned nodes.
Since the new version is still not completed, for the moment, you could move you persistence-logic more tight to the database as a server extension. I explained it on the link you mentioned. It's a really fast refactoring: just move your entities and repositories to a new simple java project, install the resulting jar in the 'plugins' folder, one line configuration in the neo4j-server.properties and expose your queries as simple REST services.
Hope this could help.
Do not hesitate to contact me for any further question.
Cheers,
Lorenzo

Memcached vs Memcache vs Jcache

Please don't mark this question as a duplicate. I read the previous questions, but I am still unable to understand it.
I am currently into a project designed in Java which uses MongoDB for persistence. But due to some performance issues with it, I am asked to use Memcached. But I am unable to figure out how can Memcached help me in doing this.
While surfing, I got more confused because of caching services like Memcache and Memcached. Can someone please explain me how are these different and why does PHP comes into the answer in some questions when Memcached is asked.
I request all to answer clearly and let me know with an example how could I use Memcached into my project. What is Memcache, Memcached, Jcache and SpyMemcached?
If possible, please provide a link to complete Memcached example somewhere.
Memcache and Memcached are the same thing, the "correct name" being Memcached ( http://memcached.org/ ).
JCache is the name of a standard Java API (JSR 107 - https://jcp.org/en/jsr/detail?id=107 ) that provides a generic API to interact with caching layer/solutions. (get/set/remove data from a Key/Value cache to simplify)
So you really want to use a caching layer at the top of MongoDB in your Java application you have to:
Install Memcached somewhere on your infrastructure (if not install already, you can test it quickly with telnet. The default port is 11211, so you can run telnet localhost 11211 to see if it is working.
You have to use a JCache implementation for Memcached, for example this one: https://github.com/toelen/spymemcached-jcache This will allow you to store and get data into a Memcached process running somwhere in your infrastructure.
Since you are talking about JCache, you are Using Java, it is also possible to use Java based cache that will work in your JVM Directly without having a 3rd party cache/process (memcached). You can find many of them, it could be for example eHCache, JBoss Cache, and most of them expose their API using the standard JCache API.
Now you need to code your Data Access layer to get the data out of MongoDB and set them into the Cache using JCache API. IN this code you will have to check if a data is in the cache, if not query the data from MongoDB, and set it in the cache and use it. Be careful about the eviction strategy.
This document about using JCache in Google App Engine documentation is interesting to see the "pseudo code" https://developers.google.com/appengine/docs/java/memcache/usingjcache (your code will be different but it should help you to see what you have to do in your code.)
The reason why you often see Memcached and PHP together is just because Memcached is the most common caching layer for PHP application, with many many API/FWK that are using this. In Java we have many options, from a pure Java layer to Memcached or other...
However, this is the "overall" approach, but before doing this I would check "why" you are saying that MongoDB is slow, and solve the issue.

Mongodb document versioning using spring data

I am using Spring Data in my Java application to connect to MongoDb and have a requirement around versioning the documents (basically storing the history).
It seems that its pretty straightforward in Ruby, if one uses Mongoid
I was wondering if spring data has something similar for Java. Or are you better of trying to implement your own.
Yes there is a very good feature in Spring data which is auditing you can refer to the following link
http://www.javacodegeeks.com/2013/11/auditing-entities-in-spring-data-mongodb-2.html
After lot of research I found that https://javers.org/documentation/spring-boot-integration/. This works like rock solid and very easy to implement.
This library helps to store all the history of the changed fields and easy to query over it and it has great support of it. The sample POC shared here: https://nullbeans.com/auditing-using-spring-boot-mongodb-and-javers/

Faceted search. Are Solr with MongoDb good for it? Does someone know about modules, libraries for faceted search?

Good day, everyone.
I have an e-commerce website (Kohana php framework + Mysql + Sphinx search).
I want to integrate faceted search (also called faceted navigation, guided navigation, or parametric search) on my e-commerce shop.
1. I found several opinions that Solr was the best solution for faceted browsing? I want to be sure in my choice. Is Solr the best in doing that?
2. Also I want to migrate products (with attributes) from mysql to MongoDb. Is MongoDb good in cooperation with Solr?
3. Does someone know modules, ui, api for faceted search? Maybe there is some Zend library, Rest api...
Thanks for your help.
I found several opinions that Solr was the best solution for faceted browsing?
I am pretty sure that those who say that are most likely the same sales men who say that MongoDB pwns MySQL in every role.
Sphinx (the tech you are currently using) has been proven to support massive sets in a performant manner, for example: http://infegy.com/ uses it for a result set of 22 billion records ( http://sphinxsearch.com/info/powered/ )!
Solr is dead fast too but saying one is better than the other when both support facets and both support super fast speed on super big result sets is just utter nonsense.
Also I want to migrate products (with attributes) from mysql to MongoDb. Is MongoDb good in cooperation with Solr?
Solr uses a separate XML schema and files to represent its internal files (Lucence here) unlike Sphinx which can do this transparently without you knowing.
So to get MongoDB to work with Solr is no different than using MySQL with it. You build up the XML files and commit (or soft_commit) them to Solr.
As for faceting, here is a very simple page on it with links to the places you need to go: http://wiki.apache.org/solr/SolrFacetingOverview
Solr has a built in REST (JSON) API on top of Jety ( http://jetty.codehaus.org/jetty/ ) which you can use to fetch all the facets you need easily.