MongoDB stored procedures NOT in javascript - mongodb

I have some collection and I want to perform action on every insert into that collection. The problem is that the code, that will do this actions is in Java. In Oracle it was possible to wrap Java or even C code into PL/SQL procedure, and then use this procedure in trigger. In CouchDB we could write a view. What would be the closest analog for MongoDB?
The best possibility I can think of is to wrap my code into REST server, and then interact with it using stored javascript.
I've already seen this question, but due to dependency on java libs, I can't use just javascript in my workflow, neither I don't want to run a new heavy service along with mongodb if there is some other way to do this.

There are a number of things to say about your request:
I have some collection and I want to perform action on every insert into that collection.
1) What you're asking for here is not really a "stored procedure", but really is a "database trigger". MongoDB does not provide any sort of "database trigger" functionality.
This is consistent with the general design goals of MongoDB, which is to provide a very fast, scalable data store without the heavy weight of traditional DBMS systems. See this presentation for more details about the design goals of MongoDB: http://www.10gen.com/presentations/mongosf2011/whymongodb
2) If there is some data processing that you'd like to perform on every insert, you'll need to do it on the client side of the MongoDB connection. This will necessarily involve writing some code in your application.
3) I'd suggest that you avoid running JavaScript within the mongod server if at all possible. The JavaScript is interpreted on the server side, so the speed of your queries will be affected. In addition, all JavaScript run in the mongod server is single-threaded, so there is no concurrency of any JavaScript execution.
I wish I had a better answer for you.

Related

Should I filter data in PostgreSQL or server backend?

I am working on a project which uses graphql and PostgreSQL where we want to select data from the database with a value after a certain date. It is currently selecting all data from the database and then filtering it on the server:
.filter(({time}) => moment(time).isAfter(startTime))
However I would have thought it would be best to do this filtering in the database query as the full dataset is never used.
Is there any benefit to doing it on the server rather than in the database query?
Barring some unusual edge case -- such as other parts of your backend code really do need all the data for some reason -- it would definitely be more efficient to filter everything on the Postgres side via the SQL that is being used to fetch the data in the first place.
This is true for several reasons:
Assuming the table is properly indexed, the filtering will be able to occur much faster within the database.
The unneeded data will not need to be serialized and sent over the wire to the backend, only to then be discarded by the backend's own filtering.
The memory footprint should be reduced on both the Postgres and server end due to needing to process only a portion of the results.
I've not worked with GraphQL myself, but from doing a bit of poking around through its docs, it appears GraphQL often uses other mechanisms in different layers (outside of the database) to try to improve performance.
It would be worth seeing what the actual SQL is that your GraphQL query is generating (that may be possible via a function in GraphQL; it could also be done by enabling certain log settings on the Postgres server and correlating the log output to the query). That may lead to further optimization possibilities if you want to keep things purely GraphQL.
Jumping down to a raw query seems like it would be a good possibility though. Certainly that is something that is often done with ORMs like Django and ActiveRecord.

Any way to get Meteor using a native ACID compliant db?

I am seriously considering Meteor framework for building every POC and apps in the future...but, I can't get ride of an ACID compliant database as I have few usages of multi-documents atomic transaction that require this compliance.
Meteor strongly rely on MongoDB syntax and storage engine at the moment (it means there are no "Transaction" related syntax available...)
I am currently evaluating any solution allowing this ACID capability :
Using a MySQL native driver for Meteor (different syntax than MongoDB?)
Using a PostgreSQL native driver for Meteor (SQL syntax)
Using a TokuMX (a MongoDB fork with ACID compliance...same syntax than MongoDB appart from transaction related commands that would be required to add)
Those 3 solutions are good candidates for the Meteor roadmap as shown here
What pros/cons about those solution ? Which are the most advanced one ?
What would you although consider as a solution to keep Meteor while storing documents in a NoSQL like ACID compliant db ?
sqlAndMeteor
If you are like me, you love Meteor but hate Mongo. In Meteor's Trello Roadmap (https://trello.com/b/hjBDflxp/meteor-roadmap), the most voted feature is SQL Support, either PostgreSQL or MySQL.
Since there is no date for that in Meteor, here I summarize the partial solutions I have found.
1.- Use SQL only for client-side querys.
Let's face it, Mongo sucks on common data operations, so having the ability to use SQL to query data (with JOINS, GRUP BY and so on) would relief a lot of pain. There are packages which let you use SQL in the client, at least for querys: The simplest one is a old (2010) utility, SqlLike (http://www.thomasfrank.se/sqlike.html). The new player in town is alaSQL, which is actively developed by #agershun (https://github.com/agershun/alasql). The SqlLike advantage is that it only has 10k. AlaSQL, is a lot more powerful, of course, but for using SQL to replace mongo sintax in unions and aggregations, SqlLike is OK.
With both of them you can do something like this in your helper:
productsSold:function(){
var customerSalesHistory=salesHistory.find({cutomerId:Session.get('currentCustomer')}).fetch();
var items=products.find().fetch();
return alasql("select item.name, sales.ordered as sumaVentas from ? sales, ? items
where items.Id=sales.itemId",[customerSalesHistory,items]);
}
2.- Experiment with direct SQL support.
Some packages try to replace Mongo (and minimongo) with MySql or PostgreSQL. #numtel's MySql package is Meteor-MySql https://github.com/numtel/meteor-mysql, and PostgreSQL is Meteor-pg (https://github.com/numtel/meteor-pg). Both are good attempts to solve the problem, but have some issues yet and are somehow cumbersome to adapt.
A team from Hack Reactor has formed Meteor Stream, and its first product is a PostgreSql integration with Meteor, meteor-postgres (https://github.com/meteor-stream/meteor-postgres). It looks very good and uses alaSql on the client to replace minimongo.
Both approaches are good, but they have some problems:
They broke deployment to meteor.
They are very, very young and not near production ready AFAIK
They still require tweaks to the usual pub-sub sintax we are used to, which could raise compatibility issues with other meteor packages.
3.- Still use Mongo, but as simple repository for your MySql database.
This option maintains all Meteor's characteristics and uses Mongo as a temporal repository for your MySql or PostgreSql databases.
A brilliant attempt to that is mysql-shadow by #perak (https://github.com/perak/mysql-shadow). It does what it says, keeps Mongo synchronized both ways with MySql and let's you work your data in MySql.
The bad news is that the developer will not continue maintaining it, but what is done is enough to work with simple scenarios where you don't have complex triggers that update other tables or stuff like that.
For a full featured synchronization you can use SymmetricsDS (http://www.symmetricds.org), a very well tested database replicator. This involves setting up a new java server, of course, but is by far the best way to be sure that you will be able to convert your Mongo database in a simple repository of your real MySql, PostgreSQL, SQL Server , Informix database. I have to check it myself yet.
For now MySQL Shadow seems like a good enough solution.
One advantage of this approach is that you can still use all standard Meteor features, packages, meteor deployment and so on. You donĀ“t have to do anything but set up the synch mechanism, and you are not breaking anything.
Also, if someday the Meteor team uses some of the dollars raised in SQL integration, your app is more likely to work as is.
If MySQL works for you, I've used the meteor-mysql package and it works well.
I finally forged my own conclusion...
I will have TWO platforms :
a Meteor front with business data in PostgreSQL and some front data or easy-to-replicate data in MongoDB
a Java data backend (server2server only) handling all atomic operations on my business data in PostgreSQL...plus technical adapters (SAP, Salesforce), a BPMN 2.0 workflow engine (Actility), and any registered SOA needed from other systems
Any comments are still very welcomed and will be considered and answered

NoSql Injection in Python

when trying to make this question, i got this one it is using Java, and in the answer it gave a Ruby example, and it seems that the injection happens only when using Json? because i've an expose where i'll try to compare between NoSQL and SQL and i was trying to said: be happy, nosql has no sql injection since it's not sql ...
can you please explain me:
how sql injection happens when using Python driver (pymongo).
how to avoid it.
the comparison using the old way sql injection using the comment in the login form.
There are a couple of concerns with injection in MongoDB:
$where JS injection - Building JavaScript functions from user input can result in a query that can behave differently to what you expect. JavaScript functions in general are not a responsible method to program MongoDB queries and it is highly recommended to not use them unless absolutely needed.
Operator injection - If you allow users to build (from the front) a $or or something they could easily manipulate this ability to change your queries. This of course does not apply if you just take data from a set of text fields and manually build a $or from that data.
JSON injection - Quite a few people recently have been trying to convert a full JSON document sent (saw this first in JAVA, ironically) from some client side source into a document for insertion into MongoDB. I shouldn't need to even go into why this is bad. A JSON value for a field is fine since, of course, MongoDB is BSON.
As #Burhan stated injection comes from none sanitized input. Fortunately for MongoDB it has object orientated querying.
The problem with SQL injection comes from the word "SQL". SQL is a querying language built up of strings. On the other hand MongoDB actually uses a BSON document to specify a query (an Object). If you keep to the basic common sense rules I gave you above you should never have a problem with an attack vector like:
SELECT * FROM tbl_user WHERE ='';DROP TABLE;
Also MongoDB only supports one operation per command atm (without using eval, don't ever do that though) so that wouldn't work anyway...
I should add that this does not apply to data validation only injection.
SQL injection has nothing to do with the database. It is a type of vulnerability that allows for execution of arbitrary SQL commands because the target system does not sanitize the SQL that is given to the SQL server.
It doesn't matter if you are on NoSQL or not. If you have a system running on mongodb (or couchdb, or XYZ db), and you provide a front end where users can enter records - and you don't correctly escape and sanitize the input coming from the front end; you are open to SQL injection.

Mongodb access through sql like syntax

Is there any library where i can access mongodb by using sql like syntax.
Example
use db
select * from table1
insert into table1 values (a,b,c)
delete from table
select a,b,count(*) from table1 group by a,b
select a.field1,b.field2 from a,b where a.id=b.id
Thanks
Raman
The learning curve is small only if you are only doing extremely simple sql queries. If the extent of your SQL querying is "select * from X", then MongoDB looks like a brilliant idea to cut through all the too-complicated SQL. But if you need to perform left outer joins, test for null, check for ranges, subselects, grouping and summation, then you will soon end up with a round concave dent in your desk after being moved to Mongo. The sick punchline is that half the time, the thing you are trying to do can't be done in the Mongo interface. Mongo represents a bold new world where instead of databases doing things like aggregation and query optimization, it just stores data and all the magic is done by retrieving everything, slowly, storing it in app memory, and doing all that stuff in code instead.
YES!
A company called UnityJDBC makes a JDBC driver for mongodb. Unlike the mongo java driver, this JDBC driver allows you to run SQL queries against MongoDB and the driver is supported by any Java appliaction that uses JDBC.
to download this driver go to...
http://www.unityjdbc.com/mongojdbc/mongo_jdbc.php
Its free to download too!
hope this helps
MoSQL might satisfy your needs. It'll require you to run a new PostgreSQL instance but from there you can query your entire Mongo dataset with SQL.
"MoSQL imports the contents of your MongoDB database cluster into a PostgreSQL instance, using an oplog tailer to keep the SQL mirror live up-to-date. This lets you run production services against a MongoDB database, and then run offline analytics or reporting using the full power of SQL."
Have a look at this recent project: http://www.mongosql.com/. I've been looking at it over the last few weeks and it looks very promising.
For those of you who have questioned the usefulness of SQL against MongoDB, consider the large number of not-very-technical users in many organizations, like business analysts, who may know SQL, but don't want to make the leap to JavaScript and JSON. Tools like mongoSQL can help push the adoption of MongoDB in an organization.
There are a few solutions out there, but nearly all of them fail to truly represent the MongoDB data model in a way that the "relationally" minded ODBC/JDBC applications and users desire/require. A recent commercial product was released that addresses these challenges
ODBC:
http://www.progress.com/products/datadirect-connect/odbc-drivers/data-sources/mongodb
JDBC:
http://www.progress.com/products/datadirect-connect/jdbc-drivers/data-sources/mongodb
To address the need for ODBC/JDBC (SQL) access...While there are strong arguments for writing new applications using Mongo's clients, there is still a strong need in the marketplace for quality ODBC/JDBC and SQL based access to MongoDB. This need largely arises from all the reporting, analytic, and BI applications that rely on ODBC/JDBC connectivity and do not offer native integration with MongoDB.
Free NoSQL Viewer supports conversion of SQL queries to MongoDB shell syntax. Furthermore, in SQL Viewer you can even use SQL SELECT statements to query MongoDB collections data without knowing MongoDB query syntax. Check out NoSQL Viewer here www.spviewer.com/nosqlviewer.html
Mongodb and its current driver do not support direct SQL like syntax.
However, all operations are easily doable with the driver specific operations.
Here is a brief mapping of mongodb operations to corresponding SQL like query :
http://www.mongodb.org/display/DOCS/SQL+to+Mongo+Mapping+Chart
There are a couple projects underway to emulate a SQL interface for MongoDB. While they provide a familiar interface, in general they should be avoided. They operate on a fundamentally flawed premise in that they parse strings and translate them into method calls.
Once you work with MongoDB you will find the approach of using classes and methods a much more accessible interface as it works exactly like all other parts of your application. Yes there is a small learning curve as you first start, but for the most part, the interface in MongoDB works how you would expect it to.

entity framework performance

I am using Entity Framework to layer on my SQL Server 2008 database. The EF is present in my web service and the webservice is invoked by a Silverlight client.
I am seeing a serious performance issue in terms of the duration taken by a query to execute in the EF. This wouldn't happen in the consecutive calls.
A little bit of googling revealed that, it's caused per app domain to construct the in-memory model of the db objects. I found this Microsoft link explaining pre-generation of views for performance improvement. Even after implementing the steps, the performance actually degraded instead of improving. I am curious, if anyone has tried this approach successfully and if there are any other avenues for improving performance.
I am using .NET 3.5.
A couple areas to look at for EF performance
Do as much of the processing before calling things like tolist(). ToList will bring everything in the set into memory. By default, EF will keep building the expression tree and only actually process it when you need the data in memory. That first query will be against the database, but afterwards the processing will be in memory. When working with large data, you definitely want as much of the heavy lifting done by the database as possible.
EF 1 only has the option to pull the entire row back. Therefore if you have a column that is a large string or binary blob, it is going to be pulled down and into memory whether you need it or not. You can create a projection that doesn't include this column, but then you don't get the benefits of having it be an entity.
You can look at the sql generated by EF using the suggestion in this post
How do I view the SQL generated by the Entity Framework?
The same laws of physics apply for EF queries as they do for ordinary SQL. Check your database tables and make sure that you have indexes on primary and foreign keys, that your database is properly normalized, and so forth. If performance is degrading after Microsoft's suggestions, then that's my guess as to the problem area.
Are you hosting the webservice in IIS? Is it running on the same site as the Silverlight App? What about the database itself? Is it running on a dedicated machine? Are there other apps hitting it? The first call to a dormant database is painful (I've had situations where it would actually time out in my environment.)
There are a number of factors to take into consideration here. But it comes down to more than just EF's overhead.
edit I didn't fully qualify but the process of opening the first connection to SQL Server is slow regardless of your data access solution.
Use SQL Profiler to check how many queries executed to retrieve your data.If it's large number use Include() method of ObjectQuery to retrieve child objects with parent in one query.