Mongodb running ordinary sql - mongodb

Why can I not use regular SQL queries like select *from student (assuming there is a table called student) in MongoDB, although they say it is Not only SQL (NoSql)?

SQL is a domain-specific language used in programming and designed for
managing data held in a relational database management system (RDBMS)
Wouldn't it be quite confusing to use such a specific language to query data from a DBMS (MongoDB) that is structurally different from RDBMS, where there is no concept of table, rows and columns, where joins don't exist, where you can nest documents into other documents following no defined schema, and so on?
The point is that the difference between SQL and Mongo query language is not just the syntax, it is also the semantics. A Mongo query does not say the same thing of a SQL query just in a different language, it says a different thing at all.
Sure you can find a direct Mongo translation for basic SQL query, like a simple SELECT item, status from inventory WHERE status = "A" but how would you translate a JOIN to Mongo, or how would you query a nested document using SQL?

I'm not sure I fully understand the question.... But I'll give it a go.
NoSQL is a non-relational database and stands for "Not only SQL"... As for not being able to use it that is because MongoDB has their own terminology and such.
In MongoDB terms a collection is the same as a table. A collection is a grouping of MongoDB documents (A document is a record in MongoDB collection that is the basic unit of data in MongoDB)
In order to translate your statement
select *from student
to MongoDB we would use
db.student.find()
Notice the different syntax in statements. They each have their use case it is just all finding which one fits yours. There are numerous differences between the two beyond syntax such as schema, architecture and how they work.
For more information on this see the following link:
MongoDB terminology versus SQL

Related

mongodb duplicate a collection within the same database

I want to clone an existing collection, including data and indexes, to a new collection with another name within the same database, using mongodb JSON interface (not the command-line interface).
I've tried:
cloneCollection - didn't work. is for cloning across databases.
aggregate with an $out operator - that copies just the data but not the indexes.
The aggregate command I've tried:
{"aggregate":"orig_coll", "pipeline":[{"$out":"orig_clone"}]}
There is no way to do this in one JSON query.
So, two solutions here :
Using mongodump/mongorestore as proposed in What's the fastest way to copy a collection within the same database?
Using two queries : one to create the destination table with the index and the aggregate query that you already have. I understand that it's not a perfect solution as you need to maintain the query to create the index on the destination table and the index on the source table but there's no other way to do this.
What you need to understand is that, the JSON interface as you told it is not a database interface but a database JavaScript query language. So you can pass query to it not command. In fact, it's not an interface just a query DSL. The interface is the mongo shell or any of the mongo drivers (java, perl, ...) or any of the mongo admin tools ...

Is it possible to use mongodb with sqlalchemy?

I can't find any information about connecting to mongodb in the documents of sqlalchemy and google search.
Is it possible to use mongodb with sqlalchemy? Thanks.
as per sql alchem desc you cannot use it:
SQLAlchemy considers the database to be a relational algebra engine,
not just a collection of tables. Rows can be selected from not only
tables but also joins and other select statements; any of these units
can be composed into a larger structure. SQLAlchemy's expression
language builds on this concept from its core.
SQLAlchemy is most famous for its object-relational mapper (ORM), an
optional component that provides the data mapper pattern, where
classes can be mapped to the database in open ended, multiple ways -
allowing the object model and database schema to develop in a cleanly
decoupled way from the beginning.
The main goal of SQLAlchemy is to change the way you think about
databases and SQL!
You may use MongoAlchemy instead.

SQL view in mongodb

I am currently evaluating mongodb for a project I have started but I can't find any information on what the equivalent of an SQL view in mongodb would be. What I need, that an SQL view provides, is to lump together data from different tables (collections) into a single collection.
I want nothing more than to clump some documents together and label them as a single document. Here's an example:
I have the following documents:
cc_address
us_address
billing_address
shipping_address
But in my application, I'd like to see all of my addresses and be able to manage them in a single document.
In other cases, I may just want a couple of fields from collections:
I have the following documents:
fb_contact
twitter_contact
google_contact
reddit_contact
each of these documents have fields that align, like firstname lastname and email, but they also have fields that don't align. I'd like to be able to compile them into a single document that only contains the fields that align.
This can be accomplished by Views in SQL correct? Can I accomplish this kind of functionality in MongoDb?
The question is quite old already. However, since mongodb v3.2 you can use $lookup in order to join data of different collections together as long as the collections are unsharded.
Since mongodb v3.4 you can also create read-only views.
There are no "joins" in MongoDB. As said by JonnyHK, you can either enormalize your data or you use embedded documents or you perform multiple queries
However, you could also use Map-Reduce.
or if you're prepared to use the development branch, you could test the new aggregation framework though maybe it's too much? This new framework will be in the soon-to-be-released 2.2, which is production-ready unlike 2.1.x.
Here's the SQL-Mongo chart also, which may be of some help in your learning.
Update: Based on your re-edit, you don't need Map-Reduce or the Aggregation Framework because you're just querying.
You're essentially doing joins, querying multiple documents and merging the results. The place to do this is within your application on the client-side.
MongoDB queries never span more than a single collection as there is no support for joins. So if you have related data you need available in the results of a query you must either add that related data to the collection you're querying (i.e. denormalize your data), or make a separate query for it from another collection.
I am currently evaluating mongodb for a project I have started but I
can't find any information on what the equivalent of an SQL view in
mongodb would be
In addition to this answer, mongodb now has on-demand materialized views. In a nutshell, this feature allows you to use aggregate and $merge (in 4.2) to create/update a quick view collection that you can query from faster. The strategy is used to update the quick view collection whenever the main collection has a record change. This has the side effect unlike SQL of increasing your data storage size. But the benefits can be huge depending on your querying needs.

What is the fundmental difference between MongoDB / NoSQL which allows faster aggregation (MapReduce) compared to MySQL

Greeting!
I have the following problem. I have a table with huge number of rows which I need to search and then group search results by many parameters. Let's say the table is
id, big_text, price, country, field1, field2, ..., fieldX
And we run a request like this
SELECT .... WHERE
[use FULLTEXT index to MATCH() big_text] AND
[use some random clauses that anyway render indexes useless,
like: country IN (1,2,65,69) and price<100]
This we be displayed as search results and then we need to take these search results and group them by a number of fields to generate search filters
(results) GROUP BY field1
(results) GROUP BY field2
(results) GROUP BY field3
(results) GROUP BY field4
This is a simplified case of what I need, the actual task at hand is even more problematic, for example sometimes the first results query does also its own GROUP BY. And example of such functionality would be this site
http://www.indeed.com/q-sales-jobs.html
(search results plus filters on the left)
I've done and still doing a deep research on how MySQL functions and at this point I totally don't see this possible in MySQL. Roughly speaking MySQL table is just a heap of rows lying on HDD and indexes are tiny versions of these tables sorted by the index field(s) and pointing to the actual rows. That's a super oversimplification of course but the point is I don't see how it is possible to fix this at all, i.e. how to use more than one index, be able to do fast GROUP BY-s (by the time query reaches GROUP BY index is completely useless because of range searches and other things). I know that MySQL (or similar databases) have various helpful things such index merges, loose index scans and so on but this is simply not adequate - the queries above will still take forever to execute.
I was told that the problem can be solved by NoSQL which makes use of some radically new ways of storing and dealing with data, including aggregation tasks. What I want to know is some quick schematic explanation of how it does this. I mean I just want to have a quick glimpse at it so that I could really see that it does that because at the moment I can't understand how it is possible to do that at all. I mean data is still data and has to be placed in memory and indexes are still indexes with all their limitation. If this is indeed possible, I'll then start studying NoSQL in detail.
PS. Please don't tell me to go and read a big book on NoSQL. I've already done this for MySQL only to find out that it is not usable in my case :) So I wanted to have some preliminary understanding of the technology before getting a big book.
Thanks!
There are essentially 4 types of "NoSQL", but three of the four are actually similar enough that an SQL syntax could be written on top of it (including MongoDB and it's crazy query syntax [and I say that even though Javascript is one of my favorite languages]).
Key-Value Storage
These are simple NoSQL systems like Redis, that are basically a really fancy hash table. You have a value you want to get later, so you assign it a key and stuff it into the database, you can only query a single object at a time and only by a single key.
You definitely don't want this.
Document Storage
This is one step up above Key-Value Storage and is what most people talk about when they say NoSQL (such as MongoDB).
Basically, these are objects with a hierarchical structure (like XML files, JSON files, and any other sort of tree structure in computer science), but the values of different nodes on the tree can be indexed. They have a higher "speed" relative to traditional row-based SQL databases on lookup because they sacrifice performance on joining.
If you're looking up data in your MySQL database from a single table with tons of columns (assuming it's not a view/virtual table), and assuming you have it indexed properly for your query (that may be you real problem, here), Document Databases like MongoDB won't give you any Big-O benefit over MySQL, so you probably don't want to migrate over for just this reason.
Columnar Storage
These are the most like SQL databases. In fact, some (like Sybase) implement an SQL syntax while others (Cassandra) do not. They store the data in columns rather than rows, so adding and updating are expensive, but most queries are cheap because each column is essentially implicitly indexed.
But, if your query can't use an index, you're in no better shape with a Columnar Store than a regular SQL database.
Graph Storage
Graph Databases expand beyond SQL. Anything that can be represented by Graph theory, including Key-Value, Document Database, and SQL database can be represented by a Graph Database, like neo4j.
Graph Databases make joins as cheap as possible (as opposed to Document Databases) to do this, but they have to, because even a simple "row" query would require many joins to retrieve.
A table-scan type query would probably be slower than a standard SQL database because of all of the extra joins to retrieve the data (which is stored in a disjointed fashion).
So what's the solution?
You've probably noticed that I haven't answered your question, exactly. I'm not saying "you're finished," but the real problem is how the query is being performed.
Are you absolutely sure you can't better index your data? There are things such as Multiple Column Keys that could improve the performance of your particular query. Microsoft's SQL Server has a full text key type that would be applicable to the example you provided, and PostgreSQL can emulate it.
The real advantage most NoSQL databases have over SQL databases is Map-Reduce -- specifically, the integration of a full Turing-complete language that runs at high speed that query constraints can be written in. The querying function can be written to quickly "fail out" of non-matching queries or quickly return with a success on records that meet "priority" requirements, while doing the same in SQL is a bit more cumbersome.
Finally, however, the exact problem you're trying to solve: text search with optional filtering parameters, is more generally known as a search engine, and there are very specialized engines to handle this particular problem. I'd recommend Apache Solr to perform these queries.
Basically, dump the text field, the "filter" fields, and the primary key of the table into Solr, let it index the text field, run the queries through it, and if you need the full record after that, query your SQL database for the specific index you got from Solr. It uses some more memory and requires a second process, but will probably best suite your needs, here.
Why all of this text to get to this answer?
Because the title of your question doesn't really have anything to do with the content of your question, so I answered both. :)

Do you need Solr/Lucene for MongoDB, CouchDB and Cassandra?

If you have RDBMS you probably have to use Solr to index your relational tables to fully nested documents.
Im new to non-sql databases like Mongodb, CouchDB and Cassandra, but it seems to me that the data you save is already in that document structure like the documents saved in Solr/Lucene.
Does this mean that you don't have to use Solr/Lucene when using these databases?
Is it already indexed so that you can do full-text search?
It depends on your needs. They have a full text search. In CouchDB the search is Lucene (same as solr). Unfortunately, this is just a full text index, if you need complex scoring or DisMax type searching, you'll likely want the added capabilities of an independent Solr Index.
Solr (Lucene) uses an algorithm to returns relevant documents from a query. It will returns a score to indicate how relevant each document is related to the query.
It is different than what a database (relational or not) does, which is returning results that matches or not a query.