Imitate join for NoSQL document database - mongodb

Are there any workarounds to execute join-like query with NoSQL document database?
Example: We need to select last month articles by users with rating more than thousand.
SQL solution is
SELECT a.* FROM Articles as a
INNER JOIN Users as u ON a.UserId = u.Id
WHERE a.Date > (Now - Month) AND u.Rating > 1000
I can imagine several NoSQL solutions. First is two queries solution:
Retrieve users with rating more than 1000
Retrieve last month articles for these users
I don't like it as I have to make two queries and I have to retrieve all users with rating > 1000 (what if I have 1kk of users?)
The other NoSQL solution which comes to my mind is denormalization. But I am not big fan of it. I would be not against of putting comments collection to post entity (because comments belong to post), but I don't like to put user inside article or articles inside user.
Are there any other solutions?

You can do that using Multi Maps / Reduce indexes with RavenDB. See here.

RavenDB Multi Maps handles this scenarios very well:
http://ayende.com/blog/89089/ravendb-multi-maps-reduce-indexes

Another solution may be playOrm where you can partition a table and select and join on partitions. IT is basically like hibernate with JQL and all except you query into partitions. Perhaps if you partition by month, you can run a simple old school select query on that partition and join it with something else. noSql now seems to have joins through playOrm ;). It of course does not do joins on HUGE tables. The PARTITION needs to be comparative size to RDBMS table sizes when doing joins....The table size can be infinite(ie. you have infinite partitions).

Related

Google Data Studio: Connect to multiple schemas in multi-tenant postgres DB

I have a multi-tenant database in postgres. So, I have one schema per customer and each schema has a fixed set of tables.
When I connect to the DB using Google Data Studio(GDS), I only see the table names without their associated schema.
How do I connect to tables belonging to one or more schemas?
Also, what do I do if my tables have more than 700k rows, as GDS has a limit on number of rows that can be queried right?
You'll have to use the "Custom Query" option instead of the basic table selection if you need anything more complex.
Regarding the row limit. I wasn't aware of the limit, but if that is true I'd suggest using the Custom Query to pre-group your rows in the query into whatever makes sense...days...months...etc to bring the row count down.
Data Studio will likely choke on anywhere near that many rows and make for a horrible user experience. Let Postgres do as much of the heavy lifting as you can.
Answering only: "what do I do if my tables have more than 700k rows, as GDS has a limit on number of rows that can be queried right?"
Not exactly. The limit is on the number of rows returned, not the amount of rows queried. And that matters since Data Studio will almost always push down queries to connectors.
Here's an example: Lets say you have a purchase table in a PostgreSQL db with 1M+ rows where each record is a purchase event. You add this table as a data source in your report and add a bar chart that shows average purchase by customer type. Let's say you have 12 customer types. Data Studio will then push down the GROUP BY clause to the PostgreSQL db. Thus, your result will have only 12 rows of data instead of 1M+. In most chart types, Data Studio will aggregate or page the results thus issuing a query statement that limits the number of rows returned.
You will only run into the limit if you end up creating a scenario where Data Studio cannot issue an aggregation or paging over the query results or if the aggregated results cross the row limit.

Can I have a list of foreign keys as a single field? [duplicate]

Newbie trying to figure out the best way to design a Postgres db for the following use case scenario.
There is an Account table for the business customers and there is a contacts table with a column relationship.
account.pk_id, ….
contacts.pk_id, contacts.fk_accountid …
Thousands of different businesses in the Accounts table will be storing millions of contacts each in the Contacts table.
Each contact record will over time belong to between 1 and 100 different categories, lists and products.
If I use a classic sql master/child relationship I potentially end up with millions and millions of rows in tables such as contacts_categories, contacts_lists and contacts_products which would reference from Categories, Lists & Products tables.
Alternatively, I could store the related keys ( uuid’s) for categories, lists and products in 3 character varying arrays[] columns in the contact record row. This would eliminate the need for the contacts_categories, contacts_lists and contacts_products tables that would be quite large.
With tools like Select unnest, array_append() and the array index options it seems like a smart solution but am curious to know if it is better to stick to normalized relations and more tables and row counts for performance and / or storage memory / cost.
Anybody tried this before ?
Too many people have tried that, and it is a bad idea. Many of your queries, particularly joins, will become complicated and slow. Besides, you won't be able to have foreign key constraints to guarantee data integrity.
Relational databases are good at coping with millions of rows in a table. Keep your schema normalized.

How to fetch data from multi tables in nosql using SEMBAST plugin

I am not able to fetch data when multi table which are having relationship . so can any one help as per the below image
finally i need data like
ORDER_ID, USER_FULL_NAME, PRODUCT_NAME, PRODUCT_PRICE from 3 different table .
please help me out.
Sembast, does not provide a way to query multiple stores in one join query. However getting an entity by id (here a user or a product) is almost immediate (store.record(id).get(db) or store.records(ids).get(db)) so I think your best bet is to query the order_items store and fetch the users and products by ids.
Basically using 3 requests in a transaction to ensure data integrity should perform the join you want.

Restrict list of employees in NMBRS to just a few companies

I am creating a report on sick leave on nmbrs.nl using Invantive SQL.
By default this query retrieves data across all companies:
select *
from employees emp
join employeeabsence(emp.id)
This takes an enormous amount of time since for each company a SOAP request is done, plus one SOAP request per employee to retrieve the absence.
Is there an efficient way to restrict it to just a few companies instead of thousands?
You can use the 'use' statement or select a partition which is actually a company.
With use you can use a query like:
use select code from systempartitions#datadictionary where lower(name) like '%companyname%' limit 10
to retrieve the first 10 companies with a specific name.
Also see answer on use with alias on how to also specify the data container alias when running distributed queries.

TSQL - Deleting with Inner Joins and multiple conditions

My question is a variation on one already asked and answered (TSQL Delete Using Inner Joins) but I have a different level of complexity and I couldn't see a solution to it.
My requirement is to delete Special Prices which haven't been accessed in 90 days. Special Prices are keyed on Customer ID and Product ID and the products have to matched to a Customer Order Detail table which also contains a Customer ID and a Product ID. I want to write one function that will look at the Special Price table for each Customer, compare each Product for that Customer with the Customer Order Detail table and if the Maximum Order Date is more than 90 days earlier than today, delete it from the Special Price table.
I know I can use a CURSOR (slow but effective) but would prefer to have a single query like the one in the TSQL Delete Using Inner Joins example. Any ideas and/or is more information required?
I cannot dig more on the situation of your system but i think and if it is ok for you, check MERGE STATEMENT, it might be a help instead of using cursors. check this Link MERGE STATEMENT