New collection or nested array? - mongodb

Supposed we have a db called A. The structure of A can be:
1) A( a, b, c, d).
a, b, c, d are collections.
And the element in each collection is like { _id:id, data : data }
2) A(k).
k(a, b, c, d)
k is a colletion. and a, b, c, d are elements inside k.
a, b, c, d are like
{
type : 'a / b / c / d',
data : [
{_id : id1, data : data1 },
{_id : id2, data : data2},
...
]
}
the daily operations are { get, inserting element into, empty element of } a, b, c and d.
Which one is better in terms of efficiency?

#Markus-W-Mahlberg is right about your actual-use-case.
As you are using mongodb and it uses documents not tabular data structure (such as ms-sql), your both approaches work fine and if you define right index, u get same performance.
But in my opinion if your types (a, b, c and d ) have different structures (different properties, different queries, different update scenarios, aggregation plans and ...) Use way1, other wise use Way2 with right index.

Related

The right way to normalize database into 3NF

I have this database:
R(A, B, C, D, E)
Keys: A
F = {A -> B, D -> E, C -> D}
I normalize it into 3NF like this:
R(A, B, C, D, E)
Keys: AD
F = {AD -> B, AD -> E, C -> D}
What I do is when I check D -> E, D is not a superkey and E is not a key attribute, so I treat D and A as a superkey {AD}. When I check C -> D, C is not a key but D is a key attribute so it's OK.
Is my normalization correctly?
There is a problem in your input data. If the relation R has the dependencies F = {A -> B, D -> E, C -> D}, then A cannot be a key. In fact, a key is a set of attributes whose closure determines all the attributes of the relation, which is not the case here, since:
A+ = AB
From F, the (only) possible key is AC, in fact
AC+ = ABCD
Normalizing means to reduce the redundancy by decomposing a relation in other relations in which the functional dependencies do not violate the normal form, and such that joining the decomposed relations, one can obtain the original one.
In you solution, you do not decompose the relation, but only change the set of dependencies with other dependencies not implied by the first set.
A correct decomposition would be instead the following:
R1 < (A B) ,
{ A → B } >
R2 < (C D) ,
{ C → D } >
R3 < (D E) ,
{ D → E } >
R4 < (A C) ,
{ } >
The algorithm to decompose a relation into 3NF can be found on any good book on databases.

Anorm is just locking on executeUpdate

I have a very simple Play 2.1 Scala project. As in, this is the only code in it so far. I have a task which I am running with an Akka.system.scheduler. I have some code to select from the database (currently the standard test H2 instance) and I'm following the documentation example almost exactly.
DB.withConnection { implicit c =>
Logger.info("2")
var x = SQL("insert into x (a, b, c) values ({a, b, c})").on(
'a -> a,
'b -> b,
'c -> c
)
Logger.info("2.5")
x.executeUpdate()
Logger.info("3")
It never gets past 2.5. I haven't got any other database operations happening (except for evolutions).
Help?
Based on your link, shouldn't your SQL statement look like:
var x = SQL("insert into x (a, b, c) values ({a}, {b}, {c})").on(
"a" -> a,
"b" -> b,
"c" -> c
)
In the question the values don't have individual braces: {a, b, c}.

Will the real unique index please stand up?

I recently encountered a DB2 table that has three different indexes that are unique.
Index 1 (Columns: A, B, C)
Index 2 (Columns: A, B, C, D)
Index 3 (Columns: A, B, C, D, E)
Is the most specific one the actual unique index? Or does the definition of uniqueness differ depending about which index DB2 uses to access the table?
I'm a bit confused since, index 1 suggests that as long as my values for A, B, C are unique, I can have duplicate values for D and E. But then there's index 3 saying that A, B, C, D, E are unique, so I can't have duplicate values for D and E after all?
Quite the opposite, the only unique index that counts is Index 1 (for uniqueness).
I haven't tried it, but for accessing purpose, DB2 would use the index that is better for the actual query you are performing.
For instance, if you are querying { A=1, B=2, C=3 } it should use Index 1;
if you are querying {A =1, B=2, C=3, D=4 } it should use Index 2, even if it could just use Index 1, but you won't see any performance gain.

Talend: Aggregation summary?

I've seen how to use tMap in TOS to map different fields in a SQL-like JOIN. How do I aggregate based on certain fields?
If I have two tables:
[ A, B, C, D ]
and that are tMap'ped to [ B, C, F, G ]
[ B, E, F, G]
how can I aggregate the results to that instead of the many entries of the non-unique B I can see something like:
[ B, count(B), avg(C), avg(F), avg(G) ]
Thanks!
You certainly can. Use the tAggregate component to do that. You can group by column B and then compute all of the different aggregations, like count, sum, and average in the other columns.

Nested array comprehensions in CoffeeScript

In Python
def cross(A, B):
"Cross product of elements in A and elements in B."
return [a+b for a in A for b in B]
returns an one-dimensional array if you call it with two arrays (or strings).
But in CoffeeScript
cross = (A, B) -> (a+b for a in A for b in B)
returns a two-dimensional array.
Do you think it's by design in CoffeeScript or is it a bug?
How do I flatten arrays in CoffeScript?
First I would say say that 2 array comprehensions in line is not a very maintainable pattern. So lets break it down a little.
cross = (A, B) ->
for a in A
for b in B
a+b
alert JSON.stringify(cross [1,2], [3,4])
What's happening here is that the inner creates a closure, which has its own comprehension collector. So it runs all the b's, then returns the results as an array which gets pushed onto the parent comprehension result collector. You are sort of expecting a return value from an inner loop, which is a bit funky.
Instead I would simply collect the results myself.
cross = (A, B) ->
results = []
for a in A
for b in B
results.push a + b
results
alert JSON.stringify(cross [1,2], [3,4])
Or if you still wanted to do some crazy comprehension magic:
cross = (A, B) ->
results = []
results = results.concat a+b for b in B for a in A
results
alert JSON.stringify(cross [1,2], [3,4])
Whether this is a bug in CS or not is a bit debatable, I suppose. But I would argue it's good practice to do more explicit comprehension result handling when dealing with nested iterators.
https://github.com/jashkenas/coffee-script/issues/1191