Will the real unique index please stand up?

Will the real unique index please stand up? - db2

I recently encountered a DB2 table that has three different indexes that are unique.
Index 1 (Columns: A, B, C)
Index 2 (Columns: A, B, C, D)
Index 3 (Columns: A, B, C, D, E)
Is the most specific one the actual unique index? Or does the definition of uniqueness differ depending about which index DB2 uses to access the table?
I'm a bit confused since, index 1 suggests that as long as my values for A, B, C are unique, I can have duplicate values for D and E. But then there's index 3 saying that A, B, C, D, E are unique, so I can't have duplicate values for D and E after all?

Quite the opposite, the only unique index that counts is Index 1 (for uniqueness).
I haven't tried it, but for accessing purpose, DB2 would use the index that is better for the actual query you are performing.
For instance, if you are querying { A=1, B=2, C=3 } it should use Index 1;
if you are querying {A =1, B=2, C=3, D=4 } it should use Index 2, even if it could just use Index 1, but you won't see any performance gain.

Related

OrientDB SQL gives different results for complex query with OR filters and for cascade of simple queries

I started using OrientDB and stuck with some weird behavior on complex queries.
I'd like to get a graph neighborhood of several nodes. I'm interested in the nodes containing "AD" substring in string attribute, nodes containing attribute ontology_id with value D017628 and nodes that are related to them with depth up to 3.
Could anyone explain why I have different results for complex query with "OR":
TRAVERSE * FROM (SELECT * FROM V WHERE ontology_id="D017628" or string LIKE "%AD%") WHILE $depth<=3
And I can get more results if I decompose the query above into 2 separate queries and run them separately:
TRAVERSE * FROM (SELECT * FROM V WHERE ontology_id="D017628") WHILE $depth<=3
TRAVERSE * FROM (SELECT * FROM V WHERE string LIKE "%AD%") WHILE $depth<=3
Why does it happen? Isn't it correct to join different criteria with OR statement within SELECT?

To understand the TRAVERSE statement, you have to take into consideration two details:
by default the traversal is made depth-first
traversed nodes are never re-traversed.
Now consider a graph made of four vertices:
A, B, C, D
connected as follows:
A -> B -> C -> D
A -> C
(a triangle made of A, B, C, plus an edge from C to D)
Now, suppose you do the TRAVERSE starting from A, with $depth <= 2
You can have this traversal
A (depth = 0)
A -> B (depth = 1)
B -> C (depth = 2)
C -> D (depth = 3, so D is discarded)
A -> C (you encounter C, that was already traversed, so this path is discarded)
The result of this traversal is A, B, C of course.
Now consider that (by pure luck) the traversal starts from the other path:
A (depth = 0)
A -> C (depth = 1)
C -> D (depth = 2)
A -> B (depth = 1)
B -> C (you encounter C, that was already traversed, so this path is discarded)
The result in this case is A, C, D, B
As you can see, you have two different results based on how the graph is traversed.
A possible solution here is to try to force the traversal to use a breadth-first strategy:
TRAVERSE * FROM (SELECT * FROM V WHERE ontology_id="D017628" or string LIKE "%AD%") WHILE $depth<=3 STRATEGY BREADTH_FIRST
This will help, but probably it won't completely solve the problem.
Another possible solution is to write a single query with a unionAll():
SELECT unionAll($a, $b)
LET
$a = (TRAVERSE * FROM (SELECT * FROM V WHERE ontology_id="D017628") WHILE $depth<=3),
$b = (TRAVERSE * FROM (SELECT * FROM V WHERE string LIKE "%AD%") WHILE $depth<=3)
If you are doing an actual graph traversal (so you can replace TRAVERSE * with TRAVERSE both()), you can rewrite the query with a MATCH statement (the MATCH does not discard duplicates):
MATCH
{class:V, as:a, where:(ontology_id="D017628" or string LIKE "%AD%")} -E- {as:b, while:($depth < 3)}
RETURN $elements

New collection or nested array?

Supposed we have a db called A. The structure of A can be:
1) A( a, b, c, d).
a, b, c, d are collections.
And the element in each collection is like { _id:id, data : data }
2) A(k).
k(a, b, c, d)
k is a colletion. and a, b, c, d are elements inside k.
a, b, c, d are like
{
type : 'a / b / c / d',
data : [
{_id : id1, data : data1 },
{_id : id2, data : data2},
...
]
}
the daily operations are { get, inserting element into, empty element of } a, b, c and d.
Which one is better in terms of efficiency?

#Markus-W-Mahlberg is right about your actual-use-case.
As you are using mongodb and it uses documents not tabular data structure (such as ms-sql), your both approaches work fine and if you define right index, u get same performance.
But in my opinion if your types (a, b, c and d ) have different structures (different properties, different queries, different update scenarios, aggregation plans and ...) Use way1, other wise use Way2 with right index.

Scala iterating through 2 collections and finding matches

I have 2 collections: a is a sequence of Scala objects of class C. b is a sequence of strings. C has a string field, name, that could possibly match an item in b. What I want is to loop through a and find all c.name that matches with one of the item in b. How do I do this in Scala?

Iterating through both a and b can get expensive because one loop nested inside another yields O(n^2) time. If b is sufficiently large, you probably want to make it into a Set first to bring this down to O(n).
val bSet = b.toSet;
a.filter(c => b.contains(c.name))
I would read this as "Apply the following filter to a: for each item c in a, include it in the result if and only if the name of c is in b."

Here's the equivalent for loop with yield.
for(c <- a if b.contains(c.name)) yield c.name

Talend: Aggregation summary?

I've seen how to use tMap in TOS to map different fields in a SQL-like JOIN. How do I aggregate based on certain fields?
If I have two tables:
[ A, B, C, D ]
and that are tMap'ped to [ B, C, F, G ]
[ B, E, F, G]
how can I aggregate the results to that instead of the many entries of the non-unique B I can see something like:
[ B, count(B), avg(C), avg(F), avg(G) ]
Thanks!

You certainly can. Use the tAggregate component to do that. You can group by column B and then compute all of the different aggregations, like count, sum, and average in the other columns.

Nested array comprehensions in CoffeeScript

In Python
def cross(A, B):
"Cross product of elements in A and elements in B."
return [a+b for a in A for b in B]
returns an one-dimensional array if you call it with two arrays (or strings).
But in CoffeeScript
cross = (A, B) -> (a+b for a in A for b in B)
returns a two-dimensional array.
Do you think it's by design in CoffeeScript or is it a bug?
How do I flatten arrays in CoffeScript?

First I would say say that 2 array comprehensions in line is not a very maintainable pattern. So lets break it down a little.
cross = (A, B) ->
for a in A
for b in B
a+b
alert JSON.stringify(cross [1,2], [3,4])
What's happening here is that the inner creates a closure, which has its own comprehension collector. So it runs all the b's, then returns the results as an array which gets pushed onto the parent comprehension result collector. You are sort of expecting a return value from an inner loop, which is a bit funky.
Instead I would simply collect the results myself.
cross = (A, B) ->
results = []
for a in A
for b in B
results.push a + b
results
alert JSON.stringify(cross [1,2], [3,4])
Or if you still wanted to do some crazy comprehension magic:
cross = (A, B) ->
results = []
results = results.concat a+b for b in B for a in A
results
alert JSON.stringify(cross [1,2], [3,4])
Whether this is a bug in CS or not is a bit debatable, I suppose. But I would argue it's good practice to do more explicit comprehension result handling when dealing with nested iterators.

https://github.com/jashkenas/coffee-script/issues/1191

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Will the real unique index please stand up? - db2

Related

OrientDB SQL gives different results for complex query with OR filters and for cascade of simple queries

New collection or nested array?

Scala iterating through 2 collections and finding matches

Talend: Aggregation summary?

Nested array comprehensions in CoffeeScript

Categories

Resources