Can someone tell me if I made this good? I am not so sure, especially about one thing explained by second diagram: does this green region means values of X AND Z, or rather X OR Z?
I made some corrects in code, but it seems that I am not using parentheses correctly. Don't know if this code is good
-- 1
/*
// Values stored in Y, that are parts of X and Z
"Y NOT IN (Y EXCEPT (UNION OF X AND Y))"
*/
SELECT Val FROM Y
EXCEPT
SELECT Val FROM X
EXCEPT
SELECT Val FROM Z
-- 2
/*
// Values stored in Y, that are parts of X and Z
"Y NOT IN (Y EXCEPT (UNION OF X AND Y))"
*/
SELECT VAL FROM Y
INTERSECT (
SELECT Val FROM Y
EXCEPT
SELECT Val FROM X
EXCEPT
SELECT Val FROM Z
)
-- 3
/*
// Values stored in X and Z. that are not a part of Y
"(UNION OF X & Z) EXCEPT Y"
*/
SELECT VAL FROM X
UNION
SELECT VAL FROM Z
EXCEPT
SELECT VAL FROM Y
-- 4
/*
// Every value of X, and same values from Y and Z
"(Y NOT IN (Y EXCEPT (UNION OF X AND Y))) UNION X"
*/
SELECT Val FROM X
UNION(
SELECT Val FROM Y
INTERSECT
SELECT Val FROM Z)
I agree with 1,3 and 4 but 2 should be:
SELECT VAL FROM Y
EXCEPT (
SELECT Val FROM Y
EXCEPT
SELECT Val FROM X
EXCEPT
SELECT Val FROM Z
or alternatively:
(SELECT Val FROM Y
INTERSECT
SELECT Val FROM X)
UNION
(SELECT Val FROM Y
INTERSECT
SELECT Val FROM X)
Related
I'd like to approximate a given n x m matrix A with n >> m as a weighted sum W of some k rows B (ideally selected from A, but could also be arbitrary). The weights must sum up to 1 and need to be positive.
import numpy as np
n = 1000 # rows
m = 3 # columns
k = 2 # hidden rank
# create random matrix with rank k
A = np.random.rand(n, k).dot(np.random.rand(k, m))
# estimate hidden rank
u, s, vt = np.linalg.svd(A, full_matrices=False, compute_uv=True)
k_est = np.count_nonzero(~np.isclose(s, 0))
# truncate to k_est
B = np.diag(s[:k_est]) # vt[..., :k_est, :]
W = u[..., :k_est]
# do some magic with B and W to come up with
assert np.all(W >= 0)
assert np.all(np.isclose(W.sum(1), 1))
assert np.all(np.isclose(A, W # B))
I tried with SVD which is able to reproduce A by W # B, but the weights are negative and don't sum up to 1.
From my gut feeling it seems like I'm searching for a convex hull of A, but with only k_est points.
I have a matrix M of (L x N) rank and I want to add the same vector v of length L to every column of the matrix. Is there a way do this please, using Scala Breeze?
I tried:
val H = DenseMatrix.zeros(L,N)
for (j <- 0 to L) {
H (::,j) = M(::,j) + v
}
but this doesn't really fit Scala's immutability as H is then already defined and therefore gives a reassignment to val error. Any suggestions appreciated!
To add a vector to all columns of a matrix, you don't need to loop through columns; you can use the column broadcasting feature, for your example:
H(::,*) + v // assume v is breeze dense vector
Should work.
import breeze.linalg._
val L = 3
val N = 2
val v = DenseVector(1.0,2.0,3.0)
val H = DenseMatrix.zeros[Double](L, N)
val result = H(::,*) + v
//result: breeze.linalg.DenseMatrix[Double] = 1.0 1.0
// 2.0 2.0
// 3.0 3.0
I am runining a following "simple query" from tables a1, a2, ..., a20. each table a1, a2, ...., a20 has milions of rows, and each of them have same columns, X, Y, Z.
CREATE TABLE A_bis as
SELECT
X, Y, Z
FROM a1
WHERE
Y= 3
UNION
SELECT
X, Y, Z
FROM a2
WHERE
Y= 3
UNION
SELECT
X, Y, Z
FROM a3
WHERE
Y= 3
UNION
...
SELECT
X, Y, Z
FROM a20
WHERE
Y= 3
and I get table A_bis, but it takes at least 20 minutes.
I'd like to:
a) optimize the query so it is faster.
b) improve the code (loop ? ) so I don't have to literally write a 7 lines for each of tables a1, .... a20 to get 130 lines of code
Comments answered your question A (Basically : Add an index on each aX table).
For the question B, you can use PostgreSQL inheritance:
CREATE TABLE aParent (x INT, y INT, z INT);
ALTER TABLE a1 INHERITS aParent;
ALTER TABLE a2 INHERITS aParent;
...
ALTER TABLE a20 INHERITS aParent;
Then you can do
SELECT X, Y, Z FROM aParent WHERE Y = 3;
I have a logiql file with many "complicated" rules.
Here are some examples:
tuple1(x), tuple2(x), function1[y, z] = x <- in_tuple1(x), in_tuple2(x, y), in_tuple3[x, y] = z.
tuple1(x,y) <- (in_tuple1(x,z), in_tuple2(y,z)); in_tuple2(x,y)
For my purposes it would be much better to have only rules in the simple form: only one derived tuple per rule and no "OR" combinations of rules.
Does logicblox offer some intermediate representation output that only consists of the simpler rules?
I think there are intermediate representations created, but I don't know how to unearth them. Even if I did, I think my first advice would be to write the simpler rules you want.
I'm quite confident that the first example can be re-written as follows.
Example 1 Before
tuple1(x),
tuple2(x),
function1[y, z] = x
<-
in_tuple1(x),
in_tuple2(x, y),
in_tuple3[x, y] = z.
Example 1 After
tuple1(x) <- in_tuple1(x), in_tuple2(x, y), in_tuple3[x, y] = _.
tuple2(x) <- in_tuple1(x), in_tuple2(x, y), in_tuple3[x, y] = _.
/** alternatively
tuple1(x) <- function1[_, _] = x.
tuple2(x) <- function1[_, _] = x.
**/
function1[y, z] = x
<-
in_tuple1(x),
in_tuple2(x, y),
in_tuple3[x, y] = z.
I'm a little less confident with the second one. No conflicts between the two rules jump out at me. If there is a problem here you may get a functional dependency violation, which you'll know by the output or logging of "Error: Function cannot contain conflicting records."
Example 2 Before (assumed complete clause with "." at end)
tuple1(x,y)
<-
(
in_tuple1(x,z),
in_tuple2(y,z)
)
;
in_tuple2(x,y).
Example 2 After
tuple1(x,y)
<-
in_tuple1(x,z),
in_tuple2(y,z).
tuple1(x,y)
<-
in_tuple2(x,y).
Is there any way to execute the following Sql query in HiveQL?
select * from my_table
where (a,b,c) not in (x,y,z)
where a,b,c correspond respectively to x,y,z
Thanks:)
You'll have to break these down to separate conditions:
SELECT *
FROM my_table
WHERE a != x AND b != y AND c != z
Is this what you intend?
where a <> x or b <> y or c <> z
Or this?
where a not in (x, y, z) and
b not in (x, y, z) and
c not in (x, y, z)
Or some other variation?