OrientDB: Efficient way to select records with a value equal to the max of all such values? - orientdb

I'm not sure how to do this without using a JOIN (which ODB doesn't have, of course). In "generic" SQL, you might do something like this:
Select * FROM table
INNER JOIN
(SELECT max(field) AS max_of_field, key FROM table GROUP BY key) sub
ON table.field = sub.max_of_field AND table.key = sub.key
Is there an efficient way to do this in ODB, using SELECT and/or MATCH?

Related

Transpose/Pivot a table in Postgres

I am trying for hours to transpose one table into another one this way:
My idea is to grab on an expression (which can be a simple SELECT * FROM X INNER JOIN Y ...), and transpose it into a MATERIALIZED VIEW.
The problem is that the original table can have an arbitrary number of rows (hence columns in the transposed table). So I was not able to find a working solution, not even with colpivot.
Can this ever be done?
Use conditional aggregation:
select "user",
max(value) filter (where property = 'Name') as name,
max(value) filter (where property = 'Age') as age,
max(value) filter (where property = 'Address') as addres
from the_table
group by "user";
A fundamental restriction of SQL is, that all columns of a query must be known to the database before it starts running that query.
There is no way you can have a "dynamic" number of columns (evaluated at runtime) in SQL.
Another alternative is to aggregate everything into a JSON value:
select "user",
jsonb_object_agg(property, value) as properties
from the_table
group by "user";

Multiple attributes condition in T-SQL

I'd like to check if a "couple" of attributes is in a the result of another request.
I tried the following query but the syntax isn't good.
SELECT ID
FROM Table1
WHERE (Col_01, Col_02) IN
(
SELECT Col_01, Col_02
FROM Table2
)
Is-it possible to do something like that in T-SQL ?
You can use EXISTS and a correlated subquery:
SELECT ID
FROM Table1 t1
WHERE EXISTS
(
SELECT *
FROM Table2 t2
WHERE t2.Col_01 = t1.Col_01 AND
t2.Col_02 = t1.Col_02
)
You initial attempt was a good one though - some database systems do allow us to use rowset constructors to create arbitrary tuples, and the syntax is quite similar to what you showed, but they're not supported in T-SQL in this part of the syntax, so you have to go this slightly more verbose route.

Hive: How to do a SELECT query to output a unique primary key using HiveQL?

I have the following schema dataset which i want to transform into a table that can be exported to SQL. I am using HIVE. Input as follows
call_id,stat1,stat2,stat3
1,a,b,c,
2,x,y,z,
3,d,e,f,
1,j,k,l,
The output table needs to have call_id as its primary key so it needs to be unique. The output schema should be
call_id,stat2,stat3,
1,b,c, or (1,k,l)
2,y,z,
3,e,f,
The problem is that when i use the keyword DISTINCT in the HIVE query, the DISTINCT applies to the all the colums combined. I want to apply the DISTINCT operation only to the call_id. Something on the lines of
SELECT DISTINCT(call_id), stat2,stat3 from intable;
However this is not valid in HIVE(I am not well-versed in SQL either).
The only legal query seems to be
SELECT DISTINCT call_id, stat2,stat3 from intable;
But this returns multiple rows with same call_id as the other columns are different and the row on the whole is distinct.
NOTE: There is no arithmetic relation between a,b,c,x,y,z, etc. So any trick of averaging or summing is not viable.
Any ideas how i can do this?
One quick idea,not the best one, but will do the work-
hive>create table temp1(a int,b string);
hive>insert overwrite table temp1
select call_id,max(concat(stat1,'|',stat2,'|',stat3)) from intable group by call_id;
hive>insert overwrite table intable
select a,split(b,'|')[0],split(b,'|')[1],split(b,'|')[2] from temp1;
,,I want to apply the DISTINCT operation only to the call_id"
But how will then Hive know which row to eliminate?
Without knowing the amount of data / size of the stat fields you have, the following query can the job:
select distinct i1.call_id, i1.stat2, i1.stat3 from (
select call_id, MIN(concat(stat1, stat2, stat3)) as smin
from intable group by call_id
) i2 join intable i1 on i1.call_id = i2.call_id
AND concat(i1.stat1, i1.stat2, i1.stat3) = i2.smin;

a dual variable not in statement?

I have the need to look at two tables that share two variables and get a list of the data from one table that does not have matching data in the other table. Example:
Table A
xName
Date
Place
xAmount
Table B
yName
Date
Place
yAmount
I need to be able to write a query that will check Table A and find entries that have no corresponding entry in Table B. If it was a one variable issue I could use not in statement but I can't think of a way to do that with two variables. A left join also does not appear like you could do it. Since looking at it by a specific date or place name would not work since we are talking about thousands of dates and hundreds of place names.
Thanks in advance to anyone who can help out.
SELECT TableA.Date,
TableA.Place,
TableA.xName,
TableA.xAmount,
TableB.yName,
TableB.yAmount
FROM TableA
LEFT OUTER JOIN TableB
ON TableA.Date = TableB.Date
AND TableA.Place = TableB.Place
WHERE TableB.yName IS NULL
OR TableB.yAmount IS NULL
SELECT * FROM A WHERE NOT EXISTS
(SELECT 1 FROM B
WHERE A.xName = B.yName AND A.Date = B.Date AND A.Place = B.Place AND A.xAmount = B.yAmount)
in ORACLE:
select xName , xAmount from tableA
MINUS
select yName , yAmount from tableB

T-SQL - How to write query to get records that match ALL records in a many to many join

(I don't think I have titled this question correctly - but I don't know how to describe it)
Here is what I am trying to do:
Let's say I have a Person table that has a PersonID field. And let's say that a Person can belong to many Groups. So there is a Group table with a GroupID field and a GroupMembership table that is a many-to-many join between the two tables and the GroupMembership table has a PersonID field and a GroupID field. So far, it is a simple many to many join.
Given a list of GroupIDs I would like to be able to write a query that returns all of the people that are in ALL of those groups (not any one of those groups). And the query should be able to handle any number of GroupIDs. I would like to avoid dynamic SQL.
Is there some simple way of doing this that I am missing?
Thanks,
Corey
select person_id, count(*) from groupmembership
where group_id in ([your list of group ids])
group by person_id
having count(*) = [size of your list of group ids]
Edited: thank you dotjoe!
Basically you are looking for Persons for whom there is no group he is not a member of, so
select *
from Person p
where not exists (
select 1
from Group g
where not exists (
select 1
from GroupMembership gm
where gm.PersonID = p.ID
and gm.GroupID = g.ID
)
)
You're basically not going to avoid "dynamic" SQL in the sense of dynamically generating the query at query time. There's no way to hand a list around in SQL (well, there is, table variables, but getting them into the system from C# is either impossible (2005 & below) or else annoying (2008)).
One way that you could do it with multiple queries is to insert your list into a work table (probably a process-keyed table) and join against that table. The only other option would be to use a dynamic query such as the ones specified by Jonathan and hongliang.