Query linked Vertex in OrientDB - orientdb

I am trying to query a Vertex (B) that is linked to a vertex (A) that I'm querying. I tried below query, but it returns the linked vertex(B) not the vertex (A) that I've selected .
select expand(out(A)[title='xyz']) from A
This returns all Vertices from B. I need to how this fits in the where clause.

I created this structure to try your case:
I have these options to get the results you're looking for:
Query 1:
select from A where out(E)[title='xyz'].size() > 0
Output:
----+-----+------+-----+--------
# |#RID |#CLASS|title|out_
----+-----+------+-----+--------
0 |#12:0|A |abc |[size=3]
----+-----+------+-----+--------
Query 2:
select from A where out(E).title contains 'xyz'
Output:
----+-----+------+-----+--------
# |#RID |#CLASS|title|out_
----+-----+------+-----+--------
0 |#12:0|A |abc |[size=3]
----+-----+------+-----+--------
Hope it helps

Apart from being more direct, wouldn't the following generally be more efficient than beginning at A?
select in(E) from (select from B where title='xyz') unwind in

Related

Using the ST_Disjoint() Function gives unexpected result

I am fiddeling around with this dataset http://s3.cleverelephant.ca/postgis-workshop-2020.zip. It is used in this workshop http://postgis.net/workshops/postgis-intro/spatial_relationships.html.
I want to identify all the features, that do not have a subway station. I thought this spatial join is rather straight forward
SELECT
census.boroname,
COUNT(census.boroname)
FROM nyc_census_blocks AS census
JOIN nyc_subway_stations AS subway
ON ST_Disjoint(census.geom, subway.geom)
GROUP BY census.boroname;
However, the result set is waaaaay to large.
"Brooklyn" 4753693
"Manhattan" 1893156
"Queens" 7244123
"Staten Island" 2473146
"The Bronx" 2683246
When I run a test
SELECT COUNT(id) FROM nyc_census_blocks;
I get 38794 as a result. So there are way less features in nyc_census_blocks than I have in the result-set from the spatial join.
Why is that? Where is the mistake I am making?
The problem is that with ST_Disjoint you're getting for every record of nyc_census_block the total number of stations that are disjoint with nyc_subway_stations, which means in case of no intersection all records of nyc_subway_stations (491). That's why you're getting such a high count.
Alternatively you can count how many subways and census blocks do intersect, e.g. in a CTE or subquery, and in another query count how many of them return 0:
WITH j AS (
SELECT
gid,census.boroname,
(SELECT count(*)
FROM nyc_subway_stations subway
WHERE ST_Intersects(subway.geom,census.geom)) AS qt
FROM nyc_census_blocks AS census
)
SELECT boroname,count(*)
FROM j WHERE qt = 0
GROUP BY boroname;
boroname | count
---------------+-------
Brooklyn | 9517
Manhattan | 3724
Queens | 14667
Staten Island | 5016
The Bronx | 5396
(5 rows)

Select by id and generate column with relationships in array

Essentially what i want to do is to get by id from "Tracks" but i also want to get the relations it has to other tracks (found in table "Remixes").
I can write a simple query that gets the track i want by id, ex.
SELECT * FROM "Tracks" WHERE id IN ('track-id1');
That gives me:
id | dateModified | channels | userId
-----------+---------------------+-----------------+--------
track-id1 | 2019-07-21 12:15:46 | {"some":"json"} | 1
But this is what i want to get:
id | dateModified | channels | userId | remixes
-----------+---------------------+-----------------+--------+---------
track-id1 | 2019-07-21 12:15:46 | {"some":"json"} | 1 | track-id2, track-id3
So i want to generate a column called "remixes" with ids in an array based on the data that is available in the "Remixes" table by a SELECT query.
Here is example data and database structure:
http://sqlfiddle.com/#!17/ec2e6/3
Don't hesitate to ask questions in case anything is unclear,
Thanks in advance
Left join the remixes and then GROUP BY the track ID and use array_agg() to get an array of the remix IDs.
SELECT t.*,
CASE
WHEN array_agg(r."remixTrackId") = '{NULL}'::varchar(255)[] THEN
'{}'::varchar(255)[]
ELSE
array_agg(r."remixTrackId")
END "remixes"
FROM "Tracks" t
LEFT JOIN "Remixes" r
ON r."originalTrackId" = t."id"
WHERE t."id" = 'track-id1'
GROUP BY t."id";
Note that, if there are no remixes array_agg() will return {NULL}. But I figured you rather want an empty array in such a case. That's what the CASE is for.
BTW, providing a fiddle is a nice move of yours! But please also include the code in the original question. The fiddle site might be down (even permanently) and that renders the question useless because of the missing information.
That's a simple outer join with a string aggregation to get the comma separated list:
SELECT t.*,
string_agg(r."remixTrackId", ', ') as remixes
FROM "Tracks" t
LEFT JOIN "Remixes" r ON r."originalTrackId" = t.id
WHERE t.id = 'track-id1'
GROUP BY t.id;
The above assumes that Tracks.id is the primary key of the Tracks table.

group by in postgres sql with error must appear in the GROUP BY clause or be used in an aggregate function [duplicate]

I've been migrating some of my MySQL queries to PostgreSQL to use Heroku. Most of my queries work fine, but I keep having a similar recurring error when I use group by:
ERROR: column "XYZ" must appear in the GROUP BY clause or be used in
an aggregate function
Could someone tell me what I'm doing wrong?
MySQL which works 100%:
SELECT `availables`.*
FROM `availables`
INNER JOIN `rooms` ON `rooms`.id = `availables`.room_id
WHERE (rooms.hotel_id = 5056 AND availables.bookdate BETWEEN '2009-11-22' AND '2009-11-24')
GROUP BY availables.bookdate
ORDER BY availables.updated_at
PostgreSQL error:
ActiveRecord::StatementInvalid: PGError: ERROR: column
"availables.id" must appear in the GROUP BY clause or be used in an
aggregate function:
SELECT "availables".* FROM "availables" INNER
JOIN "rooms" ON "rooms".id = "availables".room_id WHERE
(rooms.hotel_id = 5056 AND availables.bookdate BETWEEN E'2009-10-21'
AND E'2009-10-23') GROUP BY availables.bookdate ORDER BY
availables.updated_at
Ruby code generating the SQL:
expiration = Available.find(:all,
:joins => [ :room ],
:conditions => [ "rooms.hotel_id = ? AND availables.bookdate BETWEEN ? AND ?", hostel_id, date.to_s, (date+days-1).to_s ],
:group => 'availables.bookdate',
:order => 'availables.updated_at')
Expected Output (from working MySQL query):
+-----+-------+-------+------------+---------+---------------+---------------+
| id | price | spots | bookdate | room_id | created_at | updated_at |
+-----+-------+-------+------------+---------+---------------+---------------+
| 414 | 38.0 | 1 | 2009-11-22 | 1762 | 2009-11-20... | 2009-11-20... |
| 415 | 38.0 | 1 | 2009-11-23 | 1762 | 2009-11-20... | 2009-11-20... |
| 416 | 38.0 | 2 | 2009-11-24 | 1762 | 2009-11-20... | 2009-11-20... |
+-----+-------+-------+------------+---------+---------------+---------------+
3 rows in set
MySQL's totally non standards compliant GROUP BY can be emulated by Postgres' DISTINCT ON. Consider this:
MySQL:
SELECT a,b,c,d,e FROM table GROUP BY a
This delivers 1 row per value of a (which one, you don't really know). Well actually you can guess, because MySQL doesn't know about hash aggregates, so it will probably use a sort... but it will only sort on a, so the order of the rows could be random. Unless it uses a multicolumn index instead of sorting. Well, anyway, it's not specified by the query.
Postgres:
SELECT DISTINCT ON (a) a,b,c,d,e FROM table ORDER BY a,b,c
This delivers 1 row per value of a, this row will be the first one in the sort according to the ORDER BY specified by the query. Simple.
Note that here, it's not an aggregate I'm computing. So GROUP BY actually makes no sense. DISTINCT ON makes a lot more sense.
Rails is married to MySQL, so I'm not surprised that it generates SQL that doesn't work in Postgres.
PostgreSQL is more SQL compliant than MySQL. All fields - except computed field with aggregation function - in the output must be present in the GROUP BY clause.
MySQL's GROUP BY can be used without an aggregate function (which is contrary to the SQL standard), and returns the first row in the group (I don't know based on what criteria), while PostgreSQL must have an aggregate function (MAX, SUM, etc) on the column, on which the GROUP BY clause is issued.
Correct, the solution to fixing this is to use :select and to select each field that you wish to decorate the resulting object with and group by them.
Nasty - but it is how group by should work as opposed to how MySQL works with it by guessing what you mean if you don't stick fields in your group by.
If I remember correctly, in PostgreSQL you have to add every column you fetch from the table where the GROUP BY clause applies to the GROUP BY clause.
Not the prettiest solution, but changing the group parameter to output every column in model works in PostgreSQL:
expiration = Available.find(:all,
:joins => [ :room ],
:conditions => [ "rooms.hotel_id = ? AND availables.bookdate BETWEEN ? AND ?", hostel_id, date.to_s, (date+days-1).to_s ],
:group => Available.column_names.collect{|col| "availables.#{col}"},
:order => 'availables.updated_at')
According to MySQL's "Debuking GROUP BY Myths" http://dev.mysql.com/tech-resources/articles/debunking-group-by-myths.html. SQL (2003 version of the standard) doesn't requires columns referenced in the SELECT list of a query to also appear in the GROUP BY clause.
For others looking for a way to order by any field, including joined field, in postgresql, use a subquery:
SELECT * FROM(
SELECT DISTINCT ON(availables.bookdate) `availables`.*
FROM `availables` INNER JOIN `rooms` ON `rooms`.id = `availables`.room_id
WHERE (rooms.hotel_id = 5056
AND availables.bookdate BETWEEN '2009-11-22' AND '2009-11-24')
) AS distinct_selected
ORDER BY availables.updated_at
or arel:
subquery = SomeRecord.select("distinct on(xx.id) xx.*, jointable.order_field")
.where("").joins(")
result = SomeRecord.select("*").from("(#{subquery.to_sql}) AS distinct_selected").order(" xx.order_field ASC, jointable.order_field ASC")
I think that .uniq [1] will solve your problem.
[1] Available.select('...').uniq
Take a look at http://guides.rubyonrails.org/active_record_querying.html#selecting-specific-fields

OrientDB SQL Check if multiple pairs of vertices are connected

I haven't been able to find an answer for the SQL for this.
Given pairs of vertices (record ids) and edge types between them, I want to check if all pairs exists.
V1 --E1--> V2
V3 --E2--> V4
... and so on. The answer I want is true / false or something equivalent. ALL connections must be present in order to evaluate to true, so at least one edge (of correct type) must exist for each pair.
Pseudo, the question would be:
Does V1 have edge <E1EdgeType> to V2?
AND
Does V3 have edge <E2EdgeType> to V4?
AND
... and so on
Does anyone know what the orientDB SQL would be to achieve this?
UPDATE
I did already have one way of checking if one single edge exists between known vertices. It's perhaps not very pretty either, but it works:
SELECT FROM (
SELECT EXPAND(out('TestEdge')) FROM #12:0
) WHERE #rid=#12:1
This will return the destination record (#12:0) if an edge of type 'TestEdge' exists from #12:0 to #12:1. However, if I have two of those, how can I query for one single result for both queries. Something like:
SELECT <something with $c> LET
$a = (SELECT FROM (SELECT EXPAND(out('TestEdge')) FROM #12:0) WHERE #rid=#12:1)
$b = (SELECT FROM (SELECT EXPAND(out('AnotherTestEdge')) FROM #12:2) WHERE #rid=#12:3)
$c = <something that checks that both a and b yield results>
That's what I aim towards doing. Please tell me if I'm solving this the wrong way. I'm not even sure what the gain is to merge queries like this compared to just repeat queries.
Given a pair of vertices, say #11:0 and #12:0, the following query will effectively check whether there is an edge of type E from #11:0
to #12:0
select from (select #this, out(E) from #11:0 unwind out) where out = #12:0
----+------+-----+-----
# |#CLASS|this |out
----+------+-----+-----
0 |null |#11:0|#12:0
----+------+-----+-----
This is highly inelegant and I would encourage you to think about formulating an enhancement request accordingly at https://github.com/orientechnologies/orientdb/issues
One way to incorporate the boolean tests you have in mind is illustrated by the following:
select from
(select $a.size() as a, $b.size() as b
let a=(select count(*) as e from (select out(E) from #11:0 unwind out)
where out = #12:0),
b=(select count(*) as e from (select out(E) from #11:1 unwind out)
where out = #12:2))
where a > 0 and b > 0
Yes, inelegance again :-(
It might be useful to you the following query
SELECT eval('sum($a.size(),$b.size())==2') as existing_edges
let $a = ( SELECT from TestEdge where out = #12:0 and in = #12:1 limit 1),
$b = ( SELECT from AnotherTestEdge where out = #12:2 and in = #12:3 limit 1)
Hope it helps.

Selecting count of values in multiple columns using two tables

I'm still new to tsql and trying to figure out how to build this query.
I have two tables. One called mirror which has an official list of all campuses and is used to populate a drop down list of campuses for users on a webform. They then have 5 choices they can select, which then populates another table with their request when they submit the form(Request). ie. CampusChoice1, CampusChoice2..etc.
I am trying to build a page to display the end results of all the collected data. After some reading I'm thinking I might need to use PIVOT to make this happen but I can't get my head to see the query.
I can make a rudimentary query for each choice1-5, but I kind of wanted them all together will nulls or zeros where some campuses were not chosen.
Something like
--Simple count on single col
SELECT CampusChoice1, COUNT(*) as '#'
FROM Request
Group By CampusChoice1
Or
--But this doesn't give the results I want, since it does not account for all the POSSIBLE choices.
SELECT CampusChoice1, COUNT() as '#',
CampusChoice2, COUNT() as '#',
CampusChoice3, COUNT() as '#',
CampusChoice4, COUNT() as '#',
CampusChoice5, COUNT(*) as '#'
FROM Operations.dbo.TransferRequest
Group By CampusChoice1, CampusChoice2, CampusChoice3, CampusChoice4, CampusChoice5
Any ideas how I could show this? Am I on the right track at least with the PIVOT table?
Not sure if I understood your question correctly, but assuming that you have this:
CampusChoice | Other data ...
------------------------------
CampusChoice1 | ...
CampusChoice2 | ...
CampusChoice1 | ...
Then for the example above with only 3 rows you want this end result:
CampusChoice1 | 2 | CampusChoice2 | 1 | CampusChoice3 | 0 | ...
The T-SQL to achieve this is:
select
'CampusChoice1',
sum( case when CampusChoice = 'CampusChoice1' then 1 else 0 end ) '#',
'CampusChoice2',
sum( case when CampusChoice = 'CampusChoice2' then 1 else 0 end ) '#',
'CampusChoice3',
sum( case when CampusChoice = 'CampusChoice3' then 1 else 0 end ) '#',
...
from
...
Use the sum combined with the case to sum 1's for each row for CampusChoice1 and 0's for each row not CampusChoice1, repeating this for each CampusChoiceN.