Is Null in OrientDB - orientdb

In my database IM_0609 OrientDB version 2.0.8 is a class MARKS:
CALIBRATION_DATE:date.
DEVICE_MARK:string
DEVICE_NAME:string
END_MARK_NUM:decimal
MARK_NUM:decimal
PERIOD:decimal
SERIAL_NUM:string
In the class MARKS of 42898973 rows and I have created the index as follows:
CREATE INDEX MARK_NUM_END_MARK_NUM on MARKS(MARK_NUM,END_MARK_NUM) NOTUNIQUE
I run the following query quickly:
select * from MARKS where (MARK_NUM =84278511 AND END_MARK_NUM
=84278511 AND END_MARK_NUM IS NOT NULL)
1 item(s) found. Query executed in 0.097 sec(s).
And this requires a request to create an index or update query:
select * from MARKS where MARK_NUM =84278511 AND END_MARK_NUM IS NULL
The server displays the following message:
2015-05-12 15:46:43:129 INFO {db=IM_0609} [TIP] Query 'select * from MARKS where MARK
_NUM =84278511 AND END_MARK_NUM IS NULL' fetched more than 50000 records: to speed up
the execution, create an index or change the query to use an existent index [OProfiler
]
Q: Why do so runs the second query?

Orientdb does not keep index for null values if you don't specifically said to do that. To do that you have to set the metadata tag as follows:
CREATE INDEX addresses ON Employee (address) notunique METADATA {ignoreNullValues : false}
But your first query
select * from MARKS where (MARK_NUM =84278511 AND END_MARK_NUM =84278511 AND END_MARK_NUM IS NOT NULL)
says specific value for the END_MARK_NUM=84278511 and therefore it can use the index and limit the record reads < 50000
But your second query
select * from MARKS where MARK_NUM =84278511 AND END_MARK_NUM IS NULL
does not says a specific value for END_MARK_NUM and therefore it can't use your index. Therefore the number of record reads increases and it gives the tip saying "try to reduce the number of record reads that orientdb has to perform"
Edit: There seems to be a bug in the ignoreNullValues functionality. (https://github.com/orientechnologies/orientdb/issues/4508)

Seems like you simply have too many records matching the query. Also, did you mean to have 'IS NOT NULL' in the first query but 'IS NULL' in the second?
Try with a limit
select * from MARKS where MARK_NUM =84278511 AND END_MARK_NUM IS NULL LIMIT 10

This query runs instantly:
select * from index:MARKS.MARK_NUM_END_MARK_NUM where key=[84278511,NULL]
Q:But how do I get the fields of my class MARKS?

Related

Druid SQL: filter on result of expression

I have HTTP access log data in a Druid data source, and I want to see access patterns based on certain identifiers in the URL path. I wrote this query, and it works fine:
select regexp_extract(path, '/id/+([0-9]+)', 1) as "id",
sum("count") as "request_count"
from "access-logs"
where __time >= timestamp '2022-01-01'
group by 1
The only problem is that not all requests match that pattern, so I get one row in the result with an empty "id". I tried adding an extra condition in the where clause:
select regexp_extract(path, '/id/+([0-9]+)', 1) as "id",
sum("count") as "request_count"
from "access-logs"
where __time >= timestamp '2022-01-01' and "id" != ''
group by 1
But when I do that, I get this error message:
Error: Plan validation failed: org.apache.calcite.runtime.CalciteContextException:
From line 4, column 46 to line 4, column 49: Column 'id' not found in any table
So it doesn't let me reference the result of the expression in the where clause. I could of course just copy the entire regexp_extract expression, but is there a cleaner way of doing this?
Since id is an aggregated column, you would need a HAVING clause to filter on it.

Substring last 3 characters of an id of an object within an area

I am intending to grab the last three characters from an id. In my code, you can see that I am using ST_WITHIN() to get an object within another object. I am then grabbing the "node_id" of all objects within that area. Below is the code:
SELECT SUBSTRING ((
SELECT "node_id" from sewers.structures
WHERE(
ST_WITHIN(
ST_CENTROID((ST_SetSRID(structures.geom, 4326))),
ST_SetSRID((SELECT geom FROM sewers."Qrtr_Qrtr_Sections" WHERE "plat_page" = '510C'),4326)) )),5,3)
This portion of the code works without issue:
SELECT "node_id" from sewers.structures
WHERE(
ST_WITHIN(
ST_CENTROID((ST_SetSRID(structures.geom, 4326))),
ST_SetSRID((SELECT geom FROM sewers."Qrtr_Qrtr_Sections" WHERE "plat_page" = '510C'),4326)) )
But when I run the SELECT SUBSTRING() on the selection, I get the following error:
ERROR: more than one row returned by a subquery used as an expression
SQL state: 21000
The substring function should be called on each element, not on the entire query:
SELECT SUBSTRING("node_id",5,3)
FROM sewers.structures
WHERE ST_WITHIN ...

Select rows in postgres table where an array field contains NULL

Our system uses postgres for its database.
We have queries that can select rows from a database table where an array field in the table contains a specific value, e.g.:
Find which employee manages the employee with ID 123.
staff_managed_ids is a postgres array field containing an array of the employees that THIS employee manages.
This query works as expected:
select *
from employees
where 123=any(staff_managed_ids)
We now need to query where an array field contains a postgres NULL. We tried the following query, but it doesn't work:
select *
from employees
where NULL=any(staff_managed_ids)
We know the staff_managed_ids array field contains NULLs from other queries.
Are we using NULL wrongly?
NULL can not be compared using =. The only operators that work with that are IS NULL and IS NOT NULL.
To check for nulls, you need to unnest the elements:
select e.*
from employees e
where exists (select *
from unnest(e.staff_managed_ids) as x(staff_id)
where x.staff_id is null);
if all your id values are positive, you could write something like this:
select *
from employees
where (-1 < all(staff_managed_ids)) is null;
how this works is that -1 should be less than all values, however comparison with null will make the whole array comparison expression null.

Postgres Update Using Select Passing In Parent Variable

I need to update a few thousand rows in my Postgres table using the result of a array_agg and spatial lookup.
The query needs to take the geometry of the parent table, and return an array of the matching row IDs in the other table. It may return no IDs or potentially 2-3 IDs.
I've tried to use an UPDATE FROM but I can't seem to pass into the subquery the parent table geom column for the SELECT. I can't see any way of doing a JOIN between the 2 tables.
Here is what I currently have:
UPDATE lrc_wales_data.records
SET lrc_array = subquery.lrc_array
FROM (
SELECT array_agg(wales_lrcs.gid) AS lrc_array
FROM layers.wales_lrcs
WHERE st_dwithin(records.geom_poly, wales_lrcs.geom, 0)
) AS subquery
WHERE records.lrc = 'nrw';
The error I get is:
ERROR: invalid reference to FROM-clause entry for table "records"
LINE 7: WHERE st_dwithin(records.geom_poly, wales_lrcs.geom, 0)
Is this even possible?
Many thanks,
Steve
Realised there was no need to use SET FROM. I could just use a sub query directly in the SET:
UPDATE lrc_wales_data.records
SET lrc_array = (
SELECT array_agg(wales_lrcs.gid) AS lrc
FROM layers.wales_lrcs
WHERE st_dwithin(records.geom_poly, wales_lrcs.geom, 0)
)
WHERE records.lrc = 'nrw';

Fulltext Postgres

I created an index for full text search in postgresql.
CREATE INDEX pesquisa_idx
ON chamado
USING
gin(to_tsvector('portuguese', coalesce(titulo,'') || coalesce(descricao,'')));
When I run this query:
SELECT * FROM chamado WHERE to_tsvector('portuguese', titulo) ## 'ura'
It returned to me some rows.
But when my argument is in all uppercase, no rows are returned. For example:
SELECT * FROM chamado WHERE to_tsvector('portuguese', titulo) ## 'URA'
When the argument is 'ura' I get a few lines; when the argument is 'URA' I do not get any rows.
Why does this happen?
You get no matches in the second case since to_tsvector() lowercases all lexemes. Use to_tsquery() to build the query, it will take care of the case issues as well:
SELECT * FROM chamado WHERE to_tsvector('portuguese', titulo) ## to_tsquery('portuguese', 'URA')