Is open distro fully supported by AWS elasticsearch? - aws-elasticsearch

I tried the SQL plugin as explained in the official documentation.
https://opendistro.github.io/for-elasticsearch-docs/docs/sql/partiql/
I added the records and then used this query:
SELECT e.name AS employeeName,
p.name AS projectName
FROM employees_nested AS e,
e.projects AS p
The above query do not return anything. If I change the query to something like this...
SELECT e.name AS employeeName
FROM employees_nested AS e
It returns the 3 records as expected.

Related

Get all fields from a DocumentDB joined query

I have a DocumentDB database in Azure which I access through the CosmosDB API.
I'd like to get all the parent fields of a document with a simple query:
SELECT p.id
FROM parent p JOIN ch IN p.property1.child
WHERE CONTAINS(UPPER(ch.name), UPPER(#childName))
This query works but I get only the 'id' property. I can't use p.* (a syntax error is throwed) and probably the list will change in the future. With * I get this error: 'SELECT *' is only valid with a single input set.
It's there a way to get the whole json of parent document without the need to write the complete list of fields on the select clause?
You can instead use SELECT VALUE p FROM p JOIN ch .... This is equivalent to p.*

JPA Group by with multiple field

I am getting following error while using named query blog.findBlogs
"Your select and having clauses must only include aggregates or values that also appear in your grouping clause."
In select clause I have used b.id so it must allow to perform this query without error. I also tried same query from mysql workbench and its working perfectly fine.
#NamedQueries( value = {
#NamedQuery (name = "blog.findBlogs", query = "SELECT " +
"NEW com.vo.Blog(b.id, b.blogId, b.createDate, b.tags, b.url, COUNT(r.emotion)) " +
"FROM Blog b JOIN b.rates r " +
"GROUP BY b.id")
})
regards, Amit J.
In general, a GROUP BY query in SQL must obey a rule that everything which appears in the SELECT clause must either appear in the GROUP BY clause or be inside an aggregate function, such as COUNT.
I also tried same query from mysql workbench and its working perfectly fine.
It turns out that MySQL is one of a few databases which does not enforce this rule. But JPA appears to not allow this, so it fails for you running from JPA even though the underlying database is a lax version of MySQL.
Here is how you can modify your code to run without error:
#NamedQueries( value = {
#NamedQuery (name = "blog.findBlogs", query = "SELECT " +
"NEW com.vo.Blog(b.id, COUNT(r.emotion)) " +
"FROM Blog b JOIN b.rates r " +
"GROUP BY b.id")
})
If you're wondering what the logical problem with your original query was, it is that, for a given b.id value, JPA cannot figure out which blog, date, and tags value to choose for that group.

2-steps query in OrientDB

I'm evaluating OrientDB and Neo4j in this simple toy example composed by:
Employees, identified by eid
Meetings, identified by mid and having start and end attributes encoding their start and end DateTime.
Both entities are represented by different classes of vertices, namely Employee and CalendarEvent, which are connected by Involves edges specifying that CalendarEvent-[Involves]->Employee.
My task is to write a query that returns, for each pair of employees, the date/time of their first meeting and the number of meetings they co-attended.
In Cypher I would write something like:
MATCH (e0: Employee)<-[:INVOLVES]-(c:CalendarEvent)-[:INVOLVES]->(e1: Employee)
WHERE e0.eid > e1.eid
RETURN e0.eid, e1.eid, min(c.start) as first_met, count(*) as frequency
I wrote the following query for OrientDB:
SELECT eid, other, count(*) AS frequency, min(start) as first_met
FROM (
SELECT eid, event.start as start, event.out('Involves').eid as other
FROM (
SELECT
eid,
in('Involves') as event
FROM Employee UNWIND event
) UNWIND other )
GROUP BY eid, other
but it seems over-complicated to me.
Does anybody knows if there is an easier way to express the same query?
yes, your query is correct and this is what you have to do in current version (2.1.x).
From 2.2, with MATCH statement (https://github.com/orientechnologies/orientdb-docs/blob/master/source/SQL-Match.md), you will be able to write a query very similar to Cypher version:
select eid0, eid1, min(start) as firstMet, count(*) from (
MATCH {class:Person, as:e0}.in("Involves"){as: meeting}.out("Involves"){as:e1}
return e0.eid as eid0, e1.eid as eid1, meeting.start as start
) group by eid0, eid1
This feature is till in beta, probably in final version you will have more operators in the MATCH statement itself and the query will be even shorter

HAVING clause in PostgreSQL

I'm rewriting the MySQL queries to PostgreSQL. I have table with articles and another table with categories. I need to select all categories, which has at least 1 article:
SELECT c.*,(
SELECT COUNT(*)
FROM articles a
WHERE a."active"=TRUE AND a."category_id"=c."id") "count_articles"
FROM articles_categories c
HAVING (
SELECT COUNT(*)
FROM articles a
WHERE a."active"=TRUE AND a."category_id"=c."id" ) > 0
I don't know why, but this query is causing an error:
ERROR: column "c.id" must appear in the GROUP BY clause or be used in an aggregate function at character 8
The HAVING clause is a bit tricky to understand. I'm not sure about how MySQL interprets it. But the Postgres documentation can be found here:
http://www.postgresql.org/docs/9.0/static/sql-select.html#SQL-HAVING
It essentially says:
The presence of HAVING turns a query
into a grouped query even if there is
no GROUP BY clause. This is the same
as what happens when the query
contains aggregate functions but no
GROUP BY clause. All the selected rows
are considered to form a single group,
and the SELECT list and HAVING clause
can only reference table columns from
within aggregate functions. Such a
query will emit a single row if the
HAVING condition is true, zero rows if
it is not true.
The same is also explained in this blog post, which shows how HAVING without GROUP BY implicitly implies a SQL:1999 standard "grand total", i.e. a GROUP BY ( ) clause (which isn't supported in PostgreSQL)
Since you don't seem to want a single row, the HAVING clause might not be the best choice.
Considering your actual query and your requirement, just rewrite the whole thing and JOIN articles_categories to articles:
SELECT DISTINCT c.*
FROM articles_categories c
JOIN articles a
ON a.active = TRUE
AND a.category_id = c.id
alternative:
SELECT *
FROM articles_categories c
WHERE EXISTS (SELECT 1
FROM articles a
WHERE a.active = TRUE
AND a.category_id = c.id)
SELECT * FROM categories c
WHERE
EXISTS (SELECT 1 FROM article a WHERE c.id = a.category_id);
should be fine... perhaps simpler ;)

mysql subquery in single table

I have 2 columns in an "entries" table: a non-unique id, and a date stamp (YYYY-MM-DD). I want to select "all of the entry id's that were inserted today that have never been entered before."
I've been trying to use a subquery, but I don't think I'm using it right since they're both performed on the same table. Could someone help me out with the proper select statement? I can provide more details if need be.
Disclaimer: I don't have access to a mysql database right now, but this should help:
select
e.id
from
entries e
where
e.date = curdate() and
e.id not in
(select id from entries e2 where e2.date < e.date)