Cassandra Select Query for List and Frozen - select

I have user define type like
CREATE TYPE point ( pointId int, floor text);
And I have table like:
CREATE TABLE path (
id timeuuid,
val timeuuid,
PointList list<frozen <point>>,
PRIMARY KEY(id,val)
);
And have create index like
create index on path(PointList);
But the problem is I am not able to execute select query where PointList = [floor : "abc"].
I google for 2 hours but not able to find the hint.
I am using this query to execute select query
Select * from path where val = sdsdsdsdsds-dsdsdsd-dssds-sdsdsd and PointList contains {floor: 'eemiG8NbzdRCQ'};
I can see this data in my cassandra table but not able to get that data using above query.
I want select query where we can only use floor and val. Because we only have data for floor and val
I tried many different ways but nothing is working.
I would appreciate any kind of hint or help.
Thank you,

Frozen point means point type is frozen, you can't partially provide point value, you have to provide the full value of point
Example Query :
select * from path where pointlist CONTAINS {pointId : 1, floor : 'abc'};

Related

How to convert an jsonb array and use stats moment

how are you?
I needed to store an array of numbers as JSONB in PostgreSQL.
Now I'm trying to calculate stats moments from this JSON, I'm facing some issues.
Sample of my data:
I already was able to convert a JSON into a float array.
I used a function to convert jsonb to float array.
CREATE OR REPLACE FUNCTION jsonb_array_castdouble(jsonb) RETURNS float[] AS $f$
SELECT array_agg(x)::float[] || ARRAY[]::float[] FROM jsonb_array_elements_text($1) t(x);
$f$ LANGUAGE sql IMMUTABLE;
Using this SQL:
with data as (
select
s.id as id,
jsonb_array_castdouble(s.snx_normalized) as serie
FROM
spectra s
)
select * from data;
I found a function that can do these calculations and I need to pass an array for that: https://github.com/ellisonch/PostgreSQL-Stats-Aggregate/
But this function requires an array in another way: unnested
I already tried to use unnest, but it will get only one value, not the entire array :(.
My goal is:
Be able to apply stats moment (kurtosis, skewness) for each row.
like:
index
skewness
1
21.2131
2
1.123
Bonus: There is a way to not use this 'with data', use the transformation in the select statement?
snx_wavelengths is JSON, right? And also you provided it as a picture and not text :( the data looks like (id, snx_wavelengths) - I believe you meant id saying index (not a good idea to use a keyword, would require identifier doublequotes):
1,[1,2,3,4]
2,[373,232,435,84]
If that is right:
select id, (stats_agg(v::float)).skewness
from myMeasures,
lateral json_array_elements_text(snx_wavelengths) v
group by id;
DBFiddle demo
BTW, you don't need "with data" in the original sample if you don't want to use and could replace with a subquery. ie:
select (stats_agg(n)).* from (select unnest(array[16,22,33,24,15])) data(n)
union all
select (stats_agg(n)).* from (select unnest(array[416,622,833,224,215])) data(n);
EDIT: And if you needed other stats too:
select id, "count","min","max","mean","variance","skewness","kurtosis"
from myMeasures,
lateral (select (stats_agg(v::float)).* from json_array_elements_text(snx_wavelengths) v) foo
group by id,"count","min","max","mean","variance","skewness","kurtosis";
DBFiddle demo

Postgresql function doesn't return anything when UUID argument is used in WHERE clause

I'm pretty new to Postgresql. The issue I'm having is that I have a function that returns a table, but when I pass an UUID which is used in the where clause, it returns nothing. The funny thing is that if I take the SQL statement inside the function and run it by itself in PgAdmin, it gives me the right result.
The function looks like the following:
CREATE OR REPLACE FUNCTION get_service (
service_id uuid ) RETURNS TABLE(id uuid,title text,description text,category text,photo_url text,address text,
created_by uuid,created_on timestamp,service_rating float,rating_count bigint) AS $func$
Select
service.id,
service.title,
service.description,
service.category,
service.photo_url,
service.address,
service.created_by,
service.created_on,
CAST(AVG(rating.rating) AS float) as service_rating,
Count(rating.rating) as rating_count
from service
left join rating_service_map map
on service.id = map.service_id
left join rating
on rating.id = map.rating_id
where service.id = service_id
group by service.id,service.title,service.description,service.category,service.photo_url,service.address,service.created_by,service.created_on;
$func$ LANGUAGE SQL;
I have two records in my service table. The ID is of the type uuid and has a default value of uuid_generate_v4(). One of the records has an id of '2af3f03e-b2e5-44fd-89e8-3dc5fb641732'
If I run this I get no result:
select * from get_service('2af3f03e-b2e5-44fd-89e8-3dc5fb641732')
But if I run the following statement (the SQL portion of the function), then I get my right result:
Select
service.id,
service.title,
service.description,
service.category,
service.photo_url,
service.address,
service.created_by,
service.created_on,
CAST(AVG(rating.rating) AS float) as service_rating,
Count(rating.rating) as rating_count
from service
left join rating_service_map map
on service.id = map.service_id
left join rating
on rating.id = map.rating_id
where service.id = '2af3f03e-b2e5-44fd-89e8-3dc5fb641732'
group by service.id,service.title,service.description,service.category,service.photo_url,service.address,service.created_by,service.created_on;
I've also tried to cast the service_id (I've tried "where service.id = sevice_id::uuid" and "where service.id = CAST(service_id AS uuid)") but none of them worked.
I really appreciate it if you can tell me what I'm doing wrong. I've been at this for a couple of hours now.
Thank you.
I suspect that it's because the identifier service_id is ambiguous, being present as both a function parameter and a column in the map table.
Unlike a plain query, where such ambiguity would result in an error, conflicts in SQL functions are resolved by giving precedence to the column, so service_id in your case is actually referring to map.service_id.
You can either qualify it in your function body using the name of your function (i.e. get_service.service_id), or simply choose another name for the parameter.

How to convert self-defined types, e.g. geometry, to and from String?

There is geometry type column in database like Postgis or h2gis(I am using it).
In the console provided by database, I can create a geometry value with select ST_GeomFromText('POINT(12.3 12)', 4326).
Or select a column with geometry type simply by select * from geom.
However I don't know how to insert a geometry value (a string actually) into a table or the opposite direction conversion.
There are also several miscellaneous question below.
Here is the table definition in slick:
class TableSimple(tag:Tag) extends Table[ (Double,String,String) ](tag,"tb_simple"){
def col_double = column[Double]("col_double",O.NotNull)
def col_str = column[String]("col_str",O.NotNull)
def geom = column[String]("geom",O.DBType("Geometry"))
def * = (col_double,col_str,geom)
}
1. About select
The most simple one:
sql" select col_double,col_str, geom from tb_simple ".as[(Double,String,String)]
won't work unless casting geom to string explicitly like:
sql" select col_double,col_str, cast( geom as varchar) from tb_simple ".as[(Double,String,String)]
The first sql throws the error java.lang.ClassNotFoundException: com.vividsolutions.jts.io.ParseException
Q1: How does slick know com.vividsolutions.jts.io.ParseException (it is lib used by h2gis)? Is it an error on the server side or client side(slick side)?
Q2: How to convert/treat column geom as string without writing too much code(e.g. create a new column type in slick)?
2. About insert
First of all the following sql works
StaticQuery.updateNA(""" insert into tb_simple values(11,'abcd',ST_GeomFromText('POINT(5.300000 1.100000)', 4326)) """).execute
I hope code like TableQuery[TableSimple] += (10.3,"hello","ST_GeomFromText('POINT(0.300000 1.100000)'") would work but it doesn't.
It shouldn't because slick translate it to
insert into tb_simple values(11,'abcd','ST_GeomFromText(''POINT(5.300000 1.100000)'', 4326)')
Notice the function ST_GeomFromText become a part of string, that's why it doesn't work.
Q3: Can I implant a string directly for a column instead of wrapped with '' in slick?
I hope I can insert a row as easy as TableQuery[TableSimple] += (10.3,"hello","ST_GeomFromText('POINT(0.300000 1.100000)'") or similar code.
Q4 What's the most convenient way in Slick to implement bidirectional conversion to and from String for a geometry or other self-defined column in the database?
Answering you main question: Slick-pg offers mapping of geometry types in the db to actual geometry types in your model.
It works for Postgis, but maybe it can also work with H2Gis.
You can find slick-pg at https://github.com/tminglei/slick-pg

Select most reviewed courses starting from courses having at least 2 reviews

I'm using Flask-SQLAlchemy with PostgreSQL. I have the following two models:
class Course(db.Model):
id = db.Column(db.Integer, primary_key = True )
course_name =db.Column(db.String(120))
course_description = db.Column(db.Text)
course_reviews = db.relationship('Review', backref ='course', lazy ='dynamic')
class Review(db.Model):
__table_args__ = ( db.UniqueConstraint('course_id', 'user_id'), { } )
id = db.Column(db.Integer, primary_key = True )
review_date = db.Column(db.DateTime)#default=db.func.now()
review_comment = db.Column(db.Text)
rating = db.Column(db.SmallInteger)
course_id = db.Column(db.Integer, db.ForeignKey('course.id') )
user_id = db.Column(db.Integer, db.ForeignKey('user.id') )
I want to select the courses that are most reviewed starting with at least two reviews. The following SQLAlchemy query worked fine with SQlite:
most_rated_courses = db.session.query(models.Review, func.count(models.Review.course_id)).group_by(models.Review.course_id).\
having(func.count(models.Review.course_id) >1) \ .order_by(func.count(models.Review.course_id).desc()).all()
But when I switched to PostgreSQL in production it gives me the following error:
ProgrammingError: (ProgrammingError) column "review.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT review.id AS review_id, review.review_date AS review_...
^
'SELECT review.id AS review_id, review.review_date AS review_review_date, review.review_comment AS review_review_comment, review.rating AS review_rating, review.course_id AS review_course_id, review.user_id AS review_user_id, count(review.course_id) AS count_1 \nFROM review GROUP BY review.course_id \nHAVING count(review.course_id) > %(count_2)s ORDER BY count(review.course_id) DESC' {'count_2': 1}
I tried to fix the query by adding models.Review in the GROUP BY clause but it did not work:
most_rated_courses = db.session.query(models.Review, func.count(models.Review.course_id)).group_by(models.Review.course_id).\
having(func.count(models.Review.course_id) >1) \.order_by(func.count(models.Review.course_id).desc()).all()
Can anyone please help me with this issue. Thanks a lot
SQLite and MySQL both have the behavior that they allow a query that has aggregates (like count()) without applying GROUP BY to all other columns - which in terms of standard SQL is invalid, because if more than one row is present in that aggregated group, it has to pick the first one it sees for return, which is essentially random.
So your query for Review basically returns to you the first "Review" row for each distinct course id - like for course id 3, if you had seven "Review" rows, it's just choosing an essentially random "Review" row within the group of "course_id=3". I gather the answer you really want, "Course", is available here because you can take that semi-randomly selected Review object and just call ".course" on it, giving you the correct Course, but this is a backwards way to go.
But once you get on a proper database like Postgresql you need to use correct SQL. The data you need from the "review" table is just the course_id and the count, nothing else, so query just for that (first assume we don't actually need to display the counts, that's in a minute):
most_rated_course_ids = session.query(
Review.course_id,
).\
group_by(Review.course_id).\
having(func.count(Review.course_id) > 1).\
order_by(func.count(Review.course_id).desc()).\
all()
but that's not your Course object - you want to take that list of ids and apply it to the course table. We first need to keep our list of course ids as a SQL construct, instead of loading the data - that is, turn it into a derived table by converting the query into a subquery (change the word .all() to .subquery()):
most_rated_course_id_subquery = session.query(
Review.course_id,
).\
group_by(Review.course_id).\
having(func.count(Review.course_id) > 1).\
order_by(func.count(Review.course_id).desc()).\
subquery()
one simple way to link that to Course is to use an IN:
courses = session.query(Course).filter(
Course.id.in_(most_rated_course_id_subquery)).all()
but that's essentially going to throw away the "ORDER BY" you're looking for and also doesn't give us any nice way of actually reporting on those counts along with the course results. We need to have that count along with our Course so that we can report it and also order by it. For this we use a JOIN from the "course" table to our derived table. SQLAlchemy is smart enough to know to join on the "course_id" foreign key if we just call join():
courses = session.query(Course).join(most_rated_course_id_subquery).all()
then to get at the count, we need to add that to the columns returned by our subquery along with a label so we can refer to it:
most_rated_course_id_subquery = session.query(
Review.course_id,
func.count(Review.course_id).label("count")
).\
group_by(Review.course_id).\
having(func.count(Review.course_id) > 1).\
subquery()
courses = session.query(
Course, most_rated_course_id_subquery.c.count
).join(
most_rated_course_id_subquery
).order_by(
most_rated_course_id_subquery.c.count.desc()
).all()
A great article I like to point out to people about GROUP BY and this kind of query is SQL GROUP BY techniques which points out the common need for the "select from A join to (subquery of B with aggregate/GROUP BY)" pattern.

Using two different rows from the same table in an expression

I'm using PostgreSQL + PostGIS.
In table I have a point and line geometry in the same column of the same table, in different rows. To get the line I run:
SELECT the_geom
FROM filedata
WHERE id=3
If i want to take point I run:
SELECT the_geom
FROM filedata
WHERE id=4
I want take point and line together, like they're shown in this WITH expression, but using a real query against the table instead:
WITH data AS (
SELECT 'LINESTRING (50 40, 40 60, 50 90, 30 140)'::geometry AS road,
'POINT (60 110)'::geometry AS poi)
SELECT ST_AsText(
ST_Line_Interpolate_Point(road, ST_Line_Locate_Point(road, poi))) AS projected_poi
FROM data;
You see in this example data comes from a hand-created WITH expression. I want take it from my filedata table. My problem is i dont know how to work with data from two different rows of one table at the same time.
One possible way:
A subquery to retrieve another value from a different row.
SELECT ST_AsText(
ST_Line_Interpolate_Point(
the_geom
,ST_Line_Locate_Point(
the_geom
,(SELECT the_geom FROM filedata WHERE id = 4)
)
)
) AS projected_poi
FROM filedata
WHERE id = 3;
Use a self-join:
SELECT ST_AsText(
ST_Line_Interpolate_Point(fd_road.the_geom, ST_Line_Locate_Point(
fd_road.the_geom,
fd_poi.the_geom
)) AS projected_poi
FROM filedata fd_road, filedata fd_poi
WHERE fd_road.id = 3 AND fd_poi.id = 4;
Alternately use a subquery to fetch the other row, as Erwin pointed out.
The main options for using multiple rows from one table in a single expression are:
Self-join the table with two different aliases as shown above, then filter the rows;
Use a subquery expression to get a value for all but one of the rows, as Erwin's answer shows;
Use a window function like lag() and lead() to get a row relative to the current row within the query result; or
JOIN on a subquery that returns a table
The latter two are more advanced options that solve problems that're difficult or inefficient to solve with the simpler self-join or subquery expression.