Postgres ORDER BY using double precision not properly sorting - postgresql

I'm trying to execute what should be a simple query using Postgres 9.3.2.
SELECT
event.id AS event_id, event.created_at AS event_created_at, event.updated_at AS event_updated_at, event.title AS event_title, event.summary AS event_summary, event.image AS event_image, event.active AS event_active, event.raw_score AS event_raw_score, event._score AS event__score, user_1.id AS user_1_id, user_1.email AS user_1_email, user_1.image AS user_1_image, user_1.name AS user_1_name, user_1.password AS user_1_password, user_1.active AS user_1_active, user_1.confirmed_at AS user_1_confirmed_at, user_1.created_at AS user_1_created_at, user_1.updated_at AS user_1_updated_at
FROM event
LEFT OUTER JOIN (
users_events AS users_events_1
JOIN "user" AS user_1 ON user_1.id = users_events_1.user_id
) ON event.id = users_events_1.event_id
ORDER BY event._score DESC
I'm expecting the results to be ordered by event._score, descending, but they never are. They don't seem to be ordered in any way relating to event._score. The datatype for that column is double precision.
The order I expect the values to be in are:
5787.0428202
5787.0427916
5787.0427628
But they are returned as:
5787.0427916
5787.0427628
5787.0428202
I've been struggling with this for a couple hours now and can't figure out why something so simple isn't working correctly.

After a lot more digging, it turns out that this was not an issue with Postgres, but with how the ORM I was using (sqlalchemy) was setup. The _score property was not properly being set in the database, though it was set within the session, so the objects in the session were returning the proper values, but objects fetched from the database were not.

Related

Conditional WHERE clause in KDB?

Full Query:
{[tier;company;ccy; startdate; enddate] select Deal_Time, Deal_Date from DEALONLINE_REMOVED where ?[company = `All; 1b; COMPANY = company], ?[tier = `All;; TIER = tier], Deal_Date within(startdate;enddate), Status = `Completed, ?[ccy = `All;1b;CCY_Pair = ccy]}
Particular Query:
where ?[company = `All; 1b; COMPANY = company], ?[tier = `All; 1b; TIER = tier],
What this query is trying to do is to get the viewstate of a dropdown.
If there dropdown selection is "All", that where clause i.e. company or tier is invalidated, and all companies or tiers are shown.
I am unsure if the query above is correct as I am getting weird charts when displaying them on KDB dashboard.
What I would recommend is to restructure your function to make use of the where clause using functional qSQL.
In your case, you need to be able to filter based on certain input, if its "All" then don't filter else filter on that input. Something like this could work.
/Define sample table
DEALONLINE_REMOVED:([]Deal_time:10#.z.p;Deal_Date:10?.z.d;Company:10?`MSFT`AAPL`GOOGL;TIER:10?`1`2`3)
/New function which joins to where clause
{[company;tier]
wc:();
if[not company=`All;wc:wc,enlist (=;`Company;enlist company)];
if[not tier=`All;wc:wc,enlist (=;`TIER;enlist tier)];
?[DEALONLINE_REMOVED;wc;0b;()]
}[`MSFT;`2]
If you replace the input with `All you will see that everything is returned.
The full functional select for your query would be as follows:
whcl:{[tier;company;ccy;startdate;enddate]
wc:(enlist (within;`Deal_Date;(enlist;startdate;enddate))),(enlist (=;`Status;enlist `Completed)),
$[tier=`All;();enlist (=;`TIER;enlist tier)],
$[company=`All;()enlist (=;`COMPANY;enlist company)],
$[ccy=`All;();enlist (=;`CCY_Pair;enlist ccy)];
?[`DEALONLINE_REMOVED;wc;0b;`Deal_Time`Deal_Date!`Deal_Time`Deal_Date]
}
The first part specifies your date range and status = `Completed in the where clause
wc:(enlist (within;`Deal_Date;(enlist;startdate;enddate))),(enlist (=;`Status;enlist `Completed)),
Next each of these conditionals checks for `All for the TIER, COMPANY and CCY_Pair column filtering. It then joins these on to the where clause when a specific TIER, COMPANY or CCY_Pair are specified. (otherwise an empty list is joined on):
$[tier=`All;();enlist (=;`TIER;enlist tier)],
$[company=`All;();enlist (=;`COMPANY;enlist company)],
$[ccy=`All;();enlist (=;`CCY_Pair;enlist ccy)];
Finally, the select statement is called in its functional form as follows, with wc as the where clause:
?[`DEALONLINE_REMOVED;wc;0b;`Deal_Time`Deal_Date!`Deal_Time`Deal_Date]

Advanced MySQLi query syntax

I have a fairly complex mysqli insert statement to do, as I need to gather data from two other tables and also use some variables available to me in php all in one statement.
It's confusing me because of the order of the variables for the prepared statements. - I am only inserting four values, but in order to get the values for t1 and t2, I am referencing these three additional variables too in the bind_param statement.
I should be populating jobRoleID,companyID,departmentID,competenceID with $newJobRoleID,$newCompanyID,t1.id,t2.id
$addJobCompetencies = $con->prepare("
INSERT INTO jobRoleCompetencies (jobRoleID,companyID,departmentID,competenceID)
WITH t1 AS (SELECT id FROM departments WHERE companyID = ? AND department = ?),
t2 AS (SELECT id FROM competencies WHERE companyID = ?)
SELECT ?,?,t1.id,t2.id FROM t1,t2");
if($addJobCompetencies) {
$addJobCompetencies->bind_param('isiii', $newCompanyID,$row['department'],$newCompanyID,$newJobRoleID,$newCompanyID);
$addJobCompetencies->execute();
} else {
echo $con->error;
}
$addJobCompetencies->free_result();
I am getting a syntax error but I'm not 100% sure why. Can anyone help?
Thanks
Dan
I haven't resolved the syntax issue with this query, but have got around it by simply combining the select statements to retrieve all the information I need in a single select statement...
SELECT ?,?,departments.id AS departmentID,competencies.id AS competencyID FROM departments,competencies... etc

Select most reviewed courses starting from courses having at least 2 reviews

I'm using Flask-SQLAlchemy with PostgreSQL. I have the following two models:
class Course(db.Model):
id = db.Column(db.Integer, primary_key = True )
course_name =db.Column(db.String(120))
course_description = db.Column(db.Text)
course_reviews = db.relationship('Review', backref ='course', lazy ='dynamic')
class Review(db.Model):
__table_args__ = ( db.UniqueConstraint('course_id', 'user_id'), { } )
id = db.Column(db.Integer, primary_key = True )
review_date = db.Column(db.DateTime)#default=db.func.now()
review_comment = db.Column(db.Text)
rating = db.Column(db.SmallInteger)
course_id = db.Column(db.Integer, db.ForeignKey('course.id') )
user_id = db.Column(db.Integer, db.ForeignKey('user.id') )
I want to select the courses that are most reviewed starting with at least two reviews. The following SQLAlchemy query worked fine with SQlite:
most_rated_courses = db.session.query(models.Review, func.count(models.Review.course_id)).group_by(models.Review.course_id).\
having(func.count(models.Review.course_id) >1) \ .order_by(func.count(models.Review.course_id).desc()).all()
But when I switched to PostgreSQL in production it gives me the following error:
ProgrammingError: (ProgrammingError) column "review.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT review.id AS review_id, review.review_date AS review_...
^
'SELECT review.id AS review_id, review.review_date AS review_review_date, review.review_comment AS review_review_comment, review.rating AS review_rating, review.course_id AS review_course_id, review.user_id AS review_user_id, count(review.course_id) AS count_1 \nFROM review GROUP BY review.course_id \nHAVING count(review.course_id) > %(count_2)s ORDER BY count(review.course_id) DESC' {'count_2': 1}
I tried to fix the query by adding models.Review in the GROUP BY clause but it did not work:
most_rated_courses = db.session.query(models.Review, func.count(models.Review.course_id)).group_by(models.Review.course_id).\
having(func.count(models.Review.course_id) >1) \.order_by(func.count(models.Review.course_id).desc()).all()
Can anyone please help me with this issue. Thanks a lot
SQLite and MySQL both have the behavior that they allow a query that has aggregates (like count()) without applying GROUP BY to all other columns - which in terms of standard SQL is invalid, because if more than one row is present in that aggregated group, it has to pick the first one it sees for return, which is essentially random.
So your query for Review basically returns to you the first "Review" row for each distinct course id - like for course id 3, if you had seven "Review" rows, it's just choosing an essentially random "Review" row within the group of "course_id=3". I gather the answer you really want, "Course", is available here because you can take that semi-randomly selected Review object and just call ".course" on it, giving you the correct Course, but this is a backwards way to go.
But once you get on a proper database like Postgresql you need to use correct SQL. The data you need from the "review" table is just the course_id and the count, nothing else, so query just for that (first assume we don't actually need to display the counts, that's in a minute):
most_rated_course_ids = session.query(
Review.course_id,
).\
group_by(Review.course_id).\
having(func.count(Review.course_id) > 1).\
order_by(func.count(Review.course_id).desc()).\
all()
but that's not your Course object - you want to take that list of ids and apply it to the course table. We first need to keep our list of course ids as a SQL construct, instead of loading the data - that is, turn it into a derived table by converting the query into a subquery (change the word .all() to .subquery()):
most_rated_course_id_subquery = session.query(
Review.course_id,
).\
group_by(Review.course_id).\
having(func.count(Review.course_id) > 1).\
order_by(func.count(Review.course_id).desc()).\
subquery()
one simple way to link that to Course is to use an IN:
courses = session.query(Course).filter(
Course.id.in_(most_rated_course_id_subquery)).all()
but that's essentially going to throw away the "ORDER BY" you're looking for and also doesn't give us any nice way of actually reporting on those counts along with the course results. We need to have that count along with our Course so that we can report it and also order by it. For this we use a JOIN from the "course" table to our derived table. SQLAlchemy is smart enough to know to join on the "course_id" foreign key if we just call join():
courses = session.query(Course).join(most_rated_course_id_subquery).all()
then to get at the count, we need to add that to the columns returned by our subquery along with a label so we can refer to it:
most_rated_course_id_subquery = session.query(
Review.course_id,
func.count(Review.course_id).label("count")
).\
group_by(Review.course_id).\
having(func.count(Review.course_id) > 1).\
subquery()
courses = session.query(
Course, most_rated_course_id_subquery.c.count
).join(
most_rated_course_id_subquery
).order_by(
most_rated_course_id_subquery.c.count.desc()
).all()
A great article I like to point out to people about GROUP BY and this kind of query is SQL GROUP BY techniques which points out the common need for the "select from A join to (subquery of B with aggregate/GROUP BY)" pattern.

Postgres - Get data from each alias

In my application i have a query that do multiple joins with a table position. Just like this:
SELECT *
FROM (...) as trips
join trip as t on trips.trip_id = t.trip_id
left outer join vehicle as v on v.vehicle_id = t.trip_vehicle_id
left outer join position as start on trips.start_position_id = start.position_id and start.position_vehicle_id = v.vehicle_id
left outer join position as "end" on trips.end_position_id = "end".position_id and "end".position_vehicle_id = v.vehicle_id
left outer join position as last on trips.last_position_id = last.position_id and last.position_vehicle_id = v.vehicle_id;
My table position has 35 columns(for example position_id).
When I run the query, in result should appear the table position 3 times, start, end and last. But postgres can not distinguish between, for exemplar, start.position_id, end.position_id and last.position_id. So this 3 columns are group and appear as one, position_id.
As the data from start.position_id and end.position_id are different, the column, position_id, that appear in result, it's empty.
Without having to rename all the columns, like this: start.position_id as start_position_id.
How can i get each group of data separately, for exemple, get all columns from the table 'start'. In MYSQL i can do this operation by calling fetch_fields, and give the function an alias, like 'start'.
But i can i do this in Postgres?
Best Regards,
Nuno Oliveira
My understanding is that you can't (or find it difficult to) discern between which table each column with a shared name (such as "position_id") belongs to, but only need to see one of the sets of shared columns at any one time. If that is the case, use tablename.* in your SELECT, so SELECT trips.*, start.*... would show the columns from trips and start, but no columns from other tables involved in the join.
SELECT [...,] start.* [,...] FROM [...] atable AS start [...]

Updating records in Postgres using nested sub-selects

I have a table where I have added a new column, and I want to write a SQL statement to update that column based on existing information. Here are the two tables and the relevant columns
'leagues'
=> id
=> league_key
=> league_id (this is the new column)
'permissions'
=> id
=> league_key
Now, what I want to do, in plain English, is this
Set leagues.league_id to be permissions.id for each value of permissions.league_key
I had tried SQL like this:
UPDATE leagues
SET league_id =
(SELECT id FROM permissions WHERE league_key =
(SELECT distinct(league_key) FROM leagues))
WHERE league_key = (SELECT distinct(league_key) FROM leagues)
but I am getting an error message that says
ERROR: more than one row returned by a subquery used as an expression
Any help for this would be greatly appreciated
Based on your requirements of
Set leagues.league_id to be permissions.id for each value of permissions.league_key
This does that.
UPDATE leagues
SET league_id = permissions_id
FROM permissions
WHERE permissions.league_key = leagues.league_key;
When you do a subquery as an expression, it can't return a result set. Your subquery must evaluate to a single result. The error that you are seeing is because one of your subqueries returns more than one value.
Here is the relevant documentation for pg84: