Why do I need to group by columns that I don't need to group by? - postgresql

Say I have a query like this:
SELECT
car.id,
car.make,
car.model,
car.vin,
car.year,
car.color
FROM car GROUP BY car.make
I want to group the result by make so I can eliminate any duplicate makes. I'm essentially trying to do a SELECT DISTINCT. But I get this error:
ERROR column must appear in the GROUP BY clause or be used in an aggregate function
It seems silly to group by each column when I dont want to see any of them in a group. How do I get around this?

Instead of GROUP BY, use DISTINCT ON:
SELECT DISTINCT ON (c.make) c.*
FROM car c
ORDER BY c.make;
This will return an arbitrary row for each make. Which row? An arbitrary one. You can include a second key in the ORDER BY to determine the particular row you want (cheapest, oldest, etc.).

All column names in SELECT list must appear in GROUP BY clause unless name is used only in an aggregate function. PostgreSQL only let you omit from the GROUP BY clause columns that are functionally dependent on columns that are in the GROUP BY.

Related

Create SQL Column Counting Frequency of Value in other column

I have the first three columns in SQL. I want to create the 4th column called Count which counts the number of times each unique name appears in the Name column. I want my results to appears like the dataset below, so I don't want to do a COUNT and GROUP BY.
What is the best way to achieve this?
We can try to use COUNT window function
SELECT *,COUNT(*) OVER(PARTITION BY name ORDER BY year,month) count
FROM T
ORDER BY year,month
sqlfiddle

How to change row order using order & group in PgAdmin, sql?

I would like to re-order rows from two columns in an existing table without creating a new one. I have this script, that works, table name test.table:
SELECT value, variety
FROM test.table
group by value, variety
order by value, variety;
I have tried update and alter table, but I can not get it to work e.g:
update test.table
SELECT value, variety
FROM test.table
group by value, variety
order by value, variety;
How is this done?
I think you should have a look at this qustion and answer for using group by and orde by together.

Unexpected behavior in a postgres group by query

I am used to writing group by queries in t-sql. In a t-sql group by, this would generate a list where items with the same categorytext were grouped together, then items within a category text group that had the same type text would be grouped together. But that does not seem to be what is happening here:
Select "CategoryText", "TypeText"
from "NewOrleans911Categories"
group by "CategoryText", "TypeText";
Here is some output from postgres. Why are the NAs not getting grouped together?
CategoryText; TypeText
"BrokenWindows";"DRUG VIOLATIONS"
"NA";"BOMB SCARE"
"Weapon";"DISCHARGING FIREARMS"
"NA";"NEGLIGENT INJURY"
In a t-sql group by, this would generate a list where items with the same categorytext were grouped together, then items within a category text group that had the same type text would be grouped together.
In SQL, the order in which rows are returned by a query is unspecified, unless you toss in an order by clause. Typically, you'll get the rows in the order they got returned by the query, and that would entirely depend on the query plan. (Best I'm aware, t-sql does that too.)
At any rate, you'd want to add the missing order by clause to get the expected result:
Select "CategoryText", "TypeText"
from "NewOrleans911Categories"
group by "CategoryText", "TypeText"
order by "CategoryText", "TypeText";
Or (and I suspect this is what you're actually looking for) replace the group by with an order by clause:
Select "CategoryText", "TypeText"
from "NewOrleans911Categories"
order by "CategoryText", "TypeText";
You are "grouping" by two columns. The rows are only "Grouped " when the records match both columns.
In that case you have different TypeText for both NA, so they will not group by. Much like using a distinct, which in that case will accomplish the same thing.
May be you need query like this:
select distinct on ("CategoryText") "CategoryText", "TypeText"
from "NewOrleans911Categories"
because with group by you cannot select columns which aren't in group by statement.

Create a query to select two columns; (Company, No. of Films) from the database

I have created a database as part of university assignment and I have hit a snag with the question in the title.
More likely I am being asked to find out how many films each company has made. Which suggests to me a group by query. But I have no idea where to begin. It is only a two mark question but the syntax is not clicking in my head.
My schema is:
CREATE TABLE Movie
(movieID CHAR(3) ,
title CHAR(36),
year NUMBER,
company CHAR(50),
totalNoms NUMBER,
awardsWon NUMBER,
DVDPrice NUMBER(5,2),
discountPrice NUMBER(5,2))
There are other tables but at first glance I don't think they are relevant to this question.
I am using sqlplus10
The answer you need comes from three basic SQL concepts, I'll step through them with you. If you need more assistance to create an answer from these hints, let me know and I can try to keep guiding you.
Group By
As you mentioned, SQL offers a GROUP BY function that can help you.
A SQL Query utilizing GROUP BY would look like the following.
SELECT list, fields, aggregate(value)
FROM tablename
--WHERE goes here, if you need to restrict your result set
GROUP BY list, fields
a GROUP BY query can only return fields listed in the group by statement, or aggregate functions acting on each group.
Aggregate Functions
Your homework question also needs an Aggregate function called Count. This is used to count the results returned. A simple query like the following returns the count of all records returned.
SELECT Count(*)
FROM tablename
The two can be combined, allowing you to get the Count of each group in the following way.
SELECT list, fields, count(*)
FROM tablename
GROUP BY list, fields
Column Aliases
Another answer also tried to introduce you to SQL column aliases, but they did not use SQLPLUS syntax.
SELECT Count(*) as count
...
SQLPLUS column alias syntax is shown below.
SELECT Count(*) "count"
...
I'm not going to provide you the SQL, but instead a way to think about it.
What you want to do is select where the company matches and count the total rows returned. That count is the number of films made by the specified company.
Hope that points you in the right direction.
Select company, count(*) AS count
from Movie
group by company
select * group by company won't work in Oracle.

hive Expression Not In Group By Key

I create a table in HIVE.
It has the following columns:
id bigint, rank bigint, date string
I want to get avg(rank) per month. I can use this command. It works.
select a.lens_id, avg(a.rank)
from tableA a
group by a.lens_id, year(a.date_saved), month(a.date_saved);
However, I also want to get date information. I use this command:
select a.lens_id, avg(a.rank), a.date_saved
from lensrank_archive a
group by a.lens_id, year(a.date_saved), month(a.date_saved);
It complains: Expression Not In Group By Key
The full error message should be in the format Expression Not In Group By Key [value].
The [value] will tell you what expression needs to be in the Group By.
Just looking at the two queries, I'd say that you need to add a.date_saved explicitly to the Group By.
A walk around is to put the additional field in a collect_set and return the first element of the set. For example
select a.lens_id, avg(a.rank), collect_set(a.date_saved)[0]
from lensrank_archive a
group by a.lens_id, year(a.date_saved), month(a.date_saved);
This is because there is more than one ‘date_saved’ record under your group by. You can turn these ‘date_saved’ records into arrays and output them.