Condensing Left Join Result Set into one row - tsql

I have a sql query which currently, due to a couple of Left joins, returns multiple rows:-
Id
Action
Group
12345 NULL NULL
12345 ADD NULL
12345 NULL ABC Group
How do I go about condensing these 3 rows into one e.g.
12345 ADD ABC Group

The constraints of your question are a bit unclear. If for every ID, there is only one possible non-NULL value of the other columns, you could use MIN or MAX to pull those values out. For example, put the query with the LEFT JOINS into a CTE, then do
SELECT ID, MAX(col2),MAX(col3)
FROM CTE
GROUP BY ID
If there is the potential for multiple non-NULL values per column, you will need to be more specific about what you would want the output to look like.

Related

PostgreSQL how to GROUP BY single field from returned table

So I have complicated query, to simplify let it be like
SELECT
t.*,
SUM(a.hours) AS spent_hours
FROM (
SELECT
person.id,
person.name,
person.age,
SUM(contacts.id) AS contact_count
FROM
person
JOIN contacts ON contacts.person_id = person.id
) AS t
JOIN activities AS a ON a.person_id = t.id
GROUP BY t.id
Such query works fine in MySQL, but Postgres needs to know that GROUP BY field is unique, and despite it actually is, in this case I need to GROUP BY all returned fields from returned t table.
I can do that, but I don't believe that will work efficiently with big data.
I can't JOIN with activities directly in first query, as person can have several contacts which will lead query counting hours of activity several time for every joined contact.
Is there a Postgres way to make this query work? Maybe force to treat Postgres t.id as unique or some other solution that will make same in Postgres way?
This query will not work on both database system, there is an aggregate function in the inner query but you are not grouping it(unless you use window functions). Of course there is a special case for MySQL, you can use it with disabling "sql_mode=only_full_group_by". So, MySQL allows this usage because of it' s database engine parameter, but you cannot do that in PostgreSQL.
I knew MySQL allowed indeterminate grouping, but I honestly never knew how it implemented it... it always seemed imprecise to me, conceptually.
So depending on what that means (I'm too lazy to look it up), you might need one of two possible solutions, or maybe a third.
If you intent is to see all rows (perform the aggregate function but not consolidate/group rows), then you want a windowing function, invoked by partition by. Here is a really dumbed down version in your query:
.
SELECT
t.*,
SUM (a.hours) over (partition by t.id) AS spent_hours
FROM t
JOIN activities AS a ON a.person_id = t.id
This means you want all records in table t, not one record per t.id. But each row will also contain a sum of the hours for all values that value of id.
For example the sum column would look like this:
Name Hours Sum Hours
----- ----- ---------
Smith 20 120
Jones 30 30
Smith 100 120
Whereas a group by would have had Smith once and could not have displayed the hours column in detail.
If you really did only want one row per t.id, then Postgres will require you to tell it how to determine which row. In the example above for Smith, do you want to see the 20 or the 100?
There is another possibility, but I think I'll let you reply first. My gut tells me option 1 is what you're after and you want the analytic function.

Create SQL Column Counting Frequency of Value in other column

I have the first three columns in SQL. I want to create the 4th column called Count which counts the number of times each unique name appears in the Name column. I want my results to appears like the dataset below, so I don't want to do a COUNT and GROUP BY.
What is the best way to achieve this?
We can try to use COUNT window function
SELECT *,COUNT(*) OVER(PARTITION BY name ORDER BY year,month) count
FROM T
ORDER BY year,month
sqlfiddle

How to get distinct results with max of a field

I have a query :
select distinct(donorig_cdn),cerhue_num_rfa,max(cerhue_dt) from t_certif_hue
group by donorig_cdn,cerhue_num_rfa
order by donorig_cdn
it returns me some repeated ids with different cerhue_num_rfa
how do i return only one line for the repeated ids with cerhue_num_rfa that matches the max of date (cerhue_dt) .. and have at the end only 10 results instead of 15 ?
Postgres has SELECT DISTINCT ON to the rescue. It only returns the first row found for each value of the given column. So, all you need is an order that ensures the latest entry comes first. No need for grouping.
SELECT DISTINCT ON (donorig_cdn) donorig_cdn,cerhue_num_rfa,cerhue_dt
FROM t_certif_hue
ORDER BY donorig_cdn, cerhue_dt DESC;

T-SQL LEFT JOIN on bigint id return only ids lower than 101 on right table

I have two tables on a Sql Server 2008.
ownership with 3 fields and case with another 3 fields I need to join both on the ID field (bigint).
For testing purposes I'm only using one field from each table. This field is bigint and has values from 1 to 170 (for now).
My query is:
SELECT DISTINCT
ownership.fCase,
case.id
FROM
ownership LEFT JOIN case ON (case.id=ownership.fCase)
WHERE
ownership.dUser='demo'
This was expected to return 4 rows with the same values on both columns. Problem is that the last row of the right table comes as null for the fCase = 140. This is the only value above 100.
If I run the query without the WHERE clause it show all rows on the left table but the values on the right only apear if below 101 otherwise shows null.
Can someone help me, am I doing something wrong or is this a limitation or a bug?
Case is also a verb so it may be getting confused. Try your table and column names in []. E.G. [case].[id] = [ownership].[fCase]. Are you like double check sure that [case].[id] and [ownership].[fCase] are both bigint. If your current values are 1-170 then why bigint (9,223,372,036,854,775,807)? Does that column accept nulls?

Duplicate values returned with joins

I was wondering if there is a way using TSQL join statement (or any other available option) to only display certain values. I will try and explain exactly what I mean.
My database has tables called Job, consign, dechead, decitem. Job, consign, and dechead will only ever have one line per record but decitem can have multiple records all tied to the dechead with a foreign key. I am writing a query that pulls various values from each table. This is fine with all the tables except decitem. From dechead I need to pull an invoice value and from decitem I need to grab the net wieghts. When the results are returned if dechead has multiple child decitem tables it displays all values from both tables. What I need it to do is only display the dechad values once and then all the decitems values.
e.g.
1 ¦123¦£2000¦15.00¦1
2 ¦--¦------¦20.00¦2
3 ¦--¦------¦25.00¦3
Line 1 displays values from dechead and the first line/Join from decitems. Lines 2 and 3 just display values from decitem. If I then export the query to say excel I do not have duplicate values in the first two fileds of lines 2 and 3
e.g.
1 ¦123¦£2000¦15.00¦1
2 ¦123¦£2000¦20.00¦2
3 ¦123¦£2000¦25.00¦3
Thanks in advance.
Check out 'group by' for your RDBMS http://msdn.microsoft.com/en-US/library/ms177673%28v=SQL.90%29.aspx
this is a task best left for the application, but if you must do it in sql, try this:
SELECT
CASE
WHEN RowVal=1 THEN dt.col1
ELSE NULL
END as Col1
,CASE
WHEN RowVal=1 THEN dt.col2
ELSE NULL
END as Col2
,dt.Col3
,dt.Col4
FROM (SELECT
col1, col2, col3
,ROW_NUMBER OVER(PARTITION BY Col1 ORDER BY Col1,Col4) AS RowVal
FROM ...rest of your big query here...
) dt
ORDER BY dt.col1,dt.Col4