T-SQL - how can I use group by on xml objects - tsql

I've wrote the following query which I expect to return a data-set as outlined under the query
Query
SELECT
RelatedRecordID AS [OrganisationID],
Data.value('(//OpportunityViewEvent/Title)[1]','nvarchar(255)') AS OpportunityTitle,
Data.value('(//OpportunityViewEvent/ID)[1]','int') AS OpportunityID,
Count(Data.value('(//OpportunityViewEvent/ID)[1]','int')) AS Visits
FROM [Audit].[EventData]
LEFT OUTER JOIN Employed.Organisation AS ORG ON [EventData].RelatedRecordID = ORG.ID
Where EventTypeID = 4
Group BY RelatedRecordID
Order By Visits Desc
Expected Result
+-----------------+-----------------+---------------+--------+
| OrganisationID | OpportunityTitle | OpportunityID | Visits |
+-----------------+------------------+---------------+--------+
| 23 | Plumber | 122 | 567 |
| 65 | Accountant | 34 | 288 |
| 12 | Developer | 81 | 100 |
| 45 | Driver | 22 | 96 |
+-----------------+------------------+---------------+--------+
I receive an error saying
Column 'Audit.EventData.Data' is invalid in the select list because it
is not contained in either an aggregate function or the GROUP BY
clause.
If I then try to group the xml data I get a different error saying
XML methods are not allowed in a GROUP BY clause.
Is there a way to work around this?
Thanks

You can do by adding it into CTE
;with cte as (
SELECT
RelatedRecordID AS [OrganisationID],
Data.value('(//OpportunityViewEvent/Title)[1]','nvarchar(255)') AS OpportunityTitle,
Data.value('(//OpportunityViewEvent/ID)[1]','int') AS OpportunityID,
Data.value('(//OpportunityViewEvent/ID)[1]','int') as visit
FROM [Audit].[EventData]
LEFT OUTER JOIN Employed.Organisation AS ORG ON [EventData].RelatedRecordID = ORG.ID
Where EventTypeID = 4 )
select OrganisationID, opportunityTitle, opportunityId, count(visit) as Visits from cte
Group BY OrganisationID, opportunityTitle, opportunityId

Related

How to select rows based on properties of another row?

Had a question..
| a_id | name | r_id | message | date
_____________________________________________
| 1 | bob | 77 | bob here | 1-jan
| 1 | bob | 77 | bob here again | 2-jan
| 2 | jack | 77 | jack here. | 2-jan
| 1 | bob | 79 | in another room| 3-feb
| 3 | gill | 79 | gill here | 4-feb
These are basically accounts (a_id) chatting inside different rooms (r_id)
I'm trying to find the last chat message for every room that jack a_id = 2 is chatting in.
What i've tried so far is using distinct on (r_id) ... ORDER BY r_id, date DESC.
But this incorrectly gives me the last message in every room instead of only giving the last message in everyroom that jack belongs to.
| 2 | jack | 77 | jack here. | 2-jan
| 3 | gill | 79 | gill here | 4-feb
Is this a partition problem instead distinct on?
I would suggest :
to group the rows by r_id with a GROUP BY clause
to select only the groups where a_id = 2 is included with a HAVING clause which aggregates the a_id of each group : HAVING array_agg(a_id) #> array[2]
to select the latest message of each selected group by aggregating its rows in an array with ORDER BY date DESC and selecting the first element of the array : (array_agg(t.*))[1]
to convert the selected rows into a json object and then displaying the expected result by using the json_populate_record function
The full query is :
SELECT (json_populate_record(null :: my_table, (array_agg(to_json(t.*)))[1])).*
FROM my_table AS t
GROUP BY r_id
HAVING array_agg(a_id) #> array[2]
and the result is :
a_id
name
r_id
message
date
1
bob
77
bob here
2022-01-01
see dbfiddle
For last message in every chat room simply would be:
select a_id, name, r_id, to_char(max(date),'dd-mon') from chats
where a_id =2
group by r_id, a_id,name;
Fiddle https://www.db-fiddle.com/f/keCReoaXg2eScrhFetEq1b/0
Or seeing messages
with last_message as (
select a_id, name, r_id, to_char(max(date),'dd-mon') date from chats
where a_id =1
group by r_id, a_id,name
)
select l.*, c.message
from last_message l
join chats c on (c.a_id= l.a_id and l.r_id=c.r_id and l.date=to_char(c.date,'dd-mon'));
Fiddle https://www.db-fiddle.com/f/keCReoaXg2eScrhFetEq1b/1
Though all this complication could by avoided with a primary key on your table.

posgresql selecting two different data as two columns from one column

I need to select two id's from stockcurrent as two different columns (id1,id2), first where points.id = '244' and second where points.id ='191'. But result facing last where clause and filling only one column based on that statement.
I think I've faced a similar problem as in that case: Two SELECT statements as two columns
The only difference is that in the case above his last where clause is in range but mine is not. In my opinion, it is the reason why my statement is not working:
select
(case when po.id='244' then st.id end) id1,
(case when po.id='191' then st.id end) id2
from stockcurrent st
inner join points po on po.id = st.point
where po.id ='244';
My result:
Expected result:
So I need to find a solution to fill both columns with id's not only one which in that case giving me the result(s) of '244'. Thanks in advance.
Example of stockcurrent table:
+-------+-------+
| id | point |
+-------+-------+
| 23414 | 191 |
| 12493 | 191 |
| 16121 | 170 |
| 24325 | 191 |
| 51232 | 244 |
| 11255 | 244 |
| 56572 | 244 |
| 16123 | 170 |
+-------+-------+
Example of points table:
+-----+------+------+
| id | comp | type |
+-----+------+------+
| 191 | 96 | 2 |
| 307 | 96 | 1 |
| 244 | 97 | 0 |
| 311 | 98 | 0 |
| 170 | 109 | 0 |
+-----+------+------+
Change the query to:
select
(case when po.id='244' then st.id end) id1,
(case when po.id='191' then st.id end) id2
from stockcurrent st
inner join points po on po.id = st.point
where po.id in ('244', '191');

Accomplishing what I need without a CROSS JOIN

I have a query that pulls from a table. With this table, I would like to build a query that allows me to make projections into the future.
SELECT
b.date,
a.id,
SUM(CASE WHEN a.date = b.date THEN a.sales ELSE 0 END) sales,
SUM(CASE WHEN a.date = b.date THEN a.revenue ELSE 0 END) revenue
FROM
table_a a
CROSS JOIN table_b b
WHERE a.date BETWEEN '2018-10-31' AND '2018-11-04'
GROUP BY 1,2
table_b is a table with literally only one column that contains dates going deep into the future. This returns results like this:
+----------+--------+-------+---------+
| date | id | sales | revenue |
+----------+--------+-------+---------+
| 11/4/18 | 113972 | 0 | 0 |
| 11/4/18 | 111218 | 0 | 0 |
| 11/3/18 | 111218 | 0 | 0 |
| 11/3/18 | 113972 | 0 | 0 |
| 11/2/18 | 111218 | 0 | 0 |
| 11/2/18 | 113972 | 0 | 0 |
| 11/1/18 | 111218 | 89 | 2405.77 |
| 11/1/18 | 113972 | 265 | 3000.39 |
| 10/31/18 | 111218 | 64 | 2957.71 |
| 10/31/18 | 113972 | 120 | 5650.91 |
+----------+--------+-------+---------+
Now there's more to the query after this where I get into the projections and what not, but for the purposes of this question, this is all you need, as it's where the CROSS JOIN exists.
How can I recreate these results without using a CROSS JOIN? In reality, this query is a much larger date range with way more data and takes hours and so much power to run and I know CROSS JOIN's should be avoided if possible.
Use the table of all dates as the "from table" and left join the data, this still returns each date.
SELECT
d.date
, t.id
, COALESCE(SUM(t.sales),0) sales
, COALESCE(SUM(t.revenue),0) revenue
FROM all_dates d
LEFT JOIN table_data t
ON d.date = t.date
WHERE d.date BETWEEN '2018-10-31' AND '2018-11-04'
GROUP BY
d.date
, t.id
Another alternative (to avoid the cross join) could be to use generate series but for this - in Redshift - I suggest this former answer. I'm a fan of generate series, but if you already have a table I would probably stay with that (but this is based on what little I know about your query etc.).

How to join 2 tables without value duplication in PostgreSql

I am joining two tables using:
select table1.date, table1.item, table1.qty, table2.anotherQty
from table1
INNER JOIN table2
on table1.date = table2.date
table1
date | item | qty
july1 | itemA | 20
july1 | itemB | 30
july2 | itemA | 20
table2
date | anotherQty
july1 | 200
july2 | 300
Expected result should be:
date | item | qty | anotherQty
july1 | itemA | 20 | 200
july1 | itemB | 30 | null or 0
july2 | itemA | 20 | 300
So that when i sum(anotherQty) it will have 500 only, instead of:
date | item | qty | anotherQty
july1 | itemA | 20 | 200
july1 | itemB | 30 | 200
july2 | itemA | 20 | 300
That is 200+200+300 = 700
SQL DEMO
WITH T1 as (
SELECT *, ROW_NUMBER() OVER (PARTITION BY "date" ORDER BY "item") as rn
FROM Table1
), T2 as (
SELECT *, ROW_NUMBER() OVER (PARTITION BY "date" ORDER BY "anotherQty") as rn
FROM Table2
)
SELECT *
FROM t1
LEFT JOIN t2
ON t1."date" = t2."date"
AND t1.rn = t2.rn
OUTPUT
Filter the columns you want, and change the order if need it.
| date | item | qty | rn | date | anotherQty | rn |
|-------|-------|-----|----|--------|------------|--------|
| july1 | itemA | 20 | 1 | july1 | 200 | 1 |
| july1 | itemB | 30 | 2 | (null) | (null) | (null) |
| july2 | itemA | 20 | 1 | july2 | 300 | 1 |
Try the following code, but know that so long as the qty values differ across rows, that you're going to still get the 'anotherQty' field breaking out into distinct values:
select
table1.date,
table1.item,
table1.qty,
SUM(table2.anotherQty)
from table1
INNER JOIN table2
on table1.date = table2.date
GROUP BY
table1.item,
table1.qty,
table1.date
If you need it to always aggregate down to a single line per item/date, then you will need to add a SUM() to table1.qty as well. Alternately, you could run a common table expression (WITH() statement) for each quantity that you want, summing them within the common table expression, and then rejoining the expressions to your final SELECT statement.
Edit:
Based on the comment from #Juan Carlos Oropeza, I'm not sure that there is a way to get the summed value of 500 while including table1.date in your query, because you will have to group the output by date which will cause the aggregation to split into distinct lines. The following query will get you the sum of anotherQty, at the sacrifice of displaying date:
select
table1.item,
SUM(table1.qty),
SUM(table2.anotherQty)
from table1
INNER JOIN table2
on table1.date = table2.date
GROUP BY
table1.item
If you need date to persist, you can get the sum to show up by using a WINDOW function, but note that this is essentially doing a running sum, and may throw off any subsequent summation you're doing on this query's output in terms of post-processing:
select
table1.item,
table1.date,
SUM(table1.qty),
SUM(table2.anotherQty) OVER (Partition By table1.item)
from table1
INNER JOIN table2
on table1.date = table2.date
GROUP BY
table1.item,
table1.date,
table2.anotherQty

T-SQL 2012- How can I pair down query and check the returned values

I have a table which captures various events and returns data as below
+----+-----------+--------------+-----------------+
| ID | EventType | SystemUserID | OrganisationID |
+----+-----------+--------------+-----------------+
| 23 | 8 | 11 | 1 |
| 44 | 7 | 456 | 304 |
| 80 | 1 | 48 | 4042 |
| 89 | 2 | 5673 | 183 |
+----+-----------+--------------+-----------------+
EventType 8 is captured each time a user visits an organisation.
I have another table named OrganisationHasContact as such
+----+----------------+--------------+
| ID | OrganisationID | SystemUserID |
+----+----------------+--------------+
| 23 | 1 | 11 |
| 44 | 304 | 456 |
| 80 | 4042 | 48 |
| 89 | 183 | 5673 |
+----+----------------+--------------+
I have wrote a query in order to find the total number of visits each organisation has had but excluding SystemUsers who are associated to that organisation
SELECT
OrganisationID AS [OrganisationID],
ORG.Name AS [Compayny Name],
Count(OrganisationID) AS [Visits]
FROM [Audit].[EventData]
LEFT OUTER JOIN Employed.Organisation AS ORG ON [EventData].OrganisationID = ORG.ID
Where Not Exists(
Select * From [Audit].[EventData] AS A2
Inner join Employed.OrganisationHasContact AS OHC On [EventData].OrganisationIDID = OHC.OrganisationID
Where OHC.SystemUserID = [EventData].SystemUserID
)
AND
EventTypeID = 8
Group BY [EventData].OrganisationID,
ORG.Name
Order By Visits Desc
To test the above query was returning correct values (449 visits) I did the following
First I ran this query to get all SystemUserId's associated to Org 1741
Select * from Employed.OrganisationHasContact
Where OrganisationID = 1741
Then with the returned SystemUserID's I ran this query
SELECT Distinct
OrganisationID AS [OrganisationID],
ORG.Name AS [Compayny Name],
Count(OrganisationID) AS [Visits]
FROM [Audit].[EventData]
LEFT OUTER JOIN Employed.Organisation AS ORG ON [EventData].OrganisationID = ORG.ID
Where
EventTypeID = 8
AND SystemUserID NOT IN (35, 4602, 48, 4603, 7704)
Group BY OrganisationID,
ORG.Name
Order By Visits Desc
This Returned 446 visits for Org 1741, my original query returns 449 visits.
What could be causing the difference?
Can I improve my original query somehow?
Is there a better way to test this?
Thanks