TSQL: FOR XML PATH('') Failing To Group - tsql

I'm trying to group column values by a specific column using FOR XML PATH('') in TSQL. This is the result in both cases (note that the without XML code - ie: SELECT * FROM #xml - is the same as the with XML code):
Class | Animals
=================================
Asteroidea | Starfish
Mammalia | Dog
Mammalia | Cat
Mammalia | Coyote
Reptilia | Crocodile
Reptilia | Lizard
According to this article and this article (note that the second article leaves out the GROUP BY, which I'm unsure how the author managed to pull this off without it - I've tried and it only generates all the values), the syntax should be as shown below this:
DECLARE #xml TABLE(
Animal VARCHAR(50),
Class VARCHAR(50)
)
INSERT INTO #xml
VALUES ('Dog','Mammalia')
, ('Cat','Mammalia')
, ('Coyote','Mammalia')
, ('Starfish','Asteroidea')
, ('Crocodile','Reptilia')
, ('Lizard','Reptilia')
SELECT x1.Class
, STUFF((SELECT ',' + x2.Animal AS [text()]
FROM #xml x2
WHERE x1.Animal = x2.Animal
ORDER BY x2.Animal
FOR XML PATH('')),1,1,'' ) AS "Animals"
FROM #xml x1
GROUP BY Class
After a few hours, between these examples and the above code, I fail to see where I'm wrong on syntax, but I'm receiving the error "Column '#xml.Animal' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause." Note that if I leave off the GROUP BY clause, it still doesn't produce the values in the appropriate manner. Another set of eyes would be useful.

I think you have your WHERE clause using the wrong column, you want to use Class not Animal:
SELECT x1.Class
, STUFF((SELECT ',' + x2.Animal AS [text()]
FROM #xml x2
WHERE x1.Class = x2.Class
ORDER BY x2.Animal
FOR XML PATH('')),1,1,'' ) AS "Animals"
FROM #xml x1
GROUP BY Class
See SQL Fiddle with Demo. The result is:
| CLASS | ANIMALS |
---------------------------------
| Asteroidea | Starfish |
| Mammalia | Cat,Coyote,Dog |
| Reptilia | Crocodile,Lizard |

Related

How to join two tables with nested field?

I have a table like this:
id | ciaps
1 | a|b|c
An have a second table like:
cod | desc
a | item a
b | item b
c | item c
I need a code to join this tables like:
id | ciaps
1 | item a|item b|item c
Use array_agg for concatenating string separated by '|' and convert it array_to_string to get the value expected format.
-- PostgreSQL (v11)
SELECT t1.id, t2.descr ciaps
FROM test1 t1
INNER JOIN (SELECT array_to_string(array_agg(cod), '|') cod
, array_to_string(array_agg(descr), '|') descr
FROM test2) t2
ON t1.ciaps = t2.cod;
Please check from url https://dbfiddle.uk/?rdbms=postgres_11&fiddle=6fffc7f1da6a02a48018b3691c99ad17

DB2: How to transpose mutlidimensional table from row to column to find data changes across rows

I am trying the following with Db2:
Problem
So I've got a table with 80+ columns and two rows.
I need to accomplish is checking what columns have changed value between the two rows, and return a table of the column names that have changed, their initial value from row1, and their new value from row2.
Approach so far
My initial idea was to perform a pivot of the two rows into two columns, row 1 as column 1, row 2 as column 2, then join a column of column names (likely taken from syscat.columns) to the table as column 3, at which point I can then do a select where column1 != column2, hence returning the rows with all the data needed. But alas, it was not long after coming up with this that I discover DB2 doesn't support pivot / unpivot...
Question
So is there any idea for how to accomplish this in DB2, taking a table with 80+ columns and two rows like so:
| Col A | Col B | Col C | ... | Col Z|
| ----- | ----- | ----- | --- | ---- |
| Val A | Val B | 123 | ... | 01/01/2021 |
| Val C | Val B | 124 | ... | 02/01/2021 |
And returning a table with the columns changed, their initial value, and their new value:
| Initial | New | ColName|
| ----- | ----- | ----- |
| Val A | Val C | Col A |
| 123 | 124 | Col C |
| 01/01/2021 | 02/01/2021 | Col Z |
Also note the column data types also vary, so will need to be converted to varchar
DB2 version is 11.1
EDIT: Also for reference as per comment request, this is code I attempted to use to achieve this goal:
WITH
INIT AS (SELECT * FROM TABLE WHERE SOMEDATE=(SELECT MIN(SOMEDATE) FROM TABLE),
LATE AS (SELECT * FROM TABLE WHERE SOMEDATE=(SELECT MAX(SOMEDATE) FROM TABLE),
COLS AS (SELECT COLNAME FROM SYSCAT.COLUMNS WHERE TABNAME='TABLE' ORDER BY COLNO)
SELECT * FROM (
SELECT
COLNAME AS ATTRIBUTE,
(SELECT COLNAME AS INITIAL FROM INIT),
(SELECT COLNAME AS NEW FROM LATE)
FROM
COLS
WHERE
(INITIAL != NEW) OR (INITIAL IS NULL AND NEW IS NOT NULL) OR (INITIAL IS NOT NULL AND NEW IS NULL));
Only issue with this one is that I couldn't figure how to use the values from the COLS table as the columns to be selected
You may easily generate text of the expressions needed, if you don't want to type them manually.
Consider the following example, if you want to print different column values only in 2 rows of the same quite a wide table SYSCAT.TABLES. We use the following query for such an expression generation.
SELECT
'DECODE(I.I, '
|| LISTAGG(COLNO || ', A.' || COLNAME || CASE WHEN TYPENAME NOT LIKE '%CHAR%' AND TYPENAME NOT LIKE '%GRAPHIC' THEN '::VARCHAR(128)' ELSE '' END, ', ')
|| ') AS INITIAL' AS EXPR_INITIAL
, 'DECODE(I.I, '
|| LISTAGG(COLNO || ', B.' || COLNAME || CASE WHEN TYPENAME NOT LIKE '%CHAR%' AND TYPENAME NOT LIKE '%GRAPHIC' THEN '::VARCHAR(128)' ELSE '' END, ', ')
|| ') AS NEW' AS EXPR_NEW
, 'DECODE(I.I, '
|| LISTAGG(COLNO || ', ''' || COLNAME || '''', ', ')
|| ') AS COLNAME' AS EXPR_COLNAME
FROM SYSCAT.COLUMNS C
WHERE TABSCHEMA = 'SYSCAT' AND TABNAME = 'TABLES'
AND TYPENAME NOT LIKE '%LOB';
It doesn't matter how many columns the table contains. We just filter out the columns of *LOB types as an example. If you want them as well, you should change the ::VARCHAR(128) casting to some ::CLOB(XXX).
These 3 generated expressions we put to the corresponding places in the query below:
WITH MYTAB AS
(
-- We enumerate the rows to reference them later
SELECT ROWNUMBER() OVER () RN_, T.*
FROM SYSCAT.TABLES T
WHERE TABSCHEMA = 'SYSCAT'
FETCH FIRST 2 ROWS ONLY
)
SELECT *
FROM
(
SELECT
-- Place here the result got in the EXPR_INITIAL column
-- , Place here the result got in the EXPR_NEW column
-- , Place here the result got in the EXPR_COLNAME column
FROM MYTAB A, MYTAB B
,
(
SELECT COLNO AS I
FROM SYSCAT.COLUMNS
WHERE TABSCHEMA = 'SYSCAT' AND TABNAME = 'TABLES'
AND TYPENAME NOT LIKE '%LOB'
) I
WHERE A.RN_ = 1 AND B.RN_ = 2
)
WHERE INITIAL IS DISTINCT FROM NEW;
The result I got in my database:
|INITIAL |NEW |COLNAME |
|--------------------------|--------------------------|---------------|
|2019-06-04-22.44.14.493001|2019-06-04-22.44.14.502001|ALTER_TIME |
|26 |15 |COLCOUNT |
|2019-06-04-22.44.14.493001|2019-06-04-22.44.14.502001|CREATE_TIME |
|2019-06-04-22.44.14.493001|2019-06-04-22.44.14.502001|INVALIDATE_TIME|
|2019-06-04-22.44.14.493001|2019-06-04-22.44.14.502001|LAST_REGEN_TIME|
|ATTRIBUTES |AUDITPOLICIES |TABNAME |

How to return a comma separated values of column without having to loop through the result set

Let say I have this 2 table
+----+---------+ +----+-----------+----------------+
| Id | Country | | Id | CountryId | City |
+----+---------+ +----+-----------+----------------+
| 1 | USA | | 1 | 1 | Washington, DC |
+----+---------+ +----+-----------+----------------+
| 2 | Canada | | 2 | 2 | Ottawa |
+----+---------+ +----+-----------+----------------+
| 3 | 1 | New York |
+----+-----------+----------------+
| 4 | 1 | Baltimore |
+----+-----------+----------------+
I need to produce a result like:
Id | Country | Cities
---+---------+--------------------------------------
1 | USA | Washington, DC, New York, Baltimore
---+------------------------------------------------
2 | Canada | Ottawa
So far, I am looping through the left side table result like this:
DECLARE #table
(
Id INT IDENTITY(1, 1),
CountryId INT,
City VARCHAR(50)
)
DECLARE #tableString
(
Id INT IDENTITY(1, 1),
CountryId INT,
Cities VARCHAR(100)
)
INSERT INTO #table
SELECT Id, City
FROM tblCountries
DECLARE #city VARCHAR(50)
DECLARE #id INT
DECLARE #count INT
DECLARE #i INT = 1
SELECT #count = COUNT(*) FROM #table
WHILE (#i <= #count)
BEGIN
SELECT #Id = Id, #city = City FROM #table WHERE Id = #i
IF(EXISTS(SELECT * FROM #tableString WHERE CountryId = #Id))
BEGIN
UPDATE #tableString SET Cities = Cities + ', ' + #city WHERE Id = #Id
END
ELSE
BEGIN
INSERT INTO #tableString (CountryId, city) VALUES (#Id, #city)
END
SET #i = #i + 1
END
SELECT tc.Id, tc.Country, ts.Cities
FROM tblCountries tc
LEFT JOIN #tableString ts
ON tc.Id = ts.CountryId
My concern is that with all those looping in TSQL, it may be a performance killer. Even with fewer, it appears to be slow. Is there a better way to concatenate those string without having to loop through the data set as if I was working in C#
.
Thanks for helping
This was answered many times, but I've got the feeling, that some explanation might help you...
... am I missing something? It seems like this is related to XML
The needed functionality STRING_AGG() was introduced with SQL-Server 2017. The other direction STRING_SPLIT() came with v2016.
But many people still use older versions (and will do this for years), so we need workarounds. There were approaches with loops, bad and slow... And you might use recursive CTEs. And - that's the point here! - we can use some abilities of XML to solve this.
Try this out:
DECLARE #xml XML=
N'<root>
<element>text1</element>
<element>text2</element>
<element>text3</element>
</root>';
--The query will return the first <element> below <root> and return text1.
SELECT #xml.value(N'(/root/element)[1]','nvarchar(max)');
--But now try this:
SELECT #xml.value(N'(/root)[1]','nvarchar(max)')
The result is text1text2text3.
The reason for this: If you call .value() on an element without a detailed specification of what you want to read, you'll get the whole element back. Find details here.
Now imagine an XML like this
DECLARE #xml2 XML=
N'<root>
<element>, text1</element>
<element>, text2</element>
<element>, text3</element>
</root>';
With the same query as above you'd get , text1, text2, text3. The only thing left is to cut off the leading comma and the space. This is done - in most examples - with STUFF().
So the challenge is to create this XML. And this is what you find in the linked examples.
A general example is this: Read all tables and list their columns as a CSV-list:
SELECT TOP 10
TABLE_NAME
,STUFF(
(SELECT ',' + c.COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS c
WHERE c.TABLE_SCHEMA=t.TABLE_SCHEMA AND c.TABLE_NAME=t.TABLE_NAME
ORDER BY c.COLUMN_NAME
FOR XML PATH('')
),1,1,'') AS AllTableColumns
FROM INFORMATION_SCHEMA.TABLES t

Table Valued Function [XML Reader] Very Slow - Alternatives?

I have the following query that really kills performance and want to know what alternatives their are to an xml reader subquery. The purpose of this query is to export data with some html code.
An example of the table data is as follows.
p_s_id | p_c_id | notes
-----------------------
1 | 1 | this note is really long.
2 | 1 | This is fun.
3 | null | long note here
4 | 2 | this is not fun
5 | 2 | this is not fun
6 | 3 | long note here
I want to take all distinct notes that have the same p_c_id and join them together as shown below.
Any additional information can be provided so feel free to comment.
select distinct
p_c_id
,'<br/><br/>'+(select distinct '• ' +cast(note as nvarchar(max)) + ' <br/> '
from dbo.spec_notes_join m2
where m.p_c_id = m2.p_c_id
and isnull(note,'') <> ''
for xml path(''), type).value('.[1]', 'nvarchar(max)') as notes_spec
from dbo.spec_notes_join m
so the export would look as follows:
p_c_id | notes
--------------
1 | <br/><br/> • this note is really long. <br/> &bull This is fun <br/>
2 | <br/><br/> • This is not fun. <br/>
3 | <br/><br/> • long note here. <br/>
I think you will get slightly better performance you skip the distinct in the outer query and do a group by p_c_id instead.
select p_c_id,
'<br/><br/>'+(select distinct '• ' +cast(note as nvarchar(max)) + ' <br/> '
from dbo.spec_notes_join m2
where m.p_c_id = m2.p_c_id and
isnull(note,'') <> ''
for xml path(''), type).value('.', 'nvarchar(max)') as notes_spec
from dbo.spec_notes_join m
group by p_c_id
You could also try concatenating with a CLR User-Defined Aggregate Function.
Other alternatives can be found here Concatenating Row Values in Transact-SQL.
While this alternative skips the XML, I don’t know if it improves performance—if you could test and post results as a comment, I’d apreciate it. (It worked on my quick mock up, you may need to do some minor debugging on your own structures.)
Start with this function:
CREATE FUNCTION dbo.Testing
(
#p_c_id int
)
RETURNS varchar(max)
AS
BEGIN
DECLARE #ReturnString varchar(max)
SELECT #ReturnString = isnull(#ReturnString + ' <br/> , <br/><br/>• ', '<br/><br/>• ') + Name
from (select distinct note
from spec_notes_join
where p_c_id = #p_c_id
and isnull(note, '') <> '') xx
SET #ReturnString = #ReturnString + ' <br/> '
RETURN #ReturnString
END
GO
and then embed it in your query:
SELECT p_c_id, dbo.Testing(p_c_id)
from (select distinct p_c_id
from dbo.spec_notes_join) xx
This may perform poorly because of the function called required for each row. A possibly quicker variant would be to write the function as a table-valued function, and reference it by a CROSS APPLY in the join clause.

Converting Access Pivot Table to SQL Server

I'm having trouble converting a MS Access pivot table over to SQL Server. Was hoping someone might help..
TRANSFORM First(contacts.value) AS FirstOfvalue
SELECT contacts.contactid
FROM contacts RIGHT JOIN contactrecord ON contacts.[detailid] = contactrecord.[detailid]
GROUP BY contacts.contactid
PIVOT contactrecord.wellknownname
;
Edit: Responding to some of the comments
Contacts table has three fields
contactid | detailid | value |
1 1 Scott
contactrecord has something like
detailid | wellknownname
1 | FirstName
2 | Address1
3 | foobar
contractrecord is dyanamic in that the user at anytime can create a field to be added to contacts
the access query pulls out
contactid | FirstName | Address1 | foobar
1 | Scott | null | null
which is the pivot on the wellknownname. The key here is that the number of columns is dynamic since the user can, at anytime, create another field for the contact. Being new to pivot tables altogether, I'm wondering how I can recreate this access query in sql server.
As for transform... that's a built in access function. More information is found about it here. First() will just take the first result on that matching row.
I hope this helps and appreciate all the help.
I quick search for dynamic pivot tables comes up with this article.
After renaming things in his last query on the page I came up with this:
DECLARE #PivotColumnHeaders VARCHAR(max);
SELECT #PivotColumnHeaders = COALESCE(#PivotColumnHeaders + ',['+ CAST(wellknownname as varchar) + ']','['+ CAST(wellknownname as varchar) + ']')
FROM contactrecord;
DECLARE #PivotTableSQL NVARCHAR(max);
SET #PivotTableSQL = N'
SELECT *
FROM (
SELECT
c.contactid,
cr.wellknownname,
c.value
FROM contacts c
RIGHT JOIN contactrecord cr
on c.detailid = cr.detailid
) as pivotData
pivot(
min(value)
for wellknownname in (' + #PivotColumnHeaders +')
) as pivotTable
'
;
execute(#PivotTableSQL);
which despite its ugliness, it does the job