How can I combine CTEs with a FOR XML clause? - tsql

I'm trying to generate some XML with various levels of nesting, and at the risk of over-simplifying, the output XML will be loosely of the format:
<invoice number="1">
<charge code="foo" rate="123.00">
<surcharge amount="10%" />
</charge>
<charge code="bar" />
</invoice>
The database schema I have inherited for this happens to have charges stored in differing tables, which means that surcharges are stored differently based on the table from where the charge was from.
Given that you cannot use UNIONs with FOR XML, I've done some UNIONing in a CTE, so something along the lines of:
WITH Charges ( [#code], [#rate], surcharge, InvoiceId ) AS (
SELECT code AS [#Code], amount AS [#rate], NULL as surcharge, InvoiceId
FROM item.charges
UNION ALL
SELECT
code AS [#Code],
amount AS [#rate],
(
SELECT amount AS [#amount]
FROM order.surcharges os
WHERE oc.ChargeId = os.ChargeId
FOR XML PATH('surcharge'), TYPE
),
InvoiceId
FROM order.charges oc
)
SELECT
Number AS [#number],
(
SELECT
[#code],
[#rate],
surcharge
FROM Charges
WHERE Charges.InvoiceId = i.InvoiceId
)
FROM Invoices i
FOR XML PATH( 'invoice' ), TYPE
Now, that is incredibly close, giving (Note the nested <surcharge>):
<invoice number="1">
<charge code="foo" rate="123.00">
<surcharge>
<surcharge amount="10%" />
</surcharge>
</charge>
<charge code="bar" />
</invoice>
But I need to find a way of getting the end query to include the value of an XML column to be treated as the content of the element, rather than as a new element. Is this possible, or do I need to take a new approach?

You have a column query which returns mulitple rows (#charge, #rate, and an XML type.) I would expect the query you post to give the error:
Only one expression can be specified
in the select list when the subquery
is not introduced with EXISTS.
However, that's easily fixed by moving the query to an outer apply. To remove the double surcharge element, you could move the XML column names as far to the bottom as possible, like:
;WITH Charges (code, rate, surcharge, InvoiceId) AS
(
SELECT code, amount, NULL, InvoiceId
FROM #charges
UNION ALL
SELECT code
, amount
, (
SELECT amount AS [#amount]
FROM #surcharges os
WHERE oc.ChargeId = os.ChargeId
FOR XML PATH('surcharge'), TYPE
)
, InvoiceId
FROM #charges oc
)
SELECT Number AS [#number]
, c.code as [charge/#code]
, c.rate as [charge/#rate]
, c.surcharge as [charge]
FROM #Invoices i
outer apply
(
SELECT code
, rate
, surcharge
FROM Charges
WHERE Charges.InvoiceId = i.InvoiceId
) c
WHERE i.InvoiceID = 1
FOR XML PATH( 'invoice' ), TYPE
This would print, for example:
<invoice number="1">
<charge code="1" rate="1" />
</invoice>
<invoice number="1">
<charge code="1" rate="1">
<surcharge amount="1" />
</charge>
</invoice>
The first element comes from the top part of the union, where surcharge = null.

It appears that naming the (fake) column as "*" will use that the content of that column as the content of the element, so changing the SQL as below makes it work:
WITH Charges ( [#code], [#rate], surcharge, InvoiceId ) AS (
SELECT code AS [#Code], amount AS [#rate], NULL as surcharge, InvoiceId
FROM item.charges
UNION ALL
SELECT
code AS [#Code],
amount AS [#rate],
(
SELECT amount AS [#amount]
FROM order.surcharges os
WHERE oc.ChargeId = os.ChargeId
FOR XML PATH('surcharge'), TYPE
),
InvoiceId
FROM order.charges oc
)
SELECT
Number AS [#number],
(
SELECT
[#code],
[#rate],
surcharge AS [*] -- Thsi will embed the contents of the previously generated XML in here.
FROM Charges
WHERE Charges.InvoiceId = i.InvoiceId
)
FROM Invoices i
FOR XML PATH( 'invoice' ), TYPE

I think you can do this by omitting the root node type in your "for xml path('surcharge')" statement. That is, use "for xml path('')" instead.

Related

How to work with data values formatted [{}, {}, {}]

I apologize if this is a simple question - I had some trouble even formatting the question when I was trying to Google for help!
In one of the tables I am working with, there's data value that looks like below:
Invoice ID
Status
Product List
1234
Processed
[{"product_id":463153},{"product_id":463165},{"product_id":463177},{"pid":463218}]
I want to count how many products each order has purchased. What is the proper syntax and way to count the values under "Product List" column? I'm aware that count() is wrong, and I need to maybe extract the data from the string value.
select invoice_id, count(Product_list)
from quote_table
where status = 'processed'
group by invoice_id
You can use a JSON function named: json_array_length and cast this column like a JSON data type (as long as possible), for example:
select invoice_id, json_array_length(Product_list::json) as count
from quote_table
where status = 'processed'
group by invoice_id;
invoice_id | count
------------+-------
1234 | 4
(1 row)
If you need to count a specific property of the json column, you can use the query below.
This query solves the problem by using create type, json_populate_recordset and subquery to count product_id inside json data.
drop type if exists count_product;
create type count_product as (product_id int);
select
t.invoice_id,
t.status,
(
select count(*) from json_populate_recordset(
null::count_product,
t.Product_list
)
where product_id is not null
) as count_produto_id
from (
-- symbolic data to use in query
select
1234 as invoice_id,
'processed' as status,
'[{"product_id":463153},{"product_id":463165},{"product_id":463177},{"pid":463218}]'::json as Product_list
) as t

SQL NOT LIKE comparison against dynamic list

Working on a new TSQL Stored Procedure, I am wanting to get all rows where values in a specific column don't start with any of a specific set of 2 character substrings.
The general idea is:
SELECT * FROM table WHERE value NOT LIKE 's1%' AND value NOT LIKE 's2%' AND value NOT LIKE 's3%'.
The catch is that I am trying to make it dynamic so that the specific substrings can be pulled from another table in the database, which can have more values added to it.
While I have never used the IN operator before, I think something along these lines should do what I am looking for, however, I don't think it is possible to use wildcards with IN, so I might not be able to compare just the substrings.
SELECT * FROM table WHERE value NOT IN (SELECT substrings FROM subTable)
To get around that limitation, I am trying to do something like this:
SELECT * FROM table WHERE SUBSTRING(value, 1, 2) NOT IN (SELECT Prefix FROM subTable WHERE Prefix IS NOT NULL)
but I'm not sure this is right, or if it is the most efficient way to do this. My preference is to do this in a Stored Procedure, but if that isn't feasible or efficient I'm also open to building the query dynamically in C#.
Here's an option. Load values you want to filter to a table, left outer join and use PATINDEX().
DECLARE #FilterValues TABLE
(
[FilterValue] NVARCHAR(10)
);
--Table with values we want filter on.
INSERT INTO #FilterValues (
[FilterValue]
)
VALUES ( N's1' )
, ( N's2' )
, ( N's3' );
DECLARE #TestData TABLE
(
[TestValues] NVARCHAR(100)
);
--Load some test data
INSERT INTO #TestData (
[TestValues]
)
VALUES ( N's1 Test Data' )
, ( N's2 Test Data' )
, ( N's3 Test Data' )
, ( N'test data not filtered out' )
, ( N'test data not filtered out 1' );
SELECT a.*
FROM #TestData [a]
LEFT OUTER JOIN #FilterValues [b]
ON PATINDEX([b].[FilterValue] + '%', [a].[TestValues]) > 0
WHERE [b].[FilterValue] IS NULL;

Select into a table with a CTE [duplicate]

I have a very complex CTE and I would like to insert the result into a physical table.
Is the following valid?
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos
(
BatchID,
AccountNo,
APartyNo,
SourceRowID
)
WITH tab (
-- some query
)
SELECT * FROM tab
I am thinking of using a function to create this CTE which will allow me to reuse. Any thoughts?
You need to put the CTE first and then combine the INSERT INTO with your select statement. Also, the "AS" keyword following the CTE's name is not optional:
WITH tab AS (
bla bla
)
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos (
BatchID,
AccountNo,
APartyNo,
SourceRowID
)
SELECT * FROM tab
Please note that the code assumes that the CTE will return exactly four fields and that those fields are matching in order and type with those specified in the INSERT statement.
If that is not the case, just replace the "SELECT *" with a specific select of the fields that you require.
As for your question on using a function, I would say "it depends". If you are putting the data in a table just because of performance reasons, and the speed is acceptable when using it through a function, then I'd consider function to be an option.
On the other hand, if you need to use the result of the CTE in several different queries, and speed is already an issue, I'd go for a table (either regular, or temp).
WITH common_table_expression (Transact-SQL)
The WITH clause for Common Table Expressions go at the top.
Wrapping every insert in a CTE has the benefit of visually segregating the query logic from the column mapping.
Spot the mistake:
WITH _INSERT_ AS (
SELECT
[BatchID] = blah
,[APartyNo] = blahblah
,[SourceRowID] = blahblahblah
FROM Table1 AS t1
)
INSERT Table2
([BatchID], [SourceRowID], [APartyNo])
SELECT [BatchID], [APartyNo], [SourceRowID]
FROM _INSERT_
Same mistake:
INSERT Table2 (
[BatchID]
,[SourceRowID]
,[APartyNo]
)
SELECT
[BatchID] = blah
,[APartyNo] = blahblah
,[SourceRowID] = blahblahblah
FROM Table1 AS t1
A few lines of boilerplate make it extremely easy to verify the code inserts the right number of columns in the right order, even with a very large number of columns. Your future self will thank you later.
Yep:
WITH tab (
bla bla
)
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos ( BatchID, AccountNo,
APartyNo,
SourceRowID)
SELECT * FROM tab
Note that this is for SQL Server, which supports multiple CTEs:
WITH x AS (), y AS () INSERT INTO z (a, b, c) SELECT a, b, c FROM y
Teradata allows only one CTE and the syntax is as your example.
Late to the party here, but for my purposes I wanted to be able to run the code the user inputted and store in a temp table. Using oracle no such issues.. the insert is at the start of the statement before the with clause.
For this to work in sql server, the following worked:
INSERT into #stagetable execute (#InputSql)
(so the select statement #inputsql can start as a with clause).

Add an attribute to the XML Column from another column in the same/another table

Here's my scenario:
--ORDER table
OrderID OrderCode DateShipped ShipmentXML
1 ABC 08/06/2013 <Order><Item CustomerName="BF" City="Philadelphia" State="PA"></Item></Order>
2 XYZ 08/05/2013 <Order><Item CustomerName="TJ" City="Richmond" State="VA"></Item></Order>
At some point in the process, I will know the respective TrackingNumber for these Orders. The tracking numbers are available in another table like this:
--TRACKING table
TrackingID OrderCode TrackingNumber
98 ABC 1Z1
99 XYZ 1Z2
The output I'm expecting is as below:
OrderID OrderCode ShipmentXML
1 ABC <Order><Item CustomerName="BF" City="Philadelphia" State="PA" DateShipped="08/06/2013" TrackingNumber="1Z1"></Item></Order>
2 XYZ <Order><Item CustomerName="TJ" City="Richmond" State="VA" DateShipped="08/05/2013" TrackingNumber="1Z2"></Item></Order>`
As you can see, I'm trying to get the TrackingNumber and the DateShipped for each OrderCode and have them as an attribute. The intent is a SELECT, not UPDATE.
All the examples I've seen demonstrate how to update the XML with a Constant value or a variable. I couldn't find one that demonstrates XML updates with a JOIN. Please help with how this can be accomplished.
UPDATE:
By 'Select not Update', I meant that no updates to the permanent table; UPDATE on temp tables are perfectly fine, as Mikael commented below the first answer.
A version using a temp table to add the attributes to the XML.
select OrderID,
OrderCode,
DateShipped,
ShipmentXML
into #Order
from [Order]
update #Order
set ShipmentXML.modify
('insert attribute DateShipped {sql:column("DateShipped")}
into (/Order/Item)[1]')
update O
set ShipmentXML.modify
('insert attribute TrackingNumber {sql:column("T.TrackingNumber")}
into (/Order/Item)[1]')
from #Order as O
inner join Tracking as T
on O.OrderCode = T.OrderCode
select OrderID,
OrderCode,
ShipmentXML
from #Order
drop table #Order
Prevous answer is good, but you have to explicitly specify columns and cast them into varchar, and that's not good for future support (if you add attributes to ShipmentXML you'll have to modify the query).
Instead, you could use XQuery:
select
O.OrderID, O.OrderCode,
(
select
(select O.DateShipped, T.TrackingNumber for xml raw('Item'), type),
O.ShipmentXML.query('Order/*')
for xml path(''), type
).query('<Order><Item>{for $i in Item/#* return $i}</Item></Order>')
from [ORDER] as O
left outer join [TRACKING] as T on T.OrderCode = O.OrderCode
or even like this:
select
O.OrderID, O.OrderCode,
O.ShipmentXML.query('
element Order {
element Item {
attribute DateShipped {sql:column("O.DateShipped")},
attribute TrackingNumber {sql:column("T.TrackingNumber")},
for $i in Order/Item/#* return $i
}
}')
from [ORDER] as O
left outer join [TRACKING] as T on T.OrderCode = O.OrderCode
see sqlfiddle with examples
The only way I know allowing partial modification of data in columns of xml type is using modify method, but as stated in documentation
The modify() method of the xml data type can only be used in the SET
clause of an UPDATE statement.
Since UPDATE is not desired, as a workaround I see shredding and reassembling it manually as:
select
o.OrderID,
o.OrderCode,
(
cast((select
t.c.value('#CustomerName', 'varchar(50)') as '#CustomerName',
t.c.value('#City', 'varchar(50)') as '#City',
t.c.value('#State', 'varchar(50)') as '#State',
o.DateShipped as '#DateShipped',
tr.TrackingNumber as '#TrackingNumber'
for xml path('Item'), root('Order')) as xml)
) as ShipmentXML
from
[ORDER] o
join [TRACKING] tr on tr.OrderCode = o.OrderCode
cross apply o.ShipmentXML.nodes('Order/Item') t(c)
You may have to apply formatting to o.DateShipped.

Combining INSERT INTO and WITH/CTE

I have a very complex CTE and I would like to insert the result into a physical table.
Is the following valid?
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos
(
BatchID,
AccountNo,
APartyNo,
SourceRowID
)
WITH tab (
-- some query
)
SELECT * FROM tab
I am thinking of using a function to create this CTE which will allow me to reuse. Any thoughts?
You need to put the CTE first and then combine the INSERT INTO with your select statement. Also, the "AS" keyword following the CTE's name is not optional:
WITH tab AS (
bla bla
)
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos (
BatchID,
AccountNo,
APartyNo,
SourceRowID
)
SELECT * FROM tab
Please note that the code assumes that the CTE will return exactly four fields and that those fields are matching in order and type with those specified in the INSERT statement.
If that is not the case, just replace the "SELECT *" with a specific select of the fields that you require.
As for your question on using a function, I would say "it depends". If you are putting the data in a table just because of performance reasons, and the speed is acceptable when using it through a function, then I'd consider function to be an option.
On the other hand, if you need to use the result of the CTE in several different queries, and speed is already an issue, I'd go for a table (either regular, or temp).
WITH common_table_expression (Transact-SQL)
The WITH clause for Common Table Expressions go at the top.
Wrapping every insert in a CTE has the benefit of visually segregating the query logic from the column mapping.
Spot the mistake:
WITH _INSERT_ AS (
SELECT
[BatchID] = blah
,[APartyNo] = blahblah
,[SourceRowID] = blahblahblah
FROM Table1 AS t1
)
INSERT Table2
([BatchID], [SourceRowID], [APartyNo])
SELECT [BatchID], [APartyNo], [SourceRowID]
FROM _INSERT_
Same mistake:
INSERT Table2 (
[BatchID]
,[SourceRowID]
,[APartyNo]
)
SELECT
[BatchID] = blah
,[APartyNo] = blahblah
,[SourceRowID] = blahblahblah
FROM Table1 AS t1
A few lines of boilerplate make it extremely easy to verify the code inserts the right number of columns in the right order, even with a very large number of columns. Your future self will thank you later.
Yep:
WITH tab (
bla bla
)
INSERT INTO dbo.prf_BatchItemAdditionalAPartyNos ( BatchID, AccountNo,
APartyNo,
SourceRowID)
SELECT * FROM tab
Note that this is for SQL Server, which supports multiple CTEs:
WITH x AS (), y AS () INSERT INTO z (a, b, c) SELECT a, b, c FROM y
Teradata allows only one CTE and the syntax is as your example.
Late to the party here, but for my purposes I wanted to be able to run the code the user inputted and store in a temp table. Using oracle no such issues.. the insert is at the start of the statement before the with clause.
For this to work in sql server, the following worked:
INSERT into #stagetable execute (#InputSql)
(so the select statement #inputsql can start as a with clause).