I have an XML column in a table.
The table looks has two columns:
ID
DepartmentXML
The DepartmentXML typically looks like this:
<Root>
<Department>
<dID>100</dID>
<DName>Engineering</DName>
</Department>
<Employee>
<EmployeeID>999</EmployeeID>
<EName>AAA BBB</EName>
</Employee>
<Employee>
<EmployeeID>888</EmployeeID>
<EName>XXX YYY</EName>
</Employee>
</Root>
How to query this XML, to get result like this?
+------------------------------------------+
|dID|DepartmentName|EmployeeID|EmployeeName|
+------------------------------------------+
|100|Engineering |999 |AAA BBB |
|100|Engineering |888 |XXX YYY |
+------------------------------------------+
I know CROSS APPLY may have to be used, but the syntax for this particular scenario is very difficult for me to understand.
Thank you.
Try it like this:
First a mockup table to simulate your issue:
DECLARE #tbl TABLE(ID INT IDENTITY, DepartmentXml XML);
INSERT INTO #tbl VALUES
(N'<Root>
<Department>
<dID>100</dID>
<DName>Engineering</DName>
</Department>
<Employee>
<EmployeeID>999</EmployeeID>
<EName>AAA BBB</EName>
</Employee>
<Employee>
<EmployeeID>888</EmployeeID>
<EName>XXX YYY</EName>
</Employee>
</Root>');
--The query
SELECT t.ID
,t.DepartmentXml.value('(/Root/Department/dID/text())[1]','int') AS DepartmentId
,t.DepartmentXml.value('(/Root/Department/DName/text())[1]','nvarchar(max)') AS DepartmentName
,A.e.value('(EmployeeID/text())[1]','int') AS EmployeeId
,A.e.value('(EName/text())[1]','nvarchar(max)') AS EmployeeName
FROM #tbl t
OUTER APPLY t.DepartmentXml.nodes('/Root/Employee') A(e);
The idea in short:
We can pick the row's ID directly
We can read the department's information directly from the XML (non-repeating)
We can retrieve repeating nodes using APPLY with .nodes().
We can use a relative XPath against A.e to get the employee's data
Related
I apologize if this is a simple question - I had some trouble even formatting the question when I was trying to Google for help!
In one of the tables I am working with, there's data value that looks like below:
Invoice ID
Status
Product List
1234
Processed
[{"product_id":463153},{"product_id":463165},{"product_id":463177},{"pid":463218}]
I want to count how many products each order has purchased. What is the proper syntax and way to count the values under "Product List" column? I'm aware that count() is wrong, and I need to maybe extract the data from the string value.
select invoice_id, count(Product_list)
from quote_table
where status = 'processed'
group by invoice_id
You can use a JSON function named: json_array_length and cast this column like a JSON data type (as long as possible), for example:
select invoice_id, json_array_length(Product_list::json) as count
from quote_table
where status = 'processed'
group by invoice_id;
invoice_id | count
------------+-------
1234 | 4
(1 row)
If you need to count a specific property of the json column, you can use the query below.
This query solves the problem by using create type, json_populate_recordset and subquery to count product_id inside json data.
drop type if exists count_product;
create type count_product as (product_id int);
select
t.invoice_id,
t.status,
(
select count(*) from json_populate_recordset(
null::count_product,
t.Product_list
)
where product_id is not null
) as count_produto_id
from (
-- symbolic data to use in query
select
1234 as invoice_id,
'processed' as status,
'[{"product_id":463153},{"product_id":463165},{"product_id":463177},{"pid":463218}]'::json as Product_list
) as t
Assume Table A with an XML field. Table A has two rows, with the following XML data in the table.
Row 1
<fullname>
<firstName>John</firstName>
<lastName>Smith</lastName>
</fullname>
and Row 2
<fullname>
<firstName>Jane</firstName>
</fullname>
This query:
SELECT * FROM A
XMLTABLE(('/fullname'::text) PASSING (a.xml)
COLUMNS firstName text PATH ('firstName'::text), lastName text PATH ('lastName'::text)) a
will only return data on John and not Jane. Is there a work around for this?
https://www.postgresql.org/docs/13/functions-xml.html
"default X" resolved the issue.
It's all in the title. Documentation has something like this:
SELECT *
FROM crosstab('...') AS ct(row_name text, category_1 text, category_2 text);
I have two tables, lab_tests and lab_tests_results. All of the lab_tests_results rows are tied to the primary key id integer in the lab_tests table. I'm trying to make a pivot table where the lab tests (identified by an integer) are row headers and the respective results are in the table. I can't get around a syntax error at or around the integer.
Is this possible with the current set up? Am I missing something in the documentation? Or do I need to perform an inner join of sorts to make the categories strings? Or modify the lab_tests_results table to use a text identifier for the lab tests?
Thanks for the help, all. Much appreciated.
Edit: Got it figured out with the help of Dmitry. He had the data layout figured out, but I was unclear on what kind of output I needed. I was trying to get the pivot table to be based on batch_id numbers in the lab_tests_results table. Had to hammer out the base query and casting data types.
SELECT *
FROM crosstab('SELECT lab_tests_results.batch_id, lab_tests.test_name, lab_tests_results.test_result::FLOAT
FROM lab_tests_results, lab_tests
WHERE lab_tests.id=lab_tests_results.lab_test AND (lab_tests.test_name LIKE ''Test Name 1'' OR lab_tests.test_name LIKE ''Test Name 2'')
ORDER BY 1,2'
) AS final_result(batch_id VARCHAR, test_name_1 FLOAT, test_name_2 FLOAT);
This provides a pivot table from the lab_tests_results table like below:
batch_id |test_name_1 |test_name_2
---------------------------------------
batch1 | result1 | <null>
batch2 | result2 | result3
If I understand correctly your tables look something like this:
CREATE TABLE lab_tests (
id INTEGER PRIMARY KEY,
name VARCHAR(500)
);
CREATE TABLE lab_tests_results (
id INTEGER PRIMARY KEY,
lab_tests_id INTEGER REFERENCES lab_tests (id),
result TEXT
);
And your data looks something like this:
INSERT INTO lab_tests (id, name)
VALUES (1, 'test1'),
(2, 'test2');
INSERT INTO lab_tests_results (id, lab_tests_id, result)
VALUES (1,1,'result1'),
(2,1,'result2'),
(3,2,'result3'),
(4,2,'result4'),
(5,2,'result5');
First of all crosstab is part of tablefunc, you need to enable it:
CREATE EXTENSION tablefunc;
You need to run it one per database as per this answer.
The final query will look like this:
SELECT *
FROM crosstab(
'SELECT lt.name::TEXT, lt.id, ltr.result
FROM lab_tests AS lt
JOIN lab_tests_results ltr ON ltr.lab_tests_id = lt.id'
) AS ct(test_name text, result_1 text, result_2 text, result_3 text);
Explanation:
The crosstab() function takes a text of a query which should return 3 columns; (1) a column for name of a group, (2) a column for grouping, (3) the value. The wrapping query just selects all the values those crosstab() returns and defines the list of columns after (the part after AS). First is the category name (test_name) and then the values (result_1, result_2). In my query I'll get up to 3 results. If I have more then 3 results then I won't see them, If I have less then 3 results I'll get nulls.
The result for this query is:
test_name |result_1 |result_2 |result_3
---------------------------------------
test1 |result1 |result2 |<null>
test2 |result3 |result4 |result5
Here's my scenario:
--ORDER table
OrderID OrderCode DateShipped ShipmentXML
1 ABC 08/06/2013 <Order><Item CustomerName="BF" City="Philadelphia" State="PA"></Item></Order>
2 XYZ 08/05/2013 <Order><Item CustomerName="TJ" City="Richmond" State="VA"></Item></Order>
At some point in the process, I will know the respective TrackingNumber for these Orders. The tracking numbers are available in another table like this:
--TRACKING table
TrackingID OrderCode TrackingNumber
98 ABC 1Z1
99 XYZ 1Z2
The output I'm expecting is as below:
OrderID OrderCode ShipmentXML
1 ABC <Order><Item CustomerName="BF" City="Philadelphia" State="PA" DateShipped="08/06/2013" TrackingNumber="1Z1"></Item></Order>
2 XYZ <Order><Item CustomerName="TJ" City="Richmond" State="VA" DateShipped="08/05/2013" TrackingNumber="1Z2"></Item></Order>`
As you can see, I'm trying to get the TrackingNumber and the DateShipped for each OrderCode and have them as an attribute. The intent is a SELECT, not UPDATE.
All the examples I've seen demonstrate how to update the XML with a Constant value or a variable. I couldn't find one that demonstrates XML updates with a JOIN. Please help with how this can be accomplished.
UPDATE:
By 'Select not Update', I meant that no updates to the permanent table; UPDATE on temp tables are perfectly fine, as Mikael commented below the first answer.
A version using a temp table to add the attributes to the XML.
select OrderID,
OrderCode,
DateShipped,
ShipmentXML
into #Order
from [Order]
update #Order
set ShipmentXML.modify
('insert attribute DateShipped {sql:column("DateShipped")}
into (/Order/Item)[1]')
update O
set ShipmentXML.modify
('insert attribute TrackingNumber {sql:column("T.TrackingNumber")}
into (/Order/Item)[1]')
from #Order as O
inner join Tracking as T
on O.OrderCode = T.OrderCode
select OrderID,
OrderCode,
ShipmentXML
from #Order
drop table #Order
Prevous answer is good, but you have to explicitly specify columns and cast them into varchar, and that's not good for future support (if you add attributes to ShipmentXML you'll have to modify the query).
Instead, you could use XQuery:
select
O.OrderID, O.OrderCode,
(
select
(select O.DateShipped, T.TrackingNumber for xml raw('Item'), type),
O.ShipmentXML.query('Order/*')
for xml path(''), type
).query('<Order><Item>{for $i in Item/#* return $i}</Item></Order>')
from [ORDER] as O
left outer join [TRACKING] as T on T.OrderCode = O.OrderCode
or even like this:
select
O.OrderID, O.OrderCode,
O.ShipmentXML.query('
element Order {
element Item {
attribute DateShipped {sql:column("O.DateShipped")},
attribute TrackingNumber {sql:column("T.TrackingNumber")},
for $i in Order/Item/#* return $i
}
}')
from [ORDER] as O
left outer join [TRACKING] as T on T.OrderCode = O.OrderCode
see sqlfiddle with examples
The only way I know allowing partial modification of data in columns of xml type is using modify method, but as stated in documentation
The modify() method of the xml data type can only be used in the SET
clause of an UPDATE statement.
Since UPDATE is not desired, as a workaround I see shredding and reassembling it manually as:
select
o.OrderID,
o.OrderCode,
(
cast((select
t.c.value('#CustomerName', 'varchar(50)') as '#CustomerName',
t.c.value('#City', 'varchar(50)') as '#City',
t.c.value('#State', 'varchar(50)') as '#State',
o.DateShipped as '#DateShipped',
tr.TrackingNumber as '#TrackingNumber'
for xml path('Item'), root('Order')) as xml)
) as ShipmentXML
from
[ORDER] o
join [TRACKING] tr on tr.OrderCode = o.OrderCode
cross apply o.ShipmentXML.nodes('Order/Item') t(c)
You may have to apply formatting to o.DateShipped.
I'm trying to generate some XML with various levels of nesting, and at the risk of over-simplifying, the output XML will be loosely of the format:
<invoice number="1">
<charge code="foo" rate="123.00">
<surcharge amount="10%" />
</charge>
<charge code="bar" />
</invoice>
The database schema I have inherited for this happens to have charges stored in differing tables, which means that surcharges are stored differently based on the table from where the charge was from.
Given that you cannot use UNIONs with FOR XML, I've done some UNIONing in a CTE, so something along the lines of:
WITH Charges ( [#code], [#rate], surcharge, InvoiceId ) AS (
SELECT code AS [#Code], amount AS [#rate], NULL as surcharge, InvoiceId
FROM item.charges
UNION ALL
SELECT
code AS [#Code],
amount AS [#rate],
(
SELECT amount AS [#amount]
FROM order.surcharges os
WHERE oc.ChargeId = os.ChargeId
FOR XML PATH('surcharge'), TYPE
),
InvoiceId
FROM order.charges oc
)
SELECT
Number AS [#number],
(
SELECT
[#code],
[#rate],
surcharge
FROM Charges
WHERE Charges.InvoiceId = i.InvoiceId
)
FROM Invoices i
FOR XML PATH( 'invoice' ), TYPE
Now, that is incredibly close, giving (Note the nested <surcharge>):
<invoice number="1">
<charge code="foo" rate="123.00">
<surcharge>
<surcharge amount="10%" />
</surcharge>
</charge>
<charge code="bar" />
</invoice>
But I need to find a way of getting the end query to include the value of an XML column to be treated as the content of the element, rather than as a new element. Is this possible, or do I need to take a new approach?
You have a column query which returns mulitple rows (#charge, #rate, and an XML type.) I would expect the query you post to give the error:
Only one expression can be specified
in the select list when the subquery
is not introduced with EXISTS.
However, that's easily fixed by moving the query to an outer apply. To remove the double surcharge element, you could move the XML column names as far to the bottom as possible, like:
;WITH Charges (code, rate, surcharge, InvoiceId) AS
(
SELECT code, amount, NULL, InvoiceId
FROM #charges
UNION ALL
SELECT code
, amount
, (
SELECT amount AS [#amount]
FROM #surcharges os
WHERE oc.ChargeId = os.ChargeId
FOR XML PATH('surcharge'), TYPE
)
, InvoiceId
FROM #charges oc
)
SELECT Number AS [#number]
, c.code as [charge/#code]
, c.rate as [charge/#rate]
, c.surcharge as [charge]
FROM #Invoices i
outer apply
(
SELECT code
, rate
, surcharge
FROM Charges
WHERE Charges.InvoiceId = i.InvoiceId
) c
WHERE i.InvoiceID = 1
FOR XML PATH( 'invoice' ), TYPE
This would print, for example:
<invoice number="1">
<charge code="1" rate="1" />
</invoice>
<invoice number="1">
<charge code="1" rate="1">
<surcharge amount="1" />
</charge>
</invoice>
The first element comes from the top part of the union, where surcharge = null.
It appears that naming the (fake) column as "*" will use that the content of that column as the content of the element, so changing the SQL as below makes it work:
WITH Charges ( [#code], [#rate], surcharge, InvoiceId ) AS (
SELECT code AS [#Code], amount AS [#rate], NULL as surcharge, InvoiceId
FROM item.charges
UNION ALL
SELECT
code AS [#Code],
amount AS [#rate],
(
SELECT amount AS [#amount]
FROM order.surcharges os
WHERE oc.ChargeId = os.ChargeId
FOR XML PATH('surcharge'), TYPE
),
InvoiceId
FROM order.charges oc
)
SELECT
Number AS [#number],
(
SELECT
[#code],
[#rate],
surcharge AS [*] -- Thsi will embed the contents of the previously generated XML in here.
FROM Charges
WHERE Charges.InvoiceId = i.InvoiceId
)
FROM Invoices i
FOR XML PATH( 'invoice' ), TYPE
I think you can do this by omitting the root node type in your "for xml path('surcharge')" statement. That is, use "for xml path('')" instead.