I have a SOAP response and need to import into SQL Server almost all fields of the soapenv:Body
googleing and testing I build this query that seems working
but it is not exactly what I was looking for:
declare
#Root varchar(50)='/soap:Envelope/soap:Body/GetFeedbackResponse/',
#xDoc XML = '<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Header>
<ebl:RequesterCredentials soapenv:mustUnderstand="0" xmlns:ns="urn:ebay:apis:eBLBaseComponents" xmlns:ebl="urn:ebay:apis:eBLBaseComponents">
<ebl:NotificationSignature xmlns:ebl="urn:ebay:apis:eBLBaseComponents">r6revSiTXCP9SBBFUtpDAQ==</ebl:NotificationSignature>
</ebl:RequesterCredentials>
</soapenv:Header>
<soapenv:Body>
<GetFeedbackResponse xmlns="urn:ebay:apis:eBLBaseComponents">
<Timestamp>2015-09-06T11:20:48.528Z</Timestamp>
<Ack>Success</Ack>
<CorrelationID>428163922470</CorrelationID>
<Version>899</Version>
<Build>E899_INTL_APIFEEDBACK_17278558_R1</Build>
<NotificationEventName>Feedback</NotificationEventName>
<RecipientUserID>ebay_bestseller</RecipientUserID>
<EIASToken>nY+sHZ2PrBmdj6wVnY+sEWDETj2dj6AFlIajDpaEpAydj6x9nY+seQ==</EIASToken>
<FeedbackDetailArray>
<FeedbackDetail>
<CommentingUser>ebay_bestseller</CommentingUser>
<CommentingUserScore>42425</CommentingUserScore>
<CommentText>Great buyer - We Would Appreciate 5 STARS for Our Feedback!</CommentText>
<CommentTime>2015-09-06T11:20:45.000Z</CommentTime>
<CommentType>Positive</CommentType>
<ItemID>310541589307</ItemID>
<Role>Buyer</Role>
<FeedbackID>1064451206013</FeedbackID>
<TransactionID>549674542021</TransactionID>
<OrderLineItemID>310541589307-549674542021</OrderLineItemID>
</FeedbackDetail>
</FeedbackDetailArray>
<FeedbackDetailItemTotal>1</FeedbackDetailItemTotal>
<FeedbackScore>126</FeedbackScore>
<PaginationResult>
<TotalNumberOfPages>1</TotalNumberOfPages>
<TotalNumberOfEntries>1</TotalNumberOfEntries>
</PaginationResult>
<EntriesPerPage>25</EntriesPerPage>
<PageNumber>1</PageNumber>
</GetFeedbackResponse>
</soapenv:Body>
</soapenv:Envelope>'
;with xmlnamespaces('http://schemas.xmlsoap.org/soap/envelope/' as [soap], default 'urn:ebay:apis:eBLBaseComponents')
insert into Test (TS,Comment)
select
#xDoc.value('(/soap:Envelope/soap:Body/GetFeedbackResponse/Timestamp)[1]', 'nvarchar(max)'),
#xDoc.value('(/soap:Envelope/soap:Body/GetFeedbackResponse/FeedbackDetailArray/FeedbackDetail/CommentText)[1]', 'nvarchar(max)'),
........
First question: since have some 50 different SOAPs response, and some do have up to 120 Fiels, I was looking for a solution who flatten all fields and import automatically in the relative table
withoud requiring to indicate all rows: considering I'm using sqlServer 2012/2014 is there a solution, or if not, is there a better/quicker way than what I proposed?
Second: (if there's no better solution)
since I have a lot of fields to import, up to 120 for some SOAPs, and since root is always the same I would like to reduce the length putting in a variable the first part of the path so to have
for example
#xDoc.value('('+#Root+'Timestamp)[1]', 'nvarchar(max)')
instead of
#xDoc.value('(/soap:Envelope/soap:Body/GetFeedbackResponse/Timestamp)[1]', 'nvarchar(max)')
but I receive an error like first value has to be literal value: is there any turnaround?
Third: from documentation...it is required that the query
#xDoc.value('(/soap:Envelope/soap:Body/GetFeedbackResponse/Timestamp)[1]', 'nvarchar(max)')
returns a value, otherwise raise an error
but in some SOAPs some fields are optionals and therefore are not always present:
I tried with
;with xmlnamespaces('http://schemas.xmlsoap.org/soap/envelope/' as [soap], default 'urn:ebay:apis:eBLBaseComponents')
insert into Test (TS,Comment)
select if (exists #xDoc.value('(/soap:Envelope/soap:Body/GetFeedbackResponse/Timestamp)')) then #xDoc.value('(/soap:Envelope/soap:Body/GetFeedbackResponse/Timestamp)[1]', 'nvarchar(max)')
....
but raise an error: which is correct query form?
Thanks a Lot
Joe
Another way would be to get a name-value result set instead of columns and then you can work with that as you wish. If you want the result as columns you can for instance store the name-value result set in a temp table and do a dynamic pivot of the data.
declare
#Root varchar(50)='GetFeedbackResponse',
#xDoc XML = '<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Header>
<ebl:RequesterCredentials soapenv:mustUnderstand="0" xmlns:ns="urn:ebay:apis:eBLBaseComponents" xmlns:ebl="urn:ebay:apis:eBLBaseComponents">
<ebl:NotificationSignature xmlns:ebl="urn:ebay:apis:eBLBaseComponents">r6revSiTXCP9SBBFUtpDAQ==</ebl:NotificationSignature>
</ebl:RequesterCredentials>
</soapenv:Header>
<soapenv:Body>
<GetFeedbackResponse xmlns="urn:ebay:apis:eBLBaseComponents">
<Timestamp>2015-09-06T11:20:48.528Z</Timestamp>
<Ack>Success</Ack>
<CorrelationID>428163922470</CorrelationID>
<Version>899</Version>
<Build>E899_INTL_APIFEEDBACK_17278558_R1</Build>
<NotificationEventName>Feedback</NotificationEventName>
<RecipientUserID>ebay_bestseller</RecipientUserID>
<EIASToken>nY+sHZ2PrBmdj6wVnY+sEWDETj2dj6AFlIajDpaEpAydj6x9nY+seQ==</EIASToken>
<FeedbackDetailArray>
<FeedbackDetail>
<CommentingUser>ebay_bestseller</CommentingUser>
<CommentingUserScore>42425</CommentingUserScore>
<CommentText>Great buyer - We Would Appreciate 5 STARS for Our Feedback!</CommentText>
<CommentTime>2015-09-06T11:20:45.000Z</CommentTime>
<CommentType>Positive</CommentType>
<ItemID>310541589307</ItemID>
<Role>Buyer</Role>
<FeedbackID>1064451206013</FeedbackID>
<TransactionID>549674542021</TransactionID>
<OrderLineItemID>310541589307-549674542021</OrderLineItemID>
</FeedbackDetail>
</FeedbackDetailArray>
<FeedbackDetailItemTotal>1</FeedbackDetailItemTotal>
<FeedbackScore>126</FeedbackScore>
<PaginationResult>
<TotalNumberOfPages>1</TotalNumberOfPages>
<TotalNumberOfEntries>1</TotalNumberOfEntries>
</PaginationResult>
<EntriesPerPage>25</EntriesPerPage>
<PageNumber>1</PageNumber>
</GetFeedbackResponse>
</soapenv:Body>
</soapenv:Envelope>'
select T.X.value('local-name(.)', 'nvarchar(100)') as Name,
T.X.value('text()[1]', 'nvarchar(100)') as Value
from #xDoc.nodes('//*[local-name(.) = sql:variable("#Root")]//*') as T(X)
local-name() returns the current node name and text() returns the node value.
nodes() shreds the XML and returns one row for each matched node in the XML.
// does a deep search of the XML
* Is a wildcard to match any nodes.
[] is used for a predicate in a xQuery expression.
sql:variable() is a function that fetches the values of variables into the xQuery. You can not use sql:variable() to build the xQuery expression.
The expression in the nodes() function will return one row for each element below GetFeedbackResponse with a deep search.
Result:
Name Value
------------------------- -----------------------------------------------
Timestamp 2015-09-06T11:20:48.528Z
Ack Success
CorrelationID 428163922470
Version 899
Build E899_INTL_APIFEEDBACK_17278558_R1
NotificationEventName Feedback
RecipientUserID ebay_bestseller
EIASToken nY+sHZ2PrBmdj6wVnY+sEWDETj2dj6AFlIajDpaEpAydj6x9nY+seQ==
FeedbackDetailArray NULL
FeedbackDetail NULL
CommentingUser ebay_bestseller
CommentingUserScore 42425
CommentText Great buyer - We Would Appreciate 5 STARS for Our Feedback!
CommentTime 2015-09-06T11:20:45.000Z
CommentType Positive
ItemID 310541589307
Role Buyer
FeedbackID 1064451206013
TransactionID 549674542021
OrderLineItemID 310541589307-549674542021
FeedbackDetailItemTotal 1
FeedbackScore 126
PaginationResult NULL
TotalNumberOfPages 1
TotalNumberOfEntries 1
EntriesPerPage 25
PageNumber 1
Third:
You should read the docs again.
The XQuery must return at most one value.
It is perfectly fine to return no value and in that case the value returned is NULL.
Update:
If the end result you are looking for really is one row with values in columns you are better of to use what you already have figured out. It can be simplified a bit by putting the path common to all values in a nodes() clause. Something like this.
with xmlnamespaces('http://schemas.xmlsoap.org/soap/envelope/' as [soap], default 'urn:ebay:apis:eBLBaseComponents')
select T.X.value('(Timestamp/text())[1]', 'datetime') as Timestamp,
T.X.value('(Ack/text())[1]', 'varchar(10)') as Ack,
T.X.value('(CorrelationID/text())[1]', 'varchar(20)') as CorrelationID
from #xDoc.nodes('/soap:Envelope/soap:Body/GetFeedbackResponse') as T(X)
Just add the extra columns you actually need. I have also added /text() to where you extract a value from a node. For you XML it will give the exact same result as you already have only the optimizer can exclude a couple of operators from the query plan so it is potentially faster doing the shredding this way.
Related
Assume Table A with an XML field. Table A has two rows, with the following XML data in the table.
Row 1
<fullname>
<firstName>John</firstName>
<lastName>Smith</lastName>
</fullname>
and Row 2
<fullname>
<firstName>Jane</firstName>
</fullname>
This query:
SELECT * FROM A
XMLTABLE(('/fullname'::text) PASSING (a.xml)
COLUMNS firstName text PATH ('firstName'::text), lastName text PATH ('lastName'::text)) a
will only return data on John and not Jane. Is there a work around for this?
https://www.postgresql.org/docs/13/functions-xml.html
"default X" resolved the issue.
I'm trying to convert each row in a jsonb column to a type that I've defined, and I can't quite seem to get there.
I have an app that scrapes articles from The Guardian Open Platform and dumps the responses (as jsonb) in an ingestion table, into a column called 'body'. Other columns are a sequential ID, and a timestamp extracted from the response payload that helps my app only scrape new data.
I'd like to move the response dump data into a properly-defined table, and as I know the schema of the response, I've defined a type (my_type).
I've been referring to the 9.16. JSON Functions and Operators in the Postgres docs. I can get a single record as my type:
select * from jsonb_populate_record(null::my_type, (select body from data_ingestion limit 1));
produces
id
type
sectionId
...
example_id
example_type
example_section_id
...
(abbreviated for concision)
If I remove the limit, I get an error, which makes sense: the subquery would be providing multiple rows to jsonb_populate_record which only expects one.
I can get it to do multiple rows, but the result isn't broken into columns:
select jsonb_populate_record(null::my_type, body) from reviews_ingestion limit 3;
produces:
jsonb_populate_record
(example_id_1,example_type_1,example_section_id_1,...)
(example_id_2,example_type_2,example_section_id_2,...)
(example_id_3,example_type_3,example_section_id_3,...)
This is a bit odd, I would have expected to see column names; this after all is the point of providing the type.
I'm aware I can do this by using Postgres JSON querying functionality, e.g.
select
body -> 'id' as id,
body -> 'type' as type,
body -> 'sectionId' as section_id,
...
from reviews_ingestion;
This works but it seems quite inelegant. Plus I lose datatypes.
I've also considered aggregating all rows in the body column into a JSON array, so as to be able to supply this to jsonb_populate_recordset but this seems a bit of a silly approach, and unlikely to be performant.
Is there a way to achieve what I want, using Postgres functions?
Maybe you need this - to break my_type record into columns:
select (jsonb_populate_record(null::my_type, body)).*
from reviews_ingestion
limit 3;
-- or whatever other query clauses here
i.e. select all from these my_type records. All column names and types are in place.
Here is an illustration. My custom type is delmet and CTO t remotely mimics data_ingestion.
create type delmet as (x integer, y text, z boolean);
with t(i, j, k) as
(
values
(1, '{"x":10, "y":"Nope", "z":true}'::jsonb, 'cats'),
(2, '{"x":11, "y":"Yep", "z":false}', 'dogs'),
(3, '{"x":12, "y":null, "z":true}', 'parrots')
)
select i, (jsonb_populate_record(null::delmet, j)).*, k
from t;
Result:
i
x
y
z
k
1
10
Nope
true
cats
2
11
Yep
false
dogs
3
12
true
parrots
I am trying to manipulate string data in a column such as if the given string is '20591;#e123456;#17507;#c567890;#15518;#e135791' or '26169;#c785643', then the
result should be like 'e123456;c567890;e135791' or 'c785643'. The number of digits in between can be of any length.
Some of the things I have tried so far are:
select replace('20591;#e123456;#17507;#c567890;#15518;#e135791','#','');
This leaves me with '20591;e123456;17507;c567890;15518;e135791', which still includes the digits without 'e' or 'c' prefixed to them. i want to get rid of 20591, 17507 and 15518.
Create function that will keep a pattern of '%[#][ec][0-9][;]%' and will get rid of the rest.
The most important advise is: Do not store any data in a delimited string. This is violating the most basic principle of relational database concepts (1.NF).
The second hint is SO-related: Please always add / tag your questions with the appropriate tool. The tag [tsql] points to SQL-Server, but this might be wrong (which would invalidate both answers). Please tag the full product with its version (e.g. [sql-server-2012]). Especially with string splitting there are very important product related changes from version to version.
Now to your question.
Working with (almost) any version of SQL-Server
My suggestion uses a trick with XML:
(credits to Alan Burstein for the mockup)
DECLARE #table TABLE (someid INT IDENTITY, somestring VARCHAR(50));
INSERT #table VALUES ('20591;#e123456;#17507;#c567890;#15518;#e135791'),('26169;#c785643')
--the query
SELECT t.someid,t.somestring,A.CastedToXml
,STUFF(REPLACE(A.CastedToXml.query('/x[contains(text()[1],"#") and empty(substring(text()[1],2,100) cast as xs:int?)]')
.value('.','nvarchar(max)'),'#',';'),1,1,'') TheNewList
FROM #table t
CROSS APPLY(SELECT CAST('<x>' + REPLACE(t.somestring,';','</x><x>') + '</x>' AS XML)) A(CastedToXml);
The idea in short:
By replacing the ; with XML tags </x><x> we can transform your delimited list to XML. I included the intermediate XML into the result set. Just click it to see how this works.
In the next query I use a XQuery predicate first to find entries, which contain a # and second, which do NOT cast to an integer without the #.
The thrid step is specific to XML again. The XPath . in .value() will return all content as one string.
Finally we have to replace the # with ; and cut away the leading ; using STUFF().
UPDATE The same idea, but a bit shorter:
You can try this as well
SELECT t.someid,t.somestring,A.CastedToXml
,REPLACE(A.CastedToXml.query('data(/x[empty(. cast as xs:int?)])')
.value('.','nvarchar(max)'),' ',';') TheNewList
FROM #table t
CROSS APPLY(SELECT CAST('<x>' + REPLACE(t.somestring,';#','</x><x>') + '</x>' AS XML)) A(CastedToXml);
Here I use ;# to split your string and data() to implicitly concatenate your values (blank-separated).
UPDATE 2 for v2017
If you have v2017+ I'd suggest a combination of a JSON splitter and STRING_AGG():
SELECT t.someid,STRING_AGG(A.[value],';') AS TheNewList
FROM #table t
CROSS APPLY OPENJSON(CONCAT('["',REPLACE(t.somestring,';#','","'),'"]')) A
WHERE TRY_CAST(A.[value] AS INT) IS NULL
GROUP BY t.someid;
You did not include the version of SQL Server you are on. If you are using 2016+ you can use SPLIT_STRING, otherwise a good T-SQL splitter will do.
Against a single variable:
DECLARE #somestring VARCHAR(1000) = '20591;#e123456;#17507;#c567890;#15518;#e135791';
SELECT NewString = STUFF((
SELECT ','+split.item
FROM STRING_SPLIT(#somestring,';') AS s
CROSS APPLY (VALUES(REPLACE(s.[value],'#',''))) AS split(item)
WHERE split.item LIKE '[a-z][0-9]%'
FOR XML PATH('')),1,1,'');
Against a table:
NewString
----------------------
e123456,c567890,e135791
-- Against a table
DECLARE #table TABLE (someid INT IDENTITY, somestring VARCHAR(50));
INSERT #table VALUES ('20591;#e123456;#17507;#c567890;#15518;#e135791'),('26169;#c785643')
SELECT t.*, fn.NewString
FROM #table AS t
CROSS APPLY
(
SELECT NewString = STUFF((
SELECT ','+split.item
FROM STRING_SPLIT(t.somestring,';') AS s
CROSS APPLY (VALUES(REPLACE(s.[value],'#',''))) AS split(item)
WHERE split.item LIKE '[a-z][0-9]%'
FOR XML PATH('')),1,1,'')
) AS fn;
Returns:
someid somestring NewString
----------- -------------------------------------------------- -----------------------------
1 20591;#e123456;#17507;#c567890;#15518;#e135791 e123456,c567890,e135791
2 26169;#c785643 c785643
I have a simple update statement:
-- name: add-response!
UPDATE survey
SET :question = :response
WHERE caseid = :caseid
And I invoke it like this:
(add-response! db-spec "q1" 2 1001)
However, yesql doesn't like using a string as a parameter for the column - it translates "q1" to 'q1', which isn't valid postgres syntax.
"BatchUpdateException Batch entry 0 UPDATE survey SET 'q1' = 2
WHERE caseid = 1001 was aborted."
Is there a way to make this work? I've tried using the question name as a symbol: 'q1. That doesn't work because:
"PSQLException Can't infer the SQL type to use for an instance of
clojure.lang.Symbol."
I've had same problem some time ago with yesql, so I investigated its source code. I turns out that yesql converts query like
UPDATE survey SET :question = :response WHERE caseid = :caseid
to
["UPDATE survey SET ? = ? WHERE caseid = ?" question response caseid]
and feeds it to clojure.java.jdbc/query. So this is just a prepared statement. According to this StackOverflow question there is no way to pass column names as parameters to DB query. That actually makes sence, because one of purposes of prepared statements is to force values to be always treated as values and thus protect you from SQL injections or similar issues.
I your case, you could use clojure.java.jdbc/update! as it clearly allows parameterized colum names:
(:require [clojure.java.jdbc :as j])
(j/update! db-spec :survey
{"q1" 2}
["caseid = ?" 1001])
Hope that helps. Cheers!
Here is example table used for getting some basic operations with xml column in postgreSQL table.
DROP TABLE IF EXISTS temp1;
CREATE TABLE temp1(myindex serial PRIMARY KEY, description xml);
INSERT INTO temp1(description)
VALUES
('<?xml version="1.0" encoding="utf-8"?>
<setup xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<DATABASE>herdatabase</DATABASE>
<DBSERVER>127.0.0.1</DBSERVER>
<DBUSER>saly</DBUSER>
<DBPORT>5432</DBPORT>
</setup>'),
('<?xml version="1.0" encoding="utf-8"?>
<setup xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<DATABASE>mydatabase</DATABASE>
<DBSERVER>127.0.0.1</DBSERVER>
<DBUSER>john</DBUSER>
<DBPORT>4424</DBPORT>
</setup>');
I decided to use XML instead of hstore and JSON since I'm working in .NET where XML functions and serialization is well supported and I haven't much of such data so speed is not much important.
From here I try some basic queries to get data.
--1) That one work
SELECT xpath('/setup/DBPORT/text()', description) FROM temp1;
--2) That work but give two arrays with single value
-- How to get one array with 2 values like "{5432, 127.0.0.1}"
SELECT xpath('/setup/DBPORT/text()', description), xpath('/setup/DBSERVER/text()', description) FROM temp1;
--3) How to get description when condition is met?
-- Here I get ERROR: could not identify an equality operator for type xml
SELECT description FROM temp1 WHERE xpath('/setup/DBSERVER/text()', description) = '{127.0.0.1}';
--4) How to get all values when condition is met?
SELECT allvalues FROM temp1 WHERE xpath('/setup/DBUSER/text()', description) = 'john';
How to get working those queries which don't work?
2 - Use the XPath "or" operator, |, to select either a DBPORT or DBSERVER:
SELECT xpath('/setup/DBPORT/text()|/setup/DBSERVER/text()', description)
FROM temp1;
3 - The xpath() function returns an XML array which can be cast to a TEXT array for easier matching to other values:
SELECT description
FROM temp1
WHERE xpath('/setup/DBSERVER/text()', description)::TEXT[] = '{127.0.0.1}'::TEXT[];
4 - Similar to the previous, cast the XML array to a Text array to match to a value:
SELECT xpath('/setup/node()/text()', description)
FROM temp1
WHERE xpath('/setup/DBUSER/text()', description)::TEXT[] = '{john}'::TEXT[];
For the second, you have two arrays, so you can use array_cat():
SELECT array_cat(xpath('/setup/DBPORT/text()', description),
xpath('/setup/DBSERVER/text()', description))
FROM temp1;
For the third, you have one array of values (it's possible that your xpath matches multiple /setup/DBSERVER elements, thus the array type). This takes the first element from the array and casts to text so that you can compare to the string
SELECT description
FROM temp1
WHERE (xpath('/setup/DBSERVER/text()', description))[1]::text = '127.0.0.7';
Finally, you can use an xpath to generate an array of your elements, then unnest() them (so you get one row per element), then use another xpath to get at the element content. This gives the element content, but not the element name - I don't know the xpath to get the tag name off the top of my head.
SELECT xpath('/', unnest(xpath('/setup/*', description)))
FROM temp1
WHERE (xpath('/setup/DBUSER/text()', description))[1]::text = 'john';