How to transpose a XML table with T-SQL XQuery? - tsql

Let's say we have the following XML:
<root>
<row>
<column>row 1 col 1</column>
<column>row 1 col 2</column>
<column>row 1 col 3</column>
</row>
<row>
<column>row 2 col 1</column>
<column>row 2 col 2</column>
<column>row 2 col 3</column>
</row>
<row>
<column>row 3 col 1</column>
<column>row 3 col 2</column>
<column>row 3 col 3</column>
</row>
</root>
How do I transpose this, using T-SQL XQuery to:
<root>
<column>
<row>row 1 col 1</row>
<row>row 2 col 1</row>
<row>row 3 col 1</row>
</column>
<column>
<row>row 1 col 2</row>
<row>row 2 col 2</row>
<row>row 3 col 2</row>
</column>
<column>
<row>row 1 col 3</row>
<row>row 2 col 3</row>
<row>row 3 col 3</row>
</column>
</root>

I suspect there might be a really nice approach using PIVOT, but I don't know it well enough to be able to say for sure. What I offer here works. I have split it up into chunks for better formatting and to provide commentary:
To start with let's capture the example data
-- Sample data
DECLARE #x3 xml
SET #x3 = '
<root>
<row>
<column>row 1 col 1</column>
<column>row 1 col 2</column>
<column>row 1 col 3</column>
</row>
<row>
<column>row 2 col 1</column>
<column>row 2 col 2</column>
<column>row 2 col 3</column>
</row>
<row>
<column>row 3 col 1</column>
<column>row 3 col 2</column>
<column>row 3 col 3</column>
</row>
</root>
'
DECLARE #x xml
SET #x = #x3
-- #x is now our input
Now the actual transposing code:
Establish the size of the matrix:
WITH Size(Size) AS
(
SELECT CAST(SQRT(COUNT(*)) AS int)
FROM #x.nodes('/root/row/column') T(C)
)
Shred the data, use ROW_NUMBER to capture the index (the -1 is to make it zero based), and use modulo and integer divide on the index to work out the new row and column numbers:
,Flattened(NewRow, NewCol, Value) AS
(
SELECT
-- i/#size as old_r, i % #size as old_c,
i % (SELECT TOP 1 Size FROM Size) AS NewRow,
i / (SELECT TOP 1 Size FROM Size) AS NewCol,
Value
FROM (
SELECT
(ROW_NUMBER() OVER (ORDER BY C)) - 1 AS i,
C.value('.', 'nvarchar(100)') AS Value
FROM #x.nodes('/root/row/column') T(C)
) ShreddedInput
)
With this CTE FlattenedInput available, all we now need to do is get the FOR XML options and query structure right and we're done:
SELECT
(
SELECT Value 'column'
FROM
Flattened t_inner
WHERE
t_inner.NewRow = t_outer.NewRow
FOR XML PATH(''), TYPE
) row
FROM
Flattened t_outer
GROUP BY NewRow
FOR XML PATH(''), ROOT('root')
Sample output:
<root>
<row>
<column>row 1 col 1</column>
<column>row 2 col 1</column>
<column>row 3 col 1</column>
</row>
<row>
<column>row 1 col 2</column>
<column>row 2 col 2</column>
<column>row 3 col 2</column>
</row>
<row>
<column>row 1 col 3</column>
<column>row 2 col 3</column>
<column>row 3 col 3</column>
</row>
</root>
Works on any size 'square' data. Note the lack of sanity checking / error handling.

SET #x3 = '
<root>
<row>
<column>row 1 col 1</column>
<column>row 1 col 2</column>
<column>row 1 col 3</column>
</row>
<row>
<column>row 2 col 1</column>
<column>row 2 col 2</column>
<column>row 2 col 3</column>
</row>
<row>
<column>row 3 col 1</column>
<column>row 3 col 2</column>
<column>row 3 col 3</column>
</row>
</root>
'
select #x3 = replace(#x3,'<row>','<rowtemp>')
select #x3 = replace(#x3,'<column>','<row>')
select #x3 = replace(#x3,'<rowtemp>','<column>')
select #x3

Related

how to rank the column values of each group

I have a table
Want to set the ranks based on max volume for each group .i.e date.
If volume is null, then dont rank it. Keep rank column empty for null volume. (example see line 11 and 12 in expected output snapshot)
The rank=1 is our front contract, if sym flipped then it cannot be rank1 again after flip. example see output snapshot line 9, 13 and 15
expected output is
To generate the sample table, use below code.
tab:([]date:`date$();sym:`symbol$();name:`symbol$();volume:`float$();roll_rank:`int$());
`tab insert (2010.01.01;`ESH22;`ES;100.1;0Ni);
`tab insert (2010.01.01;`ESH23;`ES;500.1;0Ni);
`tab insert (2010.01.02;`ESH22;`ES;100.1;0Ni);
`tab insert (2010.01.02;`ESH23;`ES;800.1;0Ni);
`tab insert (2010.01.02;`ESH24;`ES;600.1;0Ni);
`tab insert (2010.01.02;`ESH25;`ES;550.1;0Ni);
`tab insert (2010.01.02;`ESH26;`ES;200.1;0Ni);
`tab insert (2010.01.03;`ESH23;`ES;600.1;0Ni);
`tab insert (2010.01.03;`ESH24;`ES;700.1;0Ni);
`tab insert (2010.01.03;`ESH26;`ES;0n;0Ni);
`tab insert (2010.01.03;`ESH25;`ES;500.1;0Ni);
`tab insert (2010.01.03;`ESH26;`ES;0n;0Ni);
`tab insert (2010.01.04;`ESH23;`ES;50.1;0Ni);
`tab insert (2010.01.05;`ESH23;`ES;300.1;0Ni);
`tab insert (2010.01.05;`ESH24;`ES;800.1;0Ni);
`tab insert (2010.01.05;`ESH25;`ES;100.1;0Ni);
The following will put the table in descending order by date, with the rank number in a separate column:
q)ungroup select volume:desc volume,ranknumber:1+til count volume by date from tab
Code ouput with the provided table data:
date volume ranknumber
----------------------------
2010.01.01 500.1 1
2010.01.01 100.1 2
2010.01.02 800.1 1
2010.01.02 600.1 2
2010.01.02 550.1 3
2010.01.02 200.1 4
2010.01.02 100.1 5
2010.01.03 700.1 1
2010.01.03 600.1 2
2010.01.03 500.1 3
2010.01.03 4
2010.01.03 5
2010.01.04 50.1 1
2010.01.05 800.1 1
2010.01.05 300.1 2
2010.01.05 100.1 3
Haven't thought of an elegant way of not including the null values in the rank order yet.
Edit: You could use "update" on the sorted table to remove the ranked null values - something like this would work (where tab2 is the previous output):
q)update ranknumber:0N from tab2 where ranked=0N
date ranked ranknumber
----------------------------
2010.01.01 500.1 1
2010.01.01 100.1 2
2010.01.02 800.1 1
2010.01.02 600.1 2
2010.01.02 550.1 3
2010.01.02 200.1 4
2010.01.02 100.1 5
2010.01.03 700.1 1
2010.01.03 600.1 2
2010.01.03 500.1 3
2010.01.03
2010.01.03
2010.01.04 50.1 1
2010.01.05 800.1 1
2010.01.05 300.1 2
2010.01.05 100.1 3

TSQL - nested case

I ask if the nested houses are used as follows:
SELECT
CASE
WHEN Col1 < 2 THEN
CASE Col2
WHEN 'X' THEN 10
ELSE 11
END
WHEN Col1 = 2 THEN 2
.....
ELSE 0
END as Qty,
......,
FROM ....
explanation: If Col1 <2 shows something, but that something if X gives me the value 10 otherwise 11 If Col1 = 2 shows 2 otherwise 0 everything in the column name Qty
Is the reasoning correct?
Thanks in advance
It's should return what you say you need, but it's easier to read this way:
SELECT
CASE
WHEN Col1 < 2 AND Col2 = 'X' THEN 10
WHEN Col1 < 2 THEN 11
WHEN Col1 = 2 THEN 2
--.....
ELSE 0
END AS Qty
FROM
-- ...

How to insert row data between consecutive dates in HIVE?

Sample Data:
customer txn_date tag
A 1-Jan-17 1
A 2-Jan-17 1
A 4-Jan-17 1
A 5-Jan-17 0
B 3-Jan-17 1
B 5-Jan-17 0
Need to fill every missing txn_date between date range (1-Jan-17 to 5-Jan-2017). Just like below:
Output should be:
customer txn_date tag
A 1-Jan-17 1
A 2-Jan-17 1
A 3-Jan-17 0 (inserted)
A 4-Jan-17 1
A 5-Jan-17 0
B 1-Jan-17 0 (inserted)
B 2-Jan-17 0 (inserted)
B 3-Jan-17 1
B 4-Jan-17 0 (inserted)
B 5-Jan-17 0
select c.customer
,d.txn_date
,coalesce(t.tag,0) as tag
from (select date_add (from_date,i) as txn_date
from (select date '2017-01-01' as from_date
,date '2017-01-05' as to_date
) p
lateral view
posexplode(split(space(datediff(p.to_date,p.from_date)),' ')) pe as i,x
) d
cross join (select distinct
customer
from t
) c
left join t
on t.customer = c.customer
and t.txn_date = d.txn_date
;
c.customer d.txn_date tag
A 2017-01-01 1
A 2017-01-02 1
A 2017-01-03 0
A 2017-01-04 1
A 2017-01-05 0
B 2017-01-01 0
B 2017-01-02 0
B 2017-01-03 1
B 2017-01-04 0
B 2017-01-05 0
Just have the delta content i.e the missing data in a file(input.txt) delimited with the same delimiter you have mentioned when you created the table.
Then use the load data command to insert this records into the table.
load data local inpath '/tmp/input.txt' into table tablename;
Your data wont be in the order you have mentioned , it would get appended to the last. You could retrieve the order by adding order by txn_date in the select query.

Overlapping condition for case-when

I have the following query:
SELECT case
when tbl.id % 2 = 0 then 'mod-2'
when tbl.id % 3 = 0 then 'mod-3'
when tbl.id % 5 = 0 then 'mod-5'
else 'mod-x'
end as odds, tbl.id from some_xyz_table tbl;
If the table has Id 5,6,7 then it is returning output as (copied from pg-admin):
"mod-5";5
"mod-2";6
"mod-x";7
But, here I can see 6 is divisible by both 2 and 3. And my expected output is:
"mod-5";5
"mod-2";6
"mod-3";6 <-- this
"mod-x";7
Is there any way to modify this query to obtain such output? Any alternate solution will do for me.
You could do this with UNION queries [EDIT changed it to use UNION ALL]:
SELECT 'mod-5', id FROM tbl -- divisible by 5
WHERE id %5 = 0
UNION ALL
SELECT 'mod-2', id FROM tbl -- divisible by 2
WHERE id %2 = 0
UNION ALL
SELECT 'mod-3', id FROM tbl -- divisible by 3
WHERE id %3 = 0
UNION ALL
SELECT 'mod-x',id FROM tbl -- not divisible by 5,3 or 2
WHERE id %5 <> 0 AND id%2 <> 0 AND id % 3 <> 0

Crystal Reports: How do I repeat a constant number of rows / headers on each new page in a cross-tab?

I have some data that I've staged in my database as such:
RowHeader ColumnHeader Value
Row1 Col1 (1,1)
Row1 Col2 (1,2)
Row1 Col3 (1,3)
Row1 Col4 (1,4)
Row1 Col5 (1,5)
Row2 Col1 (2,1)
Row2 Col2 (2,2)
... ... ...
RowN ColM (N,M)
And, as you might guess, I'm putting this in a cross tab in the following manner:
Columns:
ColumnHeader
Rows: Summerized Fields:
RowHeader Max of Value
And this generates the following report:
Col1 Col2 Col3 ... ColM
Row1 (1,1) (1,2) (1,3) ... (1,M)
Row2 (2,1) (2,2) (2,3) ... (2,M)
... ... ... ... ...
RowN (N,1) (N,2) (N,3) ... (N,M)
Now, this report spans multiple pages and on each page, I'd like to always display the data from the first couple of rows and columns (a little like freezing panes in Excel). The number of rows and columns that need to always be displayed is constant. E.g. Let's say, on each page, I want columns 1 to 3 and row 1 to appear:
-- Page 1 --
Col1 Col2 Col3 Col4 Col5
Row1 (1,1) (1,2) (1,3) (1,4) (1,5)
Row2 (2,1) (2,2) (2,3) (2,4) (2,5)
Row3 (3,1) (3,2) (3,3) (3,4) (3,5)
Row4 (4,1) (4,2) (4,3) (4,4) (4,5)
Row5 (5,1) (5,2) (5,3) (5,4) (5,5)
-- Page 2 --
Col1 Col2 Col3 Col6 Col7
Row1 (1,1) (1,2) (1,3) (1,6) (1,7)
Row6 (6,1) (6,2) (6,3) (6,6) (6,7)
Row7 (7,1) (7,2) (7,3) (7,6) (7,7)
Row8 (8,1) (8,2) (8,3) (8,6) (8,7)
Row9 (9,1) (9,2) (9,3) (9,6) (9,7)
-- etc. ---
How can I do this?
Ok ok... you caught me... I'm totally new to using Crystal Reports (what gave it away?). I have a feeling that this cannot be done with the way the data is currently staged, but I am totally open to staging the data in another fashion to make this work. Thanks in advance.
You can achieve that.. meaning your able to create a group which can which can dispatch your column.
I mean, if you column are month/year and you want only 6 per sheet.. you create a group with a formula indicating if your date in the 6st month of the year then 'start year', else 'end year'
you insert your group in the report, then you place your cross in each group... done
You cannot achieve this with cross-tabs. You can achieve this by staging the data differently (i.e. in the manner it needs to be displayed) and creating a normal report.
Morning,
AS I say, you need to find a link between columns... I don't know how to repeat the first 3 columns, as far as they're not labels....