I have Multiple logical records in one db row, how do I split them into separate rows?

I have Multiple logical records in one db row, how do I split them into separate rows? - tsql

I have a table that has data like:
Name
Item_1
Qty_1
Price_1
Item_2
Qty_2
Price_2
...
Item_50
Qty_50
Price_50
Bob
Apples
10
0.50
Pears
5
0.65
...
Lemons
12
0.25
Alice
Cherries
20
1.00
NULL
NULL
NULL
...
NULL
NULL
NULL
I need to process the data per-item, so the ideal form of the data would be:
Name
ItemNo
Item
Qty
Price
Bob
1
Apples
10
0.50
Bob
2
Pears
5
0.65
...
...
...
...
...
Bob
50
Lemons
12
0.25
Alice
1
Cherries
20
1.00
How can I convert between the two forms?
I have looked at the pivot command, but it seems to convert column names into data in a field, not split groups of columns into separate rows. It doesn't look like it will work for this application.
The current code looks something like:
( SELECT t1.Name, 1 AS ItemNo, t1.Item_1 AS Item, t1.Qty_1 AS Qty, t1.Price_1 AS Price FROM table t1
UNION ALL
SELECT t2.Name, 2 AS ItemNo, t2.Item_2 AS Item, t2.Qty_2 AS Qty, t2.Price_2 AS Price FROM table t2
UNION ALL
...
SELECT t50.Name, 50 AS ItemNo, t50.Item_50 AS Item, t50.Qty_50 AS Qty, t50.Price_50 AS Price FROM table t50
)
It works, but it seems hard to maintain. Is there a better way?

Hopefully the reason you want to do this is to fix your design. If not, then make the reason you're asking is to fix your design.
Anyway, one method is to use a VALUES table construct to unpivot the data:
SELECT YT.Name,
V.ItemNo,
V.Item,
V.Qty,
V.Price
FROM dbo.YourTable YT
CROSS APPLY (VALUES(1,YT.Item_1, YT.Qty_1, YT.Price1),
(2,YT.Item_2, YT.Qty_2, YT.Price2),
(3,YT.Item_3, YT.Qty_3, YT.Price3),
... --You get the idea
(49,YT.Item_49, YT.Qty_49, YT.Price49),
(50,YT.Item_50, YT.Qty_50, YT.Price50))V(ItemNo,Item,Qty,Price)
WHERE V.Item IS NOT NULL;

Related

T-SQL vlookup with fake calendar table?

I am rather new in T-SQL and I have to create a view, where the output will be as shown below:
enter image description here
But my sales table doesn't have any data about sales in February and May for customer ABC and no data in January for customer XYZ, but I really want to have 0 for these months. How to do it in T-SQL?

This is great question about a very important topic that, even many experienced developers need to touch up on. Being "relatively new at SQL" I wont just offer a solution, I'll explain the key concepts involved.
The Auxiliary Table Numbers
First lets learn about what a tally table, aka numbers table is all about.
What does this do?
SELECT N = 1 ;
It returns the number 1.
N
-----
1
How about this?
SELECT N = 1 FROM (VALUES(0)) AS e(N);
Same thing:
N
-----
1
What does this return?
SELECT N = 1 FROM (VALUES(0),(0),(0),(0),(0),(0)) AS e(n);
Here I'm leveraging the VALUES table constructer which allows for a list of values to be treated like a view. This returns:
N
-------
1
1
1
1
1
We don't need the ones, we need the rows. This will make more sense in a moment. Now, what does this do?
WITH e(N) AS (SELECT 1 FROM (VALUES(0),(0),(0),(0),(0)) AS e(n))
SELECT N = 1 FROM e e1;
It returns the same thing, five 1's, but I've wrapped the code into a CTE named e. Think of CTEs as inline unnamed views that you can reference multiple times. Now lets CROSS JOIN e to itself. This returns for 25 dummy rows (5*5).
WITH e(N) AS (SELECT 1 FROM (VALUES(0),(0),(0),(0),(0)) AS e(n))
SELECT N = 1 FROM e e1, e e2;
Next we leverage ROW_NUMBER() over our set of dummy values.
WITH E1(N) AS (SELECT 1 FROM (VALUES(0),(0),(0),(0),(0)) AS e(n))
SELECT N = ROW_NUMBER() OVER (ORDER BY(SELECT NULL)) FROM E1, E1 a;
Returns (truncated for brevity):
N
--------------------
1
2
3
...
24
25
Using as an auxiliary numbers table
#OneToTen is a table with random numbers 1 to 10. I need to count how many there are, returning 0 when there aren't any. NOTE MY COMMENTS:
;--== 2. Simple Use Case - Counting all numbers, including missing ones (missing = 0)
DECLARE #OneToTen TABLE (N INT);
INSERT #OneToTen VALUES(1),(2),(2),(2),(4),(8),(8),(10),(10),(10);
WITH E1(N) AS (SELECT 1 FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS e(n)),
iTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY(SELECT NULL)) FROM E1, E1 a)
SELECT
N = i.N,
Wrong = COUNT(*), -- WRONG!!! Don't do THIS, this counts ALL rows returned
Correct = COUNT(t.N) -- Correct, this counts numbers from #OneToTen AKA "t.N"
FROM iTally AS i -- Aux Table of numbers
LEFT JOIN #OneToTen AS t -- Table to evaluate
ON i.N = t.N -- LEFT JOIN #OneToTen numbers to our Aux table of numbers
WHERE i.N <= 10 -- We only need the numbers 1 to 10
GROUP BY i.N; -- Group by with no Sort!!!
This returns:
N Wrong Correct
----- ----------- -----------
1 1 1
2 3 3
3 1 0
4 1 1
5 1 0
6 1 0
7 1 0
8 2 2
9 1 0
10 3 3
Note that I show you the wrong and right way to do this. Note how COUNT(*) is wrong for this, you need COUNT(whatever you are counting).
Auxiliary table of Dates (AKA calendar table)
My we use our numbers table to create a calendar table.
;--== 3. Auxilliary Month/Year Calendar Table
DECLARE #Start DATE = '20191001',
#End DATE = '20200301';
WITH E1(N) AS (SELECT 1 FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS e(n)),
iTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY(SELECT NULL)) FROM E1, E1 a)
SELECT TOP(DATEDIFF(MONTH,#Start,#End)+1)
TheDate = f.Dt,
TheYear = YEAR(f.Dt),
TheMonth = MONTH(f.Dt),
TheWeekday = DATEPART(WEEKDAY,f.Dt),
DayOfTheYear = DATEPART(DAYOFYEAR,f.Dt),
LastDayOfMonth = EOMONTH(f.Dt)
FROM iTally AS i
CROSS APPLY (VALUES(DATEADD(MONTH, i.N-1, #Start))) AS f(Dt)
This returns:
TheDate TheYear TheMonth TheWeekday DayOfTheYear LastDayOfMonth
---------- ----------- ----------- ----------- ------------ --------------
2019-10-01 2019 10 3 274 2019-10-31
2019-11-01 2019 11 6 305 2019-11-30
2019-12-01 2019 12 1 335 2019-12-31
2020-01-01 2020 1 4 1 2020-01-31
2020-02-01 2020 2 7 32 2020-02-29
2020-03-01 2020 3 1 61 2020-03-31
You will only need the YEAR and MONTH.
The Auxiliary Customer table
Because you are performing aggregations (SUM,COUNT,etc.) against multiple customers we will also need an Auxiliary table of customers, more commonly known as a lookup or dimension.
SAMPLE DATA:
;--== Sample Data
DECLARE #sale TABLE
(
Customer VARCHAR(10),
SaleYear INT,
SaleMonth TINYINT,
SaleAmt DECIMAL(19,2),
INDEX idx_cust(Customer)
);
INSERT #sale
VALUES('ABC',2019,12,410),('ABC',2020,1,668),('ABC',2020,1,50), ('ABC',2020,3,250),
('CDF',2019,10,200),('CDF',2019,11,198),('CDF',2020,1,333),('CDF',2020,2,5000),
('CDF',2020,2,325),('CDF',2020,3,1105),('FRED',2018,11,1105);
Distinct list of customers for an "Auxilliary Table of Customers"
SELECT DISTINCT s.Customer FROM #sale AS s;
For my sample data we get:
Customer
----------
ABC
CDF
FRED
Putting it all together
Here I'm going to:
Create a numbers table
Use my numbers table to create a calendar table
Create an auxiliary Customer table from #sale
CROSS JOIN (combine) both tables for a "junk dimension"
LEFT JOIN our sales data to our calendar/customer auxiliary tables/junk dimension
Group by the auxiliary table values
SOLUTION:
;--==== SAMPLE DATA
DECLARE #sale TABLE
(
Customer VARCHAR(10),
SaleYear INT,
SaleMonth TINYINT,
SaleAmt DECIMAL(19,2),
INDEX idx_cust(Customer)
);
INSERT #sale
VALUES('ABC',2019,12,410),('ABC',2020,1,668),('ABC',2020,1,50), ('ABC',2020,3,250),
('CDF',2019,10,200),('CDF',2019,11,198),('CDF',2020,1,333),('CDF',2020,2,5000),
('CDF',2020,2,325),('CDF',2020,3,1105),('FRED',2018,11,1105);
;--==== START/END DATEs
DECLARE #Start DATE = '20191001',
#End DATE = '20200301';
;--==== FINAL SOLUTION
WITH -- 6.1. Auxilliary Table of numbers:
E1(N) AS (SELECT 1 FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS e(n)),
iTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY(SELECT NULL)) FROM E1, E1 a),
-- 6.2. Use numbers table to create an "Auxilliary Date Table" (Calendar Table):
MonthYear(SaleYear,SaleMonth) AS
(
SELECT TOP(DATEDIFF(MONTH,#Start,#End)+1) YEAR(f.Dt), MONTH(f.Dt)
FROM iTally AS i
CROSS APPLY (VALUES(DATEADD(MONTH, i.N-1, #Start))) AS f(Dt)
)
SELECT
Customer = cust.Customer,
MonthYear = CONCAT(cal.SaleYear,'-',cal.SaleMonth),
Sales = ISNULL(SUM(s.SaleAmt),0)
-- Auxilliary Table of Customers
FROM (SELECT DISTINCT s.Customer FROM #sale AS s) AS cust -- 6.3. Aux Customer Table
CROSS JOIN MonthYear AS cal -- 6.4. Cross join to create Calendar/Customer Junk Dimension
LEFT JOIN #sale AS s -- 6.5. Join #sale to Junk Dimension on Year,Month and Customer
ON s.SaleYear = cal.SaleYear
AND s.SaleMonth = cal.SaleMonth
AND s.Customer = cust.Customer
GROUP BY cust.Customer, cal.SaleYear, cal.SaleMonth -- 6.6. Group by Junk Dim values
ORDER BY cust.Customer, cal.SaleYear, cal.SaleMonth; -- Order by not required
RESULTS:
Customer MonthYear Sales
---------- ------------ ------------
ABC 2019-10 0.00
ABC 2019-11 0.00
ABC 2019-12 410.00
ABC 2020-1 718.00
ABC 2020-2 0.00
ABC 2020-3 250.00
CDF 2019-10 200.00
CDF 2019-11 198.00
CDF 2019-12 0.00
CDF 2020-1 333.00
CDF 2020-2 5325.00
CDF 2020-3 1105.00
FRED 2019-10 0.00
FRED 2019-11 0.00
FRED 2019-12 0.00
FRED 2020-1 0.00
FRED 2020-2 0.00
FRED 2020-3 0.00

Randomly select observations separately for each column in SQL

I am interested in generating a completely (damaged) randomized data where observations are selected randomly (with replacement) for each field and then combined. I will need to generate a new dummy id to represent the old id as I don't want to reconstruct the data. My goal is to create a simulated column-wise random dataset.
Here is a sample data:
Id Col1 Col2 Col3
11 A 0.01 David
12 B 0.04 Max
13 C 0.05 Tom
14 E 0.06 West
15 C 0.02 Mike
What I am interested in is something like this:
Id2 Col1 Col2 Col3
1 E 0.04 Mike
2 C 0.06 David
3 B 0.02 West
4 A 0.04 Tom
5 C 0.05 Max
I am looking for an organized way of doing this. Here is what I attempted so far but am not interested in doing many times over since I have a lot of columns in the real data.
proc sql;
create table newtable1 as
select monotonic() as id2, col1 from
(select col1 from Table1 order by ranuni(0));
quit;
Using the above code you generate separate random columns and then combine them using the new monotonic key.

TSQL, Pivot rows into single columns

Before, I had to solve something similar:
Here was my pivot and flatten for another solution:
I want to do the same thing on the example below but it is slightly different because there are no ranks.
In my previous example, the table looked like this:
LocationID Code Rank
1 123 1
1 124 2
1 138 3
2 999 1
2 888 2
2 938 3
And I was able to use this function to properly get my rows in a single column.
-- Check if tables exist, delete if they do so that you can start fresh.
IF OBJECT_ID('tempdb.dbo.#tbl_Location_Taxonomy_Pivot_Table', 'U') IS NOT NULL
DROP TABLE #tbl_Location_Taxonomy_Pivot_Table;
IF OBJECT_ID('tbl_Location_Taxonomy_NPPES_Flattened', 'U') IS NOT NULL
DROP TABLE tbl_Location_Taxonomy_NPPES_Flattened;
-- Pivot the original table so that you have
SELECT *
INTO #tbl_Location_Taxonomy_Pivot_Table
FROM [MOAD].[dbo].[tbl_Location_Taxonomy_NPPES] tax
PIVOT (MAX(tax.tbl_lkp_Taxonomy_Seq)
FOR tax.Taxonomy_Rank in ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],[14],[15])) AS pvt
-- ORDER BY Location_ID
-- Flatten the tables.
SELECT Location_ID
,max(piv.[1]) as Tax_Seq_1
,max(piv.[2]) as Tax_Seq_2
,max(piv.[3]) as Tax_Seq_3
,max(piv.[4]) as Tax_Seq_4
,max(piv.[5]) as Tax_Seq_5
,max(piv.[6]) as Tax_Seq_6
,max(piv.[7]) as Tax_Seq_7
,max(piv.[8]) as Tax_Seq_8
,max(piv.[9]) as Tax_Seq_9
,max(piv.[10]) as Tax_Seq_10
,max(piv.[11]) as Tax_Seq_11
,max(piv.[12]) as Tax_Seq_12
,max(piv.[13]) as Tax_Seq_13
,max(piv.[14]) as Tax_Seq_14
,max(piv.[15]) as Tax_Seq_15
-- JOIN HERE
INTO tbl_Location_Taxonomy_NPPES_Flattened
FROM #tbl_Location_Taxonomy_Pivot_Table piv
GROUP BY Location_ID
So, then here is the data I would like to work with in this example.
LocationID Foreign Key
2 2
2 670
2 2902
2 5389
3 3
3 722
3 2905
3 5561
So I have some data that is formatted like this:
I have used pivot on data like this before--But the difference was it had a rank also. Is there a way to get my foreign keys to show up in this format using a pivot?
locationID FK1 FK2 FK3 FK4
2 2 670 2902 5389
3 3 722 2905 5561
Another way I'm looking to solve this is like this:
Another way I could look at doing this is I have the values in:
this form as well:
LocationID Address_Seq
2 670, 5389, 2902, 2,
3 722, 5561, 2905, 3
etc
is there anyway I can get this to be the same?
ID Col1 Col2 Col3 Col4
2 670 5389, 2902, 2

This, adding a rank column and reversing the orders, should gives you what you require:
SELECT locationid, [4] col1, [3] col2, [2] col3, [1] col4
FROM
(
SELECT locationid, foreignkey,rank from #Pivot_Table ----- temp table with a rank column
) x
PIVOT (MAX(x.foreignkey)
FOR x.rank in ([4],[3],[2],[1]) ) pvt

Kdb q query data from one table based on the data from another table without join

I'm new in kdb/q. And the following is my question. Really hope someone who experts in kdb can help me out.
I have two tables. Table t1 has two attributes: tp_time and id, which looks like:
tp_time id
------------------------------
2018.06.25T00:07:15.822 1
2018.06.25T00:07:45.823 3
2018.06.25T00:09:01.963 8
...
...
Table t2 has three attributes: tp_time, id, and price.
For each id, it has lots of price at different tp_time. So the table t2 is really large, which looks like the following:
tp_time id price
----------------------------------------
2018.06.25T00:05:99.999 1 10.87
2018.06.25T00:06:05.823 1 10.88
2018.06.25T00:06:18.999 1 10.88
...
...
2018.06.25T17:39:20.999 1 10.99
2018.06.25T17:39:23.999 1 10.99
2018.06.25T17:39:24.999 1 10.99
...
...
2018.06.25T01:39:39.999 2 10.99
2018.06.25T01:39:41.999 2 10.99
2018.06.25T01:39:45.999 2 10.99
...
...
What I try to do is for each row in Table t1, find its price at the nearest time and its price at approximately 5 seconds later. For example, for the first row in table t1:
2018.06.25T00:07:15.822 1
The price at nearest time is 10.87 and the price at around 5 seconds later is 10.88. And my expected output table looks like the following:
tp_time id price_1 price_2
----------------------------------------------------
2018.06.25T00:07:15.822 1 10.87 10.88
2018.06.25T00:07:45.823 3 SOME_PRICE SOME_PRICE
2018.06.25T00:09:01.963 8 SOME_PRICE SOME_PRICE
...
...
The thing is I cannot join t1 and t2 because table t2 is so large and I will kill the server. I've try something like ...where tp_time within(time1, time2). But I'm not sure how to deal with the time1 and time2 varibles.
Could someone gives me some helps on this questions? Thanks so much!

I'll recommend organizing the table t1 by applying the proper attributes so that when you join the tables, it will generate the results quickly.
Since you are looking for the prevailing price and price after 5 seconds, You will need wj for this.
the general syntax is :
wj[w;c;t;(q;(f0;c0);(f1;c1))]
w - begin and end time
t & q - unkeyed tables; q should be sorted by `id`time with `p# on id
c- names of the columns to be joined
f0,f1 - aggregation functions
In your case t2 should be sorted by `id`time with `p# on id
q)t2:update `g#id from `id`tp_time xasc ([] tp_time:`time$10:20:30 + asc -10?10 ; id:10?3 ;price:10?10.)
q)t1:([] tp_time:`time$10:20:30 + asc -3?5 ; id:1 1 1 )
q)select from t2 where id=1
tp_time id price
10:20:31.000 1 4.410662
10:20:32.000 1 5.473385
10:20:38.000 1 1.247049
q)wj[(`second$0 5)+\:t1.tp_time;`id`tp_time;t1;(t2;(first;`price);(last;`price))]
tp_time id price price
10:20:30.000 1 4.410662 5.473385
10:20:31.000 1 4.410662 5.473385
10:20:34.000 1 5.473385 1.247049 //price at 32nd second & 38th second

SELECT record based upon dates

Assuming data such as the following:
ID EffDate Rate
1 12/12/2011 100
1 01/01/2012 110
1 02/01/2012 120
2 01/01/2012 40
2 02/01/2012 50
3 01/01/2012 25
3 03/01/2012 30
3 05/01/2012 35
How would I find the rate for ID 2 as of 1/15/2012?
Or, the rate for ID 1 for 1/15/2012?
In other words, how do I do a query that finds the correct rate when the date falls between the EffDate for two records? (Rate should be for the date prior to the selected date).
Thanks,
John

How about this:
SELECT Rate
FROM Table1
WHERE ID = 1 AND EffDate = (
SELECT MAX(EffDate)
FROM Table1
WHERE ID = 1 AND EffDate <= '2012-15-01');
Here's an SQL Fiddle to play with. I assume here that 'ID/EffDate' pair is unique for all table (at least the opposite doesn't make sense).

SELECT TOP 1 Rate FROM the_table
WHERE ID=whatever AND EffDate <='whatever'
ORDER BY EffDate DESC
if I read you right.
(edited to suit my idea of ms-sql which I have no idea about).

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

I have Multiple logical records in one db row, how do I split them into separate rows? - tsql

Related

T-SQL vlookup with fake calendar table?

Randomly select observations separately for each column in SQL

TSQL, Pivot rows into single columns

Kdb q query data from one table based on the data from another table without join

SELECT record based upon dates

Categories

Resources