KDB/Q add two table by rows

KDB/Q add two table by rows - kdb

I have two table with the same column
I want to create one table with the rows of each two table like
ask ask_qty exchange_name_ask bid bid_qty exchange_name_bid
0 19166.73 0.0260 b'Gate' 19164.61 0.1042 b'Gate'
1 19167.21 0.0521 b'Gate' 19164.16 0.0103 b'Gate'
2 19167.63 0.1200 b'Gate' 19163.92 0.0296 b'Gate'
3 19168.27 0.1304 b'Gate' 19162.39 0.1304 b'Gate'
AND
ask ask_qty exchange_name_ask bid bid_qty exchange_name_bid
0 19169.13 0.1200 b'CoinBase' 19159.90 0.1200 b'CoinBase'
1 19171.36 0.2608 b'CoinBase' 19158.95 0.0291 b'CoinBase'
2 19172.18 0.5215 b'CoinBase' 19158.69 0.0106 b'CoinBase'
3 19173.59 0.0102 b'CoinBase' 19157.86 0.2609 b'CoinBase'
GET
ask ask_qty exchange_name_ask bid bid_qty exchange_name_bid
0 19166.73 0.0260 b'Gate' 19164.61 0.1042 b'Gate'
1 19167.21 0.0521 b'Gate' 19164.16 0.0103 b'Gate'
2 19167.63 0.1200 b'Gate' 19163.92 0.0296 b'Gate'
3 19168.27 0.1304 b'Gate' 19162.39 0.1304 b'Gate'
4 19169.13 0.1200 b'CoinBase' 19159.90 0.1200 b'CoinBase'
5 19171.36 0.2608 b'CoinBase' 19158.95 0.0291 b'CoinBase'
6 19172.18 0.5215 b'CoinBase' 19158.69 0.0106 b'CoinBase'
7 19173.59 0.0102 b'CoinBase' 19157.86 0.2609 b'CoinBase'
Thanks

Given the tables have matching schema, just join (,) should suffice:
gateTbl,coinbaseTbl

assume tables a and b
a uj b
or
a,b

you can make use of union join here as your tables are not keyed, the tables are simply appended to each other.
https://code.kx.com/q/ref/uj/ has more useful information on uj.
As suggested by others, you can do table1,table2 or ,[table1;table2] as your columns and schema are the same.

Related

Query to Get all Row Combinations

I want a query to retrive all row combinations from the below data set
This is my original Dataset.
SId Sequence RId
2976 1 100
4576 1 100
19472 1 100
80591 1 100
58811 1 100
70859 1 100
170941 2 100
167578 2 100
131885 2 100
117608 2 100
78117 1 101
69481 1 101
70987 2 101
46857 2 101
28396 2 101
From this data set I want the result based on RId and combination of each sequence of 1 and 2.
So For the above case for RId 100 there should be 24 combinations like
the below data:
RSId Sid Sequence RId
1 2976 1 100
1 170941 2 100
2 2976 1 100
2 167578 2 100
3 2976 1 100
3 131885 2 100
the below is the input table format
CREATE TABLE #temp ( SId INT,Sequence INT,Rid INT)
INSERT into #temp values (2976,1,100)
insert into #temp values (4576,1,100)
insert into #temp values (19472,1,100)
insert into #temp values (80591,1,100)
insert into #temp values (58811,1,100)
insert into #temp values (70859,1,100)
insert into #temp values (170941,2,100)
insert into #temp values (167578,2,100)
insert into #temp values (131885,2,100)
insert into #temp values (117608,2,100)
insert into #temp values (78117,1,101)
insert into #temp values (69481,1,101)
insert into #temp values (70987,2,101)
insert into #temp values (46857,2,101)
insert into #temp values (28396,2,101)
SELECT * FROM #Temp
the result should be of the below table format:
RSId Sid Sequence RId
1 2976 1 100
1 170941 2 100
2 2976 1 100
2 167578 2 100
3 2976 1 100
3 131885 2 100
4 2976 1 100
4 117608 2 100
5 4576 1 100
5 170941 2 100
6 4576 1 100
6 167578 2 100
7 4576 1 100
7 131885 2 100
8 4576 1 100
8 117608 2 100
9 19472 1 100
9 170941 2 100
10 19472 1 100
10 167578 2 100
11 19472 1 100
11 131885 2 100
12 19472 1 100
12 117608 2 100
13 80591 1 100
13 170941 2 100
14 80591 1 100
14 167578 2 100
15 80591 1 100
15 131885 2 100
16 80591 1 100
16 117608 2 100
17 58811 1 100
17 170941 2 100
18 58811 1 100
18 167578 2 100
19 58811 1 100
19 131885 2 100
20 58811 1 100
20 117608 2 100
21 70859 1 100
21 117608 2 100
22 70859 1 100
22 170941 2 100
23 70859 1 100
23 167578 2 100
24 70859 1 100
24 131885 2 100

One way to do it is to use common table expressions, cross join and union.
It might be a bit cumbersome but it should have pretty good performance:
DECLARE #Rid int = 100;
With cte1 As
(
SELECT SID, Sequence, Rid
FROM #Temp
WHERE Sequence = 1
AND Rid = #Rid
), cte2 AS
(
SELECT SID, Sequence, Rid
FROM #Temp
WHERE Sequence = 2
AND Rid = #Rid
), cteCJ AS
(
SELECT Cte1.Sid As Sid1, Cte1.Sequence As Seq1, Cte1.Rid As Rid,
Cte2.Sid As Sid2, Cte2.Sequence As Seq2,
ROW_NUMBER() OVER(ORDER BY Cte1.Sid) As RSId
FROM Cte1
CROSS JOIN Cte2
)
SELECT RSId, Sid1 As Sid, Seq1 As Sequence, Rid
FROM cteCJ
UNION
SELECT RSId, sid2, Seq2, Rid
FROM cteCJ
ORDER BY RSId, Seq1
Results:
RSId Sid Sequence Rid
1 2976 1 100
1 170941 2 100
2 2976 1 100
2 167578 2 100
3 2976 1 100
3 131885 2 100
4 2976 1 100
4 117608 2 100
5 4576 1 100
5 170941 2 100
6 4576 1 100
6 167578 2 100
7 4576 1 100
7 131885 2 100
8 4576 1 100
8 117608 2 100
9 19472 1 100
9 170941 2 100
10 19472 1 100
10 167578 2 100
11 19472 1 100
11 131885 2 100
12 19472 1 100
12 117608 2 100
13 58811 1 100
13 170941 2 100
14 58811 1 100
14 167578 2 100
15 58811 1 100
15 131885 2 100
16 58811 1 100
16 117608 2 100
17 70859 1 100
17 170941 2 100
18 70859 1 100
18 167578 2 100
19 70859 1 100
19 131885 2 100
20 70859 1 100
20 117608 2 100
21 80591 1 100
21 170941 2 100
22 80591 1 100
22 167578 2 100
23 80591 1 100
23 131885 2 100
24 80591 1 100
24 117608 2 100

TSQL Order BY on occasion doesn't order correctly

TSQL MSSQL 2008r2
I'm re-writing the question to try and make it clear what the issue is that I'm trying to explain.
I've got a stored proc that takes 3 parameters. VehicleKey, StartDate and EndDateTime. I'm querying a Data Warehouse db. So the data shouldn't change.
When the proc is called with the same parameters then most of the time the results will be as expected but on some random occasions, with those same parameters, the results differ. I'm querying a Data WH so the data doesn't change.
The problem is with the dynamic derived column "Island".
It's completely random. The proc can be executed 20 times and give the expected results and then the next 2 will give incorrect results.
There can be 1 or more VehicleKey/DriverKey combinations in a given date range.
This is the problem query
SELECT
A.VehicleKey
,A.NodeId
,A.DriverKey
,MIN(A.StartTrip) 'StartTrip'
,MAX(A.EndTrip) 'EndTrip'
,SUM(A.PrivOdo) 'Private'
,SUM(A.BusOdo) 'Business'
,SUM(A.TravOdo) 'Travel'
,SUM(A.PrivOdo + A.BusOdo + A.TravOdo )'Total'
FROM
(
SELECT
Island = ( ROW_NUMBER() OVER (PARTITION BY T.VehicleKey ORDER BY MONTH(StartTrip)) ) - ( ROW_NUMBER() OVER (PARTITION BY T.VehicleKey, T.DriverKey ORDER BY T.StartTrip) )
,NodeId
,VehicleKey
,DriverKey
,StartTrip
,EndTrip
,BusOdo
,PrivOdo
,TravOdo
FROM
#xYTD_BPTotals T
) AS A
GROUP BY
A.Island
,A.VehicleKey
,A.NodeId
,A.DriverKey
ORDER BY
A.VehicleKey
,MIN(A.StartTrip);
I am of the understanding that the ORDER BY should be on the outside of the derived table for it to take effect.
I think I've narrowed it down to the issue presenting itself only when a Vehicle has 2 or more DriverKey combinations.
for example, Parameters VehicleKey 4865, StartDateTime = '2016-01-01', EndDateTime = '2016-10-31'
This is the correct result - including Island column
VehicleKey NodeId DriverKey Island StartTrip EndTrip Private Business Travel Total_
4865 458 0 0 2016-09-06 14:06:08 2016-09-28 17:02:08 54.75 737.83 0 792.58
4865 458 1202 134 2016-09-29 11:10:04 2016-09-30 17:25:51 0 211.32 0 211.32
4865 458 0 27 2016-10-03 07:39:25 2016-10-14 17:00:15 0 579.81 0 579.81
and this is when it's wrong. Parameters VehicleKey 4865, StartDateTime = '2016-01-01', EndDateTime = '2016-10-31'
- including Island column
The first two rows here should be combined.
VehicleKey NodeId DriverKey Island StartTrip EndTrip Private Business Travel Total_
4865 458 0 98 2016-09-06 14:06:08 2016-09-21 09:15:49 0 313.87 0 313.87
4865 458 0 -63 2016-09-21 09:21:10 2016-09-28 17:02:08 54.75 423.96 0 478.71
4865 458 1202 71 2016-09-29 11:10:04 2016-09-30 17:25:51 0 211.32 0 211.32
4865 458 0 27 2016-10-03 07:39:25 2016-10-14 17:00:15 0 579.81 0 579.81
If I show the first few rows from the derived table, I've broken down the "Island" column
SELECT
Island = ( ROW_NUMBER() OVER (PARTITION BY T.VehicleKey ORDER BY MONTH(StartTrip)) ) - ( ROW_NUMBER() OVER (PARTITION BY T.VehicleKey, T.DriverKey ORDER BY T.StartTrip) )
,Island_x =( ROW_NUMBER() OVER (PARTITION BY T.VehicleKey ORDER BY MONTH(StartTrip)) )
,Island_y = ( ROW_NUMBER() OVER (PARTITION BY T.VehicleKey, T.DriverKey ORDER BY T.StartTrip) )
,NodeId
,VehicleKey
,DriverKey
,StartTrip
,EndTrip
,BusOdo
,PrivOdo
,TravOdo
FROM
#xYTD_BPTotals T
The correct result should be
Island Island_x Island_y NodeId VehicleKey DriverKey StartTrip EndTrip BusOdo PrivOdo TravOdo
0 1 1 24901 4865 0 2016-09-06 14:06:08 2016-09-06 14:08:50 0 0 0
0 2 2 24901 4865 0 2016-09-06 15:39:14 2016-09-06 15:40:53 114 0 0
0 3 3 24901 4865 0 2016-09-08 11:06:43 2016-09-08 11:07:23 0 0 0
0 4 4 24901 4865 0 2016-09-08 11:12:03 2016-09-08 11:12:26 20 0 0
0 5 5 24901 4865 0 2016-09-08 11:19:20 2016-09-08 11:19:52 1 0 0
0 6 6 24901 4865 0 2016-09-08 11:26:58 2016-09-08 11:27:56 88 0 0
0 7 7 24901 4865 0 2016-09-08 11:33:40 2016-09-08 11:35:02 1 0 0
0 8 8 24901 4865 0 2016-09-12 09:08:53 2016-09-12 09:10:42 34 0 0
but I sometimes get this with the same input paramaters.
Island Island_x Island_y NodeId VehicleKey DriverKey StartTrip EndTrip BusOdo PrivOdo TravOdo
98 1 1 24901 4865 0 2016-09-06 14:06:08 2016-09-06 14:08:50 0 0 0
98 2 2 24901 4865 0 2016-09-06 15:39:14 2016-09-06 15:40:53 114 0 0
98 3 3 24901 4865 0 2016-09-08 11:06:43 2016-09-08 11:07:23 0 0 0
98 4 4 24901 4865 0 2016-09-08 11:12:03 2016-09-08 11:12:26 20 0 0
98 5 5 24901 4865 0 2016-09-08 11:19:20 2016-09-08 11:19:52 1 0 0
98 6 6 24901 4865 0 2016-09-08 11:26:58 2016-09-08 11:27:56 88 0 0
98 7 7 24901 4865 0 2016-09-08 11:33:40 2016-09-08 11:35:02 1 0 0
98 8 8 24901 4865 0 2016-09-12 09:08:53 2016-09-12 09:10:42 34 0 0
Why is the "Island" calculated column wrong? 1-1 = 0 not 98.
Where am I going wrong?

EDIT - #YourData now looks like your raw table
Declare #YourTable table (VehicleKey int,NodeId int,DriverKey int,StartTrip datetime,EndTrip datetime,PrivOdo decimal(10,2),BusOdo decimal(10,2), TravOdo decimal(10,2))
Insert Into #YourTable values
(4865,458,0 ,'2016-09-06 14:06:08','2016-09-21 09:15:49',0 ,313.87,0),
(4865,458,0 ,'2016-09-21 09:21:10','2016-09-28 17:02:08',54.75,423.96,0),
(4865,458,1202,'2016-09-29 11:10:04','2016-09-30 17:25:51',0 ,211.32,0),
(4865,458,0 ,'2016-10-03 07:39:25','2016-10-14 17:00:15',0 ,579.81,0)
Select VehicleKey
,NodeID
,VehicleKey
,DriverKey
,StartTrip = min(StartTrip)
,EndTrip = max(EndTrip)
,Private = sum(PrivOdo)
,Business = sum(BusOdo)
,Travel = sum(TravOdo)
,Total = sum(PrivOdo + BusOdo + TravOdo )
From (
Select Island = ( ROW_NUMBER() OVER (PARTITION BY VehicleKey ORDER BY MONTH(StartTrip)) ) - ( ROW_NUMBER() OVER (PARTITION BY VehicleKey, DriverKey ORDER BY StartTrip) )
,*
From #YourTable
) A
Group By Island,VehicleKey,NodeID,VehicleKey,DriverKey
Order By min(StartTrip)
Returns
FYI - The sub-query produces

Postgresql: Select unique rows from two tables

I have this two tables with values. I need to combine all unique values to 1 table. So the result must be:
reffnum leftb rightb desc date
tes1 1 0 Tes 1 14/10/2016
tes 1 10 0 Tes siji 14/10/2016
tes2 0 12 Tes nomor 2 14/10/2016
tes 3 0 1002 Data baru 15/10/2016
tes1 0 11 Tes 1 baru 15/10/2016
tes1 0 123 Tes 123 15/10/2016
Please help, thanks in advance
Table t1:
reffnum leftb rightb desc timestamp
tes1 1 0 Tes 1 2016-10-12 13:47:06.945581
tes1 1 0 Tes siji 2016-10-12 13:47:06.921685
tes 1 10 0 Tes siji 2016-10-03 14:55:32.126814
tes2 0 12 Tes nomor 2 2016-10-03 14:55:32.11081
tes 3 0 1002 Data baru 2016-10-03 14:55:32.094884
tes1 0 11 Tes 1 baru 2016-10-03 14:55:32.078833
And this t2:
reffnum leftb righb desc date
tes1 1 0 Tes 1 2016-10-03 14:49:15.817506
tes1 1 0 Tes siji 2016-10-03 14:33:40.285849
tes 1 10 0 Tes siji 2016-10-03 14:33:40.269887
tes2 0 12 Tes nomor 2 2016-10-03 14:30:57.376459
tes1 0 123 Tes 123 2016-10-03 14:33:40.285849
tes2 0 12 Tes no2 2016-10-03 14:33:40.269887
Edited:
This is the closest I can do:
I should find unique values in t2 that not in t1: select * from t2 except select * from t1
Then I insert values in no. 1 to t1
But now, the problem is, query in no. 1 throws an error:
[Err] ERROR: EXCEPT types smallint and timestamp without time zone cannot be matched

The union operator removes duplicates, so you can use a pretty straight-forward query:
SELECT * FROM table1
UNION
SELECT * FROM table2

What's wrong with this Partition By

I have a query that uses a partition by over a time column, however the result is a bit unexpected, what's wrong here ? Why do I get more than one 1 on RN ? (one for 21:00:02:100 and the other for 21:00:02:600)
SELECT TOP 500
ROW_NUMBER() OVER(
PARTITION BY [Date], CAST([Time] AS Time(0))
ORDER BY [DATE] ASC, CAST([Time] AS Time(0)) ASC
) RN,
[DATE],
[Time]
FROM [DB]..[TABLE]
ORDER BY [Date] ASC,
[Time] ASC,
[RN] ASC
Results:
**1 2010-10-03 21:00:02.100**
2 2010-10-03 21:00:02.100
3 2010-10-03 21:00:02.200
4 2010-10-03 21:00:02.200
5 2010-10-03 21:00:02.200
4 2010-10-03 21:00:02.500
**1 2010-10-03 21:00:02.600**
2 2010-10-03 21:00:02.600
3 2010-10-03 21:00:02.600
5 2010-10-03 21:00:02.700
6 2010-10-03 21:00:02.700
7 2010-10-03 21:00:02.700
8 2010-10-03 21:00:02.700
9 2010-10-03 21:00:02.700
10 2010-10-03 21:00:02.700
11 2010-10-03 21:00:02.700
12 2010-10-03 21:00:02.700
13 2010-10-03 21:00:02.700
14 2010-10-03 21:00:02.700
15 2010-10-03 21:00:02.700
16 2010-10-03 21:00:02.700
17 2010-10-03 21:00:02.700
18 2010-10-03 21:00:02.700
19 2010-10-03 21:00:02.700
20 2010-10-03 21:00:02.700
21 2010-10-03 21:00:02.700
22 2010-10-03 21:00:02.700

You are using CASTing to time(0) for your ordering which rounds (truncates?) the time to second precision. It is working exactly as advertised...
Edit:
It makes no sense to have the same PARTITION BY and ORDER BY...
My guess is that you are trying to partition by the second, and want rows numbers in that interval
Try this:
ROW_NUMBER() OVER(
PARTITION BY [Date], CAST([Time] AS Time(0))
ORDER BY [DATE], [Time]
) RN
If you get duplicates row numbers crossing the 0.5 second boundary, use this to force truncate rather then ROUND
ROW_NUMBER() OVER(
PARTITION BY [Date], CAST([Time] - '00:00:00.5000' AS Time(0))
ORDER BY [DATE], [Time]
) RN

Thanks a lot for your feedback, turns out that cast is rounding it and therefore its not working (giving me two times the 1). Subtracting from [TIME] didn't work for me, got an error. At the end I used this code to get it working as wanted:
ROW_NUMBER() OVER(
PARTITION BY CONVERT(nvarchar(8), [Time], 8)
ORDER BY [Date], [Time]) RN
FROM [DB]..[TABLE]

SQL Server "Group By" for an interesting case

I have a table like that
ID ORDER TEAM TIME
IL-1 1 A_Team 11
IL-1 2 A_Team 3
IL-1 3 B_Team 2
IL-1 4 A_Team 1
IL-1 5 A_Team 1
IL-2 1 A_Team 5
IL-2 2 C_Team 3
What I want is grouping the same named teams which are also sequential teams (Which is according to the ORDER column)
So the result table should look like
IL-1 1 A_Team 14
IL-1 2 B_Team 2
IL-1 3 A_Team 2
IL-2 1 A_Team 5
IL-2 2 C_Team 3
Thanks
Edit: Depending on the nang's answer, I added the ID column to my table.

There is a problem in your example. Why should rows #6 and #2 not be "sequential teams"?
1 A_Team 5
2 A_Team 3
However, maybe the following is usefull for you:
select neworder, name, sum([time]) from (
select min(n1.[order]) neworder, n2.[order], n2.name, n2.[time]
from mytable n1, mytable n2
where n1.Name = n2.Name
and n2.[order] >= n1.[order]
and not exists(select 1 from mytable n3 where n3.name != n1.name and n3.[order] > n1.[order] and n3.[order] < n2.[order])
group by n2.[order], n2.name, n2.[time]) x
group by neworder, name
Result:
neworder name (No column name)
1 A_Team 19
4 A_Team 2
3 B_Team 2
2 C_Team 3

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

KDB/Q add two table by rows - kdb

Given the tables have matching schema, just join (,) should suffice: gateTbl,coinbaseTbl

assume tables a and b a uj b or a,b

you can make use of union join here as your tables are not keyed, the tables are simply appended to each other. https://code.kx.com/q/ref/uj/ has more useful information on uj. As suggested by others, you can do table1,table2 or ,[table1;table2] as your columns and schema are the same.

Related

Query to Get all Row Combinations

TSQL Order BY on occasion doesn't order correctly

Postgresql: Select unique rows from two tables

What's wrong with this Partition By

SQL Server "Group By" for an interesting case

Categories

Resources