SUM postgres not like expected - postgresql

after the query I would like to obtain the SUM of account_move_line.balance AS ammounteur
when account_id, partner_id, invoice_id and account_account.code were =
SELECT
account_move_line.name, account_move_line.account_id,
account_move_line.partner_id, account_move_line.invoice_id,
account_move_line.journal_id,
CASE
WHEN account_account.code LIKE '40%%'
THEN '400000'
WHEN account_account.code LIKE '44%%'
THEN '440000'
ELSE account_account.code
END AS ACCOUNTGL,
CASE
WHEN account_account.code = '702000'
THEN SUM(account_move_line.balance)
ELSE (round(account_move_line.balance, 2))
END AS AMOUNTEUR
FROM
public.account_move_line
JOIN
account_account ON (account_account.id = account_move_line.account_id)
WHERE
(account_move_line.date BETWEEN '2020-03-01' AND '2020-03-31')
GROUP BY
account_move_line.account_id, account_move_line.partner_id,
account_move_line.invoice_id, account_move_line.journal_id,
account_account.code, account_move_line.balance, account_move_line.name
ORDER BY
account_move_line.account_id, account_move_line.invoice_id;
The result I get:
NAME account_id Partner_id Invoice_id J_id accountgl amounteur
"Taxe led" 186 2476 1883 1 "702000" -0.83
"Taxe eclairage" 186 2476 1883 1 "702000" -0.11
"Taxe gros et petit blanc" 186 3090 1884 1 "702000" -0.83
"Taxe eclairage" 186 2077 1885 1 "702000" 0.25
"Taxe eclairage" 186 2077 1887 1 "702000" -0.25
"Taxe eclairage" 186 2077 1888 1 "702000" -0.02
"Taxe led" 186 2481 1916 1 "702000" -0.83
"Taxe eclairage" 186 2481 1916 1 "702000" -0.52
I expected
NAME account_id Partner_id Invoice_id J_id accountgl amounteur
186 2476 1883 1 "702000" -0.94
"Taxe gros et petit blanc" 186 3090 1884 1 "702000" -0.83
"Taxe eclairage" 186 2077 1885 1 "702000" 0.25
"Taxe eclairage" 186 2077 1887 1 "702000" -0.25
"Taxe eclairage" 186 2077 1888 1 "702000" -0.02
186 2481 1916 1 "702000" -1.35
Thanks

I'm guessing, but it seems you expect the results to be grouped by account_id, partner_id, invoice_id, and perhaps journal_id. But you've told it to group by so many more columns.
account_move_line.account_id,
account_move_line.partner_id,
account_move_line.invoice_id,
account_move_line.journal_id,
account_account.code,
account_move_line.balance,
account_move_line.name
To be grouped, a row would have to have the same account, partner, invoice, and journal IDs. Plus the same code, balance, and name.
Cut your group by back to just the four IDs.
This will mean you cannot select some columns because the group has several values for that column. For example, the name. Each group will contain several names, no single name can be selected.

Related

Problem with using two different sum()'s on same column

I'm trying to get two different counts on the same column. The first count works fine with the constraints given, but the second count is not counting correctly. I have two tables, which are DailyFieldRecord and AB953. DailyFieldRecord contains: DailyFieldRecordID and ActivityCodeID. The AB953 table contains:DailyFieldRecordID, ItemID, and GroupID. Count1 will return the count of the DailyfieldrecordID's that contain ActivityCodeID=387 and GroupID=260 and that DON'T have ItemID in (1302,1303,1305,1306). Count2 will return the count of the DailyfieldrecordID's that contain ActivityCodeID=387 and GroupID=260 and that HAVE ItemID in (1302,1303,1305,1306). I'm trying to only get the count of the GroupID =260 for each DailyFieldRecordID that corresponds with the above constraints.
DailyFieldRecord: AB953:
DailyFieldRecordID ActivityCodeID DailyFieldRecordID: ItemID: GroupID:
657 387 657 1305 210
888 420 657 1333 260
672 387 657 1335 260
657 1302 210
657 1334 260
657 1111 111
888 1302 210
888 1336 260
672 1327 260
672 1334 260
672 1335 260
672 1322 260
672 1222 420
Expected Output:
Count1: Count2:
4 3
Count1 is supposed to count: Count2 is supposed to count:
672 1327 260 657 1333 260
672 1334 260 657 1335 260
672 1335 260 657 1334 260
672 1322 260
Current Count:
Count1: Count2:
4 6
SELECT sum(CASE WHEN ex=0 THEN 1 ELSE 0 END) AS COUNT1,sum(EX) AS COUNT2
FROM AB953 ab
JOIN DailyFieldRecord dfr
ON dfr.DailyFieldRecordID = ab.DailyFieldRecordID
JOIN ( SELECT AB1.DailyFieldRecordID,sum(CASE WHEN AB1.ItemID IN
(1302,1303,1305,1306) THEN 1 ELSE 0 END) AS EX
FROM AB953 AB1
GROUP BY AB1.DailyFieldRecordID) T
ON dfr.DailyFieldRecordID = T.DailyFieldRecordID
WHERE dfr.ActivityCodeID = 387
AND ab.GroupID = 260
First need to identify all of the DailyFieldRecordIDs that have any of the ItemIDs specified, which is what the sub-query here is doing. Then you can determine if the record in the outer query belongs to Count1 or Count2 based on where or not it exists in the result set of the subquery.
select sum(case when i.DailyFieldRecordID is null then 1 else 0 end) as Count1
, sum(case when i.DailyFieldRecordID is null then 0 else 1 end) as Count2
from AB953 as ab
inner join DailyFieldRecord as dfr on ab.DailyFieldRecordID = dfr.DailyFieldRecordID
left join (
select distinct a.DailyFieldRecordID
from AB953 as a
where a.ItemID in (1302, 1303, 1305, 1306)
) as i on ab.DailyFieldRecordID = i.DailyFieldRecordID
where dfr.ActivityCodeID = 387
and ab.GroupID = 260
Final Output:
+--------+--------+
| Count1 | Count2 |
+--------+--------+
| 4 | 3 |
+--------+--------+

Add unique rows for each group when similar group repeats after certain rows

Hi Can anyone help me please to get unique group number?
I need to give unique rows for each group even when same group repeats after some groups.
I have following data:
id version product startdate enddate
123 0 2443 2010/09/01 2011/01/02
123 1 131 2011/01/03 2011/03/09
123 2 131 2011/08/10 2012/09/10
123 3 3009 2012/09/11 2014/03/31
123 4 668 2014/04/01 2014/04/30
123 5 668 2014/05/01 2016/01/01
123 6 668 2016/01/02 2017/09/08
123 7 131 2017/09/09 2017/10/10
123 8 131 2018/10/11 2019/01/01
123 9 550 2019/01/02 2099/01/01
select *,
dense_rank()over(partition by id order by id,product)
from table
Expected results:
id version product startdate enddate count
123 0 2443 2010/09/01 2011/01/02 1
123 1 131 2011/01/03 2011/03/09 2
123 2 131 2011/08/10 2012/09/10 2
123 3 3009 2012/09/11 2014/03/31 3
123 4 668 2014/04/01 2014/04/30 4
123 5 668 2014/05/01 2016/01/01 4
123 6 668 2016/01/02 2017/09/08 4
123 7 131 2017/09/09 2017/10/10 5
123 8 131 2018/10/11 2019/01/01 5
123 9 550 2019/01/02 2099/01/01 6
Try the following
SELECT
id,version,product,startdate,enddate,
1+SUM(v)OVER(PARTITION BY id ORDER BY version) n
FROM
(
SELECT
*,
IIF(LAG(product)OVER(PARTITION BY id ORDER BY version)<>product,1,0) v
FROM TestTable
) q

How to match date and string from 2 lists (KDB)?

I have two lists:
data:
dt sym bid ask
2017.01.01D05:00:09.140745000 AAPL 101.20 101.30
2017.01.01D05:00:09.284281800 GOOG 801.00 802.00
2017.01.02D05:00:09.824847299 AAPL 101.30 101.40
info:
date sym shares divisor
2017.01.01 AAPL 500 2
2017.01.01 GOOG 100 1
2017.01.02 AAPL 200 2
I need to append from "info" the shares and divisor values for each ticker based on the date. How can I achieve this? Below is an example:
result:
dt sym bid ask shares divisor
2017.01.01D05:00:09.140745000 AAPL 101.20 101.30 500 2
2017.01.01D05:00:09.284281800 GOOG 801.00 802.00 100 1
2017.01.02D05:00:09.824847299 AAPL 101.30 101.40 200 2
If matching based on an exact date match then you can use lj. For this to work you will need to create a date column in the data table and key info by date and sym. Like so:
(update date:`date$dt from data)lj 2!info
dt sym price date shares divisor
---------------------------------------------------------------------
2018.02.04D17:25:06.658216000 AAPL 103.9275 2018.02.04 500 2
2018.02.04D17:25:06.658216000 GOOG 105.1709 2018.02.04 100 1
2018.02.05D17:25:06.658217000 AAPL 105.1598 2018.02.05 200 2
2018.02.05D17:25:06.658217000 GOOG 104.0666 2018.02.05
You can then delete the date column from this output.
It might be useful for you to use the stepped attribute [ http://code.kx.com/q/cookbook/temporal-data/#stepped-attribute ]
This will allow you to have e.g. missing dates from the info table and use the "most recent" date instead (so you don't have to have data for every sym every day). For example, without stepped attribute:
q)data:([] dt:(10?2017.01.01+til 2)+10?.z.t;sym:10?`AAPL`GOOG;bid:100+10?5;ask:105+10?5)
q)info:([] date:2017.01.01 2017.01.01 2017.01.02;sym:`AAPL`GOOG`AAPL;shares:500 100 200;divisor:2 1 2)
q)(update date:`date$dt from data) lj 2!info
dt sym bid ask date shares divisor
--------------------------------------------------------------------
2017.01.01D04:04:03.440000000 GOOG 104 105 2017.01.01 100 1
2017.01.01D14:00:02.748000000 GOOG 104 105 2017.01.01 100 1
2017.01.02D09:34:52.869000000 GOOG 102 106 2017.01.02
2017.01.02D16:44:16.648000000 AAPL 100 107 2017.01.02 200 2
2017.01.01D08:48:23.285000000 AAPL 102 108 2017.01.01 500 2
2017.01.02D02:31:11.038000000 AAPL 104 109 2017.01.02 200 2
2017.01.01D05:50:50.463000000 GOOG 104 109 2017.01.01 100 1
2017.01.02D02:13:45.275000000 AAPL 101 107 2017.01.02 200 2
2017.01.01D10:25:30.322000000 AAPL 104 109 2017.01.01 500 2
2017.01.01D14:51:12.687000000 AAPL 103 109 2017.01.01 500 2
Note the nulls for GOOG on 2017.01.02. With stepped attribute:
q)(update date:`date$dt from data) lj `s#2!`sym xasc `sym`date xcols info
dt sym bid ask date shares divisor
--------------------------------------------------------------------
2017.01.01D04:04:03.440000000 GOOG 104 105 2017.01.01 100 1
2017.01.01D14:00:02.748000000 GOOG 104 105 2017.01.01 100 1
2017.01.02D09:34:52.869000000 GOOG 102 106 2017.01.02 100 1
2017.01.02D16:44:16.648000000 AAPL 100 107 2017.01.02 200 2
2017.01.01D08:48:23.285000000 AAPL 102 108 2017.01.01 500 2
2017.01.02D02:31:11.038000000 AAPL 104 109 2017.01.02 200 2
2017.01.01D05:50:50.463000000 GOOG 104 109 2017.01.01 100 1
2017.01.02D02:13:45.275000000 AAPL 101 107 2017.01.02 200 2
2017.01.01D10:25:30.322000000 AAPL 104 109 2017.01.01 500 2
2017.01.01D14:51:12.687000000 AAPL 103 109 2017.01.01 500 2
Here, GOOG gets the values for 2017.01.01 as there is no new value on 2017.01.02
Could possibly use an aj as well.
q)aj[`date`sym;update date:`date$dt from data;info]
dt sym bid ask date shares divisor
--------------------------------------------------------------------
2017.01.02D07:57:14.764000000 GOOG 101 109 2017.01.02 200 2
2017.01.02D02:31:39.330000000 AAPL 100 105 2017.01.02 200 2
2017.01.02D04:25:17.604000000 AAPL 102 107 2017.01.02 200 2
2017.01.01D01:47:51.333000000 GOOG 104 106 2017.01.01 100 1
2017.01.02D15:50:12.140000000 AAPL 101 107 2017.01.02 200 2
2017.01.01D02:59:16.636000000 GOOG 102 106 2017.01.01 100 1
2017.01.01D14:35:31.860000000 AAPL 100 107 2017.01.01 500 2
2017.01.01D16:36:29.214000000 GOOG 101 108 2017.01.01 100 1
2017.01.01D14:01:18.498000000 GOOG 101 107 2017.01.01 100 1
2017.01.02D08:31:52.958000000 AAPL 102 109 2017.01.02 200 2

kdb getting float from integer division

I have a table
id, turnover, qty
and I want to query
select sum turnover, sum qty, (sum turnover) div (sum qty) by id from Table
However, the the resulting value from the division seems to be an int and shows 0 (as the unit price is a lot smaller than 1). I tried to cast the results into a float, but that doesnt help
select sum turnover, sum qty, `float$(`float$(sum turnover) div `float$(sum qty)) by id from Table.
How can I get a float in return?
Also, as a side question. How can I name the column (equivalently to sql select sum(x) as my_column_name ...)
That's the expected output from div, you should use % to divide numbers - which always returns a float.
q)200 div 8.5
22
q)200%8.5
23.52941
q)
Reference here;
Div: http://code.kx.com/q/ref/arith-integer/#div
%: http://code.kx.com/q/ref/arith-float/#divide
*edit
Apologies - forgot to reference the rest of your question. In your example, you are calculating the sum turnover and sum qty twice - you will want to avoid that, if you're dealing with a lot of records.
How is this;
q)show trade:([] id:(`$"A",'string[til 10]);turnover:10?til 10; qty:10?100+til 200)
id turnover qty
---------------
A0 4 152
A1 4 238
A2 2 298
A3 2 268
A4 7 246
A5 2 252
A6 0 279
A7 5 286
A8 7 245
A9 5 191
q)update toverq:sumT%sumQ from select sumT:sum turnover,sumQ:sum qty by id from trade
id| sumT sumQ toverq
--| ---------------------
A0| 4 152 0.02631579
A1| 4 238 0.01680672
A2| 2 298 0.006711409
A3| 2 268 0.007462687
A4| 7 246 0.02845528
A5| 2 252 0.007936508
A6| 0 279 0
A7| 5 286 0.01748252
A8| 7 245 0.02857143
A9| 5 191 0.02617801

Adding a number to a field according to some specific condition

I have the Following Data
CompId PersonelNo EduId RecordsDay DateEs
1 1000 1 2 1370
1 1000 2 10 1370
1 1002 2 5 1380
1 1003 1 4 1391
1 1003 2 7 1391
I want to add (1392-1390=2) for RecordsDay of the Maximum EduID and Records which DateEs are less than or equal to 1390 and add (DateEs -1390) for RecordsDay the Maximum EduID and Records with DateEs Bigger than 1390
So the Data would be like this
CompId PersonelNo EduId RecordsDay DateEs
1 1000 1 2 1370 // record is the same becuase eduID is not Max for this Personel
1 1000 2 12 1370 // this is max EduId for this personel and DateEs is less than 1390 so (1392-1390) +10 = 12
1 1002 2 7 1380 //this is the only record for this personel and DateEs is less than 1390(1392-1390) +5 = 7
1 1003 1 4 1391 // record is the same becuase eduID is not Max for this Personel
1 1003 2 8 1391 // this is max EduId for this personel and DateEs is Greater than 1390 so (1391-1390) +7 = 8
I want to have TSQl for it. I am working on it, but can write it up to now
You can try:
SELECT CASE
WHEN [EduId] = MAX(EduId) OVER (Partition by PersonelNo) AND DateEs <= 1390 THEN RecordsDay + 2
WHEN [EduId] = MAX(EduId) OVER (Partition by PersonelNo) AND DateEs > 1390 THEN RecordsDay + (DateEs -1390) END