T_SQL counting particular values in one row with multiple columns - tsql

I have little problem with counting cells with particular value in one row in MSSMS.
Table looks like
ID
Month
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
11
12
13
14
15
16
...
31
5000
1
null
null
1
1
null
1
1
null
null
2
2
2
2
2
null
null
3
3
3
3
3
null
...
1
I need to count how many cells in one row have value for example 1. In this case it would be 5.
Data represents worker shifts in a month. Be aware that there is a column named month (FK with values 1-12), i don't want to count that in a result.
Column ID is ALWAYS 4 digit number.
Possibility is to use count(case when) but in examples there are only two or three columns not 31. Statement will be very long. Is there any other option to count it?
Thanks for any advices.

I'm going to strongly suggest that you abandon your current table design, and instead store one day per month, per record, not column. That is, use this design:
ID | Date | Value
5000 | 2021-01-01 | NULL
5000 | 2021-01-02 | NULL
5000 | 2021-01-03 | 1
5000 | 2021-01-04 | 1
5000 | 2021-01-05 | NULL
...
5000 | 2021-01-31 | 5
Then use this query:
SELECT
ID,
CONVERT(varchar(7), Date, 120),
COUNT(CASE WHEN Value = 1 THEN 1 END) AS one_cnt
FROM yourTable
GROUP BY
ID,
CONVERT(varchar(7), Date, 120);

Related

Select previous different value PostgreSQL

I have a table:
id
date
value
1
2022-01-01
1
1
2022-01-02
1
1
2022-01-03
2
1
2022-01-04
2
1
2022-01-05
3
1
2022-01-06
3
I want to detect changing of value column by date:
id
date
value
diff
1
2022-01-01
1
null
1
2022-01-02
1
null
1
2022-01-03
2
1
1
2022-01-04
2
1
1
2022-01-05
3
2
1
2022-01-06
3
2
I tried a window function lag(), but all I got:
id
date
value
diff
1
2022-01-01
1
null
1
2022-01-02
1
1
1
2022-01-03
2
1
1
2022-01-04
2
2
1
2022-01-05
3
2
1
2022-01-06
3
3
I am pretty sure you have to do a gaps-and-islands to "group" your changes.
There may be a more concise way to get the result you want, but this is how I would solve this:
with changes as ( -- mark the changes and lag values
select id, date, value,
coalesce((value != lag(value) over w)::int, 1) as changed_flag,
lag(value) over w as last_value
from a_table
window w as (partition by id order by date)
), groupnums as ( -- number the groups, carrying the lag values forward
select id, date, value,
sum(changed_flag) over (partition by id order by date) as group_num,
last_value
from changes
window w as (partition by id order by date)
) -- final query that uses group numbering to return the correct lag value
select id, date, value,
first_value(last_value) over (partition by id, group_num
order by date) as diff
from groupnums;
db<>fiddle here

Pandas's `pct_change()` equivalent in postgres

Let's assume I have a table like this:
id
date
value
1
2021-04-05
100
1
2021-04-04
50
1
2021-04-03
25
1
2021-04-02
5
2
2021-04-05
80
2
2021-04-04
20
2
2021-04-03
15
2
2021-04-02
10
I need to add another column that groups by id and calculates a day-over-day percent change from the value with the date before it. So for this example it would look like this:
id
date
value
pct_change
1
2021-04-05
100
100
1
2021-04-04
50
100
1
2021-04-03
25
400
1
2021-04-02
5
NaN
2
2021-04-05
80
300
2
2021-04-04
20
33.33
2
2021-04-03
15
50
2
2021-04-02
10
NaN
In python this would be easy, I could do something like this:
df['pct_change'] = df.groupby('id').value.pct_change() * 100
But if I wanted to do this in the Postgres database call, I'd suddenly implode with stupidity... does anybody know how to do this?
Maybe something like this?
SELECT
id,
date,
value,
(value - prev_value) / prev_value AS pct_change
FROM
(
SELECT
id,
date,
value,
LAG(value) OVER (PARTITION BY id ORDER BY date
ROWS BETWEEN 1 PRECEDING AND
CURRENT ROW) AS prev_value
FROM
your_table
)
ORDER BY date, id

select firsts closest to date rows

How for this table x:
pk - uid | post_id | date - timestamp | likes
1 1 01.01.2020 1
2 1 01.01.2021 5
3 2 01.01.2020 1
4 4 01.01.2021 3
5 4 01.01.2022 5
6 4 01.01.2023 10
Using two dates (range):
const [start_date, end_date] = ['01.01.2021', '01.01.2022']
Get rows in time range (one for each post_id), closest to start_date
pk - uid | post_id | date - timestamp | likes
2 1 01.01.2021 5
4 4 01.01.2021 3
and end_date (separate query):
pk - uid | post_id | date - timestamp | likes
2 1 01.01.2021 5
3 2 01.01.2020 1
5 4 01.01.2022 5
I was traying to do it like this, but got duplicate post_id's:
SELECT uid, post_id, like
FROM x
WHERE date <= ${end_date}
GROUP BY uid, post_uid
ORDER BY date DESC
Bonus question - i can do this using js, with 2 result arrays above, but maybe i can get result that will be difference of end_date rows likes - start date rows:
pk - uid | post_id | likes
2 1 0
4 4 2
You can use window functions first_value() and last_value() to accomplish this:
select distinct first_value(uid) over w as uid, post_id,
first_value(likes) over w as oldest_likes,
last_value(likes) over w as newest_likes,
last_value(likes) over w - first_value(likes) over w as likes
from x
where ddate between '2021-01-01' and '2022-01-01'
window w as (partition by post_id
order by ddate
rows between unbounded preceding and unbounded following)
;
uid | post_id | oldest_likes | newest_likes | likes
-----+---------+--------------+--------------+-------
2 | 1 | 5 | 5 | 0
4 | 4 | 3 | 5 | 2
(2 rows)

Getting data from alternate dates of same ID column

I've a table data as below, now I need to fetch the record with in same code, where (Value2-Value1)*2 of one row >= (Value2-Value1) of consequtive date row. (all dates are uniform with in all codes)
---------------------------------------
code Date Value1 Value2
---------------------------------------
1 1-1-2018 13 14
1 2-1-2018 14 16
1 4-1-2018 15 18
2 1-1-2019 1 3
2 2-1-2018 2 3
2 4-1-2018 3 7
ex: output needs to be
1 1-1-2018 13 14
as I am begginer to SQL coding, tried my best, but cannot get through with compare only on consequtive dates.
Use a self join.
You can specify all the conditions you've listed in the ON clause:
SELECT T0.code, T0.Date, T0.Value1, T0.Value2
FROM Table As T0
JOIN Table As T1
ON T0.code = T1.code
AND T0.Date = DateAdd(Day, 1, T1.Date)
AND (T0.Value2 - T0.Value1) * 2 >= T1.Value2 - T1.Value1

T-SQL Determine Status Changes in History Table

I have an application which logs changes to records in the "production" table to a "history" table. The history table is basically a field for field copy of the production table, with a few extra columns like last modified date, last modified by user, etc.
This works well because we get a snapshot of the record anytime the record changes. However, it makes it hard to determine unique status changes to a record. An example is below.
BoxID StatusID SubStatusID ModifiedTime
1 4 27 2011-08-11 15:31
1 4 11 2011-08-11 15:28
1 4 11 2011-08-10 09:07
1 5 14 2011-08-09 08:53
1 5 14 2011-08-09 08:19
1 4 11 2011-08-08 14:15
1 4 9 2011-07-27 15:52
1 4 9 2011-07-27 15:49
1 2 8 2011-07-26 12:00
As you can see in the above table (data comes from the real system with other fields removed for brevity and security) BoxID 1 has had 9 changes to the production record. Some of those updates resulted in statuses being changed and some did not, which means other fields (those not shown) have changed.
I need to be able, in TSQL, to extract from this data the unique status changes. The output I am looking for, given the above input table, is below.
BoxID StatusID SubStatusID ModifiedTime
1 4 27 2011-08-11 15:31
1 4 11 2011-08-10 09:07
1 5 14 2011-08-09 08:19
1 4 11 2011-08-08 14:15
1 4 9 2011-07-27 15:49
1 2 8 2011-07-26 12:00
This is not as easy as grouping by StatusID and SubStatusID and taking the min(ModifiedTime) then joining back into the history table since statuses can go backwards as well (see StatusID 4, SubStatusID 11 gets set twice).
Any help would be greatly appreciated!
Does this do work for you
;WITH Boxes_CTE AS
(
SELECT Boxid, StatusID, SubStatusID, ModifiedTime,
ROW_NUMBER() OVER (PARTITION BY Boxid ORDER BY ModifiedTime) AS SEQUENCE
FROM Boxes
)
SELECT b1.Boxid, b1.StatusID, b1.SubStatusID, b1.ModifiedTime
FROM Boxes_CTE b1
LEFT OUTER JOIN Boxes_CTE b2 ON b1.Boxid = b2.Boxid
AND b1.Sequence = b2.Sequence + 1
WHERE b1.StatusID <> b2.StatusID
OR b1.SubStatusID <> b2.SubStatusID
OR b2.StatusID IS NULL
ORDER BY b1.ModifiedTime DESC
;
Select BoxID,StatusID,SubStatusID FROM Staty CurrentStaty
INNER JOIN ON
(
Select BoxID,StatusID,SubStatusID FROM Staty PriorStaty
)
Where Staty.ModifiedTime=
(Select Max(PriorStaty.ModifiedTime) FROM PriorStaty
Where PriortStaty.ModifiedTime<Staty.ModifiedTime)
AND Staty.BoxID=PriorStaty.BoxID
AND NOT (
Staty.StatusID=PriorStaty.StatusID
AND
Staty.SubStatusID=PriorStaty.StatusID
)