PostgreSQL Accruing values incrementally

PostgreSQL Accruing values incrementally - postgresql

I have function in my PostgreSQL database which when called
select * from schema.function_name($1,$2,$3);
returns a table similar to the following
calculation_date | value | increment
2020-01-01 | 1 | 0.5
2020-01-02 | NULL | NULL
2020-01-03 | NULL | NULL
2020-01-04 | NULL | NULL
2020-01-05 | 4 | 2
2020-01-06 | NULL | NULL
2020-01-07 | NULL | NULL
2020-01-08 | 8.5 | 1
As you can see the data returned from this function can be disparate. What I would like to do is query this function so that the value column, when NULL, increases incrementally based off the most recently populated value in the increment column. So in this example, the above table would be transformed into the below
calculation_date | value | increment
2020-01-01 | 1 | 0.5
2020-01-02 | 1.5 | NULL
2020-01-03 | 2.0 | NULL
2020-01-04 | 2.5 | NULL
2020-01-05 | 4 | 2
2020-01-06 | 6 | NULL
2020-01-07 | 8 | NULL
2020-01-08 | 8.5 | 1
If anybody has any suggestions as to how I might go about achieving this output, I'd be grateful. I'm using v10. If any more detail is required, don't hesitate to ask.

I solved the issue in the end by grabbing the last increment value before any null's and multiplying it by the difference in days from that row's date to each subsequent row's date - then adding it on to the previous value. This works in this case as I'm guaranteed a daily time series.

Related

KSQL Table which shows last recent non-null value

At the moment I have a stream with several sensor data, which send their status code once when they update themselves.
This is a one-time value, then the sensor value is zero again until something changes again. So in my table the last value should replace the zero values until a new value is delivered. Currently i create my table like this:
CREATE TABLE LRS WITH
(KAFKA_TOPIC='lrs', KEY_FORMAT='DELIMITED', PARTITIONS=6, REPLICAS=3)
AS SELECT
Device,
LATEST_BY_OFFSET(CAST(Sensor1 AS DOUBLE)),
LATEST_BY_OFFSET(CAST(Sensor2 AS DOUBLE))
FROM RELEVANT_VALUES RELEVANT_VALUES
WINDOW TUMBLING ( SIZE 10 SECONDS )
GROUP BY Device
So instead of behaving like this:
Device | Sensor1 | Sensor2 | Timestamp
1 | null | null | 05:00am
1 | 3 | 2 | 05:01am
1 | null | null | 05:02am
1 | null | null | 05:03am
1 | 2 | 1 | 05:04am
1 | null | null | 05:05am
it should look like this while updating the values:
Device | Sensor1 | Sensor2 | window
1 | null | null | 05:00-01
1 | 3 | 2 | 05:01-02
1 | 3 | 2 | 05:02-03
1 | 3 | 2 | 05:03-04
1 | 2 | 1 | 05:04-05
1 | 2 | 1 | 05:05-06
I basically want to create a Table which always show the latest sent value, which is not null.
Is there a way to achieve this using KSQL ?

You can always add a filter before if you are using streams or with ksql you can do something like WHERE Sensor1 IS NOT NULL

postgres LAG() using wrong previous value

Take the following data and queries:
create table if not exists my_example(a_group varchar(1)
,the_date date
,metric numeric(4,3)
);
INSERT INTO my_example
VALUES ('1','2018-12-14',0.514)
,('1','2018-12-15',0.532)
,('2','2018-12-15',0.252)
,('3','2018-12-14',0.562)
,('3','2018-12-15',0.361);
select
t1.the_date
,t1.a_group
,t1.metric AS current_metric
,lag(t1.metric, 1) OVER (ORDER BY t1.a_group, t1.the_date) AS previous_metric
from
my_example t1;
Which yields the following results:
+------------+---------+----------------+-----------------+
| the_date | a_group | current_metric | previous_metric |
+------------+---------+----------------+-----------------+
| 2018-12-14 | 1 | 0.514 | NULL |
| 2018-12-15 | 1 | 0.532 | 0.514 |
| 2018-12-15 | 2 | 0.252 | 0.532 |
| 2018-12-14 | 3 | 0.562 | 0.252 |
| 2018-12-15 | 3 | 0.361 | 0.562 |
+------------+---------+----------------+-----------------+
I expected the value of previous_metric for the lone a_group==2 row to be NULL. However, as you can see, the value is showing as 0.532, which is being picked up from the previous row. How can I modify this query to yield a value of NULL as I expected?

You need to use LAG with a partition on a_group, since you want the lag values from a specific frame:
SELECT
t1.the_date,
t1.a_group,
t1.metric AS current_metric,
LAG(t1.metric, 1) OVER (PARTITION BY t1.a_group ORDER BY t1.the_date)
AS previous_metric
FROM my_example t1;

Lead on datastage for timestamp?

I need to use the Lead function on Datastage with the column timestamp, but when I need to put the next column substriting-10 minutes, if someone did it before I tried, and I had also, that can't put strings on the timestamp.
How can I solve this problem?
Input
Id | date. |
1 | 01/04/2016 13:45:25|
2 | 10/04/2016 01:25:36|
3 | 26/10/2017 22:35:13|
Output
Id| date. | Befor. |
1 | 01/04/2016 13:45:25 | 10/04/2016 01:15:36|
2 | 10/04/2016 01:25:36 | 26/10/2017 22:35:13|
3 | 26/10/2017 22:35:13 |null |

PostgreSQL Crosstab generate_series of weeks for columns

From a table of "time entries" I'm trying to create a report of weekly totals for each user.
Sample of the table:
+-----+---------+-------------------------+--------------+
| id | user_id | start_time | hours_worked |
+-----+---------+-------------------------+--------------+
| 997 | 6 | 2018-01-01 03:05:00 UTC | 1.0 |
| 996 | 6 | 2017-12-01 05:05:00 UTC | 1.0 |
| 998 | 6 | 2017-12-01 05:05:00 UTC | 1.5 |
| 999 | 20 | 2017-11-15 19:00:00 UTC | 1.0 |
| 995 | 6 | 2017-11-11 20:47:42 UTC | 0.04 |
+-----+---------+-------------------------+--------------+
Right now I can run the following and basically get what I need
SELECT COALESCE(SUM(time_entries.hours_worked),0) AS total,
time_entries.user_id,
week::date
--Using generate_series here to account for weeks with no time entries when
--doing the join
FROM generate_series( (DATE_TRUNC('week', '2017-11-01 00:00:00'::date)),
(DATE_TRUNC('week', '2017-12-31 23:59:59.999999'::date)),
interval '7 day') as week LEFT JOIN time_entries
ON DATE_TRUNC('week', time_entries.start_time) = week
GROUP BY week, time_entries.user_id
ORDER BY week
This will return
+-------+---------+------------+
| total | user_id | week |
+-------+---------+------------+
| 14.08 | 5 | 2017-10-30 |
| 21.92 | 6 | 2017-10-30 |
| 10.92 | 7 | 2017-10-30 |
| 14.26 | 8 | 2017-10-30 |
| 14.78 | 10 | 2017-10-30 |
| 14.08 | 13 | 2017-10-30 |
| 15.83 | 15 | 2017-10-30 |
| 8.75 | 5 | 2017-11-06 |
| 10.53 | 6 | 2017-11-06 |
| 13.73 | 7 | 2017-11-06 |
| 14.26 | 8 | 2017-11-06 |
| 19.45 | 10 | 2017-11-06 |
| 15.95 | 13 | 2017-11-06 |
| 14.16 | 15 | 2017-11-06 |
| 1.00 | 20 | 2017-11-13 |
| 0 | | 2017-11-20 |
| 2.50 | 6 | 2017-11-27 |
| 0 | | 2017-12-04 |
| 0 | | 2017-12-11 |
| 0 | | 2017-12-18 |
| 0 | | 2017-12-25 |
+-------+---------+------------+
However, this is difficult to parse particularly when there's no data for a week. What I would like is a pivot or crosstab table where the weeks are the columns and the rows are the users. And to include nulls from each (for instance if a user had no entries in that week or week without entries from any user).
Something like this
+---------+---------------+--------------+--------------+
| user_id | 2017-10-30 | 2017-11-06 | 2017-11-13 |
+---------+---------------+--------------+--------------+
| 6 | 4.0 | 1.0 | 0 |
| 7 | 4.0 | 1.0 | 0 |
| 8 | 4.0 | 0 | 0 |
| 9 | 0 | 1.0 | 0 |
| 10 | 4.0 | 0.04 | 0 |
+---------+---------------+--------------+--------------+
I've been looking around online and it seems that "dynamically" generating a list of columns for crosstab is difficult. I'd rather not hard code them, which seems weird to do anyway for dates. Or use something like this case with week number.
Should I look for another solution besides crosstab? If I could get the series of weeks for each user including all nulls I think that would be good enough. It just seems that right now my join strategy isn't returning that.

Personally I would use a Date Dimension table and use that table as the basis for the query. I find it far easier to use tabular data for these types of calculations as it leads to SQL that's easier to read and maintain. There's a great article on creating a Date Dimension table in PostgreSQL at https://medium.com/#duffn/creating-a-date-dimension-table-in-postgresql-af3f8e2941ac, though you could get away with a much simpler version of this table.
Ultimately what you would do is use the Date table as the base for the SELECT cols FROM table section and then join against that, or probably use Common Table Expressions, to create the calculations.
I'll write up a solution to that if you would like demonstrating how you could create such a query.

I want to exchange value of 2 fields in crystal report

I want to exchange value of 2 fields in crystal report
(if Columnd5 is null want to get the value of Columnd6 to Columnd5
I use this formula
if isnull ({DataTable1.Columnd5}) then
tonumber ({DataTable1.Columnd6})
else if isnull({DataTable1.Columnd6}) then
0.00
else
tonumber ({DataTable1.Columnd6})
but this one isn't working

I understand that a truth table of your formula result as follows:
| Column5 value | Column6 value | result |
| 5* | 6* | 6 |
| 5* | null | 0 |
| null | 6* | 6 |
| null | null | error |
*means a supposed value as an example, could be any numeric value
But i understand your desired result would be:
| Column5 value | Column6 value | result |
| 5* | 6* | 5 |
| 5* | null | 5 |
| null | 6* | 6 |
| null | null | 0 |
*means a supposed value as an example, could be any numeric value
Did i really understand the problem? If so, i would suggest the following formula:
if not isnull ({DataTable1.Columnd5}) then
{DataTable1.Columnd5}
else if not isnull({DataTable1.Columnd6}) then
{DataTable1.Columnd6}
else
0
Check if it would be necessary to call the ToNumber function as you need. I don't believe so. It depends on your schema.