Dynamic groups in Postgresql data - postgresql

I have a PostgreSQL 9.1 database with a table containing measurement data, which contains setpoint information. For example temperature setpoints. The measurements are taken when at a setpoint, after which the following setpoint will be set. A setpoint can be reached multiple times, e.g. -25, 25, 75, 125, 75, 25 degree Celcius. In this case 25 and 75 degree Celcius are reached multiple times.
Now I want to group the data per setpoint, but not group data together of another setpoint that has the same value but is reached at a later point in time.
Example data:
| id | setpoint | value |<dyn.group>|
| 1 | -25 | 5.324 | 1
| 2 | -25 | 6.343 | 1
| 3 | -25 | 6.432 | 1
| 4 | 25 | 3.432 | 2
| 5 | 25 | 4.472 | 2
| 6 | 25 | 6.221 | 2
| 7 | 75 | 5.142 | 3
| 8 | 75 | 7.922 | 3
| 9 | 75 | 3.832 | 3
|10 | 125 | 8.882 | 4
|11 | 125 | 9.742 | 4
|12 | 125 | 7.632 | 4
|13 | 75 | 5.542 | 5
|14 | 75 | 2.452 | 5
|15 | 75 | 1.332 | 5
|16 | 25 | 3.232 | 6
|17 | 25 | 4.132 | 6
|18 | 25 | 5.432 | 6
Normal group by clauses will fail, because setpoint can be there multiple times, but should not be put together.
Looking with LEAD and LAG to the previous/next values is also not desired, because changes will most likely be similar (e.g. if setpoint 75 is repeated, then most likely the step from 25->75 will also be repeated).
The expected outcome is the 4th column (<dyn.group>). With that column I can for example average on these groups.

It can be done with a custom aggregation function to generate the "group index" value and then a "group by" clause in that value.

Related

postgreSQL question: get data by last date of each record and subtract from last date number of days

Please help me make a request. i'm at a dead end.
There are 2 tables:
“Trains”:
+----+---------+
| id | numbers |
+----+---------+
| 1 | 101 |
| 2 | 102 |
| 3 | 103 |
| 4 | 104 |
| 5 | 105 |
+----+---------+
“Passages”:
+----+--------------+-------+---------------------+
| id | train_number | speed | date_time |
+----+--------------+-------+---------------------+
| 1 | 101 | 26 | 2021-11-10 16:26:30 |
| 2 | 101 | 28 | 2021-11-12 16:26:30 |
| 3 | 102 | 24 | 2021-11-14 16:26:30 |
| 4 | 103 | 27 | 2021-11-15 16:26:30 |
| 5 | 101 | 29 | 2021-11-16 16:26:30 |
+----+--------------+-------+---------------------+
The goal is to go through the train numbers from the Trains table, take from the existing ones from the Passages table by the latest date (date_time) and the number of passages for “the last date for each train” - N days. as I understand date_time - interval "N days". should get something like:
+----+--------+---------------------+----------------+
| id | train | last_passage | count_passages |
+----+--------+---------------------+----------------+
| 1 | 101 | 2021-11-10 16:26:30 | 2 |
| 2 | 102 | 2021-11-14 16:26:30 | 1 |
| 3 | 103 | 2021-11-15 16:26:30 | 1 |
| 4 | 104 | null | 0 |
| 5 | 105 | null | 0 |
+----+--------+---------------------+----------------+
ps: "count_passages" - for example, last passage date minus 4 days
I tried through "where in" but I can’t create the necessary and correct request

create line plot from table in tableau

I've got data that look similar to this
+------------+--------------+---------+---------+---------+---------+
| funding_id | amountOnHand | rate_1d | rate_1w | rate_1m | rate_1y |
+------------+--------------+---------+---------+---------+---------+
| USDOIS | 100 | 18 | 9 | 12 | 2 |
| USDOIS | 106 | 3 | 6 | 16 | 2 |
| USDOIS | 103 | 1 | 7 | 5 | 15 |
| USDOIS | 108 | 1 | 11 | 11 | 13 |
| JPYOIS | 100 | 0 | 19 | 16 | 15 |
| JPYOIS | 106 | 9 | 10 | 10 | 5 |
| JPYOIS | 103 | 4 | 9 | 11 | 6 |
| JPYOIS | 109 | 9 | 18 | 14 | 2 |
| EUROIS | 104 | 3 | 6 | 19 | 6 |
| EUROIS | 103 | 3 | 11 | 19 | 3 |
| EUROIS | 104 | 9 | 1 | 8 | 15 |
| EUROIS | 107 | 18 | 4 | 1 | 5 |
+------------+--------------+---------+---------+---------+---------+
I create weighted rates per funding id using the aggreation: SUM([rate_1d]*[initial])/SUM([initial])
And then use tableau to create a text table and get something similar to the following table (note that sometimes an entire row is null. that's ok)
+------------+------------------+------------------+------------------+------------------+
| funding_id | weighted_rate_1d | weighted_rate_1w | weighted_rate_1m | weighted_rate_1y |
+------------+------------------+------------------+------------------+------------------+
| AUDOIS | 3.0 | 8.0 | 6.0 | 3.0 |
| CADOIS | 20.0 | 3.0 | 17.0 | 0.0 |
| EUROIS | 9.0 | 0.0 | 19.0 | 7.0 |
| GBP CORP | | | | |
| GBPOIS | 12.0 | 19.0 | 14.0 | 16.0 |
| JPYOIS | 10.0 | 7.0 | 18.0 | 3.0 |
| USDOIS | 19.0 | 7.0 | 5.0 | 7.0 |
+------------+------------------+------------------+------------------+------------------+
What I'd like to do is create a line plot showing time on the x axis (so 1d/1w/1m/1y) and rate on the y axis, with each line colored by funding_id
Is there any way to do this?
Go to data source pane -> Select the measures weighted rate 1d, 1w, 1m etc..,
-> Then right click and select pivot this would convert column data to row data i.e.., pivot field names and pivot field values
-> Go back to your worksheet and drag the pivot field names to columns shelf and pivot field values to rows shelf, within the marks card change the chart type option from automatic to line chart and you're done.
Add more aesthetics to your chart as per your requirement.
Hope this helps.!
The solution is to use the "measure name" and "measure value" fields at the bottom of the "dimensions" and "measures" panels in the data selection area (no need to create a table at all)
so the steps are:
1) create 4 aggregations (weighted_rate_1d, etc)
2) create a new worksheet
3) drag Measure Names (found under Dimensions) to the Columns shelf
4) right click it, and filter out everything except the aggregations
5) drag "Measure Values" to the rows shelf
6) in the "marks" area just to the left of the plot (where you can change color, shape, etc) use the drop down menu to change the bar plot to a line plot
7) just below this, you'll see the measure values listed in green boxes- drag them around to reorder to it goes 1d, 1w, 1m, 1y (by default, 1w and 1m are switched because they're in alphabetic order)
8) drag funding_id to the color panel

Architecture Design for Bus Routing with Time

This is to confirm if my design is good enough or get the better ideas to solve the bus routing problem with time. Here is my solution with the primary steps given below:
Have one edges table which represents all the edges (the source and target represent vertices (bus stops):
postgres=# select id, source, target, cost from busedges;
id | source | target | cost
----+--------+--------+------
1 | 1 | 2 | 1
2 | 2 | 3 | 1
3 | 3 | 4 | 1
4 | 4 | 5 | 1
5 | 1 | 7 | 1
6 | 7 | 8 | 1
7 | 1 | 6 | 1
8 | 6 | 8 | 1
9 | 9 | 10 | 1
10 | 10 | 11 | 1
11 | 11 | 12 | 1
12 | 12 | 13 | 1
13 | 9 | 15 | 1
14 | 15 | 16 | 1
15 | 9 | 14 | 1
16 | 14 | 16 | 1
Have a table which represents bus details like from time, to time, edge etc.
NOTE: I have used integer format for "from" and "to" column for faster results as I can do an integer query, but I can replace it with any better format if available.
postgres=# select id, "busedgeId", "busId", "from", "to" from busedgetimes;
id | busedgeId | busId | from | to
----+-----------+-------+-------+-------
18 | 1 | 1 | 33000 | 33300
19 | 2 | 1 | 33300 | 33600
20 | 3 | 2 | 33900 | 34200
21 | 4 | 2 | 34200 | 34800
22 | 1 | 3 | 36000 | 36300
23 | 2 | 3 | 36600 | 37200
24 | 3 | 4 | 38400 | 38700
25 | 4 | 4 | 38700 | 39540
Use dijkstra algorithm to find the nearest path.
Get the upcoming buses from the busedgetimes table in the earliest first order for the nearest path detected by dijkstra algorithm. => This leads to a bit complex query though.
Can I do any kind of improvements to this, or are there any better designs?
Links to docs, articles related to this would be really helpful.
This is totally normal and the regular way to do it. See also,
PgRouting Example

Spotfire - Calculate average only if there are minimum 3 values

I want to create a cross table in Spotfire where in which Average is calculated only when there are at least 3 values. If there are no values or less than 3 values the average should be blank.
+-------+-----+---------+
| Month | Age | Average |
+-------+-----+---------+
| 1 | 10 | |
| 2 | 11 | |
| 3 | 2 | 7.7 |
| 4 | | |
| 5 | 13 | |
| 6 | 14 | |
| 7 | | |
| 8 | 19 | |
| 9 | 20 | |
| 10 | 21 | 20 |
+-------+-----+---------+
If I'm understanding you correctly, you want to group by Month, and then have something like this as your aggregation:
If(Count()>2,Avg([Age]),null) as [AverageAge_3Min]

Displaying 2 metrics on a tableau map

I am new to Tableau and I have requirements as below:
I need to create a dashboard with a filter on Paywave or EMV and show count of Confirmed and Probable on a geo map.
When I select EMV from the quick filter, it should show a count of confirm & probable for that city. I should be able to drill down and see a count of confirm and probable for zip codes as well.
I am not sure how to achieve the above requirements.
As shown below I have fields like:
EMV Paywave
mrchchant_city, mrch_zipcode confirm probable confirm probable
A 1001 10 15 20 18
B 1005 34 67 78 12
C 2001 24 56 76 45
C 2001 46 19 63 25
Please let me know if any information required from my side.
This will be a lot easier on you if you restructure your data a bit. More often than not, the goal in Tableau is to provide an aggregated summary of the data, rather than showing each individual row. We'll want to group by dimensions (categorical data like "EMV"/"Paywave" or "Confirm"/"Probable"), so this data will be much easier to work with if we get those dimensions into their own columns.
Here's how I personally would go about structuring your table:
+----------------+--------------+---------+----------+-------+-----+
| mrchchant_city | mrch_zipcode | dim1 | dim2 | count | ... |
+----------------+--------------+---------+----------+-------+-----+
| A | 1001 | Paywave | confirm | 20 | ... |
| A | 1001 | Paywave | probable | 18 | ... |
| A | 1001 | EMV | confirm | 10 | ... |
| A | 1001 | EMV | probable | 15 | ... |
| B | 1005 | Paywave | confirm | 78 | ... |
| B | 1005 | Paywave | probable | 12 | ... |
| B | 1005 | EMV | confirm | 34 | ... |
| B | 1005 | EMV | probable | 67 | ... |
| C | 2001 | Paywave | confirm | 76 | ... |
| C | 2001 | Paywave | probable | 45 | ... |
| C | 2001 | EMV | confirm | 24 | ... |
| C | 2001 | EMV | probable | 56 | ... |
| C | 2001 | Paywave | confirm | 63 | ... |
| C | 2001 | Paywave | probable | 25 | ... |
| C | 2001 | EMV | confirm | 46 | ... |
| C | 2001 | EMV | probable | 19 | ... |
| ... | ... | ... | ... | ... | ... |
+----------------+--------------+---------+----------+-------+-----+
(Sorry about the dim1 and dim2, I don't really know what those dimensions represent. You can/should obviously pick a more intuitive nomenclature.)
Once you have a table with columns for your categorical data, it will be simple to filter and group by those dimensions.