Whoosh groupping by time interval - whoosh

I have the following structure and is indexed using Whoosh.
timestamp name count(b.name)
------------------- ---- -------------
2010-11-16 10:32:22 John 2
2010-11-16 10:35:12 John 7
2010-11-16 10:36:34 John 1
2010-11-16 10:37:45 John 2
2010-11-16 10:48:26 John 8
2010-11-16 10:55:00 John 9
2010-11-16 10:58:08 John 2
I want to make a query to get the following structures, so it displays name frequency every 5 mins
timestamp name count(b.name)
------------------- ---- -------------
2010-11-16 10:30:00 John 2
2010-11-16 10:35:00 John 10
2010-11-16 10:40:00 John 0
2010-11-16 10:45:00 John 8
2010-11-16 10:50:00 John 0
2010-11-16 10:55:00 John 11

One of the possible solutions is to introduce additional field into index e.g. timestamp_trimmed, trim timestamp to 5min interval and save into timestamp_trimmed field and perform search with grouped by timestamp_trimmed field.

Related

How to calculate the amount of SQL?

I have a table transaction_details:
transaction_id
customer_id
item_id
item_number
transaction_dttm
7765
1
23
1
2022-01-15
1254
2
12
4
2022-02-03
3332
3
56
2
2022-02-15
7658
1
43
1
2022-03-01
7231
4
56
1
2022-01-15
7231
2
23
2
2022-01-29
I need to calculate the amount spent by the client in the last month and find out the item (item_name) on which the client spent the most in the last month.
Example result:
|customer_id|amount_spent_lm|top_item_lm|
| - | ---------- | ----- |
| 1 | 700 | glasses |
| 2 | 20000 | notebook |
| 3 | 100 | cup |
When calculating, it is necessary to take into account the current price at the time of the transaction (dict_item_prices). Customers who have not made purchases in the last month are not included in the final table. he last month is defined as the last 30 days at the time of the report creation.
There is also a table dict_item_prices:
item_id
item_name
item_price
valid_from_dt
valid_to_dt
23
phone 1
1000
2022-01-01
2022-12-31
12
notebook
5000
2022-01-02
2022-12-31
56
cup
50
2022-01-02
2022-12-31
43
glasses
700
2022-01-01
2022-12-31

How to count rows after the occurence of a value by group (postgresql)

I have for example the following table:
Name
Day
Healthy
Jon
1
No
Jon
2
Yes
Jon
3
Yes
Jon
4
Yes
Jon
5
No
Mary
1
Yes
Mary
2
No
Mary
3
Yes
Mary
4
No
Mary
5
Yes
I want to add a column which counts the number of following days after day X a person was healthy:
Name
Day
Healthy
Number of days the person was healthy after day X (incl.)
Jon
1
No
3
Jon
2
Yes
3
Jon
3
Yes
2
Jon
4
Yes
1
Jon
5
No
0
Mary
1
Yes
3
Mary
2
No
2
Mary
3
Yes
2
Mary
4
No
1
Mary
5
Yes
1
Is it possible to use some sort of window function to create such a column? Thanks a lot for the help!
There are a couple of ways to do this with a window function. One is to order by day descending and use the default window. The other is to specify the window from the current row to the end of the partition.
This example casts the boolean healthy as an int so that it can be summed. If your table has literal Yes and No strings, then you can use sum((healthy = 'yes')::int) over (...) to achieve the same thing.
select name, day,
sum(healthy::int)
over (partition by name
order by day
rows between current row
and unbounded following) as num_subsequent_health_days
from my_table;
name | day | num_subsequent_health_days
:--- | --: | -------------------------:
Jon | 1 | 3
Jon | 2 | 3
Jon | 3 | 2
Jon | 4 | 1
Jon | 5 | 0
Mary | 1 | 3
Mary | 2 | 2
Mary | 3 | 2
Mary | 4 | 1
Mary | 5 | 1
db<>fiddle here
I assume your relation has the following schema:
CREATE TABLE test(name text, day int, healthy boolean);
Then this should produce the desired result:
SELECT name, day, sum(mapped) OVER (PARTITION BY name ORDER BY day DESC RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM (SELECT name, day, CASE WHEN healthy THEN 1 ELSE 0 END AS mapped FROM test) sub ORDER BY name, day;

How to get runtime value of column and used in the next row

I have below dataframe, based on the visited date I need to create a new column allowed- If the customar visted within a week from last allowd week visit I have to mark allowed as NO (4th row 2020-01-09-2020-01-10 <7 ) and if it is more than 1 week allowed yes (3rd row 2020-01-09-2020-01-01 >7 )
Input DF
Customar visited_date
John 2020-01-01
John 2020-01-05
John 2020-01-09
John 2020-01-10
John 2020-01-17
output DF
Customar visited_date allowed
John 2020-01-01 Yes
John 2020-01-05 No
John 2020-01-09 Yes
John 2020-01-10 No
John 2020-01-17 Yes
I dont know how to calulate the colum value in runtime and used that in subsequent column calculation.

Tableau Pivot Rows into Columns

I have a table structure like this:
Department Employee Class Peroid Qty1 Qty2 Qty3
----------------------------------------------------
Dept1 John 1 1st 1 2 3
Dept1 John 1 2nd 11 22 33
Dept1 Mary 1 1st 2 3 4
Dept1 Mary 1 2nd 22 33 44
Dept2 Joe 1 1st 3 4 5
Dept2 Joe 1 2nd 33 44 55
Dept2 Paul 1 1st 4 5 6
Dept2 Paul 1 2nd 44 55 66
In a view I'd like to display the format as such:
Class / Period
1
Department Employee 1st 2nd
----------------------------------------------
Dept1 John 1 2 3 11 22 33
Dept1 Mary 2 3 4 22 33 44
Dept2 Joe 3 4 5 33 44 55
Dept2 Paul 4 5 6 44 55 66
I can't seem to find a way to do this. I have Class, Period as Columns and Department, Employee as Rows then drag Qty1, Qty2, Qty3 to the Text Mark but the format becomes:
Class / Period
1
Department Employee 1st 2nd
----------------------------------------------
Dept1 John 1 11
2 22
3 33
Dept1 Mary 2 22
3 33
4 44
Dept2 Joe 3 33
4 44
5 55
Dept2 Paul 4 44
5 55
6 66
How do I turn those rows under each employee to sub-columns under Period?
I think this is what you're trying to achieve.
A lot of times when you see a repeating column in a database table, Qty1, Qty2, Qty3, it is a sign that you really want multiple rows each with a single Qty (and repeating the other information) -- At least when you are building reports. That way you can have rows with any number of instances of Qty, and you can also easily aggregate all the Qty together when needed.
There are situations where you may want to stick with a repeating field design. But if you do want to reshape the data, you can do that in Tableau's data connection window by selecting the columns you want to pull out into a single field and selecting the pivot command.

Select from table removing similar rows - PostgreSQL

There is a table with document revisions and authors. Looks like this:
doc_id rev_id rev_date editor title,content so on....
123 1 2016-01-01 03:20 Bill ......
123 2 2016-01-01 03:40 Bill
123 3 2016-01-01 03:50 Bill
123 4 2016-01-01 04:10 Bill
123 5 2016-01-01 08:40 Alice
123 6 2016-01-01 08:41 Alice
123 7 2016-01-01 09:00 Bill
123 8 2016-01-01 10:40 Cate
942 9 2016-01-01 11:10 Alice
942 10 2016-01-01 11:15 Bill
942 15 2016-01-01 11:17 Bill
I need to find out moments when document was transferred to another editor - only first rows of every edition series.
Like so:
doc_id rev_id rev_date editor title,content so on....
123 1 2016-01-01 03:20 Bill ......
123 5 2016-01-01 08:40 Alice
123 7 2016-01-01 09:00 Bill
123 8 2016-01-01 10:40 Cate
942 9 2016-01-01 11:10 Alice
942 10 2016-01-01 11:15 Bill
If I use DISTINCT ON (doc_id, editor) it resorts a table and I see only one per doc and editor, that is incorrect.
Of course I can dump all and filter with shell tools like awk | sort | uniq. But it is not good for big tables.
Window functions like FIRST_ROW do not give much, because I cannot partition by doc_id, editor not to mess all them.
How to do better?
Thank you.
You can use lag() to get the previous value, and then a simple comparison:
select t.*
from (select t.*,
lag(editor) over (partition by doc_id order by rev_date) as prev_editor
from t
) t
where prev_editor is null or prev_editor <> editor;