Query that references itself - postgresql

I want to create a self referencing and shifting basked of stocks (similar to methodologies used by S&P for the S&P 500). The goal is to create an index that changes structure every month. It guarantees a spot for the first stock by market cap (rank). The 2nd spot goes to the stock that has been in the previous months' lineup and ranks between 2 and 3 this month. If that stock ranks lower than 3, it will get excluded and a new stock slots into its' place. Else, the next closest stock will get chosen.
Give the table below, the index would include the following stocks:
2020-01-01 – AAPL + MSFT
2020-02-01 – AAPL + MSFT
2020-03-01 – APPL + GOOG
In my real data, I obviously have many more stocks and many more months. I am having a very hard time modeling the second case in Postgres since it requires me to create a continuously updated "previous months" table that I need to reference when checking the current month. Any idea how to do this in PostgreSQL? I tried recursive CTEs and those didn't work (due to inner join requirements)
Table with structure below.
date
stock
rank
2020-01-01
AAPL
1
2020-01-01
MSFT
2
2020-01-01
META
3
2020-01-01
GOOG
4
2020-02-01
AAPL
1
2020-02-01
MSFT
3
2020-02-01
META
2
2020-01-01
GOOG
4
2020-03-01
AAPL
1
2020-03-01
MSFT
4
2020-03-01
META
3
2020-01-01
GOOG
2

Related

Tableau : Average Not working as Dimensions not match between Filtered values

Issue: When more than one state is selected in Filters the average is not calculated as expected. For a single state, it is working fine.
The data source contains the following Dimension and Measures -
Dimension: SKU, Country, State, Run Date
Measure: Profit
Dataset is as follows -
Country
State
Run Date
SKU
Profit
USA
Texas
2022-06-01
Table
20
USA
Texas
2022-06-01
Chair
30
USA
Ohio
2022-06-01
Table
5
USA
Ohio
2022-06-02
Table
10
Expected Result of Average (Country and State are used as Filters. Profit is used in values)-
SKU
2022-06-01
2022-06-02
Table
12.5
Chair
15
5
Current Result of Average (Country and State are used as Filters. Profit is used in values)-
SKU
2022-06-01
2022-06-02
Table
12.5
Chair
30
10
I have tried Window_AVG and used SUM and COUNT in the Calculated field but the results are not as expected.
Kindly help with a solution to solve the Averaging issue. Thanks a lot for the support.

Split and merge columns in Tableau

I Know how to split the country column into multiple columns(Using custom split I split them on the comma's) but I am not sure how to merge them like shown below(What I want). Any help is greatly appreciated
**What I have**
ID Type Country
1 TV Show USA, UK, Spain, Sweden
2 Movie USA, India
3 Movie USA
4 TV Show Bulgaria, USA
5 Movie Sweden, Norway
6 Movie UK, USA
7 TV Show Germany
8 TV Show India
9 Movie USA, India
10 Movie USA
**What I want**
USA
UK
Spain
Sweden
USA
India
USA
Bulgaria
USA
Sweden
Norway
UK
USA
Germany
India
USA
India
USA
It's actually something you can do in the datasource pane, even though it may not make much sense "inside" Tableau for "data handling".
As you said the first step is splitting your Country column; doing so, for each row you will get N columns where N equals the number of country for the row having the highest number of countries (4 in your example).
Rows having just 1 country will have just the first column with a value and the rest will be null value columns.
Once you have done the first step, you need to pivot your data selecting those N columns, right-click and select pivot.
Doing that you will get N*M rows where N is the maximum number of countries per original row (4) and M is the number of your original rows (10).
Since, as said before there could be null values, the final step should filtering out all "new rows" having null values.

calculation to determine average per event by year

I have a very large table of data containing cricket information. At the moment I am trying to gather the average number of runs per match for Australia (and other countries) in years 2013, 2014, and 2015. I was able to get the average runs per year into a graph and currently I have a bar chart that looks like this:
year 2013 | 2014 | 2015
total runs 1037 | 1835 | 177
but I would like one that divides that total by the number of matches per year (6, 13, and 1 respectively) and looks like this:
year 2013 | 2014 | 2015
avg runs per match 173 | 141 | 177
but I don't know how to conduct a calculation on these numbers to divide that total over the number of games played. There is a column in my set called 'MID' for Match ID. Obviously, summing the number of MID for 2013 would give me the needed number, 6.
Ideally, I would divide the total number of runs by the number of unique items in the MID column, but I do not know how to do this. If this makes any sense at all, would anyone have a simple way of doing this? I would really appreciate it, as I'm essentially experimenting with this on my own and falling way behind on my other projects.
Assuming you have a column named "Runs" and a column named "MID", then a calculation for Runs per Match would be as follows:
SUM(Runs) / COUNTD(MID)
This gives total runs divided by distinct count of Match ID.

Calculating percentage of survivors per cohort over time in Tableau

In my dataset, I have three columns of data:
CustomerID, BoxCount, MonthCreated
1001 1 Aug 2015
1001 2 Aug 2015
1001 3 Aug 2015
1002 1 Sep 2015
1002 2 Sep 2015
In the screenshot below, I have built a table that displays the count of unique CustomerIDs at each BoxCount level, by cohort (MonthCreated, which is when the customer signed up).
BoxCount level 1 is the full number of people who signed up in MonthCreated X, because everyone who has signed up receives at least 1 box. Then people start cancelling. The number of people who reached BoxCount level 2 for May 2015 (according to the screenshot), is 156,823, or 86.87% of total people who signed up in May (180,525).
I need to create a second column next to the count of customers that displays the % of customers remaining at each BoxCount level, per cohort (people who signed up in the same month).
I have tried using the Quick Table calculation Percentage of Total, with the computation method being "Table (Down)" but it only seems to work for the first month of MonthCreated. I would like for each subsequent month to have 100% for BoxCount level 1, and the following % to be a portion of the number at each month's level 1. I can't figure out why for July, the % starts at 83.89% and not 100%.
Can anyone help me figure out how to calculate this percentage and also to add it as a new column instead of replacing the column of raw counts?
Thanks!
Looks like you're almost there. Did you try changing the calculation definition by changing the values it summarizes from or the level?
Some example:
And for it to be replacing the column of raw counts, you can just add the raw counts column again in your view and you'll have both.

How to sample from KDB table to reduce data before querying?

I have a table of tick data representing prices of various financial instruments up to millisecond precision. Problem is, there are over 5 billion entries, and even the most basic queries takes several minutes.
I only need data with a precision of up to 1 second - is there an efficient way to sample the table so that the precision is reduced to roughly 1 second prior to querying? This should dramatically cut the amount of data and hence execution time.
So far, as a quick hack I've added the condition where i mod 2 = 0 to my query, but is there a better way?
The best way to bucket time data is with xbar
q)select last price, sum size by 10 xbar time.minute from trade where sym=`IBM
minute| price size
------| -----------
09:30 | 55.32 90094
09:40 | 54.99 48726
09:50 | 54.93 36511
10:00 | 55.23 35768
...
more info http://code.kx.com/q/ref/arith-integer/#xbar