My postgres DB has a Price table where I store price data for a bunch of products. For each Price object I store when it was created (Price.timestamp), and whenever there is a new price for the product, I create a new Price object, and for the old Price object I store when it ended (Price.end_time). Both times are datetime objects.
Now, I want to count how many Prices there are at over a time period. Easy I thought, so I did the query below:
trunc_date = db.func.date_trunc('day', Price.timestamp)
query = db.session.query(trunc_date, db.func.count(Price.id))
query = query.order_by(trunc_date.desc())
query = query.group_by(trunc_date)
prices_count = query.all()
Which is great, but only counts how many prices were new/created for each day. So what I thought I could do, was to filter so that I would get prices where the trunc_date is between the beginning and the end for the Price, like below:
query = query.filter(Price.timestamp < trunc_date < Price.time_ended)
But apparently you are not allowed to use trunc_date this way. Can anyone help me with how I am supposed to write my query?
Data example:
Price.id Price.timestamp Price.time_ended
1 2022-18-09 2022-26-09
2 2022-13-09 2022-20-09
The query result i would like to get is:
2022-27-09; 0
2022-26-09; 1
2022-25-09; 1
...
2022-20-09; 2
2022-19-09; 2
2022-18-09; 2
2022-17-09; 1
...
2022-12-09; 0
Have you tried separating the conditions inside the filter?
query = db.session.\
query(trunc_date, db.func.count(Price.id)).\
filter(
(Price.timestamp < trunc_date),
(trunc_date < Price.time_ended)
).\
group_by(trunc_date).\
order_by(trunc_date.desc()).\
all()
you can use
trunc_date.between(Price.timestamp, Price.time_ended)
I figured it out.
First I created a date range by using a subquery.
todays_date = datetime.today() - timedelta(days = 1)
numdays = 360
min_date = todays_date - timedelta(days = numdays)
date_series = db.func.generate_series(min_date , todays_date, timedelta(days=1))
trunc_date = db.func.date_trunc('days', date_series)
subquery = db.session.query(trunc_date.label('day')).subquery()
Then I used the subquery as input in my original query, and I was finally able to filter on the dates from the subquery.
query = db.session.query(subquery.c.day, db.func.count(Price.id))
query = query.order_by(subquery.c.day.desc())
query = query.group_by(subquery.c.day)
query = query.filter(Price.timestamp < subquery.c.day)
query = query.filter(Price.time_ended > subquery.c.day)
Now, query.all() will give you a nice list that counts the prices for each day specified in the date_series.
I would like to join my two tables on [Group] and [YearMonth] Dates. Where [YRMO_NB] from Table 2 falls between [ENR_START] AND [ENR_END] from Table 1 then repeat the value of column [PHASE] for each related row, just like the second picture last column = [PHASE] and leaves unmatched rows blank.
I did this which only gives me exact matches:
ON A.GROUP = PHASE.GROUP
AND A.YRMO_NB = PHASE.ENR_START
Table 1
Table 2
Is there an easy way to do this ?
Thank You!
I figured it out
ON A.GROUP = PHASE.GROUP
AND A.YRMO_NB >= PHASE.ENR_START
and A.YRMO_NB <= PHASE.ENR_END
The ask is to create a measure (not a calculated column using Earlier), to fetch minimum unit price of a product group by year and "UnitofMeasure" and show them as "UnitPriceMin" against complete list of data in separate column as shown below-
You can try with this below measure.
I have considered column "Year", "description" and "unitmeasure" for the grouping. You can add/remove columns as per your necessity.
Considered your table name - "product_details". change it as per your table name.
group_wise_min =
VAR current_row_year = MIN(product_details[year])
VAR current_row_product = MIN(product_details[description])
VAR current_row_unit_measure = MIN(product_details[unitmeasure])
RETURN
CALCULATE(
MIN(product_details[unitprice]),
FILTER(
ALL(product_details),
product_details[year] = current_row_year
&& product_details[description] = current_row_product
&& product_details[unitmeasure] = current_row_unit_measure
)
)
Output will be as below-
I have a portal on my "Clients" table. The related table contains the results of surveys that are updated over time. For each combination of client and category (a field in the related table), I only want the portal to display the most recently collected row.
Here is a link to a trivial example that illustrates the issue I'm trying to address. I have two tables in this example (Related on ClientID):
Clients
Table 1 Get Summary Method
The Table 1 Get Summary Method table looks like this:
Where:
MaxDate is a summary field = Maximum of Date
MaxDateGroup is a calculated field = GetSummary ( MaxDate ;
ClientIDCategory )
ShowInPortal = If ( Date = MaxDateGroup ; 1 ; 0 )
The table is sorted on ClientIDCategory
Issue 1 that I'm stumped on: .
ShowInPortal should equal 1 in row 3 (PKTable01 = 5), row 4 (PKTable01 = 6), and row 6 (PKTable01 = 4) in the table above. I'm not sure why FM is interpreting 1Red and 1Blue as the same category, or perhaps I'm just misunderstanding what the GetSummary function does.
The Clients table looks like this:
Where:
The portal records are sorted on ClientIDCategory
Issue 2 that I'm stumped on:
I only want rows with a ShowInPortal value equal to 1 should appear in the portal. I tried creating a portal filter with the following formula: Table 1 Get Summary Method::ShowInPortal = 1. However, using that filter removes all row from the portal.
Any help is greatly appreciated.
One solution is to use ExecuteSQL to grab the Max Date. This removes the need for Summary functions and sorts, and works as expected. Propose to return it as number to avoid any issues with date formats.
GetAsTimestamp (
ExecuteSQL (
"SELECT DISTINCT COALESCE(MaxDate,'')
FROM Survey
WHERE ClientIDCategory = ? "
; "" ; "";ClientIDCategory )
)
Also, you need to change the ShowInPortal field to an unstored calc field with:
If ( GetAsNumber(Date) = MaxDateGroupSQL ; 1 ; 0 )
Then filter the portal on this field.
I can send you the sample file if you want.
I have a postgresql database and in one particular table, with many rows. One column in this table, called data, is a float array, REAL[], and gets filled with an array of ~4500 elements. I want to access this table through some query via SQLAlchemy and the ORM.
How do I select all rows in the table where a subset of this column satisfies some condition, e.g.contains a range of values? Like I want to select all rows where the data contains values >= 10, or values between >=10 and <=20.
Can I do this with a straight session query like
rows = session.query(Table).filter(Table.data.(some conditional)).all()
where my conditional is something like "VALUES >= 10 and VALUES <= 20"?
Or do I need to define some special methods, or setup, when I'm defining my SQLAlchemy table class. For example, I have my table set up as
class Table(Base):
__tablename__ = 'table'
__table_args__ = {'autoload' : True, 'schema' : 'testdb', 'extend_existing':True}
data = deferred(Column(ARRAY(Float)))
def __repr__(self):
return '<Table (pk={0})>'.format(self.pk)
Ideally I'd like to set it up so I can just do simple filtering in my session.query calls. Is this possible? I'm not super familiar with the ORM, so maybe it is?
I've had a look at the ARRAY Comparator sqlalchemy docs but those only seem to work on exact values. My data is precise to 6 sigfigs, and I don't know the exact values ahead of time.
What's the best way to do this? Thanks.
EDIT:
Based on the below comment, here is the code I'm using in attempting to select all rows (out of 1000) that have data (from 1 column) >= 1.0. There should be 537 rows.
rows = session.query(datadb.Table).filter(datadb.Table.data.any(1.0,operator=operators.le)).all()
This gives the correct subset number. len(rows) = 537. However, I don't understand the logic of with this operator, where to select data >=1.0 , I use the le operator? Also, along those same lines, there should be 234 rows that have data between the values >=1.0 and <1.0, but this statement fails to give the correct subset..
rows = session.query(datadb.Table).filter(datadb.Table.data.any(1.0,operator=operators.le)).filter(datadb.Table.data.any(1.2,operator=operators.ge)).all()
* EDIT 2 *
Here's an example of my database Table with a few rows. pk is an integer, and data is a real[].
db datadb
schema Table
pk data
0 [0.0,0.0,0.5,0.3,1.3,1.9,0.3,0.0,0.0]
1 [0.1,0.0,1.0,0.7,1.1,1.5,1.2,0.3,1.4]
2 [0.0,0.6,0.4,0.3,1.6,1.7,0.4,1.3,0.0]
3 [0.0,0.1,0.2,0.4,1.0,1.1,1.2,0.9,0.0]
4 [0.0,0.0,0.5,0.3,0.2,0.1,0.7,0.3,0.1]
I have 5 rows, 4 of them have data with values >= 1.0, while just 2 have values in the range >= 1.0 and <= 1.2. The query I would do to grab the rows is in the first case
rows = session.query(datadb.Table).filter(datadb.Table.data.any(1.0,operator=operators.le)).all()
This should return the 4 rows, at pk=0,1,2,3. This query does what I expect. The second case
rows = session.query(datadb.Table).filter(datadb.Table.data.any(1.0,operator=operators.le)).filter(datadb.Table.data.any(1.2,operator=operators.ge)).all()
and should return the 2 rows at pk=1,3. However this query just returns the 4 rows from the first query. For the second query, I also tried
rows = session.query(datadb.Table).filter(datadb.Table.data.any(1.0,operator=operators.le),datadb.Table.data.any(1.2,operator=operators.ge)).all()
which also didn't work.
Please read documentation on ARRAY.Comparator, according to which you should be able to do the following:
rows = (session.query(Table)
.filter(Table.data.any(10, operator=operators.le))
.filter(Table.data.any(20, operator=operators.ge)
).all()
EDIT:
# combined filter does not work,
# but applying one or the other is still useful as it reduces the result set
q = (session.query(MyTable)
.filter(MyTable.data.any(1.0, operator=operators.le))
# .filter(MyTable.data.any(1.2, operator=operators.ge))
)
# filter in memory
items = [_row for _row in q.all()
if any(1.0 <= item <= 1.2 for item in _row.data)]
for item in items:
print(item)