Query function to contain date operations [DATE + X number of months] - date

I am querying a Google spreadsheet, using a relatively simple expression:
=QUERY(Sheet1!A1:J200, "Select A, J", 1)
This query produces list of Offices and Last N date in columns L and M - see picture below.
What I do next is
add 6 months to each of the Last N dates;
=IF(M2="","",DATE(YEAR(M2)+0,MONTH(M2)+6,DAY(M2)+0))
See if any of the resultant dates are equal to or greater than TODAY();
If YES, place "ALARM" into column O which is then used as a marker elsewhere, by filtering the rows with this value as an identifier.
=IF(today()>=X2,"ALARM","")
I was wondering if it is possible to create a query where 6 months would already be added to values in Column J and, possibly, the resultant list filtered IF value[i] in column J is greater than or equal to TODAY(). By achiving this, the column J would contain only Last N dates + 6 months AND >= TODAY();
All examples I have checked seems to operate with dates as filters.

=QUERY({Sheet1!A1:A,
ARRAYFORMULA(DATE(YEAR(Sheet1!J1:J), MONTH(Sheet1!J1:J)+6, DAY(Sheet1!J1:J)))},
"select Col1,Col2,'ALARM'
where Col1 is not null
and Col2 >=date '"&TEXT(TODAY(), "yyyy-mm-dd")&"'
label Col2'ABCD', 'ALARM''alarm'
format Col2 'dd-mmm-yyyy'", 1)

=QUERY({FleetStatus!A1:D, ARRAYFORMULA(
DATE(YEAR(FleetStatus!J1:J), MONTH(FleetStatus!J1:J)+6, DAY(FleetStatus!J1:J)))},
"select Col1,Col5,'ALARM'
where Col1 is not null
and Col1 !='IVAN GUBKIN'
and Col1 !='VYACHESLAV TIKHONOV'
and Col4 != 'L'
and Col5 <=date '"&TEXT(TODAY(), "yyyy-mm-dd")&"'
label Col5'+6M', 'ALARM''Alarm'
format Col5 'dd-mmm-yyyy'", 1)

Related

Show first and last value in table

I have an excel file with customer's purchasing details (sorted by date).
for example:
customer_id
date
$_Total_purchase
A
1/2/23
5
A
1/3/23
20
A
1/4/23
10
i want to show in table, one row for each customer, so the final table will be:
customer_id
date
purchase_counter
amount_of_last_purchase
amount_of_first_purchase
A
1/4/23
3
10
5
in my table, customer_id is a dimension.
for extracting the date, i use max(date) as measure
for purchase_counter i use count(customer_id)
for extracting 'amount_of_first_purchase', i use firstSortedValue('$_Total_purchase', date)
how can i extract 'amount_of_last_purchase'? is there maybe an aggregation function i can use?
Thanks in advance :)
The simple answer is that you can use -date in you expression and this will return the last record:
FirstSortedValue('$_Total_purchase', -date)
The above will work for the provided data example. When there are more than one customer then Aggr function can help:
First: FirstSortedValue(aggr(sum($_Total_purchase), customer_id, date), date)
Last: FirstSortedValue(aggr(sum($_Total_purchase), customer_id, date), -date)
Another approach (if applied to your case/data) is to flag the first and last records during the data load and use the flags in the measures.
An example script:
RawData:
Load * Inline [
customer_id, date, $_Total_purchase
A, 2/1/23, 5
A, 3/1/23, 20
A, 4/1/23, 10
B, 5/1/23, 35
B, 6/1/23, 40
B, 7/1/23, 50
];
Temp0:
Load
customer_id,
date,
// flag the first record
// if the current row is the beggining of the table then flag as isFirst = 1
// if the customer_id for the current row is different from the previously loaded >-
// customer_id then flag as isFirst = 1
if(RowNo() = 1 or customer_id <> peek(customer_id), 1, null()) as isFirst,
// getting the last is a bit more tricky
// similar logic - if the currrent and previous customer_id are different >-
// or it is the end of the table then get the current customer_id and date >-
// and combine their values. Values are separeted with | ELSE write 0.
// for example: A|4/1/23 or B|7/1/23
if(customer_id <> peek(customer_id) and RowNo() <> 1, peek(customer_id) & '|' & peek(date),
if(RowNo() = NoOfRows('RawData'), customer_id & '|' & date, 0
)) as isLastTemp
Resident
RawData
;
// Get all the data from Temp0 for which isLastTemp is not equal to 0
// split isLastTemp by | -> fist value is customer_id and second is date
// join the result back to the otiginal table
join (RawData)
Load
SubField(isLastTemp, '|', 1) as customer_id,
SubField(isLastTemp, '|', 2) as date,
1 as isLast
Resident
Temp0
Where
isLastTemp <> 0
;
// join Temp0 to the original table
// but only grab the isFirst flag
join(RawData)
Load
customer_id,
date,
isFirst
Resident
Temp0
;
// this table is no longer needed
Drop Table Temp0;
Once the above script is reloaded RawData table will have two more columns - isFirst and isLast:
Then the expressions are simpler:
First: sum( {< isFirst = {1} >} $_Total_purchase)
Last: sum( {< isLast = {1} >} $_Total_purchase)
import pandas as pd
# read excel file
df = pd.read_excel('customer_purchases.xlsx')
# get first value
first_value = df.head(1)
# get last value
last_value = df.tail(1)
you can do with pandas also

is there a query formula to count the latest value in a specific date for a specific id along with all the unique ids's values?

For example, post id 18158935492035927 has value of 42 in Col D in date 08/24 and the same id# has value of 44 in date 08/25.
For date 08/25, is there a query formula to scan all the values in the Col D to count the value of 44 and not 42 in 08/25 for this same post id along with all the unique post ids' values?
Like from 08/24 to 08/25 there's total of 3 unique post ids. For date 08/25, is there a formula to scan all rows in Col D to sum values of (34+0+44) and not (34+42) from col D?
First, it would help you if you were to use a cell (like E2) where you can store the latest date in column A
=max(A3:A9)
Following that, you can use this formula
=QUERY(A3:D9,"select sum(D)
where A=DATE '"&TEXT(E2, "yyyy-mm-dd")&"'
label sum(D) '' ")
(Please adjust ranges to your needs)

Need help in parsing column value based on value in other column

I have two columns, COL1 and COL2. COL1 has value like 'Birds sitting on $1 and enjoying' and COL2 has value like 'the.location_value[/tree,\building]'
I need to update third column COL3 with values like 'Birds sitting on /tree and enjoying'
i.e. $1 in 1st column is replaced with /tree
which is the 1st word from list of comma separated words with in square brackets [] in COL2 i.e. [/tree,\building]
I wanted to know the best suitable combination of string function in postgresql to use to achieve this.
You need to first extract the first element from the comma separated list, to do that, you can use split_part() but you first need to extract the actual list of values. This can be done using substring() with a regular expression:
substring(col2 from '\[(.*)\]')
will return /tree,\building
So the complete query would be:
select replace(col1, '$1', split_part(substring(col2 from '\[(.*)\]'), ',', 1))
from the_table;
Online example: http://rextester.com/CMFZMP1728
This one should work with any (int) number after $:
select t.*, c.col3
from t,
lateral (select string_agg(case
when o = 1 then s
else (string_to_array((select regexp_matches(t.col2, '\[(.*)\]'))[1], ','))[(select regexp_matches(s, '^\$(\d+)'))[1]::int] || substring(s from '^\$\d+(.*)')
end, '' order by o) col3
from regexp_split_to_table(t.col1, '(?=\$\d+)') with ordinality s(s, o)) c
http://rextester.com/OKZAG54145
Note:it is not the most efficient though. It splits col2's values (in the square brackets) each time for replacing $N.
Update: LATERAL and WITH ORDINALITY is not supported in older versions, but you could try a correlating subquery instead:
select t.*, (select array_to_string(array_agg(case
when s ~ E'^\\$(\\d+)'
then (string_to_array((select regexp_matches(t.col2, E'\\[(.*)\\]'))[1], ','))[(select regexp_matches(s, E'^\\$(\\d+)'))[1]::int] || substring(s from E'^\\$\\d+(.*)')
else s
end), '') col3
from regexp_split_to_table(t.col1, E'(?=\\$\\d+)') s) col3
from t

Finding if values in two columns exist

I have two columns of dates and I want to run a query that returns TRUE if there is a date in existence in the first column and in existence in the second column.
I know how to do it when I'm looking for a match (if the data entry in column A is the SAME as the entry in column B), but I don't know know how to find if data entry in column A and B are in existence.
Does anyone know how to do this? Thanks!
If data in a column is present, it IS NOT NULL. You can query for that on both columns, with and AND clause to get your result:
SELECT (date1 IS NOT NULL AND date2 IS NOT NULL) AS both_dates
FROM mytable;
So, rephrasing:
For any two entries in table x with date columns a and b, is there some pair of rows x1 and x2 where x1.a = x2.b?
If that's what you're trying to do, you want a self-join, e.g, presuming the presence of a single key column named id:
SELECT x1.id, x2.id, x1.a AS x1_a_x2_b
FROM mytable x1
INNER JOIN mytable x2 ON (x1.a = x2.b);

Perl + PostgreSQL-- Selective Column to Row Transpose

I'm trying to find a way to use Perl to further process a PostgreSQL output. If there's a better way to do this via PostgreSQL, please let me know. I basically need to choose certain columns (Realtime, Value) in a file to concatenate certains columns to create a row while keeping ID and CAT.
First time posting, so please let me know if I missed anything.
Input:
ID CAT Realtime Value
A 1 time1 55
A 1 time2 57
B 1 time3 75
C 2 time4 60
C 3 time5 66
C 3 time6 67
Output:
ID CAT Time Values
A 1 time 1,time2 55,57
B 1 time3 75
C 2 time4 60
C 3 time5,time6 66,67
You could do this most simply in Postgres like so (using array columns)
CREATE TEMP TABLE output AS SELECT
id, cat, ARRAY_AGG(realtime) as time, ARRAY_AGG(value) as values
FROM input GROUP BY id, cat;
Then select whatever you want out of the output table.
SELECT id
, cat
, string_agg(realtime, ',') AS realtimes
, string_agg(value, ',') AS values
FROM input
GROUP BY 1, 2
ORDER BY 1, 2;
string_agg() requires PostgreSQL 9.0 or later and concatenates all values to a delimiter-separated string - while array_agg() (v8.4+) creates am array out of the input values.
About 1, 2 - I quote the manual on the SELECT command:
GROUP BY clause
expression can be an input column name, or the name or ordinal number
of an output column (SELECT list item), or ...
ORDER BY clause
Each expression can be the name or ordinal number of an output column
(SELECT list item), or
Emphasis mine. So that's just notational convenience. Especially handy with complex expressions in the SELECT list.