populate cells defined by dates period - date

Sorry if the post is in fact a duplicate. Just could not google anything similar and I am bit stuck on approach.
I am trying to populate cells in one sheet depending on date in rows of a different sheet, like these:
Sheet1 - entry sheet
ID | Name | Start date | End date
10 | Mike | 1.06.2016 | 2.06.2016
13 | Dido | 1.06.2016 | 5.06.2016
8 | Rene | 2.06.2016 | 20.06.2016
Sheet2 - report sheet
ids/dates | 1.06.2016 | 2.06.2016 | 3.06.2016 | date+1
8 | | Rene | Rene | Rene
10 | Mike | Mike | |
13 | Dido | Dido | Dido | Dido
Column Name cell's are to be populated in sheet2 depending on Sheet1 Column ID, Start date, end date. The position of the populated cell is defined in sheet2 by column ID and row Dates that should equal the same values in sheet1.

This report could be done with help of one formula. Please, check this Example File.
Assumptions
Suppose, you have Sheet1 with data:
Col A: ID
Col B: Name
Col C: Start date
Col D: End Date
Case 1. ID's are unique.
Go to Sheet2 and paste this formula in it:
={{"ids/dates";filter(Sheet1!A2:A,Sheet1!A2:A<>"")},{ArrayFormula(add(MIN(Sheet1!C:D),COLUMN(OFFSET(A1,,,1,MAX(Sheet1!C:D)-MIN(Sheet1!C:D)))-1));ArrayFormula(if(--(add(MIN(Sheet1!C:D),COLUMN(OFFSET(A1,,,1,MAX(Sheet1!C:D)-MIN(Sheet1!C:D)))-1)>=filter(Sheet1!C2:C,Sheet1!C2:C<>0))*--(add(MIN(Sheet1!C:D),COLUMN(OFFSET(A1,,,1,MAX(Sheet1!C:D)-MIN(Sheet1!C:D)))-1)<=filter(Sheet1!D2:D,Sheet1!C2:C<>0))=1,VLOOKUP(FILTER(Sheet1!A2:A,Sheet1!A2:A<>""),Sheet1!A:B,2,0),""))}}
That's all. Report will expand automatically when new data arrives on Sheet1. The report will return error if Data is not complete (misssing Names or dates) on Sheet1.
Case 2. ID's are NOT unique.
This solution works when ID's are not unique, ID's will be grouped together. One ID belongs to one person in this case.
The formula will be a bit longer:
={{"ids/dates";sort(UNIQUE(filter(Sheet1!A2:A,Sheet1!A2:A<>"")))},{ArrayFormula(add(MIN(Sheet1!C:D),COLUMN(OFFSET(A1,,,1,MAX(Sheet1!C:D)-MIN(Sheet1!C:D)))-1));ArrayFormula(if(QUERY(QUERY({filter(Sheet1!A2:A,Sheet1!A2:A<>""),ArrayFormula((--(add(MIN(Sheet1!C:D),COLUMN(OFFSET(A1,,,1,MAX(Sheet1!C:D)-MIN(Sheet1!C:D)))-1)>=filter(Sheet1!C2:C,Sheet1!C2:C<>0))*--(add(MIN(Sheet1!C:D),COLUMN(OFFSET(A1,,,1,MAX(Sheet1!C:D)-MIN(Sheet1!C:D)))-1)<=filter(Sheet1!D2:D,Sheet1!C2:C<>0))))},"select Col1, sum(Col"&JOIN("), sum(Col",ArrayFormula(COLUMN(OFFSET(B2,,,1,MAX(Sheet1!C:D)-MIN(Sheet1!C:D)))))&") group by Col1"),"Select Col"&JOIN(", Col",ArrayFormula(COLUMN(OFFSET(B2,,,1,MAX(Sheet1!C:D)-MIN(Sheet1!C:D)))))&" where Col1>0",0)=1,VLOOKUP(sort(UNIQUE(filter(Sheet1!A2:A,Sheet1!A2:A<>""))),Sheet1!A:B,2,0),""))}}
See example here.
Case 3. IDs are NOT unique. One ID <> one name
Here's working example, please check it. This case is the hardest one. We can have multiple IDs referring to multiple names. The final formula:
={{"ids/dates",ArrayFormula(add(MIN(Sheet1!C:D),COLUMN(OFFSET(A1,,,1,MAX(Sheet1!C:D)-MIN(Sheet1!C:D)))-1))};{sort(UNIQUE(FILTER(Sheet1!A2:A,Sheet1!A2:A<>""))),ArrayFormula(IFERROR(VLOOKUP(QUERY(QUERY({FILTER(Sheet1!A2:B,Sheet1!A2:A<>""),ArrayFormula(--(add(MIN(Sheet1!C:D),COLUMN(OFFSET(A1,,,1,MAX(Sheet1!C:D)-MIN(Sheet1!C:D)))-1)>=filter(Sheet1!C2:C,Sheet1!C2:C<>0))*--(add(MIN(Sheet1!C:D),COLUMN(OFFSET(A1,,,1,MAX(Sheet1!C:D)-MIN(Sheet1!C:D)))-1)<=filter(Sheet1!D2:D,Sheet1!C2:C<>0))*row(OFFSET(A1,,,rows(FILTER(Sheet1!A2:B,Sheet1!A2:A<>"")))))},"select Col1, sum(Col"&JOIN("), sum(Col",ArrayFormula(COLUMN(OFFSET(C1,,,1,MAX(Sheet1!C:D)-MIN(Sheet1!C:D)))))&") group by Col1"),"Select Col"&JOIN(", Col",ArrayFormula(COLUMN(OFFSET(B2,,,1,MAX(Sheet1!C:D)-MIN(Sheet1!C:D)))))&" where Col1>0",0),{ArrayFormula(row(OFFSET(A1,,,rows(FILTER(Sheet1!A2:B,Sheet1!A2:A<>""))))),FILTER(Sheet1!A2:B,Sheet1!A2:A<>"")},3,0)))}}
The formula will work incorrectly if two Date ranges intersect:
102 Mike 6/21/2016 6/27/2016
102 Mike 6/11/2016 6/22/2016

Related

How to convert nested json to data frame with kdb+

I am trying to get the data from cryptostats like below, it gives me back a nested json. I want it to be in a table format. How do I do that?
query:"https://api.cryptostats.community/api/v1/fees/oneDayTotalFees/2023-02-07";
raw:.Q.hg query;
res:.j.k raw;
To get json file, use https://api.cryptostats.community/api/v1/fees/oneDayTotalFees/2023-02-07
To view json code into a table format, use https://jsongrid.com/json-grid
Final result would be a kdb+ table which has all the cols from nested json output
They are all dictionaries
q)distinct type each res[`data]
,99h
But they do not collapse to a table because they do not all have matching keys
q)distinct key each res[`data]
`id`bundle`results`metadata`errors
`id`bundle`results`metadata
Looking at a row where errors is populated we can see it is a dictionary
q)res[`data;0;`errors]
oneDayTotalFees| "Error executing oneDayTotalFees on compound: Date incomplete"
You can create a prototype dictionary with a blank errors key in it and join , each piece of data onto it. This will result in uniform dictionaries which will be promoted to a table type 98h
q)table:(enlist[`errors]!enlist (`$())!()),/:res`data
q)type table
98h
Row which already had errors is unaffected:
q)table 0
errors | (,`oneDayTotalFees)!,"Error executing oneDayTotalFees..
id | "compound"
bundle | 0n
results | (,`oneDayTotalFees)!,0n
metadata| `source`icon`name`category`description`feeDescription;..
Row which previously did not have errors now has a valid empty dictionary
q)table 1
errors | (`symbol$())!()
id | "swapr-ethereum"
bundle | "swapr"
results | (,`oneDayTotalFees)!,24.78725
metadata| `category`name`icon`bundle`blockchain`description`feeDescription..
https://kx.com/blog/kdb-q-insights-parsing-json-files/
https://code.kx.com/q/ref/join/
https://code.kx.com/q/kb/faq/#construction
https://code.kx.com/q/basics/datatypes/
https://code.kx.com/q/ref/maps/#each-left-and-each-right
If you want to explore nested objects you can index at depth (see blog post linked above). If you have many sparse keys leaving it like this is efficient for storage:
q)select tokenSymbol:metadata[::;`tokenSymbol] from table where not ""~/:metadata[::;`tokenSymbol]
tokenSymbol
-----------
"HNY"
If you do wish to explode a nested field you can run similar to:
q)table:table,'{flip c!flip table[`metadata]#\:(c:distinct raze key each table[`metadata])}[]
q)meta table
c | t f a
----------------| -----
errors |
id | C
bundle | C
results |
metadata |
source | C
icon | C
name | C
category | C
description | C
feeDescription | C
blockchain | C
website | C
tokenTicker | C
tokenCoingecko | C
protocolLaunch | C
tokenLaunch | C
adapter | C
subtitle | C
events | C
shortName | C
protocolShutdown| C
tokenSymbol | C
subcategory | C
tokenticker | C
tokencoingecko | C
Care needs to be taken will filling in nulls and keeping consistent types of data in each column. In this dataset the events tag inside metadata is tabular data:
q)select distinct type each events from table
events
------
10
98
0
This would need to be cleaned similar to:
q)table:update events:count[i]#enlist ([] date:();description:()) from table where not 98h=type each events
The data returned from the API contains dictionaries with two distinct sets of keys:
q)distinct key each res`data
`id`bundle`results`metadata`errors
`id`bundle`results`metadata
One simple way to convert this to a table is to enlist each dictionary first, converting them to tables, then joining with uj:
q)(uj/)enlist each res`data
id bundle results metadata ..
-----------------------------------------------------------------------------..
"compound" 0n (,`oneDayTotalFees)!,0n `source`i..
"swapr-ethereum" "swapr" (,`oneDayTotalFees)!,24.78725 `category..
...
This works as uj generalises the join operator ,, allowing different schemas with common elements to be combined.

Aggregate function to extract all fields based on maximum date

In one table I have duplicate values ​​that I would like to group and export only those fields where the value in the "published_at" field is the most up-to-date (the latest date possible). Do I understand it correctly as I use the MAX aggregate function the corresponding fields I would like to extract will refer to the max found or will it take the first found in the table?
Let me demonstrate you this on simple example (in real world example I am also joining two different tables). I would like to group it by id and extract all fields but only relating to the max published_at field. My query would be:
SELECT "t1"."id", "t1"."field", MAX("t1"."published_at") as "published_at"
FROM "t1"
GROUP By "t1"."id"
| id | field | published_at |
---------------------------------
| 1 | document1 | 2022-01-10 |
| 1 | document2 | 2022-01-11 |
| 1 | document3 | 2022-01-12 |
The result I want is:
1 - document3 - 2022-01-12
Also one question - why am I getting this error "ERROR: column "t1"."field" must appear in the GROUP BY clause or be used in an aggregate function". Can I use MAX function on string type column?
If you want the latest row for each id, you can use DISTINCT ON. For example:
select distinct on (id) *
from t
order by id, published_at desc
If you just want the latest row in the whole result set you can use LIMIT. For example:
select *
from t
order by published_at desc
limit 1

Drools - Finding a single matching condition for a table of products ranked by consumers

I have a table displaying information for the top four ratings of produce in a store. I want to be able to find specific products in this rating table. Here is a structure of the table
----------------------------------------------------------------------------
sectId | product_code | product_category | consumer_raniking
10444 | 11222 | PRODUCE | RATING_1
10444 | 45555 | PRODUCE | RATING_1
10444 | 10005 | PR0DUCE | RATING_1
20555 | 11344 | PRODUCE | RATING_2
20555 | 94003 | PRODUCE | RATING_2
... and so on.
I wrote a rule to find inserted products which ins not working the way I want, i.e. to find the targetted fact that was inserted into the table. Here is the rule I put together:
rule "find by product codes rating_1"
when
$product_table: ProductRanking( $rank1: this.getProductCodesRankFirst())
$product1 : Product( this.product_code memberOf $rank1, $product_code: product_code )
$product2 : Product( this.product_code == 10444,this.product_code != $product_code ,$product_code2: product_code)
then
System.out.println("Found Products for product_codes "+$product_code+ " "+$product_code2 ) ;
end
Unfortunately, this returns 3 rows. I inserted into the session the product in row 2 i.e. product with ocde 45555 and it does find row 2. However, ir also brings in row 1 and row3.
I can see why it's doing that. It's because the skus are in the sectId with sectId 10444. However, I want to only bring in the row
that I inserted, which is sectionId(10444), product_code(45555). How can I achieve that?
I solved it by using a global to filter out the extra products. In the first line that brings the rankings, I eliminate the extra-matching products this way:
global ProductHelper productHelper
$product_table: ProductRanking( $rank1: productHelper.getProductCodesRankFirst(),
productCode != productHelper.getProductCodeFruitCategory() && productCode!=
productHelper.productCodeVegetableCategory())
The ProductHelper identifies the product codes I want to eliminate and hence the extra 2 products brought in are ignored, creating a single match. I'm sure there is a better way, but since I'm no expert, this is what I was able to come up with.

Postgres Query for Beginners

Ok, I deleted previous post and will try this again. I am sure I don't know the topic and I'm not sure if this is a loop or if I should use a stored function or how to get what I'm looking for. Here's sample data and expected output;
I have a single table A. Table has following fields; date created, unique person key, type, location.
I need a Postgres query that says for any given month(parameter, based on date created) and given a location(parameter based on location field), provide me fieds below where unique person key may be duplicated + or – 30 days from the date created within the month given for same type but all locations.
Example Data
Date Created | Unique Person | Type | Location
---------------------------------------------------
2/5/2017 | 1 | Admit | Hospital1
2/6/2017 | 2 | Admit | Hospital2
2/15/2017 | 1 | Admit | Hospital2
2/28/2017 | 3 | Admit | Hospital2
3/3/2017 | 2 | Admit | Hospital1
3/15/2017 | 3 | Admit | Hospital3
3/20/2017 | 4 | Admit | Hospital1
4/1/2017 | 1 | Admit | Hospital2
Output for the month of March for Hospital1:
DateCreated| UniquePerson | Type | Location | +-30days | OtherLoc.
------------------------------------------------------------------------
3/3/2017 | 2 | Admit| Hospital1 | 2/6/2017 | Hospital2
Output for the month of March for Hospital2:
None, because no one was seen at Hospital2 in March
Output for the month of March for Hospital3:
DateCreated| UniquePerson | Type | Location | +-30days | otherLoc.
------------------------------------------------------------------------
3/15/2017 | 3 | Admit| Hospital3 | 2/28/2017 | Hospital2
Version 1
I would use a WITH clause. Please, notice that I've added a column id that is a primary key to simplify the query. It's just to prevent the rows to be matched with themselves.
WITH x AS (
SELECT
id,
date_created,
unique_person_id,
type,
location
FROM
a
WHERE
location = 'Hospital1' AND
date_trunc('month', date_created) = date_trunc('month', '2017-03-01'::date)
)
SELECT
x.date_created,
x.unique_person_id,
x.type,
x.location,
a.date_created AS "+-30days",
a.location AS other_location
FROM
x
JOIN a
USING (unique_person_id, type)
WHERE
x.id != a.id AND
abs(x.date_created - a.date_created) <= 30;
Now a little bit of explanations:
First we select, let's say a reference data with a WITH clause. Think of it as a temporary table that we can reference in the main query. In given example it could be a "main visit" in given hospital.
Then we join "main visits" with other visits of the same person and type (JOIN condition) that happen in date difference of 30 days (WHERE condition).
Notice that the WITH query has the limits you want to check (location and date). I use date_trunc function that truncates the date to specified precision (a month in this case).
Version 2
As #Laurenz Albe suggested, there is no special need to use a WITH clause. Right, so here is a second version.
SELECT
x.date_created,
x.unique_person_id,
x.type,
x.location,
a.date_created AS "+-30days",
a.location AS other_location
FROM
a AS x
JOIN a
USING (unique_person_id, type)
WHERE
x.location = 'Hospital1' AND
date_trunc('month', x.date_created) = date_trunc('month', '2017-03-01'::date) AND
x.id != a.id AND
abs(x.date_created - a.date_created) <= 30;
This version is shorter than the first one but, in my opinion, the first is easier to understand. I don't have big enough set of data to test and I wonder which one runs faster (the query planner shows similar values for both).

Check if field value is in a list of strings in SSRS report

I'm using SSRS (VS2008) and creating a report of work orders. In the detail line of the report table, I have the following columns (with some fake data)
WONUM | A | B | Hours
ABC123 | 3 | 0 | 3
SPECIAL| 0 | 6 | 6
DEF456 | 5 | 0 | 5
GHI789 | 4 | 0 | 4
OTHER | 0 | 2 | 2
As you can kind of see, all work orders have a work order number (WONUM) as well as a total # of hours (HOURS). I need to put the hours into either column A or column B based on WONUM. I have a list of specifically named work orders (in the example, they would be "SPECIAL" and "OTHER") which would cause the HOURS value to be put in column B. If the WONUM is NOT a special named one, then it goes in column A. Here's what I WANTED to put as the expression for column A and column B:
Column A: =IIF(Fields!WONUM.Value IN ("SPECIAL","OTHER"), 0, Fields!Hours.Value)
Column B: =IIF(Fields!WONUM.Value IN ("SPECIAL","OTHER"), Fields!Hours.Value, 0)
But as you're probably aware, Fields!WONUM.Value IN ("SPECIAL","OTHER") is not a valid method of doing this! What is the best way to make this work? I cannot flag it in the SQL query in any other way for other reasons so it must be done in the table.
Thanks in advance for any and all help!
Try this, (Using InStr() function)
IIF(InStr(Fields!WONUM.Value,"SPECIAL")>0 OR InStr(Fields!WONUM.Value,"OTHER")>0, 0, Fields!Hours.Value)
IIF(InStr(Fields!WONUM.Value,"SPECIAL")>0 OR InStr(Fields!WONUM.Value,"OTHER")>0, Fields!Hours.Value,0)
If it's just the two WONUMs then you can do this:
Column A:
=IIF((Fields!WONUM.Value <> "SPECIAL") AND (Fields!WONUM.Value <> "OTHER"), Fields!Hours.Value, 0)
Column B:
=IIF((Fields!WONUM.Value = "SPECIAL") OR (Fields!WONUM.Value = "OTHER"), Fields!Hours.Value, 0)
or use the same formula in each column for consistency and swap the field/0 at the end.