Filter by all parts of a LTREE-Field - postgresql

Let's say I have a Table people with the following columns:
name/string, mothers_hierachy/ltree
"josef", "maria.jenny.lisa"
How do I find all mothers of Josef in the people Table?
I'm searching for such a expression like this one: (That actually works)
SELECT * FROM people where name IN (
SELECT mothers_hierachy from people where name = "josef"
)

You can cast the names to ltree and then use index() to see if they are contained:
# select * from people;
┌───────┬───────────────────────┐
│ name │ mothers_hierarchy │
├───────┼───────────────────────┤
│ josef │ maria.jenny.lisa │
│ maria │ maria │
│ jenny │ maria.jenny │
│ lisa │ maria.jenny.lisa │
│ kate │ maria.jenny.lisa.kate │
└───────┴───────────────────────┘
(5 rows)
# select *
from people j
join people m
on index(j.mothers_hierarchy, m.name::ltree) >= 0
where j.name = 'josef';
┌───────┬───────────────────┬───────┬───────────────────┐
│ name │ mothers_hierarchy │ name │ mothers_hierarchy │
├───────┼───────────────────┼───────┼───────────────────┤
│ josef │ maria.jenny.lisa │ maria │ maria │
│ josef │ maria.jenny.lisa │ jenny │ maria.jenny │
│ josef │ maria.jenny.lisa │ lisa │ maria.jenny.lisa │
└───────┴───────────────────┴───────┴───────────────────┘
(3 rows)

Related

PostgreSql : Merge two rows and add the difference to new column

We have an app which displays a table like this :
this is what it looks like in database :
┌──────────┬──────────────┬─────────────┬────────────┬──────────┬──────────────────┐
│ BatchId │ ProductCode │ StageValue │ StageUnit │ StageId │ StageLineNumber │
├──────────┼──────────────┼─────────────┼────────────┼──────────┼──────────────────┤
│ 0B001 │ 150701 │ LEDI2B4015 │ │ 37222 │ 1 │
│ 0B001 │ 150701 │ 16.21 │ KG │ 37222 │ 1 │
│ 0B001 │ 150701 │ 73.5 │ │ 37222 │ 2 │
│ 0B001 │ 150701 │ LEDI2B6002 │ KG │ 37222 │ 2 │
└──────────┴──────────────┴─────────────┴────────────┴──────────┴──────────────────┘
I would like to query the database so that the output looks like this :
┌──────────┬──────────────┬────────────────────┬─────────────┬────────────┬──────────┬──────────────────┐
│ BatchId │ ProductCode │ LoadedProductCode │ StageValue │ StageUnit │ StageId │ StageLineNumber │
├──────────┼──────────────┼────────────────────┼─────────────┼────────────┼──────────┼──────────────────┤
│ 0B001 │ 150701 │ LEDI2B4015 │ 16.21 │ KG │ 37222 │ 1 │
│ 0B001 │ 150701 │ LEDI2B6002 │ 73.5 │ KG │ 37222 │ 2 │
└──────────┴──────────────┴────────────────────┴─────────────┴────────────┴──────────┴──────────────────┘
Is that even possible ?
My PostgreSQL Server version is 14.X
I have looked for many threads with "merge two columns and add new one" but none of them seem to be what I want.
DB Fiddle link
SQL Fiddle (in case) link
It's possible to get your output, but it's going to be prone to errors. You should seriously rethink your data model, if at all possible. Storing floats as text and trying to parse them is going to lead to many problems.
That said, here's a query that will work, at least for your sample data:
SELECT batchid,
productcode,
max(stagevalue) FILTER (WHERE stagevalue ~ '^[a-zA-Z].*') as loadedproductcode,
max(stagevalue::float) FILTER (WHERE stagevalue !~ '^[a-zA-Z].*') as stagevalue,
max(stageunit),
stageid,
stagelinenumber
FROM datas
GROUP BY batchid, productcode, stageid, stagelinenumber;
Note that max is just used because you need an aggregate function to combine with the filter. You could replace it with min and get the same result, at least for these data.

where should the size values be place?

I am making the length and width values with the help of Mediaquery to be a responsive design. Where should I put these values ? in core/constants/ ? is there any project I can take an example of to find out this kind of thing or document.
├───core
│ ├───constants
│ │ ├───app
│ │ ├───color
│ │ └───textstyle
│ ├───extension
│ └───init
│ └───translations
├───product
│ ├───error
│ ├───navigator
│ │ └───guard
│ └───widget
│ ├───appbar
│ ├───button
│ └───textfield
├───providers
└───view
├───authenticate
│ ├───login
│ │ ├───model
│ │ ├───service
│ │ ├───view
│ │ └───viewmodel
│ ├───onboard
│ │ ├───model
│ │ ├───view
│ │ └───widget
│ ├───register
│ │ ├───model
│ │ ├───service
│ │ └───view
│ └───reset_password_view.dart
│ └───view
├───home
│ ├───home
│ │ └───view
│ ├───menu
│ │ └───view
│ ├───models
│ ├───more
│ │ └───view
│ ├───offers
│ │ └───view
│ └───profile
│ └───view
├───welcome
│ └───view
└───_product
└───_widgets
├───card
├───listtile
└───safearea
extension MediaQueryExtension on BuildContext {
double dynamicWidth(double val) => MediaQuery.of(this).size.width * val;
double dynamicHeight(double val) => MediaQuery.of(this).size.height * val;
There is no written rule about the question you ask.
It usually depends on developers/company preferences.
But since the extension can be used project wide, the most reasonable place for it would be core/extensions.
I would not place it inside constants since it's not a constant values.

Combine columns of text type to a jsonb column in postgresql

I have a table with below structure in postgres where id is the primary key.
┌──────────────────────────────────┬──────────────────┬───────────┬──────────┬──────────────────────────────────────────────────────────────┬──────────┬──────────────┬─────────────┐
│ Column │ Type │ Collation │ Nullable │ Default │ Storage │ Stats target │ Description │
├──────────────────────────────────┼──────────────────┼───────────┼──────────┼──────────────────────────────────────────────────────────────┼──────────┼──────────────┼─────────────┤
│ id │ bigint │ │ │ │ plain │ │ │
│ requested_external_total_taxable │ bigint │ │ │ │ plain │ │ │
│ requested_external_total_tax │ bigint │ │ │ │ plain │ │ │
│ store_address.country │ text │ │ │ │ extended │ │ │
│ store_address.city │ text │ │ │ │ extended │ │ │
│ store_address.postal_code │ text │
I want to convert the store_address fields to a jsonb column.
┌──────────────────────────────────┬──────────────────┬───────────┬──────────┬──────────────────────────────────────────────────────────────┬──────────┬──────────────┬─────────────┐
│ Column │ Type │ Collation │ Nullable │ Default │ Storage │ Stats target │ Description │
├──────────────────────────────────┼──────────────────┼───────────┼──────────┼──────────────────────────────────────────────────────────────┼──────────┼──────────────┼─────────────┤
│ id │ bigint │ │ │ │ plain │ │ │
│ requested_external_total_taxable │ bigint │ │ │ │ plain │ │ │
│ requested_external_total_tax │ bigint │ │ │ │ plain │ │ │
│ store_address │ jsonb │ │ │ │ extended │ │ │
Any efficient of doing this?
You will need to add a new column, UPDATE the table and populating the new jsonb column. After that you can drop the old columns:
alter table the_table
add store_address jsonb;
update the_table
set store_address = jsonb_build_object('country', "store_address.country",
'city', "store_address.city",
'postal_code', "store_address.postal_code");
alter table the_table
drop "store_address.country",
drop "store_address.city",
drop "store_address.postal_code"

Apache Druid : Issue while updating the data in Datasource

I am currently using the druid-Incubating-0.16.0 version. As mentioned in https://druid.apache.org/docs/latest/tutorials/tutorial-update-data.html tutorial link, we can use combining firehose to update and merge the data for a data source.
Step: 1
I am using the same sample data with the initial structure as
┌──────────────────────────┬──────────┬───────┬────────┐
│ __time │ animal │ count │ number │
├──────────────────────────┼──────────┼───────┼────────┤
│ 2018-01-01T01:01:00.000Z │ tiger │ 1 │ 100 │
│ 2018-01-01T03:01:00.000Z │ aardvark │ 1 │ 42 │
│ 2018-01-01T03:01:00.000Z │ giraffe │ 1 │ 14124 │
└──────────────────────────┴──────────┴───────┴────────┘
Step 2:
I updated the data for tiger with {"timestamp":"2018-01-01T01:01:35Z","animal":"tiger", "number":30} with appendToExisting = false and rollUp = true and found the result
┌──────────────────────────┬──────────┬───────┬────────┐
│ __time │ animal │ count │ number │
├──────────────────────────┼──────────┼───────┼────────┤
│ 2018-01-01T01:01:00.000Z │ tiger │ 2 │ 130 │
│ 2018-01-01T03:01:00.000Z │ aardvark │ 1 │ 42 │
│ 2018-01-01T03:01:00.000Z │ giraffe │ 1 │ 14124 │
└──────────────────────────┴──────────┴───────┴────────┘
Step 3:
Now i am updating giraffe with {"timestamp":"2018-01-01T03:01:35Z","animal":"giraffe", "number":30} with appendToExisting = false and rollUp = true and getting the following result
┌──────────────────────────┬──────────┬───────┬────────┐
│ __time │ animal │ count │ number │
├──────────────────────────┼──────────┼───────┼────────┤
│ 2018-01-01T01:01:00.000Z │ tiger │ 1 │ 130 │
│ 2018-01-01T03:01:00.000Z │ aardvark │ 1 │ 42 │
│ 2018-01-01T03:01:00.000Z │ giraffe │ 2 │ 14154 │
└──────────────────────────┴──────────┴───────┴────────┘
My doubt is, In step 3 the count of the tiger is getting decreased by 1 but I think it should not be changed since there are no changes in step 3 for tiger and there is no number change also
FYI, count and number are metricSpec and they are count and longSum respectively.
Please clarify.
when using ingestSegment firehose with initial data like
┌──────────────────────────┬──────────┬───────┬────────┐
│ __time │ animal │ count │ number │
├──────────────────────────┼──────────┼───────┼────────┤
│ 2018-01-01T00:00:00.000Z │ aardvark │ 1 │ 9999 │
│ 2018-01-01T00:00:00.000Z │ bear │ 1 │ 111 │
│ 2018-01-01T00:00:00.000Z │ lion │ 2 │ 200 │
└──────────────────────────┴──────────┴───────┴────────┘
while adding a new data {"timestamp":"2018-01-01T03:01:35Z","animal":"giraffe", "number":30} with appendToExisting = true, i am getting
┌──────────────────────────┬──────────┬───────┬────────┐
│ __time │ animal │ count │ number │
├──────────────────────────┼──────────┼───────┼────────┤
│ 2018-01-01T00:00:00.000Z │ aardvark │ 1 │ 9999 │
│ 2018-01-01T00:00:00.000Z │ bear │ 1 │ 111 │
│ 2018-01-01T00:00:00.000Z │ lion │ 2 │ 200 │
│ 2018-01-01T00:00:00.000Z │ aardvark │ 1 │ 9999 │
│ 2018-01-01T00:00:00.000Z │ bear │ 1 │ 111 │
│ 2018-01-01T00:00:00.000Z │ giraffe │ 1 │ 30 │
│ 2018-01-01T00:00:00.000Z │ lion │ 1 │ 200 │
└──────────────────────────┴──────────┴───────┴────────┘
is it correct and expected output? why the rollup didn't happen?
Druid has actually only 2 modes. Overwrite or append.
With the appendToExisting=true, your data will be appended to the existing data, which will cause that the "number" field will increase (and the count also).
With appendToExisting=false all your data in the segment is overwritten. I think this is what happening.
This is different then with "normal" databases, where you can update specific rows.
In druid you can update only certain rows, but this is done by re-indexing your data. It is not a very easy process.
This re-indexing is done by an ingestSegment Firehose, which reads your data from a segment, and then writes it also to a segment (can be the same). During this process, you can add a transform filter, which does a specific action, like update certain field values.
We have build a PHP library to make these processes more easy to work with. See this example how to re-index a segment and apply a transformation during the re-indexing.
https://github.com/level23/druid-client#reindex

Find clusters of values using Postgresql

Consider the following example table:
CREATE TABLE rndtbl AS
SELECT
generate_series(1, 10) AS id,
random() AS val;
and I want to find for each id a cluster_id such that the clusters are far away from each other at least 0.1. How would I calculate such a cluster assignment?
A specific example would be:
select * from rndtbl ;
id | val
----+-------------------
1 | 0.485714662820101
2 | 0.185201027430594
3 | 0.368477711919695
4 | 0.687312887981534
5 | 0.978742253035307
6 | 0.961830694694072
7 | 0.10397826647386
8 | 0.644958863966167
9 | 0.912827260326594
10 | 0.196085536852479
(10 rows)
The result would be: ids (2,7,10) in a cluster and (5,6,9) in another cluster and (4,8) in another, and (1) and (3) as singleton clusters.
From
SELECT * FROM rndtbl ;
┌────┬────────────────────┐
│ id │ val │
├────┼────────────────────┤
│ 1 │ 0.153776332736015 │
│ 2 │ 0.572575284633785 │
│ 3 │ 0.998213059268892 │
│ 4 │ 0.654628816060722 │
│ 5 │ 0.692200613208115 │
│ 6 │ 0.572836415842175 │
│ 7 │ 0.0788379465229809 │
│ 8 │ 0.390280921943486 │
│ 9 │ 0.611408909317106 │
│ 10 │ 0.555164183024317 │
└────┴────────────────────┘
(10 rows)
Use the LAG window function to know whether the current row is in a new cluster or not:
SELECT *, val - LAG(val) OVER (ORDER BY val) > 0.1 AS new_cluster
FROM rndtbl ;
┌────┬────────────────────┬─────────────┐
│ id │ val │ new_cluster │
├────┼────────────────────┼─────────────┤
│ 7 │ 0.0788379465229809 │ (null) │
│ 1 │ 0.153776332736015 │ f │
│ 8 │ 0.390280921943486 │ t │
│ 10 │ 0.555164183024317 │ t │
│ 2 │ 0.572575284633785 │ f │
│ 6 │ 0.572836415842175 │ f │
│ 9 │ 0.611408909317106 │ f │
│ 4 │ 0.654628816060722 │ f │
│ 5 │ 0.692200613208115 │ f │
│ 3 │ 0.998213059268892 │ t │
└────┴────────────────────┴─────────────┘
(10 rows)
Finally you can SUM the number of true (still ordering by val) to get the cluster of the row (counting from 0):
SELECT *, SUM(COALESCE(new_cluster::int, 0)) OVER (ORDER BY val) AS nb_cluster
FROM (
SELECT *, val - LAG(val) OVER (ORDER BY val) > 0.1 AS new_cluster
FROM rndtbl
) t
;
┌────┬────────────────────┬─────────────┬────────────┐
│ id │ val │ new_cluster │ nb_cluster │
├────┼────────────────────┼─────────────┼────────────┤
│ 7 │ 0.0788379465229809 │ (null) │ 0 │
│ 1 │ 0.153776332736015 │ f │ 0 │
│ 8 │ 0.390280921943486 │ t │ 1 │
│ 10 │ 0.555164183024317 │ t │ 2 │
│ 2 │ 0.572575284633785 │ f │ 2 │
│ 6 │ 0.572836415842175 │ f │ 2 │
│ 9 │ 0.611408909317106 │ f │ 2 │
│ 4 │ 0.654628816060722 │ f │ 2 │
│ 5 │ 0.692200613208115 │ f │ 2 │
│ 3 │ 0.998213059268892 │ t │ 3 │
└────┴────────────────────┴─────────────┴────────────┘
(10 rows)