Filter postgresql query where value is not a substring of any other value in the same column - postgresql

I have a table with a column called values:
values | other_columns...
-------+-----------------
f | ...
foo | ...
fo | ...
bar | ...
ba | ...
baz | ...
foobar | ...
When querying this table I want to filter the results so that the only remaining rows are those in which value is not a substring of any other value in the column:
prime_values | other_result_columns...
-------------+------------------------
baz | ...
foobar | ...
How can I do this?

With NOT EXISTS:
select t.*
from tablename t
where not exists (
select 1
from tablename
where
values <> t.values
and
values like concat('%', t.values, '%')
)
See the demo.
Results:
> | values |...
> | :----- |
> | baz |
> | foobar |

Related

POSTGRESQL Subquery with a Order by Function does not return values

I have the following code that does not return values to the select because of my order by (Function)
Select sub.enrollmentseqnumemp ,sub.membercodedep,a.Subscriber_ID
From elan_staging.check_register_1 a
Left join (
Select enrollmentseqnumemp ,membercodedep
from elan.elig
ORDER BY public.idx(array['e','s','1','2','3','4','5'],membercodedep) Limit 1) sub
On sub.enrollmentseqnumemp=a.Subscriber_ID
| enrollmentseqnumemp | membercodedep | Subscriber_ID |
|:----------------------|:----------------|:----------------|
| [null] | [null] | "462852" |
| [null] | [null] | "462852" |
| [null] | [null] | "407742" |
If I run it without the Order By function, it works correctly
Select sub.enrollmentseqnumemp
,sub.membercodedep,a.Subscriber_ID
From elan_staging.check_register_1 a
Left join (
Select enrollmentseqnumemp ,membercodedep
from elan.elig
ORDER BY 1) sub
On sub.enrollmentseqnumemp=a.Subscriber_ID
Limit 1
| enrollmentseqnumemp | membercodedep | Subscriber_ID |
|:----------------------|:----------------|:----------------|
| 111111 | e | "462852" |
| 222222 | 3 | "462852" |
| 333333 | s | "407742" |
Code for the function from the Postgres snippets repository:
CREATE FUNCTION idx(anyarray varchar (1) ARRAY[4], anyelement varchar (1))
RETURNS int AS
$$
SELECT i FROM (
SELECT generate_series(array_lower($1,1),array_upper($1,1))
) g(i)
WHERE $1[i] = $2
LIMIT 1;
$$ LANGUAGE sql IMMUTABLE;
Is there a way to fix it so that it returns the values?
The first query as you have written it can return non-NULLs out of elig only for the one row with globally smallest value of public.idx(...). If you want values for the smallest public.idx(...) within each enrollmentseqnumemp, you could use distinct on, like :
Select sub.enrollmentseqnumemp ,sub.membercodedep,a.Subscriber_ID
From check_register_1 a
Left join (
Select distinct(enrollmentseqnumemp) enrollmentseqnumemp, membercodedep, public.idx(array['e','s','1','2','3','4','5'],membercodedep)
from elig
ORDER BY enrollmentseqnumemp, public.idx(array['e','s','1','2','3','4','5'],membercodedep)) sub
On sub.enrollmentseqnumemp=a.Subscriber_ID;

How to expand columns into individual timesteps in PostgreSQL

I have a table of columns that represent a time series. The datatypes are not important, but anything after timestep2 could potentially be NULL.
| id | timestep1 | timestep2 | timestep3 | timestep4 |
|----|-----------|-----------|-----------|-----------|
| a | foo1 | bar1 | baz1 | qux1 |
| b | foo2 | bar2 | baz2 | NULL |
I am attempting to retrieve a view of the data more suitable for modeling. My modeling use-case requires that I break each time series (row) into rows representing their individual "states" at each step. That is:
| id | timestep1 | timestep2 | timestep3 | timestep4 |
|----|-----------|-----------|-----------|-----------|
| a | foo1 | NULL | NULL | NULL |
| a | foo1 | bar1 | NULL | NULL |
| a | foo1 | bar1 | baz1 | NULL |
| a | foo1 | bar1 | baz1 | qux1 |
| b | foo2 | NULL | NULL | NULL |
| b | foo2 | bar2 | NULL | NULL |
| b | foo2 | bar2 | baz2 | NULL |
How can I accomplish this in PostgreSQL?
Use UNION.
select id, timestep1, timestep2, timestep3, timestep4
from my_table
union
select id, timestep1, timestep2, timestep3, null
from my_table
union
select id, timestep1, timestep2, null, null
from my_table
union
select id, timestep1, null, null, null
from my_table
order by
id,
timestep2 nulls first,
timestep3 nulls first,
timestep4 nulls first
There is a more compact solution, maybe more convenient when dealing with a greater number of timesteps:
select distinct
id,
timestep1,
case when i > 1 then timestep2 end as timestep2,
case when i > 2 then timestep3 end as timestep3,
case when i > 3 then timestep4 end as timestep4
from my_table
cross join generate_series(1, 4) as i
order by
id,
timestep2 nulls first,
timestep3 nulls first,
timestep4 nulls first
Test it in Db<>fiddle.

postgresql | batch update with insert in single query, 1:n to 1:1

I need to turn a 1:n relationship into a 1:1 relationship with the data remaining the same.
I want to know if is it possible to achieve this with a single pure sql (no plpgsql, no external language).
Below there are more details, a MWE and some extra context.
To illustrate, if I have
+------+--------+ +------+----------+--------+
| id | name | | id | foo_id | name |
|------+--------| |------+----------+--------|
| 1 | foo1 | | 1 | 1 | baz1 |
| 2 | foo2 | | 2 | 1 | baz2 |
| 3 | foo3 | | 3 | 2 | baz3 |
+------+--------+ | 4 | 2 | baz4 |
| 5 | 3 | baz5 |
+------+----------+--------+
I want to get to
+------+--------+ +------+----------+--------+
| id | name | | id | foo_id | name |
|------+--------| |------+----------+--------|
| 4 | foo1 | | 1 | 4 | baz1 |
| 5 | foo1 | | 2 | 5 | baz2 |
| 6 | foo2 | | 3 | 6 | baz3 |
| 7 | foo2 | | 4 | 7 | baz4 |
| 8 | foo3 | | 5 | 8 | baz5 |
+------+--------+ +------+----------+--------+
Here is some code to set up the tables if needed:
drop table if exists baz;
drop table if exists foo;
create table foo(
id serial primary key,
name varchar
);
insert into foo (name) values
('foo1'),
('foo2'),
('foo3');
create table baz(
id serial primary key,
foo_id integer references foo (id),
name varchar
);
insert into baz (foo_id, name) values
(1, 'baz1'),
(1, 'baz2'),
(2, 'baz3'),
(2, 'baz4'),
(3, 'baz5');
I managed to work out the following query that updates only one entry (ie, the
pair <baz id, foo id> has to be provided):
with
existing_foo_values as (
select name from foo where id = 1
),
new_id as (
insert into foo(name)
select name from existing_foo_values
returning id
)
update baz
set foo_id = (select id from new_id)
where id = 1;
The real case scenario (a db migration in a nodejs environment) was solved using
something similar to
const existingPairs = await runQuery(`
select id, foo_id from baz
`);
await Promise.all(existingPairs.map(({
id, foo_id
}) => runQuery(`
with
existing_foo_values as (
select name from foo where id = ${foo_id}
),
new_id as (
insert into foo(name)
select name from existing_foo_values
returning id
)
update baz
set foo_id = (select id from new_id)
where id = ${id};
`)));
// Then delete all the orphan entries from `foo`
Here's a solution that works by first putting together what we want foo to look like (using values from the sequence), and then making the necessary changes to the two tables based on that.
WITH new_ids AS (
SELECT nextval('foo_id_seq') as foo_id, baz.id as baz_id, foo.name as foo_name
FROM foo
JOIN baz ON (foo.id = baz.foo_id)
),
inserts AS (
INSERT INTO foo (id, name)
SELECT foo_id, foo_name
FROM new_ids
),
updates AS (
UPDATE baz
SET foo_id = new_ids.foo_id
FROM new_ids
WHERE new_ids.baz_id = baz.id
)
DELETE FROM foo
WHERE id < (SELECT min(foo_id) FROM new_ids);

TSQL Populate Boolean values in Table A based on whether specific records exist in Table B

I have two tables:
| ID | HasA | HasB | HasC | Foo | Bar |
|----|------|------|------|-----|-----|
| 12 | | | | X | Y |
| 43 | | | | Y | X |
| ID | Type |
|----|------|
| 12 | A |
| 43 | B |
| 12 | C |
| 43 | A |
I want to populate the results into Table A so that it looks like:
| ID | HasA | HasB | HasC | Foo | Bar |
|----|------|------|------|-----|-----|
| 12 | true | | true | X | Y |
| 43 | true | true | | Y | X |
How?
Breaking it up into two parts:
First, figure out the "HasX" using an CASE WHEN. This is the inner subqueryA select statement.
Second, join it to the FooBar table using inner join then do a sum on the "HasX" columns to consolidate them, then use the CASE When to return values of true/false. This is the outer returning select statement.
SELECT subqueryA.ID
, CASE WHEN sum(subqueryA.HasA) > 0 THEN 'true' ELSE 'false' END as HasA
, CASE WHEN sum(subqueryA.HasB) > 0 THEN 'true' ELSE 'false' END as HasB
, CASE WHEN sum(subqueryA.HasC) > 0 THEN 'true' ELSE 'false' END as HasC
, foobar.foo
, foobar.bar
FROM
[dbo].[ABC_FooBar] foobar LEFT JOIN
(
SELECT a.ID as ID
, CASE ABCType WHEN 'A' THEN 1 ELSE 0 END as HasA
, CASE ABCType WHEN 'B' THEN 1 ELSE 0 END as HasB
, CASE ABCType WHEN 'C' THEN 1 ELSE 0 END as HasC
FROM [dbo].[ABC] a
) subqueryA on foobar.ID = subqueryA.ID
GROUP BY subqueryA.ID, foobar.foo, foobar.bar

Postgres 10 lateral unnest missing null values

I have a Postgres table where the content of a text column is delimited with '|'.
ID | ... | my_column
-----------------------
1 | ... | text|concatenated|as|such
2 | ... | NULL
3 | ... | NULL
I tried to unnest(string_to_array()) this column to separate rows which works fine, except that my NULL values (>90% of all entries) are excluded. I have tried several approaches:
SELECT * from "my_table", lateral unnest(CASE WHEN "this_column" is NULL
THEN NULL else string_to_array("this_column", '|') END);
or
as suggested here: PostgreSQL unnest with empty array
What I get:
ID | ... | my_column
-----------------------
1 | ... | text
1 | ... | concatenated
1 | ... | as
1 | ... | such
But this is what I need:
ID | ... | my_column
-----------------------
1 | ... | text
1 | ... | concatenated
1 | ... | as
1 | ... | such
2 | ... | NULL
3 | ... | NULL
Use a LEFT JOIN instead:
SELECT m.id, t.*
from my_table m
left join lateral unnest(string_to_array(my_column, '|')) as t(w) on true;
There is no need for the CASE statement to handle NULL values. string_to_array will handle them correctly.
Online example: http://rextester.com/XIGXP80374