How to select rows where the condition where all rows are being extracted for a given condition? - postgresql

I have this table
CREATE TABLE fruits
(
id SERIAL,
name VARCHAR
);
with these entries
INSERT INTO fruits(name)
VALUES('Orange');
INSERT INTO fruits(name)
VALUES('Ananas');
INSERT INTO fruits(name)
VALUES(null);
When I try to to select all rows that not equal to 'Ananas' by querying
select *
from fruits
where name <> 'Ananas'
I get these rows:
id name
-----------
1 Orange
What I would have expected was this
id name
-----------
1 Orange
3 null
How do I ensure that all rows that fulfills the condition gets selected?
Example in dbfiddle:
https://dbfiddle.uk/?rdbms=postgres_11&fiddle=a963d39df0466701b0a96b20db8461e6

Any "normal" comparison with null yields "unknown" which is treated as false in the context of the WHERE clause.
You need to use the null safe operator is distinct from:
select *
from fruits
where name is distinct from 'Ananas';
Alternatively you could convert NULL values to something different:
select *
from fruits
where coalesce(name, '') <> 'Ananas';

Related

Does String Value Exists in a List of Strings | Redshift Query

I have some interesting data, I'm trying to query however I cannot get the syntax correct. I have a temporary table (temp_id), which I've filled with the id values I care about. In this example it is only two ids.
CREATE TEMPORARY TABLE temp_id (id bigint PRIMARY KEY);
INSERT INTO temp_id (id) VALUES ( 1 ), ( 2 );
I have another table in production (let's call it foo) which holds multiples those ids in a single cell. The ids column looks like this (below) with ids as a single string separated by "|"
ids
-----------
1|9|3|4|5
6|5|6|9|7
NULL
2|5|6|9|7
9|11|12|99
I want to evaluate each cell in foo.ids, and see if any of the ids in match the ones in my temp_id table.
Expected output
ids |does_match
-----------------------
1|9|3|4|5 |true
6|5|6|9|7 |false
NULL |false
2|5|6|9|7 |true
9|11|12|99 |false
So far I've come up with this, but I can't seem to return anything. Instead of trying to create a new column does_match I tried to filter within the WHERE statement. However, the issue is I cannot figure out how to evaluate all the id values in my temp table to the string blob full of the ids in foo.
SELECT
ids,
FROM foo
WHERE ids = ANY(SELECT LISTAGG(id, ' | ') FROM temp_ids)
Any suggestions would be helpful.
Cheers,
this would work, however not sure about performance
SELECT
ids
FROM foo
JOIN temp_ids
ON '|'||foo.ids||'|' LIKE '%|'||temp_ids.id::varchar||'|%'
you wrap the IDs list into a pair of additional separators, so you can always search for |id| including the first and the last number
The following SQL (I know it's a bit of a hack) returns exactly what you expect as an output, tested with your sample data, don't know how would it behave on your real data, try and let me know
with seq AS ( # create a sequence CTE to implement postgres' unnest
select 1 as i union all # assuming you have max 10 ids in ids field,
# feel free to modify this part
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7 union all
select 8 union all
select 9 union all
select 10)
select distinct ids,
case # since I can't do a max on a boolean field, used two cases
# for 1s and 0s and converted them to boolean
when max(case
when t.id in (
select split_part(ids,'|',seq.i) as tt
from seq
join foo f on seq.i <= REGEXP_COUNT(ids, '|') + 1
where tt != '' and k.ids = f.ids)
then 1
else 0
end) = 1
then true
else false
end as does_match
from temp_id t, foo
group by 1
Please let me know if this works for you!

Postgresql: Get records having similar column values

Table A
id name keywords
1 Obj1 a,b,c,austin black
2 Obj2 e,f,austin black,h
3 Obj3 k,l,m,n
4 Obj4 austin black,t,u,s
5 Obj5 z,r,q,w
I need to get those records which contains similar type of keywords. Hence the result for the table needs to be:
Records:
1,2,4
Since records 1,2,4 are the one whose some or the other keyword match with at least any other keyword.
You can convert the "csv" to an array and then use Postgres' array functions:
select *
from the_table t1
where exists (select *
from the_table t2
where string_to_array(t1.keywords, ',') && string_to_array(t2.keywords, ',')
and t1.id <> t2.id);

How can I SUM distinct records in a Postgres database where there are duplicate records?

Imagine a table that looks like this:
The SQL to get this data was just SELECT *
The first column is "row_id" the second is "id" - which is the order ID and the third is "total" - which is the revenue.
I'm not sure why there are duplicate rows in the database, but when I do a SUM(total), it's including the second entry in the database, even though the order ID is the same, which is causing my numbers to be larger than if I select distinct(id), total - export to excel and then sum the values manually.
So my question is - how can I SUM on just the distinct order IDs so that I get the same revenue as if I exported to excel every distinct order ID row?
Thanks in advance!
Easy - just divide by the count:
select id, sum(total) / count(id)
from orders
group by id
See live demo.
Also handles any level of duplication, eg triplicates etc.
You can try something like this (with your example):
Table
create table test (
row_id int,
id int,
total decimal(15,2)
);
insert into test values
(6395, 1509, 112), (22986, 1509, 112),
(1393, 3284, 40.37), (24360, 3284, 40.37);
Query
with distinct_records as (
select distinct id, total from test
)
select a.id, b.actual_total, array_agg(a.row_id) as row_ids
from test a
inner join (select id, sum(total) as actual_total from distinct_records group by id) b
on a.id = b.id
group by a.id, b.actual_total
Result
| id | actual_total | row_ids |
|------|--------------|------------|
| 1509 | 112 | 6395,22986 |
| 3284 | 40.37 | 1393,24360 |
Explanation
We do not know what the reasons is for orders and totals to appear more than one time with different row_id. So using a common table expression (CTE) using the with ... phrase, we get the distinct id and total.
Under the CTE, we use this distinct data to do totaling. We join ID in the original table with the aggregation over distinct values. Then we comma-separate row_ids so that the information looks cleaner.
SQLFiddle example
http://sqlfiddle.com/#!15/72639/3
Create custom aggregate:
CREATE OR REPLACE FUNCTION sum_func (
double precision, pg_catalog.anyelement, double precision
)
RETURNS double precision AS
$body$
SELECT case when $3 is not null then COALESCE($1, 0) + $3 else $1 end
$body$
LANGUAGE 'sql';
CREATE AGGREGATE dist_sum (
pg_catalog."any",
double precision)
(
SFUNC = sum_func,
STYPE = float8
);
And then calc distinct sum like:
select dist_sum(distinct id, total)
from orders
SQLFiddle
You can use DISTINCT in your aggregate functions:
SELECT id, SUM(DISTINCT total) FROM orders GROUP BY id
Documentation here: https://www.postgresql.org/docs/9.6/static/sql-expressions.html#SYNTAX-AGGREGATES
If we can trust that the total for 1 order is actually 1 row. We could eliminate the duplicates in a sub-query by selecting the the MAX of the PK id column. An example:
CREATE TABLE test2 (id int, order_id int, total int);
insert into test2 values (1,1,50);
insert into test2 values (2,1,50);
insert into test2 values (5,1,50);
insert into test2 values (3,2,100);
insert into test2 values (4,2,100);
select order_id, sum(total)
from test2 t
join (
select max(id) as id
from test2
group by order_id) as sq
on t.id = sq.id
group by order_id
sql fiddle
In difficult cases:
select
id,
(
SELECT SUM(value::int4)
FROM jsonb_each_text(jsonb_object_agg(row_id, total))
) as total
from orders
group by id
I would suggest just use a sub-Query:
SELECT "a"."id", SUM("a"."total")
FROM (SELECT DISTINCT ON ("id") * FROM "Database"."Schema"."Table") AS "a"
GROUP BY "a"."id"
The Above will give you the total of each id
Use below if you want the full total of each duplicate removed:
SELECT SUM("a"."total")
FROM (SELECT DISTINCT ON ("id") * FROM "Database"."Schema"."Table") AS "a"
Using subselect (http://sqlfiddle.com/#!7/cef1c/51):
select sum(total) from (
select distinct id, total
from orders
)
Using CTE (http://sqlfiddle.com/#!7/cef1c/53):
with distinct_records as (
select distinct id, total from orders
)
select sum(total) from distinct_records;

Exclude rows that return NULL for a column when using a Case statement

SELECT ir.objectid,ir.objecttype,ir.name,ir.email,ir.createdate,
CASE objecttype
WHEN 1 THEN (select friendlyurl
from locations
where id = ir.objectid)
END as objecturl
FROM inforequests ir
WHERE createdate > '1/1/2014'
order by CreateDate asc
This query returns 10 rows for me, but 1 row shows NULL for column objecturl, which happens if no record is found in the [locations] table.
How can I alter my query to make sure that when objecturl IS NULL, that row is not returned, so in my case my query would only return 9 rows.
Add it to the WHERE clause:
where createdate > '1/1/2014' and objecttype = 1
Since your CASE does not handle any other values, it will result in a NULL when objecttype <> 1.
Alternatively, you could nest SELECTs:
select *
from ( SELECT ir.objectid,ir.objecttype,ir.name,ir.email,ir.createdate,
CASE objecttype
WHEN 1 THEN (select friendlyurl
from locations
where id = ir.objectid)
END as objecturl
FROM inforequests ir
WHERE createdate > '1/1/2014' ) as Temp
where objecturl is not NULL
order by CreateDate asc
Note that this is somewhat different as it will also exclude rows for which the correlated subquery returns NULL.

Postgresql. select SUM value from arrays

Condition:
There are two tables with arrays.
Note food.integer and price.food_id specified array.
CREATE TABLE food (
id integer[] NOT NULL,
name character varying(255),
);
INSERT INTO food VALUES ('{1}', 'Apple');
INSERT INTO food VALUES ('{1,1}', 'Orange');
INSERT INTO food VALUES ('{1,2}', 'banana');
and
CREATE TABLE price (
id bigint NOT NULL,
food_id integer[],
value double precision DEFAULT 0
);
INSERT INTO price VALUES (44, '{1}', 500);
INSERT INTO price VALUES (55, '{1,1}', 100);
INSERT INTO price VALUES (66, '{1,2}', 200);
Need to get the sum value of all the products from table food.
Please help make a sql query.
ANSWER:
{1} - Apple - 800 (500+100+200)
What about this:
select
name,
sum(value)
from
(select unnest(id) as food_id, name from food) food_cte
join (select distinct id, unnest(food_id) as food_id, value from price) price_cte using (food_id)
group by
name
It is difficult to understand your question, but this query at least returns 800 for Apple.
try the following command,
SELECT F.ID,F.NAME,SUM(P.VALUE) FROM FOOD F,PRICE P WHERE F.ID=P.FOOT_ID;