Selecting records in SQL Server - select

I have table with two columns with this sample data:
Column1 Column2
------------------------
A B
A C
A D
R B
R D
S E
If I pass the input value of Column2='B', it will display these records:
Column1 Column2
-------------------------
A B
A C
A D
R B
R D
because Column2 'D' Contains 'A' in column1 and 'R' in Column1. So it fetches
all the records which contains 'A' and 'R'
Suppose if I pass an input if Column2='C', it will display these records
Column1 Column2
-------------------------
A B
A C
A D
because Column2 'C' Contains 'A' in Column1. So it fetches all the records which contain 'A'.
Original table contains > 100k records. So we will give any input for column2.

Related

DB2: SQL to return all rows in a group having a particular value of a column in two latest records of this group

I have a DB2 table having one of the columns (A) which has either value PQR or XYZ.
I need output where the latest two records based on col C date have value A = PQR.
Sample Table
A B C
--- ----- ----------
PQR Mark 08/08/2019
PQR Mark 08/01/2019
XYZ Mark 07/01/2019
PQR Joe 10/11/2019
XYZ Joe 10/01/2019
PQR Craig 06/06/2019
PQR Craig 06/20/2019
In this sample table, my output would be Mark and Craig records
Since 11.1
You may use the nth_value OLAP function.
Refer to OLAP specification.
SELECT A, B, C
FROM
(
SELECT
A, B, C
, NTH_VALUE (A, 1) OVER (PARTITION BY B ORDER BY C DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) C1
, NTH_VALUE (A, 2) OVER (PARTITION BY B ORDER BY C DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) C2
FROM TAB
)
WHERE C1 = 'PQR' AND C2 = 'PQR'
dbfiddle link.
Older versions
SELECT T.*
FROM TAB T
JOIN
(
SELECT B
FROM
(
SELECT
A, B
, ROWNUMBER() OVER (PARTITION BY B ORDER BY C DESC) RN
FROM TAB
)
WHERE RN IN (1, 2)
GROUP BY B
HAVING MIN(A) = MAX(A) AND COUNT(1) = 2 AND MIN(A) = 'PQR'
) G ON G.B = T.B;
A simple solution could be
SELECT A,B,C
FROM tab
WHERE A = 'PQR'
ORDER BY C DESC FETCH FIRST 2 ROWS only

Postgres: how to find rows having duplicate values in fields

How can I find if any value exists more than once in one row? An example:
id | c1 | c2 | c3
----+----+----+----
1 | a | b | c
2 | a | a | b
3 | b | b | b
The query should return rows 2 and 3 since they have the same value more than once. The solution I'm looking for is not 'where c1 = c2 or c1 = c3 or c2 = c3' since there can be any number of columns in tables I need to test. All values are text but can be any length.
One way to do that is to convert the columns to rows:
select *
from the_table tt
where exists (select 1
from ( values (c1), (c2), (c3) ) as t(v)
group by v
having count(*) > 1)
If you want a dynamic solution where you don't have to list each column, you can do that by converting the row to a JSON value:
select *
from the_table tt
where exists (select 1
from jsonb_each_text(to_jsonb(tt)) as j(k,v)
group by v
having count(*) > 1)
Online example

How to merge tables in PostgreSQL?

I want to merge two tables from different schemas in the same PostgreSQL database but I could not get a query to work.
The two tables have lots of columns and samples, I want to select A and B from table 1, and I want to select C, D, E from table 2, where B and C items are exactly the same thing but numbers contained are not totally the same. Thus I want to merge and get A (B/C) D E.
I tried to use UNION but I got an error:
[42601]: ERROR: each UNION query must have the same number of columns.
And when I used LEFT JOIN it shows mistake around '.'.
In the last try my code looked like:
select A from table1 left join
table2.D, table2.E using B=C
You can use this kind of query:
Table
create table table1 (
A text,
B int
);
insert into table1 values ('test-a', 123);
create table table2 (
C int,
D text,
E text
);
insert into table2 values (3456, 'test-d', 'test-e');
Query
select A::text, B::text as BC, '' as D, '' as E from table1
union all
select '' as A, C::text as BC, D::text, E::text from table2
Result
a bc d e
test-a 123
3456 test-d test-e
That'll take all records from table1 (columns A, B, dummy column D and dummy column E) and add to it records from table2 (dummy column A, column C, D and E)
Example: https://rextester.com/NWSEP53051
If you are using SQLite
Tables
create table table1 (A, B);
insert into table1 values ('test-a', 123);
create table table2 (C, D, E);
insert into table2 values (3456, 'test-d', 'test-e');
Query
select A, B as BC, '' as D, '' as E from table1
union all
select '' as A, C as BC, D, E from table2
Result
| A | BC | D | E |
| ------ | ---- | ------ | ------ |
| test-a | 123 | | |
| | 3456 | test-d | test-e |
Example: https://www.db-fiddle.com/f/rE1MeJQpjGH4FZVwWmTpEX/0
You can implement merge using a temporary table
lock table test_tbl in exclusive mode;
data delete
update
insert
https://parksuseong.blogspot.com/2019/07/postgresql-insert-merge-olap.html

SQL: Data Cleaning

I am facing a problem which I do not know how to categorize. So, pardon me for the generic title. I have a dataset like:
Table1: Column1, Column2, Column3.
According to my business logic, for a pair of 'Column1 Column2', the Column3 can have only one unique value. So below table is a problematic one because of the second entry:
Table1
Column1 Column2 Column3
A1 B1 R
A1 B1 O << ERROR! for A1-B1 pair only one value on column3 is accepted
A2 B2 R
A2 B3 J
A3 B3 K
A4 B5 K
From above table I would like to find the problematic entries:
A1 B1 R
A1 B1 O
Thanks in advance for your help !
Using your example column names, you can run the following query to just see the Column1/Column2 pairs that have more than 1 value in Column 3.
SELECT Column1, Column2, COUNT(DISTINCT Column3) as Column3
FROM Table1
GROUP BY Column1, Column2
HAVING COUNT(DISTINCT Column3) > 1
You can omit the HAVING line to see the complete list of Column1/Column2 pairs.

Postgresql table name or alias in SELECT and WHERE clauses without specifying column name

I have two tables:
CREATE TABLE a (id INT NOT NULL);
CREATE TABLE b (id INT NOT NULL);
INSERT INTO a VALUES (1), (2);
INSERT INTO b VALUES (1);
If I try to get records from a for which there are records in b (query 1):
SELECT a.id, b FROM a LEFT JOIN b on a.id = b.id WHERE b is NOT NULL;
I get:
id | b
----+-----
1 | (1)
If I try to get records from a for which there are NO records in b (query 2):
SELECT a.id, b FROM a LEFT JOIN b on a.id = b.id WHERE b IS NULL;
I get:
id | b
----+---
2 |
It seems OK.
Then I alter b:
ALTER TABLE b ADD COLUMN s TEXT NULL;
then query 1 does not return any rows, query 2 returns the same rows and
SELECT a.id, b FROM a LEFT JOIN b on a.id = b.id;
returns
id | b
----+------
1 | (1,)
2 |
My questions are:
Why does Postresql allow to use table name or alias in WHERE clause without specifying column name?
What is (1,) in column b of resulting rows?
Why does (1,) not satisfy IS NULL and IS NOT NULL in query 1 and query 2?
P.S. If I alter table b as ALTER TABLE b ADD COLUMN s TEXT NOT NULL DEFAULT '' instead then queries 1 and 2 return the same rows.
Answering by questions:
This is row constructor, so every value from a column builds up a row value (composite value) using values from your columns for its member fields
(1,) is a row constructor with first member being 1 and second member (your text field) which has a null value, thus no value is shown.
You're comparing entire row constructor which actually satisfies both of comparison (is null and is not null)
More on point 3:
select *, b is not null as b_not_null, b is null as b_null from b;
Reult:
id | b_not_null | b_null
----+------------+--------
1 | t | f
A row IS NULL when all of its members have NULL values, otherwise it IS NOT NULL. Reproduce:
create table rowtest ( col1 int, col2 int);
insert into rowtest values (null,null), (1,1), (null,1);
select
col1, col2, rowtest,
case when rowtest is null then true else false end as rowtest_null
from rowtest;
Result:
col1 | col2 | rowtest | rowtest_null
------+------+---------+--------------
| | (,) | t
1 | 1 | (1,1) | f
| 1 | (,1) | f
Actually, for your queries they both could be rewritten to:
Query1: Get records from a with matching records from b
Using INNER JOIN which actually is the same as JOIN:
SELECT a.id, b FROM a JOIN b on a.id = b.id;
Query2: Get records from a with no matching records from b
Using NOT EXISTS instead of LEFT JOIN:
SELECT a.id
FROM a
WHERE NOT EXISTS (
SELECT 1
FROM b
WHERE a.id = b.id
);
For the last query if you really need the second empty column you can add a static value to select list like that:
SELECT a.id, null as b
The table name can be used in the SELECT or WHERE to refer to a record value containing the entire row of the table. In the output of psql a record will appear like (1) (if it has one field), or (1,2) (if it has two fields), etc. The (1,) that you see is a record with two fields that contain the values 1 and NULL. A value of record type can be null, e.g. in a left join if there is no matching row for the second table.