PGSQL - Joining two tables on complicated condition - postgresql

I got stuck during database migration on PostgreSQL and need your help.
I have two tables that I need to join: drzewa_mateczne.migracja (data I need to migrate) and ibl_as.t_adres_lesny (dictionary I need to join with migracja).
I need to join them on replace(drzewa_mateczne.migracja.adresy_lesne, ' ', '') = replace(ibl_as.t_adres_lesny.adres, ' ', ''). However my data is not very regular, so I want to join it on first good match with the dictionary.
I've created the following query:
select
count(*)
from
drzewa_mateczne.migracja a
where
length(a.adresy_lesne) > 0
and replace(a.adresy_lesne, ' ', '') = (select substr(replace(al.adres, ' ', ''), 1, length(replace(a.adresy_lesne, ' ', ''))) from ibl_as.t_adres_lesny al limit 1)
The query doesn't return any rows.
It does successfully join empty rows if ran without
length(a.adresy_lesne) > 0
The two following queries return rows (as expected):
select replace(adres, ' ', '')
from ibl_as.t_adres_lesny
where substr(replace(adres, ' ', ''), 1, 16) = '16-15-1-13-180-c'
limit 1
select replace(adresy_lesne, ' ', ''), length(replace(adresy_lesne, ' ', ''))
from drzewa_mateczne.migracja
where replace(adresy_lesne, ' ', '') = '16-15-1-13-180-c'
I'm suspecting that there might be a problem in sub-query inside the 'where' clause in my query. If you guys could help me resolve this issue, or at least point me in the right direction, I'd be very greatful.
Thanks in advance,
Jan

You can largely simplify to:
SELECT count(*)
FROM drzewa_mateczne.migracja a
WHERE a.adresy_lesne <> ''
AND EXISTS (
SELECT 1 FROM ibl_as.t_adres_lesny al
WHERE replace(al.adres, ' ', '')
LIKE (replace(a.adresy_lesne, ' ', '') || '%')
)
a.adresy_lesne <> '' does the same as length(a.adresy_lesne) > 0, just faster.
Replace the correlated subquery with an EXISTS semi-join (to get only one match per row).
Replace the complex string construction with a simple LIKE expression.
More information on pattern matching and index support in these related answers:
PostgreSQL LIKE query performance variations
Difference between LIKE and ~ in Postgres
speeding up wildcard text lookups

What you're basically telling the database to do is to get you the count of rows from drzewa_mateczne.migracja that have a non-empty adresy_lesne field that is a prefix of the adres field of a semi-random ibl_as.t_adres_lesny row...
Lose the "limit 1" in the subquery and substitute the "=" with "in" and see if that is what you wanted...

Related

Postgres SQL - different results from LIKE query using OR vs ||

I have a table with an integer column. It has 12 records numbered 1000 to 1012. Remember, these are ints.
This query returns, as expected, 12 results:
select count(*) from proposals where qd_number::text like '%10%'
as does this:
SELECT COUNT(*) FROM "proposals" WHERE (lower(first_name) LIKE '%10%' OR qd_number::text LIKE '%10%' )
but this query returns 2 records:
SELECT COUNT(*) FROM "proposals" WHERE (lower(first_name) || ' ' || qd_number::text LIKE '%10%' )
which implies using || in concatenated where expressions is not equivalent to using OR. Is that correct or am I missing something else here?
You probably have nulls in first_name. For these records (lower(first_name) || ' ' || qd_number::text results in null, so you don't find the numbers any longer.
using || in concatenated where expressions is not equivalent to using ORIs that correct or am I missing something else here?
That is correct.
|| is the string concatenation operator in SQL, not the OR operator.

Correct statement for UNION ALL with three or more selects

I have the following situation:
I have a script consisting of 6 selects joined by "UNION ALL".
From the CLP DB2 console, this script fails. Curiously, each query independently work, and even come to work if grouped in pairs. However, when I try with three or more, it fails.
So, my question is: is there is a limit for more that one UNION ALL?
My environment is:
Client. DB2 Connect server 10.1
zOS 390 (no idea what is the DB2 version on that side)
AIX 7.1
The query is like this (but three times )
SELECT
,'GG'
,varchar(right( '000000000000000' || rtrim(ltrim(eeee.zzzz)), 15), 15)
,substr(char(right('**********'||char(left(replace(eeee.yyyy,' ','*')||'**********',10),10),10),10),1,7)
,eeee.kkkkk
,eeee.hhhhhh
,CASE WHEN hhhhhh='A5 ' THEN 'ARS' WHEN hhhhhh='A6 ' THEN 'AUD' WHEN hhhhhh='B5 ' THEN 'BRL' WHEN hhhhhh='U1 ' THEN 'GBP' WHEN hhhhhh='B9 ' THEN 'BND' WHEN hhhhhh='B6 ' THEN 'BNG' WHEN hhhhhh='C1 ' THEN 'CAD' WHEN hhhhhh='C3 ' THEN 'CLP' WHEN hhhhhh='C4 ' THEN 'CNY' WHEN hhhhhh='C5 ' THEN 'COP' WHEN hhhhhh='C7 ' THEN 'CRC' WHEN hhhhhh='L5 ' THEN 'HRK' WHEN hhhhhh='C9 ' THEN 'CYP' WHEN hhhhhh='X0 ' THEN 'CZK' WHEN hhhhhh='D0 ' THEN 'DKK' WHEN hhhhhh='D1 ' THEN 'DOP' WHEN hhhhhh='U0 ' THEN 'EGP' WHEN hhhhhh='E3 ' THEN 'EUR' WHEN hhhhhh='G5 ' THEN 'GTQ' WHEN hhhhhh='H0 ' THEN 'HTG' WHEN hhhhhh='H3 ' THEN 'HUF' WHEN hhhhhh='I1 ' THEN 'INR' WHEN hhhhhh='I2 ' THEN 'IDR' WHEN hhhhhh='K2 ' THEN 'WON' WHEN hhhhhh='L6 ' THEN 'LVL' WHEN hhhhhh='L7 ' THEN 'LTL' WHEN hhhhhh='M2 ' THEN 'MYR' WHEN hhhhhh='M6 ' THEN 'MXN' WHEN hhhhhh='I8 ' THEN 'ILS' WHEN hhhhhh='N2 ' THEN 'NZD' WHEN hhhhhh='N4 ' THEN 'NIO' WHEN hhhhhh='N6 ' THEN 'NOK' WHEN hhhhhh='T4 ' THEN 'XPF' WHEN hhhhhh='P0 ' THEN 'PKR' WHEN hhhhhh='P1 ' THEN 'PAB' WHEN hhhhhh='P3 ' THEN 'PEN' WHEN hhhhhh='P4 ' THEN 'PHP' WHEN hhhhhh='P5 ' THEN 'PLN' WHEN hhhhhh='R2 ' THEN 'RON' WHEN hhhhhh='U3 ' THEN 'RUB' WHEN hhhhhh='S0 ' THEN 'SAR' WHEN hhhhhh='R6 ' THEN 'RSD' WHEN hhhhhh='S2 ' THEN 'SGD' WHEN hhhhhh='K5 ' THEN 'SKK' WHEN hhhhhh='S4 ' THEN 'ZAR' WHEN hhhhhh='C2 ' THEN 'LKR' WHEN hhhhhh='S8 ' THEN 'SEK' WHEN hhhhhh='S9 ' THEN 'CHF' WHEN hhhhhh='T2 ' THEN 'THB' WHEN hhhhhh='T6 ' THEN 'TRL' WHEN hhhhhh='U4 ' THEN 'USD' WHEN hhhhhh='U6 ' THEN 'UAH' WHEN hhhhhh='U5 ' THEN 'AED' WHEN hhhhhh='U2 ' THEN 'UYU' WHEN hhhhhh='V0 ' THEN 'VEB' WHEN hhhhhh='V1 ' THEN 'VND' WHEN hhhhhh='J1 ' THEN 'JPY' ELSE '###' END
, case when eeee.FCRCIDF='Y' then 1 else 0 end
,
CASE
WHEN SUBSTR(eeee.yyyy,7,1) = 'X' THEN 'X'
WHEN SUBSTR(eeee.yyyy,4,2) = 'O' THEN 'O'
WHEN SUBSTR(eeee.yyyy,4,2) = 'C' THEN 'C'
WHEN SUBSTR(eeee.yyyy,4,2) = 'R' THEN 'R'
WHEN eeee.lll = 'F' THEN 'F'
WHEN eeee.ppp <> '' THEN 'D'
WHEN eeee.rrr = 0 THEN '0'
WHEN eeee.rrr <> eeee.ACINTOT THEN 'P'
WHEN eeee.rrr = eeee.ACINTOT THEN '1'
ELSE '*'
END
,1
,eeee.DCINISS
,0
from (SELECT ori.*,oric.tttt FROM www.SK1V01_CUSTOMER ori left OUTER JOIN www.SK1V01A_CUSTCUF oric
ON ori.bbbb=oric.bbbb and ori.ICUSCNO=oric.ICUSCNO ) as aaaa
,www.SK1V02_OPENBILL eeee,www.SK1V41_OPENBILL kkkk
where aaaa.bbbb=eeee.bbbb
and aaaa.cagllic=eeee.cagllic
and aaaa.icuscno=eeee.icuscno
Without the entire statement its pretty hard to determine exact reasons. An given that just one portion is so long & poorly formatted, I'm not sure we'd want to dig through it all. But I can suggest a few approaches that may help resolve your problem.
Simplest part first. In practically any computer language, well formatted code helps you see the structure of what's going on. It may also help you spot the differences between your queries. (Perhaps you know this, & your code merely lost its formatting when you tried to post it.)
When trying to UNION multiple complex queries, it's not uncommon to have column inconsistencies among the queries. You might have missing or extra columns, or columns out of order. But it's possible some of your column expressions are evaluating to different types. You might want to cast() those expressions, or use type conversion functions, just to be sure.
There's so much going on here. Try testing with a version where you comment out large chunks of code, same on each major subquery, until you find which part is causing the problem.
You have a ridiculously long CASE expression on hhhhhh. Why don't you put these value pairs into a lookup table that you can join to.
Try using a module approach, just as you should when writing a large program. You could create a view for each of the major queries, then UNION them together. (Some developers use layers of views like layers of modular code).
Metadata about your views is available in the database catalog views. This means you could write a query to compare the attributes of the columns in your set of union views.

Newbie needing help in PostgreSQL

My current SQL statement is:
SELECT *
FROM names
WHERE UPPER(first_name) LIKE UPPER('John Smith%')
OR UPPER(last_name) LIKE UPPER('John Smith%')
OR UPPER(first_name || ' ' || last_name) LIKE UPPER('John Smith%')
I want to search my table for "John Smith", this SQL statement is okay.
But what if I have an entry with the first name as 'John Kevin' and last name 'Smith', this wouldn't include that entry. What do I need to add? Thanks all! :)
You can use the Similar To operator to cover all possible combinations.
Select * from table names UPPER(first_name || ' ' || last_name)
SIMILAR to '%(UPPER(John)|UPPER(Smith))%';

TSQL - CONCAT_NULL_YIELDS_NULL ON Not Returning Null?

I have had a look around and seem to have come across a strange issue with SQL Server 2008 R2.
I understand that with CONCAT_NULL_YIELDS_NULL = ON means that the following will always resolve to NULL
SELECT NULL + 'My String'
I'm happy with that, however when using this in conjunction with COALESCE() it doesn’t appear to be working on my database.
Consider the following query where MyString is VARCHAR(2000)
SELECT COALESCE(MyString + ', ', '') FROM MyTableOfValues
Now in my query, when MyString IS NULL it returns an empty (NOT NULL) string. I can see this in the query results window.
However unusually enough, when running this in conjunction with an INSERT it fails to recognise the CONCAT_NULL_YIELDS_NULL instead, inserting a blank ‘, ‘.
Query is as follows for insert.
CONCAT_NULL_YIELDS_NULL ON
INSERT INTO Mytable(StringValue)
SELECT COALESCE(MyString + ', ', '')
FROM MyTableOfValues
Further to this I have also checked the database and CONCAT_NULL_YIELDS_NULL = TRUE…
Use NULLIF(MyString, '') instead of just MyString:
SELECT COALESCE(NULLIF(MyString, '') + ', ', '') FROM MyTableOfValues
Coalesce returns the first nonnull expression among its arguments.
You're getting a ', ' because it's the first nonnull expression in your coalesce call.
http://msdn.microsoft.com/en-us/library/ms190349.aspx
From some of the answers provided I was able to assertain a more in depth understanding of COALESCE().
The reason the above query did not fully work was because although I was checking for nulls, and empty string ('') is not considered null. Therefore although the above query worked, I should have checked for empty strings in my table first.
e.g.
SELECT COALESCE(FirstName + ', ', '') + Surname
FROM
(
SELECT 'Joe' AS Firstname, 'Bloggs' AS Surname UNION ALL
SELECT NULL, 'Jones' UNION ALL
SELECT '', 'Jones' UNION ALL
SELECT 'Bob', 'Tielly'
) AS [MyTable]
Will return
FullName
-----------
Joe, Bloggs
Jones
, Jones
Bob, Tielly
Now row 3 has returned a "," character which I was not originally expecting due to a Blank but NOT NULL value.
The following code now works as expected as it checks for blank values. It works, but it looks like I took the long way around. There may be a better way.
-- Ammended Query
SELECT COALESCE(REPLACE(FirstName, Firstname , Firstname + ', '), '') + Surname AS FullName0
FROM
(
SELECT 'Joe' AS Firstname, 'Bloggs' AS Surname UNION ALL
SELECT NULL, 'Jones' UNION ALL
SELECT '', 'Jones' UNION ALL
SELECT 'Bob', 'Tielly'
) AS [MyTable]

Sum like operation for strings in t-sql

I already have query to concatenate
DECLARE #ids VARCHAR(8000)
SELECT #ids = COALESCE(#ids + ', ', '') + concatenatedid
FROM #HH
but if I have to do it inline how can I do that? Any help please.
SELECT sum(quantity), COALESCE(#ids + ', ', '') + concatenatedid from #HH
Thanks.
Use the XML PATH trick. You may need a CAST
SELECT
SUBSTRING(
(
SELECT
',' + concatenatedid
FROM
#HH
FOR XML PATH ('')
)
, 2, 7999)
Also:
Join characters using SET BASED APPROACH (Sql Server 2005)
Subquery returned more than 1 value