How to use queried table name in subquery - firebird

I'm trying to query field names as well as their maximum length in their corresponding table with a single query - is it at all possible? I've read about correlated subqueries, but I couldn't get the desired result.
Here is the query I have so far:
select T1.RDB$FIELD_NAME, T2.RDB$FIELD_NAME, T2.RDB$RELATION_NAME as tabName, T1.RDB$CHARACTER_SET_ID, T1.RDB$FIELD_LENGTH,
(select max(char_length(T2.RDB$FIELD_NAME))
FROM tabName as MaxLength)
from RDB$FIELDS T1, RDB$RELATION_FIELDS T2
The above doesn't work because, of course, here the subquery tries to find "tabName" table. My guess is that I should use some kind of joins, but my SQL skills are very limited in this matter.
The origin of the request is that I want to apply this script in order to transform all my non-utf8 fields to UTF8 but I run into "string truncation" issues, as I have a few `VARCHAR(8192)' fields that lead to string truncation errors with the script. Usually, none of the fields would actually use these 8192 chars, but I'd rather make sure before truncating.

What you're trying to do cannot be done this way. It looks like you want to obtain the actual maximum length of fields in tables, but you cannot dynamically reference table and column names like this; being able to do that would be a SQL injection heaven. In addition, your use of a SQL-89 cross join instead of an inner join (preferably in SQL-92 style) causes other problems, as you will combine fields incorrectly (as a Cartesian product).
Instead you need to write PSQL to dynamically build and execute the statement to obtain the lengths (using EXECUTE BLOCK (or a stored procedure) and EXECUTE STATEMENT).
For example, something like this:
execute block
returns (
table_name varchar(63) character set unicode_fss,
column_name varchar(63) character set unicode_fss,
type varchar(10),
length smallint,
charset_name varchar(63) character set unicode_fss,
collation_name varchar(63) character set unicode_fss,
max_length smallint)
as
begin
for select
trim(rrf.RDB$RELATION_NAME) as table_name,
trim(rrf.RDB$FIELD_NAME) as column_name,
case rf.RDB$FIELD_TYPE when 14 then 'CHAR' when 37 then 'VARCHAR' end as type,
coalesce(rf.RDB$CHARACTER_LENGTH, rf.RDB$FIELD_LENGTH / rcs.RDB$BYTES_PER_CHARACTER) as length,
trim(rcs.RDB$CHARACTER_SET_NAME) as charset_name,
trim(rc.RDB$COLLATION_NAME) as collation_name
from RDB$RELATIONS rr
inner join RDB$RELATION_FIELDS rrf
on rrf.RDB$RELATION_NAME = rr.RDB$RELATION_NAME
inner join RDB$FIELDS rf
on rf.RDB$FIELD_NAME = rrf.RDB$FIELD_SOURCE
inner join RDB$CHARACTER_SETS rcs
on rcs.RDB$CHARACTER_SET_ID = rf.RDB$CHARACTER_SET_ID
left join RDB$COLLATIONS rc
on rc.RDB$CHARACTER_SET_ID = rf.RDB$CHARACTER_SET_ID
and rc.RDB$COLLATION_ID = rf.RDB$COLLATION_ID
and rc.RDB$COLLATION_NAME <> rcs.RDB$DEFAULT_COLLATE_NAME
where coalesce(rr.RDB$RELATION_TYPE, 0) = 0 and coalesce(rr.RDB$SYSTEM_FLAG, 0) = 0
and rf.RDB$FIELD_TYPE in (14 /* char */, 37 /* varchar */)
into table_name, column_name, type, length, charset_name, collation_name
do
begin
execute statement 'select max(character_length("' || replace(column_name, '"', '""') || '")) from "' || replace(table_name, '"', '""') || '"'
into max_length;
suspend;
end
end
As an aside, the maximum length of a VARCHAR of character set UTF8 is 8191, not 8192.

Related

Concatenate string instead of just replacing it

I have a table with standard columns where I want to perform regular INSERTs.
But one of the columns is of type varchar with special semantics. It's a string that's supposed to behave as a set of strings, where the elements of the set are separated by commas.
Eg. if one row has in that varchar column the value fish,sheep,dove, and I insert the string ,fish,eagle, I want the result to be fish,sheep,dove,eagle (ie. eagle gets added to the set, but fish doesn't because it's already in the set).
I have here this Postgres code that does the "set concatenation" that I want:
SELECT string_agg(unnest, ',') AS x FROM (SELECT DISTINCT unnest(string_to_array('fish,sheep,dove' || ',fish,eagle', ','))) AS x;
But I can't figure out how to apply this logic to insertions.
What I want is something like:
CREATE TABLE IF NOT EXISTS t00(
userid int8 PRIMARY KEY,
a int8,
b varchar);
INSERT INTO t00 (userid,a,b) VALUES (0,1,'fish,sheep,dove');
INSERT INTO t00 (userid,a,b) VALUES (0,1,',fish,eagle')
ON CONFLICT (userid)
DO UPDATE SET
a = EXCLUDED.a,
b = SELECT string_agg(unnest, ',') AS x FROM (SELECT DISTINCT unnest(string_to_array(t00.b || EXCLUDED.b, ','))) AS x;
How can I achieve something like that?
Storing comma separated values is a huge mistake to begin with. But if you really want to make your life harder than it needs to be, you might want to create a function that merges two comma separated lists:
create function merge_lists(p_one text, p_two text)
returns text
as
$$
select string_agg(item, ',')
from (
select e.item
from unnest(string_to_array(p_one, ',')) as e(item)
where e.item <> '' --< necessary because of the leading , in your data
union
select t.item
from unnest(string_to_array(p_two, ',')) t(item)
where t.item <> ''
) t;
$$
language sql;
If you are using Postgres 14 or later, unnest(string_to_array(..., ',')) can be replace with string_to_table(..., ',')
Then your INSERT statement gets a bit simpler:
INSERT INTO t00 (userid,a,b) VALUES (0,1,',fish,eagle')
ON CONFLICT (userid)
DO UPDATE SET
a = EXCLUDED.a,
b = merge_lists(excluded.b, t00.b);
I think I was only missing parentheses around the SELECT statement:
INSERT INTO t00 (userid,a,b) VALUES (0,1,',fish,eagle')
ON CONFLICT (userid)
DO UPDATE SET
a = EXCLUDED.a,
b = (SELECT string_agg(unnest, ',') AS x FROM (SELECT DISTINCT unnest(string_to_array(t00.b || EXCLUDED.b, ','))) AS x);

Why does atttypmod differ from character_maximum_length?

I'm converting some information_schema queries to system catalog queries and I'm getting different results for character maximum length.
SELECT column_name,
data_type ,
character_maximum_length AS "maxlen"
FROM information_schema.columns
WHERE table_name = 'x'
returns the results I expect, e.g.:
city character varying 255
company character varying 1000
The equivalent catalog query
SELECT attname,
atttypid::regtype AS datatype,
NULLIF(atttypmod, -1) AS maxlen
FROM pg_attribute
WHERE CAST(attrelid::regclass AS varchar) = 'x'
AND attnum > 0
AND NOT attisdropped
Seems to return every length + 4:
city character varying 259
company character varying 1004
Why the difference? Is it safe to always simply subtract 4 from the result?
You could say it's safe to substract 4 from the result for types char and varchar. What information_schema.columns view does under the hood is it calls a function informatoin_schema._pg_char_max_length (this is your difference, since you don't), which body is:
CREATE OR REPLACE FUNCTION information_schema._pg_char_max_length(typid oid, typmod integer)
RETURNS integer
LANGUAGE sql
IMMUTABLE PARALLEL SAFE STRICT
AS $function$SELECT
CASE WHEN $2 = -1 /* default typmod */
THEN null
WHEN $1 IN (1042, 1043) /* char, varchar */
THEN $2 - 4
WHEN $1 IN (1560, 1562) /* bit, varbit */
THEN $2
ELSE null
END$function$
That said, for chars and varchars it always substracts 4.
This makes your query not equivalent to the extent that it would actually need a join to pg_type in order to establish the typid of the column and wrap the value in a function to have it return proper values. This is due to the fact, that there are more things coming into play than just that. If you wish to simplify, you can do it without a join (it won't be bulletproof though):
SELECT attname,
atttypid::regtype AS datatype,
NULLIF(information_schema._pg_char_max_length(atttypid, atttypmod), -1) AS maxlen
FROM pg_attribute
WHERE CAST(attrelid::regclass AS varchar) = 'x'
AND attnum > 0
AND NOT attisdropped
This should do it for you. Should you wish to investigate the matter further, refer to view definition of information_schema.columns.

Removing all the Alphabets from a string using a single SQL Query [duplicate]

I'm currently doing a data conversion project and need to strip all alphabetical characters from a string. Unfortunately I can't create or use a function as we don't own the source machine making the methods I've found from searching for previous posts unusable.
What would be the best way to do this in a select statement? Speed isn't too much of an issue as this will only be running over 30,000 records or so and is a once off statement.
You can do this in a single statement. You're not really creating a statement with 200+ REPLACEs are you?!
update tbl
set S = U.clean
from tbl
cross apply
(
select Substring(tbl.S,v.number,1)
-- this table will cater for strings up to length 2047
from master..spt_values v
where v.type='P' and v.number between 1 and len(tbl.S)
and Substring(tbl.S,v.number,1) like '[0-9]'
order by v.number
for xml path ('')
) U(clean)
Working SQL Fiddle showing this query with sample data
Replicated below for posterity:
create table tbl (ID int identity, S varchar(500))
insert tbl select 'asdlfj;390312hr9fasd9uhf012 3or h239ur ' + char(13) + 'asdfasf'
insert tbl select '123'
insert tbl select ''
insert tbl select null
insert tbl select '123 a 124'
Results
ID S
1 390312990123239
2 123
3 (null)
4 (null)
5 123124
CTE comes for HELP here.
;WITH CTE AS
(
SELECT
[ProductNumber] AS OrigProductNumber
,CAST([ProductNumber] AS VARCHAR(100)) AS [ProductNumber]
FROM [AdventureWorks].[Production].[Product]
UNION ALL
SELECT OrigProductNumber
,CAST(STUFF([ProductNumber], PATINDEX('%[^0-9]%', [ProductNumber]), 1, '') AS VARCHAR(100) ) AS [ProductNumber]
FROM CTE WHERE PATINDEX('%[^0-9]%', [ProductNumber]) > 0
)
SELECT * FROM CTE
WHERE PATINDEX('%[^0-9]%', [ProductNumber]) = 0
OPTION (MAXRECURSION 0)
output:
OrigProductNumber ProductNumber
WB-H098 098
VE-C304-S 304
VE-C304-M 304
VE-C304-L 304
TT-T092 092
RichardTheKiwi's script in a function for use in selects without cross apply,
also added dot because in my case I use it for double and money values within a varchar field
CREATE FUNCTION dbo.ReplaceNonNumericChars (#string VARCHAR(5000))
RETURNS VARCHAR(1000)
AS
BEGIN
SET #string = REPLACE(#string, ',', '.')
SET #string = (SELECT SUBSTRING(#string, v.number, 1)
FROM master..spt_values v
WHERE v.type = 'P'
AND v.number BETWEEN 1 AND LEN(#string)
AND (SUBSTRING(#string, v.number, 1) LIKE '[0-9]'
OR SUBSTRING(#string, v.number, 1) LIKE '[.]')
ORDER BY v.number
FOR
XML PATH('')
)
RETURN #string
END
GO
Thanks RichardTheKiwi +1
Well if you really can't use a function, I suppose you could do something like this:
SELECT REPLACE(REPLACE(REPLACE(LOWER(col),'a',''),'b',''),'c','')
FROM dbo.table...
Obviously it would be a lot uglier than that, since I only handled the first three letters, but it should give the idea.

Get all instances of primary keys of a table

This is a simple example of what I need, for any given table, I need to get all the instances of the primary keys, this is a little example, but I need a generic way to do it.
create table foo
(
a numeric
,b text
,c numeric
constraint pk_foo primary key (a,b)
)
insert into foo(a,b,c) values (1,'a',1),(2,'b',2),(3,'c',3);
select <the magical thing>
result
a|b
1 |1|a|
2 |2|b|
3 |3|c|
.. ...
I need to control if the instances of the primary keys are changed by the user, but I don't want to repeat code in too many tables! I need a generic way to do it, I will put <the magical thing>
in a function to put it on a trigger before update and blah blah blah...
In PostgreSQL you must always provide a resulting type for a query. However, you can obtain the code of the query you need, and then execute the query from the client:
create or replace function get_key_only_sql(regclass) returns string as $$
select 'select '|| (
select string_agg(quote_ident(att.attname), ', ' order by col)
from pg_index i
join lateral unnest(indkey) col on (true)
join pg_attribute att on (att.attrelid = i.indrelid and att.attnum = col)
where i.indrelid = $1 and i.indisprimary
group by i.indexrelid
limit 1) || ' from '||$1::text
end;
$$ language sql;
Here's some client pseudocode using the function above:
sql = pgexecscalar("select get_key_only_sql('mytable'::regclass)");
rs = pgopen(sql);

Get columns that differ between 2 rows

I have a table company with 60 columns. The goal is to create a tool to find, compare and eliminate duplicates in this table.
Example: I find 2 companies that potentially are the same, but I need to know which values (columns) differ between these 2 rows in order to continue.
I think it is possible to compare column by column x 60, but I search for a simpler and more generic solution.
Something like:
SELECT * FROM company where co_id=22
SHOW DIFFERENCE
SELECT * FROM company where co_id=33
The result should be the column names that differ.
For this you may use an intermediate key/value representation of the rows, with JSON functions or alternatively with the hstore extension (now only of historical interest). JSON comes built-in with every reasonably recent version of PostgreSQL, whereas hstore must be installed in the database with CREATE EXTENSION.
Demo:
CREATE TABLE table1 (id int primary key, t1 text, t2 text, t3 text);
Let's insert two rows that differ by the primary key and one other column (t3).
INSERT INTO table1 VALUES
(1,'foo','bar','baz'),
(2,'foo','bar','biz');
Solution with json
First with get a key/value representation of the rows with the original row number, then we pair the rows based on their original row number and
filter out those with the same "value" column
WITH rowcols AS (
select rn, key, value
from (select row_number() over () as rn,
row_to_json(table1.*) as r from table1) AS s
cross join lateral json_each_text(s.r)
)
select r1.key from rowcols r1 join rowcols r2
on (r1.rn=r2.rn-1 and r1.key = r2.key)
where r1.value <> r2.value;
Sample result:
key
-----
id
t3
Solution with hstore
SELECT skeys(h1-h2) from
(select hstore(t.*) as h1 from table1 t where id=1) h1
CROSS JOIN
(select hstore(t.*) as h2 from table1 t where id=2) h2;
h1-h2 computes the difference key by key and skeys() outputs the result as a set.
Result:
skeys
-------
id
t3
The select-list might be refined with skeys((h1-h2)-'id'::text) to always remove id which, as the primary key, will obviously always differ between rows.
Here's a stored procedure that should get you most of the way...
While this should work "as is", it has no error checking, which you should add.
It gets all the columns in the table, and loops over them. A difference is when the count of the distinct items is more than one.
Also, the output is:
The count of the number of differences
Messages for each column where there is a difference
It might be more useful to return a rowset of the columns with the differences. Anyway, good luck!
Usage:
SELECT showdifference('public','company','co_id',22,33)
CREATE OR REPLACE FUNCTION showdifference(p_schema text, p_tablename text,p_idcolumn text,p_firstid integer, p_secondid integer)
RETURNS INTEGER AS
$BODY$
DECLARE
l_diffcount INTEGER;
l_column text;
l_dupcount integer;
column_cursor CURSOR FOR select column_name from information_schema.columns where table_name = p_tablename and table_schema = p_schema and column_name <> p_idcolumn;
BEGIN
-- need error checking here, to ensure the table and schema exist and the columns exist
-- Should also check that the records ids exist.
-- Should also check that the column type of the id field is integer
-- Set the number of differences to zero.
l_diffcount := 0;
-- use a cursor to iterate over the columns found in information_schema.columns
-- open the cursor
OPEN column_cursor;
LOOP
FETCH column_cursor INTO l_column;
EXIT WHEN NOT FOUND;
-- build a query to see if there is a difference between the columns. If there is raise a notice
EXECUTE 'select count(distinct ' || quote_ident(l_column) || ' ) from ' || quote_ident(p_schema) || '.' || quote_ident(p_tablename) || ' where ' || quote_ident(p_idcolumn) || ' in ('|| p_firstid || ',' || p_secondid ||')'
INTO l_dupcount;
IF l_dupcount > 1 THEN
-- increment the counter
l_diffcount := l_diffcount +1;
RAISE NOTICE '% has % differences', l_column, l_dupcount ; -- for "real" you might want to return a rowset and could do something here
END IF;
END LOOP;
-- close the cursor
CLOSE column_cursor;
RETURN l_diffcount;
END;
$BODY$
LANGUAGE plpgsql VOLATILE STRICT
COST 100;