I have a JSON column that I can select normally for almost every record, but there are some records that just fail. If I select any other column for the problematic record it works fine, but if I select the json column, the connection just times out.
Example:
SELECT id, my_json_column from "table" where name = 'regular record'; -- works just fine
SELECT id, my_json_column from "table" where name = 'problematic record'; -- connection timeout
SELECT id, other_json_column from "table" where name = 'problematic record'; -- works just fine
I'm using an AWS RDS with PostgreSQL version 12.11
Related
I have an issue with ksql, when trying to do a pull query all I get is "Query terminated" with no results if I use it as a push query it works but what I desire is to get the current "state" of the TABLE.
the table is created off a stream:
create table MY_TABLE as select metadata->"DATA1", metadata->"DATA2", count(metadata->"DATA2") as DATA2s from My_STREAM where metadata->"DATA2"='key' GROUP BY metadata->"DATA1";
and if I do a
select * from MY_TABLE;
the result is
query terminated
if I do
select * from MY_TABLE EMIT CHANGES;
I get the desired output except I only want the current "state", what am I missing?
The versions of ksql and cli are:
CLI v0.25.1, Server v0.25.1
I have this query and insert rows to MYSQl database and work perfect.
insert int test(id,user)
select null,user from table2
union
select null,user from table3
But when run the above query in PostgreSQL not work. And I get this error column "id" is of type integer but expression is of type text, But when I run two query below as shown as worked.
When I run below query in PostgreSQL it works properly:
insert into test(id,user)
select null,user from table2
Or below query in PostgreSQL it works properly:
insert int test(id,user)
select null,user from table3
Or below query in PostgreSQL it works properly:
select null,user from table2
union
select null,user from table3
null is not a real value and thus has no data type. The default assumed data type is text, that's where the error message comes from. Just cast the value to int in the first SELECT:
insert into test(id, "user")
select null::int, "user" from table2
union
select null, "user" from table3
Or even better, leave out the id completely so that any default defined for the id column is used. It sounds strange to try and insert null into a column named id
insert into test("user")
select "user" from table2
union
select "user" from table3
Note that user is a reserved keyword and a built-in function, so you will have to quote it to avoid problems. In the long run I recommend to find a different name for that column.
Trying to get a list of COPY commands run on a particular date and the tables that were updated for each COPY command.
Working with this query:
select
slc.query as query_id,
trim(slc.filename) as file,
slc.curtime as updated,
slc.lines_scanned as rows,
sq.querytxt as querytxt
from stl_load_commits slc
join stl_query sq on sq.query = slc.query
where trunc(slc.curtime) = '2020-05-07';
How can we get the table that was updated for each COPY command? Maybe using a Redshift RegEx function on querytxt? Or joining to another system table to find the table id or name?
this regex will select the table or schema.table from stl_query.querytxt
select
slc.query as query_id,
trim(slc.filename) as file,
slc.curtime as updated,
slc.lines_scanned as rows,
sq.querytxt as querytxt,
REGEXP_REPLACE(LOWER(sq.querytxt), '^copy (analyze )?(\\S+).*$', '$2') AS t
from stl_load_commits slc
join stl_query sq on sq.query = slc.query
where trunc(updated) = '2020-05-07';
I have two query by union all and insert into temp table.
Query 1
select *
from (
select a.id as id, a.name as name from a
union all
select b.id as id, b.name as name from b
)
Query 2
drop table if exists temporary;
create temp table temporary as
select id as id, name as name
from a;
insert into temporary
select id as id, name as name
from b;
select * from temp;
Please tell me which one is better for performance?
I would expect the second option to have better performance, at least at the the database level. Both versions require doing a full table scan of both the a and b tables. But the first version would create an unnecessary intermediate table, used only for the purpose of the insert.
The only potential issue with doing two separate inserts is latency, i.e. the time it might take some process to get to and from the database. If you are worried about this, then you can limit to one insert statement:
INSERT INTO temporary (id, name)
SELECT id, name FROM a
UNION ALL
SELECT id, name FROM b;
This would just require one trip to the database.
I think use union all is the better performance way, not sure, you can try it your self. In tab run of SQL application alway show time to run. I take a snapshot in oracle; mysql and sql sv have the same tool to see it
click here to see image
I am trying to select all data out of the same specific table partition for 100+ tables using the DB2 EXPORT utility. The partition name is constant across all of my partitioned tables, which makes this method more advantageous than using some other possible methods.
I cannot detach the partitions as they are in a production environment.
In order to script this for semi-automation, I need to be able to run the query:
SELECT * FROM MYTABLE
WHERE PARTITION_NAME = MYPARTITION;
I am not able to find the correct syntax for utilizing this type of logic in my SELECT statement passed to the EXPORT utility.
You can do something like this by looking up the partition number first:
SELECT SEQNO
FROM SYSCAT.DATAPARTITIONS
WHERE TABNAME = 'YOURTABLE' AND DATAPARTITIONNAME = 'WHATEVER'
then using the SEQNO value in the query:
SELECT * FROM MYTABLE
WHERE DATAPARTITIONNUM(anycolumn) = <SEQNO value>
Edit:
Since it does not matter what column you reference in DATAPARTITIONNUM(), and since each table is guaranteed to have at least one column, you can automatically generate queries by joining SYSCAT.DATAPARTITIONS and SYSCAT.COLUMNS:
select
'select * from', p.tabname,
'where datapartitionnum(', colname, ') = ', seqno
from syscat.datapartitions p
inner join syscat.columns c
on p.tabschema = c.tabschema and p.tabname = c.tabname
where colno = 1
and datapartitionname = '<your partition name>'
and p.tabname in (<your table list>)
However, building dependency on database metadata into your application is, in my view, not very reliable. You can simply specify the appropriate partitioning key range to extract the data, which will be as efficient.