Redshift: tables info query not working via spark - scala

I am trying to run this query from spark code using databricks:
select * from svv_table_info
but I am getting this error msg:
Exception in thread "main" java.sql.SQLException: Amazon Invalid operation: Specified types or functions (one per INFO message) not supported on Redshift tables.;
any opinion why I am getting this?

That view returns table_id which is in the Postgres system type OID.
psql=# \d+ svv_table_info
Column | Type | Modifiers | Storage | Description
---------------+---------------+-----------+----------+-------------
database | text | | extended |
schema | text | | extended |
table_id | oid | | plain |
table | text | | extended |
encoded | text | | extended |
diststyle | text | | extended |
sortkey1 | text | | extended |
max_varchar | integer | | plain |
sortkey1_enc | character(32) | | extended |
sortkey_num | integer | | plain |
size | bigint | | plain |
pct_used | numeric(10,4) | | main |
empty | bigint | | plain |
unsorted | numeric(5,2) | | main |
stats_off | numeric(5,2) | | main |
tbl_rows | numeric(38,0) | | main |
skew_sortkey1 | numeric(19,2) | | main |
skew_rows | numeric(19,2) | | main |
You can cast it to INTEGER and Spark should be able to handle it.
SELECT database,schema,table_id::INT
,"table",encoded,diststyle,sortkey1
,max_varchar,sortkey1_enc,sortkey_num
,size,pct_used,empty,unsorted,stats_off
,tbl_rows,skew_sortkey1,skew_rows
FROM svv_table_info;

Related

Postgres - Add new column to existing table

I want to alter table and add a new column. But I want also set Stroage column default value.
I tried the following and I get a error. Any idea how to fix this?
ALTER TABLE main_workflowjobtemplate
ADD COLUMN "ask_credential_on_launch" BOOLEAN NOT NULL STORAGE plain;
ERROR: syntax error at or near "STORAGE"
LINE 2: ...OLUMN "ask_credential_on_launch" BOOLEAN NOT NULL STORAGE pl...
Here is the table schema.
awx=# \d+ main_workflowjobtemplate;
Table "public.main_workflowjobtemplate"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
---------------------------+-----------------------+-----------+----------+---------+----------+--------------+-------------
unifiedjobtemplate_ptr_id | integer | | not null | | plain | |
extra_vars | text | | not null | | extended | |
admin_role_id | integer | | | | plain | |
execute_role_id | integer | | | | plain | |
read_role_id | integer | | | | plain | |
survey_enabled | boolean | | not null | | plain | |
survey_spec | text | | not null | | extended | |
allow_simultaneous | boolean | | not null | | plain | |
ask_variables_on_launch | boolean | | not null | | plain | |
ask_inventory_on_launch | boolean | | not null | | plain | |
inventory_id | integer | | | | plain | |
approval_role_id | integer | | | | plain | |
ask_limit_on_launch | boolean | | not null | | plain | |
ask_scm_branch_on_launch | boolean | | not null | | plain | |
char_prompts | text | | not null | | extended | |
webhook_credential_id | integer | | | | plain | |
webhook_key | character varying(64) | | not null | | extended | |
webhook_service | character varying(16) | | not null | | extended | |

Postgres Changing column from TEXT to INTEGER increases table size

I have a postgres table that has a schema like this
Table "am.old_product"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
-----------------+--------------------------+-----------+----------+---------+----------+--------------+-------------
p_config_sku | text | | | | extended | |
p_simple_sku | text | | | | extended | |
p_merchant_id | text | | | | extended | |
p_country | character varying(2) | | | | extended | |
p_discount_rate | numeric(10,2) | | | | main | |
p_black_price | numeric(10,2) | | | | main | |
p_red_price | numeric(10,2) | | | | main | |
p_received_at | timestamp with time zone | | | | plain | |
p_event_id | uuid | | | | plain | |
p_is_deleted | boolean | | | | plain | |
Indexes:
"product_p_simple_sku_p_country_p_merchant_id_idx" UNIQUE, btree (p_simple_sku, p_country, p_merchant_id)
"config_sku_country_idx" btree (p_config_sku, p_country)
We decided that it would be a better idea remove the TEXT field merchant_id and move it to another table, and reference it in the product table using a foreign key. So the new schema looks just like this.
Table "am.product"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
-------------------+--------------------------+-----------+----------+---------+----------+--------------+-------------
p_config_sku | text | | not null | | extended | |
p_simple_sku | text | | not null | | extended | |
p_country | character varying(2) | | not null | | extended | |
p_discount_rate | numeric(10,2) | | | | main | |
p_black_price | numeric(10,2) | | | | main | |
p_red_price | numeric(10,2) | | | | main | |
p_received_at | timestamp with time zone | | not null | | plain | |
p_event_id | uuid | | not null | | plain | |
p_is_deleted | boolean | | | false | plain | |
p_merchant_id_new | integer | | not null | | plain | |
Indexes:
"new_product_p_simple_sku_p_country_p_merchant_id_new_idx" UNIQUE, btree (p_simple_sku, p_country, p_merchant_id_new)
"p_config_sku_country_idx" btree (p_config_sku, p_country)
Foreign-key constraints:
"fk_merchant_id" FOREIGN KEY (p_merchant_id_new) REFERENCES am.merchant(m_id)
Now this should make the product table size drop right? we are using a 4 bytes integer instead of a TEXT. Well not really, the two tables, have the same exact number of rows. The product table (one with integer field) size is 34.3 GB. While the old table's size (with TEXT) has size of 19.7GB
Does anyone have an explanation for that?
At a wild guess you have done this with various ALTER TABLE commands forcing at least one rewrite of the entire table.
The unused space will be gradually re-used, or for a more prompt change try a CLUSTER or VACUUM FULL on the table.
Look at the VACUUM command.
A database file is an organized collection of tuples. A row can be made up of one or more tuples. When you added a new column, you added tuples to the table file. But when you dropped a column, the space occupied by the tuples remains, because to delete it from the file is a costly operation. They are dead tuples.
VACUUM FULL am.product;
This will unfortunately will create exclusive locks on the table, and you won't be able to query it in the process.

Weird ghost records in PostgreSQL - what are they?

I have a very weird issue on our postgresql DB. I have a table called "statement" which has some strange records in it.
Using the command line console psql, I query select * from customer.statement where type in ('QUOTE'); and get 12 rows back. 7 rows look normal, 5 are missing all data except a single column which is a nullable column but seems to hold real values entered by the user. psql tells me that 7 rows were returned even though there are 12. Most of the other columns are not nullable. The weird records look like this:
select * from customer.statement where type = 'QUOTE';
id | issuer_id | recipient_id | recipient_name | recipient_reference | source_statement_id | catalogue_id | reference | issue_date | due_date | description | total | currency | type | tax_level | rounding_mode | status | recall_requested | time_created | time_updated | time_paid
------------------+------------------+------------------+----------------+---------------------+---------------------+--------------+-----------+------------+------------+------------------------------------------------------------------+-----------+----------+-------+-----------+---------------+-----------+------------------+----------------------------+----------------------------+-----------
... 7 valid records removed ...
| | | | | | | | | | Build bulkheads and sheet with plasterboard. +| | | | | | | | | |
| | | | | | | | | | Patch all patches. +| | | | | | | | | |
| | | | | | | | | | Set and sand all joints ready for painting. +| | | | | | | | | |
| | | | | | | | | | Use wall angle on bulkhead in main bedroom. +| | | | | | | | | |
| | | | | | | | | | Build nib and sheet and set in entrance | | | | | | | | | |
(7 rows)
If I run the same query using pgAdmin, I don't see those weird records.
Anyone know what these are?
The plus sign before the separator (+|) indicates a newline character in the displayed string value in psql. So no additional rows, just the same row continued with line breaks. The final line of output in your quote confirms as much: (7 rows).
In pgAdmin you don't see the extra lines as long as you don't increase the height of the field (or copy / paste the content somewhere), but there are multiple lines as well.
Try in psql and in pgAdmin:
test=# SELECT E'This\nis\na\ntest.' AS multi_line, 'foo' AS single_line;
multi_line | single_line
--------------+-------------
This +| foo
is +|
a +|
test. |
(1 row)
The manual about psql:
linestyle
Sets the border line drawing style to one of ascii, old-ascii, or unicode. [...] The default setting is ascii. [...]
ascii style uses plain ASCII characters. Newlines in data are shown using a + symbol in the right-hand margin. [...]

Error in Insert query : syntax error at or near ","

My insert query is,
insert into app_library_reports
(app_id,adp_id,reportname,description,searchstr,command,templatename,usereporttemplate,reporttype,sentbothfiles,useprevioustime,usescheduler,cronstr,option,displaysettings,isanalyticsreport,report_columns,chart_config)
values
(25,18,"Report_Barracuda_SpamDomain_summary","Report On Domains Sending Spam Emails","tl_tag:Barracuda_spam AND action:2","BarracudaSpam/Report_Barracuda_SpamDomain_summary.py",,,,,,,,,,,,);
Schema for the table 'app_library_reports' is:
Table "public.app_library_reports"
Column | Type | Modifiers | Storage | Stats target | Description
-------------------+---------+------------------------------------------------------------------+----------+--------------+-------------
id | integer | not null default nextval('app_library_reports_id_seq'::regclass) | plain | |
app_id | integer | | plain | |
adp_id | integer | | plain | |
reportname | text | | extended | |
description | text | | extended | |
searchstr | text | | extended | |
command | text | | extended | |
templatename | text | | extended | |
usereporttemplate | boolean | | plain | |
reporttype | text | | extended | |
sentbothfiles | text | | extended | |
useprevioustime | text | | extended | |
usescheduler | text | | extended | |
cronstr | text | | extended | |
option | text | | extended | |
displaysettings | text | | extended | |
isanalyticsreport | boolean | | plain | |
report_columns | json | | extended | |
chart_config | json | | extended | |
Indexes:
"app_library_reports_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
"app_library_reports_adp_id_fkey" FOREIGN KEY (adp_id) REFERENCES app_library_adapter(id)
"app_library_reports_app_id_fkey" FOREIGN KEY (app_id) REFERENCES app_library_definition(id)
When I execute insert query it gives error:ERROR: syntax error at or near ","
Please help me to find out this error.Thank you.
I'm fairly certain your immediate error is coming from the empty string of commas (i.e. ,,,,,,,) appearing at the end of the INSERT. If you don't want to specify values for a particular column, you can pass NULL for the value. But in your case, since you only specify values for the first 6 columns, another way is to just specify those 6 columns names when you insert:
INSERT INTO app_library_reports
(app_id, adp_id, reportname, description, searchstr, command)
VALUES
(25, 18, 'Report_Barracuda_SpamDomain_summary',
'Report On Domains Sending Spam Emails', 'tl_tag:Barracuda_spam AND action:2',
'BarracudaSpam/Report_Barracuda_SpamDomain_summary.py')
This insert would only work if the columns not specified accept NULL. If some of the other columns are not nullable, then you would have to pass in values for them.

Postgresql materialized view is refreshed by itself

I have this materialized view in my Postgres 9.4 database:
Materialized view "public.v_videolist"
Column | Type | Modifiers | Storage | Stats target | Description
----------+---------+-----------+----------+--------------+-------------
id | integer | | plain | |
title | text | | extended | |
embed | text | | extended | |
img | text | | extended | |
imgs | text | | extended | |
tags | text | | extended | |
category | text | | extended | |
vid | bigint | | plain | |
views | bigint | | plain | |
likes | bigint | | plain | |
unlikes | bigint | | plain | |
duration | integer | | plain | |
site | integer | | plain | |
Indexes:
"i_vl_id" UNIQUE, btree (id)
View definition:
SELECT videolist.id,
videolist.title,
videolist.embed,
videolist.img,
videolist.imgs,
videolist.tags,
videolist.category,
videolist.vid,
videolist.views,
videolist.likes,
videolist.unlikes,
videolist.duration,
videolist.site
FROM videolist
ORDER BY random();
Time to time this view refreshed by itself. There is no cron job to refresh it or something like that. It is just refreshed by itself from time to time, and I can't find who does it. I fully log all queries. There is no any refresh materialized view in the log.
Why is my view renewed? Any suggestions?
A job could be scheduled to update the statistics using this SQL statement:
REFRESH MATERIALIZED VIEW public.v_videolist;
You can use pg_cron to schedule the job.