Postgres and bytea columns appearing weird - postgresql

I dumped a database and imported it into a different server. One of the tables has a bytea column and has a single row of data. On the original server, if I SELECT * FROM users;, it shows the correct value as #. - however, when I do that same select statement on the second server, I get \x402e for that same field. I have tried to wrap my head around this column type but it is over my head. Why would it appear as an escaped string on one server but not the other? Both servers are running Pg11 and I am accessing both via psql.
Original Server:
=# \d+ users
Table "public.users"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
-----------+------------------------+-----------+----------+-----------------------------------+----------+--------------+-------------
id | integer | | not null | nextval('users_id_seq'::regclass) | plain | |
priority | integer | | not null | 7 | plain | |
policy_id | integer | | not null | 1 | plain | |
email | bytea | | not null | | extended | |
fullname | character varying(255) | | | NULL::character varying | extended | |
=# SELECT * FROM users;
id | priority | policy_id | email | fullname
----+----------+-----------+-------+----------
1 | 0 | 1 | #. |
(1 row)
Secondary Server:
=> \d+ users
Table "public.users"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
-----------+------------------------+-----------+----------+-----------------------------------+----------+--------------+-------------
id | integer | | not null | nextval('users_id_seq'::regclass) | plain | |
priority | integer | | not null | 7 | plain | |
policy_id | integer | | not null | 1 | plain | |
email | bytea | | not null | | extended | |
fullname | character varying(255) | | | NULL::character varying | extended | |
=> SELECT * FROM users;
id | priority | policy_id | email | fullname
----+----------+-----------+--------+----------
4 | 0 | 1 | \x402e |
(1 row)

set bytea_output to hex;
select '#.'::bytea;
┌────────┐
│ bytea │
├────────┤
│ \x402e │
└────────┘
set bytea_output to escape;
select '#.'::bytea;
┌───────┐
│ bytea │
├───────┤
│ #. │
└───────┘
It seems that you have a different settings at your servers.
Documentation

Related

Why don't columns with citext datatype is processed by presto?

I'm running pgsql queries on the sql console provided by presto-client connected to presto-server running on top of postgres. The resultset of the queries contain only the columns that aren't of citext type.
DataDetails Table Description:
Table "public.datadetails"
Column | Type | Modifiers | Storage | Stats target | Description
------------------+----------+------------------------------+----------+--------------+-------------
data_sequence_id | bigint | not null | plain | |
key | citext | not null | extended | |
uploaded_by | bigint | not null | plain | |
uploaded_time | bigint | not null | plain | |
modified_by | bigint | | plain | |
modified_time | bigint | | plain | |
retrieved_by | bigint | | plain | |
retrieved_time | bigint | | plain | |
file_name | citext | not null | extended | |
file_type | citext | not null | extended | |
file_size | bigint | not null default 0::bigint | plain | |
Indexes:
"datadetails_pk1" PRIMARY KEY, btree (data_sequence_id)
"datadetails_uk0" UNIQUE CONSTRAINT, btree (key)
Check constraints:
"datadetails_file_name_c" CHECK (length(file_name::text) <= 32)
"datadetails_file_type_c" CHECK (length(file_type::text) <= 2048)
"datadetails_key_c" CHECK (length(key::text) <= 64)
Query Result in Presto-Client:
presto:public> select * from datadetails;
data_sequence_id | uploaded_by | uploaded_time | modified_by | modified_time | retrieved_by | retrieved_time | file_size |
------------------+-------------+---------------+-------------+---------------+--------------+----------------+-----------+
2000000000007 | 15062270 | 1586416286363 | 0 | 0 | 0 | 0 | 61 |
2000000000011 | 15062270 | 1586416299159 | 0 | 0 | 15062270 | 1586417517045 | 36 |
(2 rows)
Query 20200410_130419_00017_gmjgh, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
0:00 [2 rows, 0B] [10 rows/s, 0B/s]
In the above resultset it is evident that the columns with citext type are missing.
Does presto support the citext datatype or Is there any configuration to process the citext datatype using presto?
Postgres: PostgreSQL 9.4.0-relocatable (Red Hat 4.4.7-11), 64-bit
Presto-Server: presto-server-0.230
Presto-Client: presto-cli-332

Postgres Changing column from TEXT to INTEGER increases table size

I have a postgres table that has a schema like this
Table "am.old_product"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
-----------------+--------------------------+-----------+----------+---------+----------+--------------+-------------
p_config_sku | text | | | | extended | |
p_simple_sku | text | | | | extended | |
p_merchant_id | text | | | | extended | |
p_country | character varying(2) | | | | extended | |
p_discount_rate | numeric(10,2) | | | | main | |
p_black_price | numeric(10,2) | | | | main | |
p_red_price | numeric(10,2) | | | | main | |
p_received_at | timestamp with time zone | | | | plain | |
p_event_id | uuid | | | | plain | |
p_is_deleted | boolean | | | | plain | |
Indexes:
"product_p_simple_sku_p_country_p_merchant_id_idx" UNIQUE, btree (p_simple_sku, p_country, p_merchant_id)
"config_sku_country_idx" btree (p_config_sku, p_country)
We decided that it would be a better idea remove the TEXT field merchant_id and move it to another table, and reference it in the product table using a foreign key. So the new schema looks just like this.
Table "am.product"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
-------------------+--------------------------+-----------+----------+---------+----------+--------------+-------------
p_config_sku | text | | not null | | extended | |
p_simple_sku | text | | not null | | extended | |
p_country | character varying(2) | | not null | | extended | |
p_discount_rate | numeric(10,2) | | | | main | |
p_black_price | numeric(10,2) | | | | main | |
p_red_price | numeric(10,2) | | | | main | |
p_received_at | timestamp with time zone | | not null | | plain | |
p_event_id | uuid | | not null | | plain | |
p_is_deleted | boolean | | | false | plain | |
p_merchant_id_new | integer | | not null | | plain | |
Indexes:
"new_product_p_simple_sku_p_country_p_merchant_id_new_idx" UNIQUE, btree (p_simple_sku, p_country, p_merchant_id_new)
"p_config_sku_country_idx" btree (p_config_sku, p_country)
Foreign-key constraints:
"fk_merchant_id" FOREIGN KEY (p_merchant_id_new) REFERENCES am.merchant(m_id)
Now this should make the product table size drop right? we are using a 4 bytes integer instead of a TEXT. Well not really, the two tables, have the same exact number of rows. The product table (one with integer field) size is 34.3 GB. While the old table's size (with TEXT) has size of 19.7GB
Does anyone have an explanation for that?
At a wild guess you have done this with various ALTER TABLE commands forcing at least one rewrite of the entire table.
The unused space will be gradually re-used, or for a more prompt change try a CLUSTER or VACUUM FULL on the table.
Look at the VACUUM command.
A database file is an organized collection of tuples. A row can be made up of one or more tuples. When you added a new column, you added tuples to the table file. But when you dropped a column, the space occupied by the tuples remains, because to delete it from the file is a costly operation. They are dead tuples.
VACUUM FULL am.product;
This will unfortunately will create exclusive locks on the table, and you won't be able to query it in the process.

Redshift: tables info query not working via spark

I am trying to run this query from spark code using databricks:
select * from svv_table_info
but I am getting this error msg:
Exception in thread "main" java.sql.SQLException: Amazon Invalid operation: Specified types or functions (one per INFO message) not supported on Redshift tables.;
any opinion why I am getting this?
That view returns table_id which is in the Postgres system type OID.
psql=# \d+ svv_table_info
Column | Type | Modifiers | Storage | Description
---------------+---------------+-----------+----------+-------------
database | text | | extended |
schema | text | | extended |
table_id | oid | | plain |
table | text | | extended |
encoded | text | | extended |
diststyle | text | | extended |
sortkey1 | text | | extended |
max_varchar | integer | | plain |
sortkey1_enc | character(32) | | extended |
sortkey_num | integer | | plain |
size | bigint | | plain |
pct_used | numeric(10,4) | | main |
empty | bigint | | plain |
unsorted | numeric(5,2) | | main |
stats_off | numeric(5,2) | | main |
tbl_rows | numeric(38,0) | | main |
skew_sortkey1 | numeric(19,2) | | main |
skew_rows | numeric(19,2) | | main |
You can cast it to INTEGER and Spark should be able to handle it.
SELECT database,schema,table_id::INT
,"table",encoded,diststyle,sortkey1
,max_varchar,sortkey1_enc,sortkey_num
,size,pct_used,empty,unsorted,stats_off
,tbl_rows,skew_sortkey1,skew_rows
FROM svv_table_info;

Using variables in select (apostrophes needed)

psql (9.6.1, server 9.5.5)
employees
Column | Type | Modifiers | Storage | Stats target | Description
----------------+-----------------------------+-----------------------------------------------------------------+----------+--------------+---- ---------
employee_id | integer | not null default nextval('employees_employee_id_seq'::regclass) | plain | |
first_name | character varying(20) | | extended | |
last_name | character varying(25) | not null | extended | |
email | character varying(25) | not null | extended | |
phone_number | character varying(20) | | extended | |
hire_date | timestamp without time zone | not null | plain | |
job_id | character varying(10) | not null | extended | |
salary | numeric(8,2) | | main | |
commission_pct | numeric(2,2) | | main | |
manager_id | integer | | plain | |
department_id | integer
For self education I'd like to use a variable.
The result of this request would suit me:
hr=> select last_name, char_length(last_name) as Length from employees where substring(last_name from 1 for 1) = 'H' order by last_name;
last_name | length
-----------+--------
Hartstein | 9
Higgins | 7
Hunold | 6
(3 rows)
But for self education I'd like to use a variable:
\set chosen_letter 'H'
hr=> select last_name, char_length(last_name) as Length from employees where substring(last_name from 1 for 1) = :chosen_letter order by last_name;
ERROR: column "h" does not exist
LINE 1: ...ployees where substring(last_name from 1 for 1) = H order by...
^
Those apostrophes seems to ruin everything. And I can't cope with the problem.
Could you help me understand how to use variable to acquire the result as above?
Try using:
\set chosen_letter '''H'''

Error in Insert query : syntax error at or near ","

My insert query is,
insert into app_library_reports
(app_id,adp_id,reportname,description,searchstr,command,templatename,usereporttemplate,reporttype,sentbothfiles,useprevioustime,usescheduler,cronstr,option,displaysettings,isanalyticsreport,report_columns,chart_config)
values
(25,18,"Report_Barracuda_SpamDomain_summary","Report On Domains Sending Spam Emails","tl_tag:Barracuda_spam AND action:2","BarracudaSpam/Report_Barracuda_SpamDomain_summary.py",,,,,,,,,,,,);
Schema for the table 'app_library_reports' is:
Table "public.app_library_reports"
Column | Type | Modifiers | Storage | Stats target | Description
-------------------+---------+------------------------------------------------------------------+----------+--------------+-------------
id | integer | not null default nextval('app_library_reports_id_seq'::regclass) | plain | |
app_id | integer | | plain | |
adp_id | integer | | plain | |
reportname | text | | extended | |
description | text | | extended | |
searchstr | text | | extended | |
command | text | | extended | |
templatename | text | | extended | |
usereporttemplate | boolean | | plain | |
reporttype | text | | extended | |
sentbothfiles | text | | extended | |
useprevioustime | text | | extended | |
usescheduler | text | | extended | |
cronstr | text | | extended | |
option | text | | extended | |
displaysettings | text | | extended | |
isanalyticsreport | boolean | | plain | |
report_columns | json | | extended | |
chart_config | json | | extended | |
Indexes:
"app_library_reports_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
"app_library_reports_adp_id_fkey" FOREIGN KEY (adp_id) REFERENCES app_library_adapter(id)
"app_library_reports_app_id_fkey" FOREIGN KEY (app_id) REFERENCES app_library_definition(id)
When I execute insert query it gives error:ERROR: syntax error at or near ","
Please help me to find out this error.Thank you.
I'm fairly certain your immediate error is coming from the empty string of commas (i.e. ,,,,,,,) appearing at the end of the INSERT. If you don't want to specify values for a particular column, you can pass NULL for the value. But in your case, since you only specify values for the first 6 columns, another way is to just specify those 6 columns names when you insert:
INSERT INTO app_library_reports
(app_id, adp_id, reportname, description, searchstr, command)
VALUES
(25, 18, 'Report_Barracuda_SpamDomain_summary',
'Report On Domains Sending Spam Emails', 'tl_tag:Barracuda_spam AND action:2',
'BarracudaSpam/Report_Barracuda_SpamDomain_summary.py')
This insert would only work if the columns not specified accept NULL. If some of the other columns are not nullable, then you would have to pass in values for them.