From postgres to greenplum via pxf UUID and INET types aint' supported? - postgresql-9.4

So I am creating external table in greenplum based on the table i have in postgres.
And When I create field
field_name uuid
I am getting an error "Field type UNSUPPORTED_TYPE" Same goes to inet type.
What exactly doesn't support uuid type? Because I've read documentation and as far as I understand both postgres and greenplum support uuid type. So I take it the problem is in pxf connection itself?
FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import')
maybe i need another formatter or something?

The full set of types that are supported by Greenplum PXF will depend on the PXF profile that you are reading from/writing to. For example, Apache ORC does not have a native type for UUID (Apache ORC - Types), but PXF will write UUIDs as strings into ORC files:
CREATE WRITABLE EXTERNAL TABLE pxf_uuid_test_w (col1 int, col2 uuid)
LOCATION ('pxf://orc-data/uuid-test?&PROFILE=file:orc')
FORMAT 'CUSTOM' (FORMATTER='pxfwritable_export');
-- CREATE EXTERNAL TABLE
INSERT INTO pxf_uuid_inet_test_w VALUES (1, 'ceb6817b-0ef1-4167-971a-857f10d4afde');
-- INSERT 0 1
However, PXF does not perform the implicit cast of UUID to string when using any of the Parquet profiles (*:parquet)
CREATE WRITABLE EXTERNAL TABLE pxf_uuid_test_w (col1 int, col2 uuid)
LOCATION ('pxf://parquet-data/uuid-test?&PROFILE=file:parquet')
FORMAT 'CUSTOM' (FORMATTER='pxfwritable_export');
-- CREATE EXTERNAL TABLE
INSERT INTO pxf_uuid_test_w VALUES (1, 'ceb6817b-0ef1-4167-971a-857f10d4afde');
-- ERROR: PXF server error : Type 2950 is not supported (seg0 127.0.1.1:6000 pid=191734)
-- HINT: Check the PXF logs located in the '/pxf-base/logs' directory on host 'localhost' or 'set client_min_messages=LOG' for additional details.
As a workaround, you can create your external table using text, varchar, or char column types and include explicit casts where needed:
CREATE TABLE uuid_test(col1 int, col2 uuid) DISTRIBUTED BY (col1);
INSERT INTO uuid_test values (1, 'ceb6817b-0ef1-4167-971a-857f10d4afde');
-- INSERT 0 1
CREATE WRITABLE EXTERNAL TABLE pxf_uuid_test_w (col1 int, col2 text)
LOCATION ('pxf://parquet-data/uuid-test?&PROFILE=file:parquet')
FORMAT 'CUSTOM' (FORMATTER='pxfwritable_export');
-- CREATE EXTERNAL TABLE
INSERT INTO pxf_uuid_inet_test_w SELECT col1, col2::text FROM uuid_test ;
-- INSERT 0 1

Related

Required an equivalent of SQL Server Table Type in Postgres because range, Enum are not working directly as Table

Create type Employees as Table(name Varchar(100));
This the Query in Sql server but when creating it in same in Postgres
create type Employees AS (name character varying)
Above is not working same as tables

Create a table from a topic using a where clause using ksql

I'm using the latest version of Kafka sql server 0.29.2, I guess. I'm trying to create a reading table that reads from a specific topic which receives lots of events, but I'm interested in specific events. The JSON event has a property named "evenType", so I want to continually filter the events and create a specific table to store the client data, like phone number, email etc., to update the client info.
I created a stream called orders_inputs only for testing purposes, and then I tried to create this table, but I got that error.
create table orders(orderid varchar PRIMARY KEY, itemid varchar) WITH (KAFKA_TOPIC='ORDERS', PARTITIONS=1, REPLICAS=1) as select orderid, itemid from orders_inputs where type='t1';
line 1:120: mismatched input 'as' expecting ';'
Statement: create table orders(orderid varchar PRIMARY KEY, itemid varchar) WITH (KAFKA_TOPIC='ORDERS', PARTITIONS=1, REPLICAS=1) as select orderid, itemid from orders_inputs where type='t1';
Caused by: line 1:120: mismatched input 'as' expecting ';'
Caused by: org.antlr.v4.runtime.InputMismatchException
If you are wanting to create a table that contains the results of a select query from a stream you can use CREATE TABLE AS SELECT
https://docs.confluent.io/5.2.1/ksql/docs/developer-guide/create-a-table.html#create-a-ksql-table-with-streaming-query-results
e.g.
CREATE TABLE orders AS
SELECT orderid, itemid FROM orders_inputs
WHERE type='t1';
You can specify the primary key when creating the stream order_inputs: https://docs.confluent.io/5.4.4/ksql/docs/developer-guide/syntax-reference.html#message-keys
Otherwise, you can specify the primary key when creating a table from a topic:
https://docs.confluent.io/5.2.1/ksql/docs/developer-guide/create-a-table.html#create-a-table-with-selected-columns
e.g.
CREATE TABLE orders
(orderid VARCHAR PRIMARY KEY,
itemid VARCHAR)
WITH (KAFKA_TOPIC = 'orders',
VALUE_FORMAT='JSON');
However, you would then have to query the table and filter where type=t1

How to correctly associate an id generator sequence with a table

I'm using Grails 3.0.7 and Postgres 9.2. I'm very new to Postgres, so this may be a dumb question. How do I correctly associate an id generator sequence with a table? I read somewhere that if you create a table with an id column that has a serial datatype, then it will automatically create a sequence for that table.
However, the column seems to be created with a type of bigint. How do I get Grails to create the column with a bigserial datatype, and will this even solve my problem? What if I want one sequence per table? I'm just not sure how to go about setting this up because I've never really used Postgres in the past.
You can define a generator in a domain class like this:
static mapping = {
id generator:'sequence', params:[sequence:'domain_sq']
}
If the sequence is already present in the database then you'll need to name it in the params.
There are other properties also available as outlined in the documentation, for example:
static mapping = {
id column: 'book_id', type: 'integer'
}
In Postgres 10 or later consider an IDENTITY column instead. See:
Auto increment table column
However, the column seems to be created with a type of bigint. How do
I get Grails to create the column with a bigserial datatype, and will
this even solve my problem?
That's expected behavior. Define the column as bigserial, that's all you have to do. The Postgres pseudo data types smallserial, serial and bigserial create a smallint, int or bigint column respectively, and attach a dedicated sequence. The manual:
The data types smallserial, serial and bigserial are not true types,
but merely a notational convenience for creating unique identifier
columns (similar to the AUTO_INCREMENT property supported by some
other databases). In the current implementation, specifying:
CREATE TABLE tablename (
colname SERIAL
);
is equivalent to specifying:
CREATE SEQUENCE tablename_colname_seq;
CREATE TABLE tablename (
colname integer NOT NULL DEFAULT nextval('tablename_colname_seq')
);
ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname;
Big quote, I couldn't describe it any better than the manual.
Related:
Get table and column "owning" a sequence
Safely rename tables using serial primary key columns

PostgreSQL bigserial & nextval

I've got a PgSQL 9.4.3 server setup and previously I was only using the public schema and for example I created a table like this:
CREATE TABLE ma_accessed_by_members_tracking (
reference bigserial NOT NULL,
ma_reference bigint NOT NULL,
membership_reference bigint NOT NULL,
date_accessed timestamp without time zone,
points_awarded bigint NOT NULL
);
Using the Windows Program PgAdmin III I can see it created the proper information and sequence.
However I've recently added another schema called "test" to the same database and created the exact same table, just like before.
However this time I see:
CREATE TABLE test.ma_accessed_by_members_tracking
(
reference bigint NOT NULL DEFAULT nextval('ma_accessed_by_members_tracking_reference_seq'::regclass),
ma_reference bigint NOT NULL,
membership_reference bigint NOT NULL,
date_accessed timestamp without time zone,
points_awarded bigint NOT NULL
);
My question / curiosity is why in a public schema the reference shows bigserial but in the test schema reference shows bigint with a nextval?
Both work as expected. I just do not understand why the difference in schema's would show different table creations. I realize that bigint and bigserial allow the same volume of ints to be used.
Merely A Notational Convenience
According to the documentation on Serial Types, smallserial, serial, and bigserial are not true data types. Rather, they are a notation to create at once both sequence and column with default value pointing to that sequence.
I created test table on schema public. The command psql \d shows bigint column type. Maybe it's PgAdmin behavior ?
Update
I checked PgAdmin source code. In function pgColumn::GetDefinition() it scans table pg_depend for auto dependency and when found it - replaces bigint with bigserial to simulate original table create code.
When you create a serial column in the standard way:
CREATE TABLE new_table (
new_id serial);
Postgres creates a sequence with commands:
CREATE SEQUENCE new_table_new_id_seq ...
ALTER SEQUENCE new_table_new_id_seq OWNED BY new_table.new_id;
From documentation: The OWNED BY option causes the sequence to be associated with a specific table column, such that if that column (or its whole table) is dropped, the sequence will be automatically dropped as well.
Standard name of a sequence is built from table name, column name and suffix _seq.
If a serial column was created in such a way, PgAdmin shows its type as serial.
If a sequence has non-standard name or is not associated with a column, PgAdmin shows nextval() as default value.

Postgresql: inserting value of a column from a file

For example, there is a table named 'testtable' that has following columns: testint (integer) and testtext (varchar(30)).
What i want to do is pretty much something like that:
INSERT INTO testtable VALUES(15, CONTENT_OF_FILE('file'));
While reading postgresql documentation, all I could find is COPY TO/FROM command, but that one's applied to tables, not single columns.
So, what shall I do?
If this SQL code is executed dynamically from your programming language, use the means of that language to read the file, and execute a plain INSERT statement.
However, if this SQL code is meant to be executed via the psql command line tool, you can use the following construct:
\set content `cat file`
INSERT INTO testtable VALUES(15, :'content');
Note that this syntax is specific to psql and makes use of the cat shell command.
It is explained in detail in the PostgreSQL manual:
psql / SQL Interpolation
psql / Meta-Commands
If I understand your question correctly, you could read the single string(s) into a temp table and use that for insert:
DROP SCHEMA str CASCADE;
CREATE SCHEMA str;
SET search_path='str';
CREATE TABLE strings
( string_id INTEGER PRIMARY KEY
, the_string varchar
);
CREATE TEMP TABLE string_only
( the_string varchar
);
COPY string_only(the_string)
FROM '/tmp/string'
;
INSERT INTO strings(string_id,the_string)
SELECT 5, t.the_string
FROM string_only t
;
SELECT * FROM strings;
Result:
NOTICE: drop cascades to table str.strings
DROP SCHEMA
CREATE SCHEMA
SET
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "strings_pkey" for table "strings"
CREATE TABLE
CREATE TABLE
COPY 1
INSERT 0 1
string_id | the_string
-----------+---------------------
5 | this is the content
(1 row)
Please note that the file is "seen" by the server as the server sees the filesystem. The "current directory" from that point of view is probably $PG_DATA, but you should assume nothing, and specify the complete pathname, which should be reacheable and readable by the server. That is why I used '/tmp', which is unsafe (but an excellent rendez-vous point ;-)