How do i set a key field when creating table in KSQL?

How do i set a key field when creating table in KSQL? - apache-kafka

I would like to ask can i set the key field when creating table?
I have created a table by aggregation as below:
CREATE TABLE withdrawal_less_than_5min AS
SELECT executedate, status, count(*) as count
FROM TB3_WITHDRAW_RECORD_EXCLUDE_INTERNAL_USERS
GROUP BY executedate,status;
And when I DESCRIBE EXTENDED withdrawal_less_than_5min the key field of the table is set as below, which i believe should be the executedate and status.
Key field : KSQL_INTERNAL_COL_0|+|KSQL_INTERNAL_COL_1
However when I try to join it with another table with the same aggregation it return this error.
Source table (A) key column (KSQL_INTERNAL_COL_0|+|KSQL_INTERNAL_COL_1)
is not the column used in the join criteria (EXECUTEDATE).
How do I set the key field? Thank you.

You can create key through following way -
CREATE TABLE withdrawal_less_than_5min with (key='EXECUTEDATE') AS
SELECT executedate, status, count(*) as count
FROM TB3_WITHDRAW_RECORD_EXCLUDE_INTERNAL_USERS
GROUP BY executedate,status partition by 'EXECUTEDATE';
You can follow Robin's blog also - https://www.confluent.io/stream-processing-cookbook/ksql-recipes/inspecting-changing-topic-keys.
For any errors or question with Ksql, Search Robin Moffet, he has already answered our queries to help :)

Related

Why am i getting postgresql error "Key (id)=(357) already exists"? [duplicate]

I have a question I know this was posted many times but I didn't find an answer to my problem. The problem is that I have a table and a column "id" I want it to be unique number just as normal. This type of column is serial and the next value after each insert is coming from a sequence so everything seems to be all right but it still sometimes shows this error. I don't know why. In the documentation, it says the sequence is foolproof and always works. If I add a UNIQUE constraint to that column will it help? I worked before many times on Postres but this error is showing for me for the first time. I did everything as normal and I never had this problem before. Can you help me to find the answer that can be used in the future for all tables that will be created? Let's say we have something easy like this:
CREATE TABLE comments
(
id serial NOT NULL,
some_column text NOT NULL,
CONSTRAINT id_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE interesting.comments OWNER TO postgres;
If i add:
ALTER TABLE comments ADD CONSTRAINT id_id_key UNIQUE(id)
Will if be enough or is there some other thing that should be done?

This article explains that your sequence might be out of sync and that you have to manually bring it back in sync.
An excerpt from the article in case the URL changes:
If you get this message when trying to insert data into a PostgreSQL
database:
ERROR: duplicate key violates unique constraint
That likely means that the primary key sequence in the table you're
working with has somehow become out of sync, likely because of a mass
import process (or something along those lines). Call it a "bug by
design", but it seems that you have to manually reset the a primary
key index after restoring from a dump file. At any rate, to see if
your values are out of sync, run these two commands:
SELECT MAX(the_primary_key) FROM the_table;
SELECT nextval('the_primary_key_sequence');
If the first value is higher than the second value, your sequence is
out of sync. Back up your PG database (just in case), then run this command:
SELECT setval('the_primary_key_sequence', (SELECT MAX(the_primary_key) FROM the_table)+1);
That will set the sequence to the next available value that's higher
than any existing primary key in the sequence.

Intro
I also encountered this problem and the solution proposed by #adamo was basically the right solution. However, I had to invest a lot of time in the details, which is why I am now writing a new answer in order to save this time for others.
Case
My case was as follows: There was a table that was filled with data using an app. Now a new entry had to be inserted manually via SQL. After that the sequence was out of sync and no more records could be inserted via the app.
Solution
As mentioned in the answer from #adamo, the sequence must be synchronized manually. For this purpose the name of the sequence is needed. For Postgres, the name of the sequence can be determined with the command PG_GET_SERIAL_SEQUENCE. Most examples use lower case table names. In my case the tables were created by an ORM middleware (like Hibernate or Entity Framework Core etc.) and their names all started with a capital letter.
In an e-mail from 2004 (link) I got the right hint.
(Let's assume for all examples, that Foo is the table's name and Foo_id the related column.)
Command to get the sequence name:
SELECT PG_GET_SERIAL_SEQUENCE('"Foo"', 'Foo_id');
So, the table name must be in double quotes, surrounded by single quotes.
1. Validate, that the sequence is out-of-sync
SELECT CURRVAL(PG_GET_SERIAL_SEQUENCE('"Foo"', 'Foo_id')) AS "Current Value", MAX("Foo_id") AS "Max Value" FROM "Foo";
When the Current Value is less than Max Value, your sequence is out-of-sync.
2. Correction
SELECT SETVAL((SELECT PG_GET_SERIAL_SEQUENCE('"Foo"', 'Foo_id')), (SELECT (MAX("Foo_id") + 1) FROM "Foo"), FALSE);

Replace the table_name to your actual name of the table.
Gives the current last id for the table. Note it that for next step.
SELECT MAX(id) FROM table_name;
Get the next id sequence according to postgresql. Make sure this id is higher than the current max id we get from step 1
SELECT nextVal('"table_name_id_seq"');
if it's not higher than then use this step 3 to update the next sequence.
SELECT setval('"table_name_id_seq"', (SELECT MAX(id) FROM table_name)+1);

The primary key is already protecting you from inserting duplicate values, as you're experiencing when you get that error. Adding another unique constraint isn't necessary to do that.
The "duplicate key" error is telling you that the work was not done because it would produce a duplicate key, not that it discovered a duplicate key already commited to the table.

For future searchs, use ON CONFLICT DO NOTHING.

Referrence - https://www.calazan.com/how-to-reset-the-primary-key-sequence-in-postgresql-with-django/
I had the same problem try this:
python manage.py sqlsequencereset table_name
Eg:
python manage.py sqlsequencereset auth
you need to run this in production settings(if you have)
and you need Postgres installed to run this on the server

From http://www.postgresql.org/docs/current/interactive/datatype.html
Note: Prior to PostgreSQL 7.3, serial implied UNIQUE. This is no longer automatic. If you wish a serial column to be in a unique constraint or a primary key, it must now be specified, same as with any other data type.

In my case carate table script is:
CREATE TABLE public."Survey_symptom_binds"
(
id integer NOT NULL DEFAULT nextval('"Survey_symptom_binds_id_seq"'::regclass),
survey_id integer,
"order" smallint,
symptom_id integer,
CONSTRAINT "Survey_symptom_binds_pkey" PRIMARY KEY (id)
)
SO:
SELECT nextval('"Survey_symptom_binds_id_seq"'::regclass),
MAX(id)
FROM public."Survey_symptom_binds";
SELECT nextval('"Survey_symptom_binds_id_seq"'::regclass) less than MAX(id) !!!
Try to fix the proble:
SELECT setval('"Survey_symptom_binds_id_seq"', (SELECT MAX(id) FROM public."Survey_symptom_binds")+1);
Good Luck every one!

I had the same problem. It was because of the type of my relations. I had a table property which related to both states and cities. So, at first I had a relation from property to states as OneToOne, and the same for cities. And I had the same error "duplicate key violates unique constraint". That means that: I can only have one property related to one state and city. But that doesnt make sense, because a city can have multiple properties. So the problem is the relation. The relation should be ManyToOne. Many properties to One city

Table name started with a capital letter if tables were created by an ORM middleware (like Hibernate or Entity Framework Core etc.)
SELECT setval('"Table_name_Id_seq"', (SELECT MAX("Id") FROM "Table_name") + 1)
WHERE
NOT EXISTS (
SELECT *
FROM (SELECT CURRVAL(PG_GET_SERIAL_SEQUENCE('"Table_name"', 'Id')) AS seq, MAX("Id") AS max_id
FROM "Table_name") AS seq_table
WHERE seq > max_id
)

try that CLI
it's just a suggestion to enhance the adamo code (thanks a lot adamo)
SELECT setval('tableName_columnName_seq', (SELECT MAX(columnName) FROM tableName));

For programatically solution at Django. Based on Paolo Melchiorre's answer, I wrote a chunk as a function to be called before any .save()
from django.db import connection
def setSqlCursor(db_table):
sql = """SELECT pg_catalog.setval(pg_get_serial_sequence('"""+db_table+"""', 'id'), MAX(id)) FROM """+db_table+""";"""
with connection.cursor() as cursor:
cursor.execute(sql)

I have similar problem but I solved it by removing all the foreign key in my Postgresql

Compute shared hstore key names in Postgresql

If I have a table with an HSTORE column:
CREATE TABLE thing (properties hstore);
How could I query that table to find the hstore key names that exist in every row.
For example, if the table above had the following data:
properties
-------------------------------------------------
"width"=>"b", "height"=>"a"
"width"=>"b", "height"=>"a", "surface"=>"black"
"width"=>"c"
How would I write a query that returned 'width', as that is the only key that occurs in each row?
skeys() will give me all the property keys, but I'm not sure how to aggregate them so I only have the ones that occur in each row.

The manual gets us most of the way there, but not all the way... way down at the bottom of http://www.postgresql.org/docs/8.3/static/hstore.html under the heading "Statistics", they describe a way to count keys in an hstore.
If we adapt that to your sample table above, you can compare the counts to the # of rows in the table.
SELECT key
FROM (SELECT (each(properties)).key FROM thing1) AS stat
GROUP BY key
HAVING count(*) = (select count(*) from thing1)
ORDER BY key;
If you want to find the opposite (all those keys that are not in every row of your table), just change the = to < and you're in business!

nosql cassandra - how to create update query based on select

I tried to create a query to update the number of rows of specific table:
UPDATE Albums (NumOfphotos)
VALUES (
SELECT COUNT(*)
FROM photos
WHERE AlbumName='nature'
)
WHERE AlbumName='nature';
This does not seem to be the right syntax,
what is the right way to do it (maybe by autokey)?

This type of query is deliberately not supported by Cassandra nor CQL.
If you review the datastax CQL documentation for update you will see that the "where specification" only takes either a primary key or a IN(keys) clause:
primary key name = key_value
primary key name IN (key_value ,...)
So your best bet is to do a
SELECT * FROM photos WHERE AlbumName='nature';
followed by updates to set the correct values.

Adding a primary key with an "insert into" statement

I have the following query and need to add a Primary Key to the Column of Employeenumber:
SELECT [Exceptions].Employeenumber,[Exceptions].exceptiondate, [Exceptions].starttime, [exceptions].endtime, [Exceptions].code, datediff(minute, starttime, endtime) as minutes INTO scratchpad3,
FROM Employees INNER JOIN Exceptions ON [Exceptions].EmployeeNumber = [Exceptions].Employeenumber
where [Exceptions].exceptiondate between '5/1/2011' and '5/8/2011'
GROUP BY [Exceptions].Employeenumber, [Exceptions].Exceptiondate, [Exceptions].starttime, [exceptions].endtime,
[Exceptions].code, [Exceptions].exceptiondate
but don't know the proper syntax when you're doing a "create" this way. What's the propery syntax to add a primary key this way?
Thank you.

You can't add a Primary Key to a SELECT statement. Primary Keys are identifying columns of tables. You'd need to ALTER TABLE and ADD PRIMARY KEY. The syntax is different, but it looks like you're using SQL Server. Statements can be found Here.
If you're looking to just add a number for each record, try using ROW_NUMBER (might be different depending on the database you're using).
Hope that helps,
Jason

postgresql duplicate key violates unique constraint

I have a question I know this was posted many times but I didn't find an answer to my problem. The problem is that I have a table and a column "id" I want it to be unique number just as normal. This type of column is serial and the next value after each insert is coming from a sequence so everything seems to be all right but it still sometimes shows this error. I don't know why. In the documentation, it says the sequence is foolproof and always works. If I add a UNIQUE constraint to that column will it help? I worked before many times on Postres but this error is showing for me for the first time. I did everything as normal and I never had this problem before. Can you help me to find the answer that can be used in the future for all tables that will be created? Let's say we have something easy like this:
CREATE TABLE comments
(
id serial NOT NULL,
some_column text NOT NULL,
CONSTRAINT id_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE interesting.comments OWNER TO postgres;
If i add:
ALTER TABLE comments ADD CONSTRAINT id_id_key UNIQUE(id)
Will if be enough or is there some other thing that should be done?

This article explains that your sequence might be out of sync and that you have to manually bring it back in sync.
An excerpt from the article in case the URL changes:
If you get this message when trying to insert data into a PostgreSQL
database:
ERROR: duplicate key violates unique constraint
That likely means that the primary key sequence in the table you're
working with has somehow become out of sync, likely because of a mass
import process (or something along those lines). Call it a "bug by
design", but it seems that you have to manually reset the a primary
key index after restoring from a dump file. At any rate, to see if
your values are out of sync, run these two commands:
SELECT MAX(the_primary_key) FROM the_table;
SELECT nextval('the_primary_key_sequence');
If the first value is higher than the second value, your sequence is
out of sync. Back up your PG database (just in case), then run this command:
SELECT setval('the_primary_key_sequence', (SELECT MAX(the_primary_key) FROM the_table)+1);
That will set the sequence to the next available value that's higher
than any existing primary key in the sequence.

Intro
I also encountered this problem and the solution proposed by #adamo was basically the right solution. However, I had to invest a lot of time in the details, which is why I am now writing a new answer in order to save this time for others.
Case
My case was as follows: There was a table that was filled with data using an app. Now a new entry had to be inserted manually via SQL. After that the sequence was out of sync and no more records could be inserted via the app.
Solution
As mentioned in the answer from #adamo, the sequence must be synchronized manually. For this purpose the name of the sequence is needed. For Postgres, the name of the sequence can be determined with the command PG_GET_SERIAL_SEQUENCE. Most examples use lower case table names. In my case the tables were created by an ORM middleware (like Hibernate or Entity Framework Core etc.) and their names all started with a capital letter.
In an e-mail from 2004 (link) I got the right hint.
(Let's assume for all examples, that Foo is the table's name and Foo_id the related column.)
Command to get the sequence name:
SELECT PG_GET_SERIAL_SEQUENCE('"Foo"', 'Foo_id');
So, the table name must be in double quotes, surrounded by single quotes.
1. Validate, that the sequence is out-of-sync
SELECT CURRVAL(PG_GET_SERIAL_SEQUENCE('"Foo"', 'Foo_id')) AS "Current Value", MAX("Foo_id") AS "Max Value" FROM "Foo";
When the Current Value is less than Max Value, your sequence is out-of-sync.
2. Correction
SELECT SETVAL((SELECT PG_GET_SERIAL_SEQUENCE('"Foo"', 'Foo_id')), (SELECT (MAX("Foo_id") + 1) FROM "Foo"), FALSE);

Replace the table_name to your actual name of the table.
Gives the current last id for the table. Note it that for next step.
SELECT MAX(id) FROM table_name;
Get the next id sequence according to postgresql. Make sure this id is higher than the current max id we get from step 1
SELECT nextVal('"table_name_id_seq"');
if it's not higher than then use this step 3 to update the next sequence.
SELECT setval('"table_name_id_seq"', (SELECT MAX(id) FROM table_name)+1);

The primary key is already protecting you from inserting duplicate values, as you're experiencing when you get that error. Adding another unique constraint isn't necessary to do that.
The "duplicate key" error is telling you that the work was not done because it would produce a duplicate key, not that it discovered a duplicate key already commited to the table.

For future searchs, use ON CONFLICT DO NOTHING.

Referrence - https://www.calazan.com/how-to-reset-the-primary-key-sequence-in-postgresql-with-django/
I had the same problem try this:
python manage.py sqlsequencereset table_name
Eg:
python manage.py sqlsequencereset auth
you need to run this in production settings(if you have)
and you need Postgres installed to run this on the server

From http://www.postgresql.org/docs/current/interactive/datatype.html
Note: Prior to PostgreSQL 7.3, serial implied UNIQUE. This is no longer automatic. If you wish a serial column to be in a unique constraint or a primary key, it must now be specified, same as with any other data type.

In my case carate table script is:
CREATE TABLE public."Survey_symptom_binds"
(
id integer NOT NULL DEFAULT nextval('"Survey_symptom_binds_id_seq"'::regclass),
survey_id integer,
"order" smallint,
symptom_id integer,
CONSTRAINT "Survey_symptom_binds_pkey" PRIMARY KEY (id)
)
SO:
SELECT nextval('"Survey_symptom_binds_id_seq"'::regclass),
MAX(id)
FROM public."Survey_symptom_binds";
SELECT nextval('"Survey_symptom_binds_id_seq"'::regclass) less than MAX(id) !!!
Try to fix the proble:
SELECT setval('"Survey_symptom_binds_id_seq"', (SELECT MAX(id) FROM public."Survey_symptom_binds")+1);
Good Luck every one!

I had the same problem. It was because of the type of my relations. I had a table property which related to both states and cities. So, at first I had a relation from property to states as OneToOne, and the same for cities. And I had the same error "duplicate key violates unique constraint". That means that: I can only have one property related to one state and city. But that doesnt make sense, because a city can have multiple properties. So the problem is the relation. The relation should be ManyToOne. Many properties to One city

Table name started with a capital letter if tables were created by an ORM middleware (like Hibernate or Entity Framework Core etc.)
SELECT setval('"Table_name_Id_seq"', (SELECT MAX("Id") FROM "Table_name") + 1)
WHERE
NOT EXISTS (
SELECT *
FROM (SELECT CURRVAL(PG_GET_SERIAL_SEQUENCE('"Table_name"', 'Id')) AS seq, MAX("Id") AS max_id
FROM "Table_name") AS seq_table
WHERE seq > max_id
)

try that CLI
it's just a suggestion to enhance the adamo code (thanks a lot adamo)
SELECT setval('tableName_columnName_seq', (SELECT MAX(columnName) FROM tableName));

For programatically solution at Django. Based on Paolo Melchiorre's answer, I wrote a chunk as a function to be called before any .save()
from django.db import connection
def setSqlCursor(db_table):
sql = """SELECT pg_catalog.setval(pg_get_serial_sequence('"""+db_table+"""', 'id'), MAX(id)) FROM """+db_table+""";"""
with connection.cursor() as cursor:
cursor.execute(sql)

I have similar problem but I solved it by removing all the foreign key in my Postgresql

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How do i set a key field when creating table in KSQL? - apache-kafka

Related

Why am i getting postgresql error "Key (id)=(357) already exists"? [duplicate]

Compute shared hstore key names in Postgresql

nosql cassandra - how to create update query based on select

Adding a primary key with an "insert into" statement

postgresql duplicate key violates unique constraint

Categories

Resources