Activerecord-import & serial column in PostgreSQL

Activerecord-import & serial column in PostgreSQL - postgresql

I am in the process of upgrading a Rails 2.3.4 project to Rails 3.1.1. The old version used ar-extensions to handle a data import. I pulled out ar-extensions and replaced it with activerecord-import, which I understand has exactly the same interfaces.
My code calls looks like this
Student.import(columns, values)
Both args are valid arrays holding the correct data, but I get a big fat error!
The error stack looks like this:
NoMethodError (You have a nil object when you didn't expect it!
You might have expected an instance of Array.
The error occurred while evaluating nil.split):
activerecord (3.1.1) lib/active_record/connection_adapters/postgresql_adapter.rb:828:in 'default_sequence_name'
activerecord (3.1.1) lib/active_record/base.rb:647:in `reset_sequence_name'
activerecord (3.1.1) lib/active_record/base.rb:643:in `sequence_name'
activerecord-import (0.2.9) lib/activerecord-import/import.rb:203:in `import'
Looking through the code it seems as though Activerecord-import calls activerecord which in turn looks for the name and next value of the Postgres sequence.
So activerecord-import looks for the sequence_name
lib/activerecord-import/import.rb:203
# Force the primary key col into the insert if it's not
# on the list and we are using a sequence and stuff a nil
# value for it into each row so the sequencer will fire later
-> if !column_names.include?(primary_key) && sequence_name && connection.prefetch_primary_key?
column_names << primary_key
array_of_attributes.each { |a| a << nil }
end
It calls active record ...
lib/active_record/base.rb:647:in `reset_sequence_name'
# Lazy-set the sequence name to the connection's default. This method
# is only ever called once since set_sequence_name overrides it.
def sequence_name #:nodoc:
-> reset_sequence_name
end
def reset_sequence_name #:nodoc:
-> default = connection.default_sequence_name(table_name, primary_key)
set_sequence_name(default)
default
end
The code errors when serial_sequence returns nil and default_sequence_name tries to split it.
lib/active_record/connection_adapters/postgresql_adapter.rb
# Returns the sequence name for a table's primary key or some other specified key.
def default_sequence_name(table_name, pk = nil) #:nodoc:
-> serial_sequence(table_name, pk || 'id').split('.').last
rescue ActiveRecord::StatementInvalid
"#{table_name}_#{pk || 'id'}_seq"
end
def serial_sequence(table, column)
result = exec_query(<<-eosql, 'SCHEMA', [[nil, table], [nil, column]])
SELECT pg_get_serial_sequence($1, $2)
eosql
result.rows.first.first
end
When I execute pg_get_serial_sequence() directly against the database I get no value returned:
SELECT pg_get_serial_sequence('student', 'id')
But I can see that in the database there is a sequence called student_id_seq
I am using the following versions of Ruby, rails PG etc..
Rails 3.1.1
Ruby 1.9.2
Activerecord-import 0.2.9
pg 0.12.2
psql (9.0.5, server 9.1.3)
I have migrated the database from MySQL to PostgreSQL, I don't think this has any bearing on the problem but I thought that I'd better add it for completeness.
I can't work out why this isn't working!

Summary of your description:
The table student exists.
The column id exists.
The sequence student_id_seq exists.
pg_get_serial_sequence('student', 'id') still returns NULL.
Two possible explanations:
1) The sequence is not linked to the column.
Column default and the tie between column and sequence are independent features. The mere existence of a fitting sequence does not mean it does what you presume. If you create a column as serial you get the whole package, though. Read the details in the manual.
To fix this (and if you are sure that's how it should be), you can mark the sequence as "owned by" student.id:
ALTER SEQUENCE student_id_seq OWNED BY student.id;
Also check if the column default is set as expected:
SELECT column_name, column_default
FROM information_schema.columns
WHERE table_name = 'student'
-- AND schema = 'your_schema' -- if needed
If not, repair:
ALTER TABLE student ALTER COLUMN id SET DEFAULT nextval('student.id')
2) A mixup of host address / port / database / schema / capitalization of the table name.
It happens all the time. Make sure you check the same database that your app connects to, With the same user or at least the same search_path. Make sure, the objects are in the schema where you expect them and there isn't, for instance, another student table in another schema that got mixed up.

Related

Postgres: getting "... is out of range for type integer" when using NULLIF

For context, this issue occurred in a Go program I am writing using the default postgres database driver.
I have been building a service to talk to a postgres database which has a table similar to the one listed below:
CREATE TABLE object (
id SERIAL PRIMARY KEY NOT NULL,
name VARCHAR(255) UNIQUE,
some_other_id BIGINT UNIQUE
...
);
I have created some endpoints for this item including an "Install" endpoint which effectively acts as an upsert function like so:
INSERT INTO object (name, some_other_id)
VALUES ($1, $2)
ON CONFLICT name DO UPDATE SET
some_other_id = COALESCE(NULLIF($2, 0), object.some_other_id)
I also have an "Update" endpoint with an underlying query like so:
UPDATE object
SET some_other_id = COALESCE(NULLIF($2, 0), object.some_other_id)
WHERE name = $1
The problem:
Whenever I run the update query I always run into the error, referencing the field "some_other_id":
pq: value "1010101010144" is out of range for type integer
However this error never occurs on the "upsert" version of the query, even when the row already exists in the database (when it has to evaluate the COALESCE statement). I have been able to prevent this error by updating COALESCE statement to be as follows:
COALESCE(NULLIF($2, CAST(0 AS BIGINT)), object.some_other_id)
But as it never occurrs with the first query I wondered if this inconsitency had come from me doing something wrong or something that I don't understand? And also what the best practice is with this, should I be casting all values?
I am definitely passing in a 64 bit integer to the query for "some_other_id", and the first query works with the Go implementation even without the explicit type cast.
If any more information (or Go implementation) is required then please let me know, many thanks in advance! (:
Edit:
To eliminate confusion, the queries are being executed directly in Go code like so:
res, err := s.db.ExecContext(ctx, `UPDATE object SET some_other_id = COALESCE(NULLIF($2, 0), object.some_other_id) WHERE name = $1`,
"a name",
1010101010144,
)
Both queries are executed in exactly the same way.
Edit: Also corrected parameter (from $51 to $2) in my current workaround.
I would also like to take this opportunity to note that the query does work with my proposed fix, which suggests that the issue is in me confusing postgres with types in the NULLIF statement? There is no stored procedure asking for an INTEGER arg inbetween my code and the database, at least that I have written.

This has to do with how the postgres parser resolves types for the parameters. I don't know how exactly it's implemented, but given the observed behaviour, I would assume that the INSERT query doesn't fail because it is clear from (name,some_other_id) VALUES ($1,$2) that the $2 parameter should have the same type as the target some_other_id column, which is of type int8. This type information is then also used in the NULLIF expression of the DO UPDATE SET part of the query.
You can also test this assumption by using (name) VALUES ($1) in the INSERT and you'll see that the NULLIF expression in DO UPDATE SET will then fail the same way as it does in the UPDATE query.
So the UPDATE query fails because there is not enough context for the parser to infer the accurate type of the $2 parameter. The "closest" thing that the parser can use to infer the type of $2 is the NULLIF call expression, specifically it uses the type of the second argument of the call expression, i.e. 0, which is of type int4, and it then uses that type information for the first argument, i.e. $2.
To avoid this issue, you should use an explicit type cast with any parameter where the type cannot be inferred accurately. i.e. use NULLIF($2::int8, 0).

COALESCE(NULLIF($51, CAST(0 AS BIGINT)), object.some_other_id)
Fifty-one? Realy?
pq: value "1010101010144" is out of range for type integer
Pay attention, the data type in the error message is an integer, not bigint.
I think the reason for the error is out of showed code. So I take out a magic crystal ball and make a pass with my hands.
an "Install" endpoint which effectively acts as an upsert function like so
I also have an "Update" endpoint
Do you call endpoint a PostgreSQL function (stored procedure)? I think yes.
Also $1, $2 looks like PostgreSQL function arguments.
The magic crystal ball says: you have two PostgreSQL function with different data types of arguments:
"Install" endpoint has $2 function argument as a bigint data type. It looks like CREATE FUNCTION Install(VARCHAR(255), bigint)
"Update" endpoint has $2 function argument as an integer data type, not bigint. It looks like CREATE FUNCTION Update(VARCHAR(255), integer).
At last, I would rewrite your condition more understandable:
UPDATE object
SET some_other_id =
CASE
WHEN $2 = 0 THEN object.some_other_id
ELSE $2
END
WHERE name = $1

Stored procedure in db2 with cursor return type

I am developing a stored procedure (SP) in db2 which will return some data in the form of output cursor but field lengths for different field may vary. I am facing issues as I am not able to make SP compile for this requirement. Below is the code for reference
create table employee(id bigint,first_name varchar(128),last_name varchar(128));
create table employee_designation(id bigint, emp_id bigint,
designation varchar(128));
create type empRowType as row(first_name varchar(128),last_name varchar(128),
designation varchar(128));
create type empCursorType as empRowType cursor;
create procedure emp_designation_lookup(in p_last_name varchar(128), out emp_rec empCursorType)
result sets 0
language SQL
begin
set emp_rec = cursor for select a.first_name,a.last_name,b.designation
from employee a, employee_designation b
where a.id=b.EMP_ID
and a.LAST_NAME = p_last_name;
end;
the above SP compiles and return the result as intended. However if I change the row definition as below
create type empRowType as row(first_name varchar(120),last_name varchar(128),
designation varchar(128));
On recompiling the SP, I get the following error
BMS sample -- [IBM][CLI Driver][DB2/NT64] SQL0408N A value is not compatible with the
data type of its assignment target. Target name is "EMP_REC". LINE NUMBER=5. SQLSTATE=42821
The error is coming as first_name defined in cursor is not of same length in the table employee(cursor has 120 whereas table has 128)
However for my actual work, I need the return values computed based on some logic and hence the length specified in the cursor will be different from what is there in the table. Also I have some new columns in the cursor which are not related with table's column (for example determining the bonus amount or should the employee be promoted etc).
I want to know if there is indeed some solution to such scenario specifically to db2. I am new to db2 and am using version 10.5.7. I also explored multiple articles in IBM docs but not able to find the exact resolution. Any help of pointers will be of great help.

When you use a strongly typed cursor, then any assignment involving that cursor must exactly match the relevant type. Otherwise the compiler will throw an exception, which is your symptom.
Db2 SQL PL also supports weak cursors, and an SQL PL procedure output parameter type can be a weak cursor. This means that a stored procedure declaration can use ...OUT p_cur CURSOR (so there is no preassigned user defined type linked to that cursor) , and then assign that output parameter from different queries ( set p_cur = CURSOR FOR SELECT ... ) . In my case the caller is always SQL (not jdbc), but you might experiment with jdbc as IBM gives an example in the Db2-LUW v11.5 documentation.
Most people use simple result-sets (instead of returned cursors) to harvest the output from queries in stored procedures. These result-sets are consumable by all kinds of client applications (jdbc, odbc, cli) and languages that uses those interfaces (java, .net, python, php, perl, javascript, command-line/scripting etc.). So the simple result sets offer more general purpose usability that returned cursor parameters.
IBM publishes various Db2 samples in different places (on github, in the samples directory-tree of your Db2 server instance directory, in the Knowledge Center etc.).

Postgres delete before insert in single transaction

PostgreSQL DB: v 9.4.24
create table my_a_b_data ... // with a_uuid, b_uuid, and c columns
NOTE: the my_a_b_data keeps the references to a and b table. So it keeps the uuids of a and b.
where: the primary key (a_uuid, b_uuid)
there is also an index:
create unique index my_a_b_data_pkey
on my_a_b_data (a_uuid, b_uuid);
In the Java jdbc-alike code, in the scope one single transaction: (start() -> [code (delete, insert)] ->commit()]) (org.postgresql:postgresql:42.2.5 driver)
delete from my_a_b_data where b_uuid = 'bbb';
insert into my_a_b_data (a_uuid, b_uuid, c) values ('aaa', 'bbb', null);
I found that the insert fails, because the delete is not yet deleted. So it fails because it can not be duplicated.
Q: Is it is some kind of limitation in PostgreSQL that DB can’t do a delete and insert in one transaction because PostgreSQL doesn’t update its indexes until the commit for the delete is executed, therefore the insert will fail since the id or key (whatever we may be using) already exists in the index?
What would be possible solution? Splitting in two transactions?
UPDATE: the order is exactly the same. When I test the sql alone in the SQL console. It works fine. We use JDBI library v 5.29.
there it looks like this:
#Transaction
#SqlUpdate("insert into my_a_b_data (...; // similar for the delete
public abstract void addB() ..
So in the code:
this.begin();
this.deleteByB(b_id);
this.addB(a_id, b_id);
this.commit();

I had a similar problem to insert duplicated values and I resolved it by using Insert and Update instead of Delete. I created this process on Python but you might be able to reproduce it:
First, you create a temporary table like the target table where you want to insert values, the difference is that this table is dropped after commit.
CREATE TEMP TABLE temp_my_a_b_data
(LIKE public.my_a_b_data INCLUDING DEFAULTS)
ON COMMIT DROP;
I have created a CSV (I had to merge different data to input) with the values that I want to input/insert on my table and I used the COPY function to insert them to the temp_table (temp_my_a_b_data).
I found this code on this post related to Java and COPY PostgreSQL - \copy command:
String query ="COPY tmp from 'E://load.csv' delimiter ','";
Use the INSERT INTO but with the ON_CONFLICT clause which you can decide to do an action when the insert cannot be done because of specified constrains, on the case below we do the update:
INSERT INTO public.my_a_b_data
SELECT *
FROM temp_my_a_b_data
ON CONFLICT (a_uuid, b_uuid,c) DO UPDATE
SET a_uuid = EXCLUDED.a_uuid,
b_uuid = EXCLUDED. c = EXCLUDED.c;`
Considerations:
I am not sure but you might be able to perform the third step without using the previous steps, temp table or copy from. You can just a loop over the values:
INSERT INTO public.my_a_b_data VALUES(value1, value2, null)
ON CONFLICT (a_uuid, b_uuid,c) DO UPDATE
SET a_uuid = EXCLUDED.a_uuid,
b_uuid = EXCLUDED.b_uuid, c = EXCLUDED.c;

Create a DB index if it doesn't exist

I have an Alembic migration which creates a few DB indexes that were missing in a database. Example:
op.create_index(op.f('ix_some_index'), 'table_1', ['column_1'], unique=False)
However, the migration fails in other environments that already have the index:
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) relation "ix_some_index" already exists
PostgreSQL supports an IF NOT EXISTS option for cases like this, but I don't see any way of invoking it using either Alembic or SQLAlchemy options. Is there a canonical way of checking for an existing index?

Here's a somewhat blunt solution that works for PostgreSQL. It simply checks whether there's an index with the same name before creating the new index.
Beware that it doesn't verify that the index is in the correct Postgres namespace or any other info that could be relevant. It works in my case because I know there's no other chance of name collision:
def index_exists(name):
connection = op.get_bind()
result = connection.execute(
"SELECT exists(SELECT 1 from pg_indexes where indexname = '{}') as ix_exists;"
.format(name)
).first()
return result.ix_exists
def upgrade():
if not index_exists('ix_some_index'):
op.create_index(op.f('ix_some_index'), 'table_1', ['column_1'], unique=False)

Case-Sensitive Linked-Server merge script - Set IDENTITY_INSERT <table> ON not working

I am doing a experimental script to do a SQL Comparison (COLLATED as case-sensitive) and I am having issues with the SET IDENTITY_INSERT <Table> ON
I have switched on this option and disabled foreign key checks, but it still seems to be complaining about the latter.
Here are the steps I followed:
1 - I created a linked server
EXEC sp_addlinkedserver #Server=N'xxx.xxx.xxx.xxx', #srvproduct=N'SQL Server'
2 - I added the login credentials
EXEC master.dbo.sp_addlinkedsrvlogin
#rmtsrvname = N'xxx.xxx.xxxx.xxx',
#locallogin = NULL ,
#useself = N'False',
#rmtuser = N'xxxxxxxxxxx',
#rmtpassword = N'xxxxxxxxxxx'
3 - In the same batch, I set the identity_insert, disabled foreign key checks and ran the following merge script. Note, the deferred query returns an XML field which is disallowed over distributed servers, so I casted to NVARCHAR(MAX)
SET IDENTITY_INSERT [DATABASE1].[dbo].[TABLE1] ON
ALTER TABLE [DATABASE1].[dbo].[TABLE1] NOCHECK CONSTRAINT ALL
MERGE [DATABASE1].[dbo].[TABLE1]
USING OPENQUERY([xxx.xxx.xxx.xxx], 'SELECT S.ID, S.EventId, S.SnapshotTypeID, CAST(S.Content AS NVARCHAR(MAX)) AS Content FROM [DATABASE1].[dbo].[TABLE1] AS S') AS S
ON (CAST([DATABASE1].[dbo].[TABLE1].Content AS NVARCHAR(MAX)) = S.Content)
WHEN NOT MATCHED BY TARGET
THEN INSERT VALUES (S.ID, S.EventId, S.SnapshotTypeID, CAST(S.Content AS XML))
WHEN MATCHED
THEN UPDATE SET [DATABASE1].[dbo].[TABLE1].EventId = S.EventId,
[DATABASE1].[dbo].[TABLE1].SnapshotTypeID = S.SnapshotTypeID,
[DATABASE1].[dbo].[TABLE1].Content = S.Content
COLLATE Latin1_General_CS_AS;
GO
The error message I am getting reads as follows:
Msg 8101, Level 16, State 1, Line 4
An explicit value for the identity column in table 'Database1.dbo.Table' can only be specified when a column list is used and IDENTITY_INSERT is ON.
How can I fix this? As I mentioned, this script is only an experiment for one of the systems I am writing. I am probably reinventing the wheel somewhere, but its all about learning in this exercise.

An explicit value for the identity column in table 'Database1.dbo.Table' can only be specified when a column list is used and IDENTITY_INSERT is ON.
You have no column list

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Activerecord-import & serial column in PostgreSQL - postgresql

Related

Postgres: getting "... is out of range for type integer" when using NULLIF

Stored procedure in db2 with cursor return type

Postgres delete before insert in single transaction

Create a DB index if it doesn't exist

Case-Sensitive Linked-Server merge script - Set IDENTITY_INSERT <table> ON not working

Categories

Resources