psycopg2 Ignore inserting row with null value - postgresql

I'm populating a database from a .csv file and somewhere in my code I'm referring to a foreign key that might be invalid ( so when I'm going to select that id it returns null). At this point I want to log the row that returns null value and ignore insertion of that row but transaction don't stop and continue. How do I achieve this?
for index, row in df.iterrows():
insert_query = f"""insert into {table_name}
(travel_date, train_id, delay)
values ({row[0]},
(select id from {train_table} where {train_table}.train_no={row[1]} limit 1),
{row[2]});"""
cursor.execute(insert_query)

Related

Fast new row insertion if a value of a column depends on previous value in existing row

I have a table cusers with a primary key:
primary key(uid, lid, cnt)
And I try to insert some values into the table:
insert into cusers (uid, lid, cnt, dyn, ts)
values
(A, B, C, (
select C - cnt
from cusers
where uid = A and lid = B
order by ts desc
limit 1
), now())
on conflict do nothing
Quite often (with the possibility of 98%) a row cannot be inserted to cusers because it violates the primary key constraint, so hard select queries do not need to be executed at all. But as I can see PostgreSQL first counts the select query as a result of dyn column and only then rejects row because of uid, lid, cnt violation.
What is the best way to insert rows quickly in such situation?
Another explanation
I have a system where one row depends on another. Here is an example:
(x, x, 2, 2, <timestamp>)
(x, x, 5, 3, <timestamp>)
Two columns contain an absolute value (2 and 5) and relative value (2, 5 - 2). Each time I insert new row it should:
avoid same rows (see primary key constraint)
if new row differs, it should count a difference and put it into the dyn column (so I take the last inserted row for the user according to the timestamp and subtract values).
Another solution I've found is to use returning uid, lid, ts for inserts and get user ids which were really inserted - this is how I know they have differences from existing rows. Then I update inserted values:
update cusers
set dyn = (
select max(cnt) - min(cnt)
from (
select cnt
from cusers
where uid = A and lid = B
order by ts desc
limit 2) Table
)
where uid = A and lid = B and ts = TS
But it is not a fast approach either, as it seeks all over the ts column to find the two last inserted rows for each user. I need a fast insert query as I insert millions of rows at a time (but I do not write duplicates).
What the solution can be? May be I need a new index for this? Thanks in advance.

Mybatis Insert PK manually

I am trying to single insert data into table with assigned PK. Manually assiging PK.
XML file
<insert id = "insertStd" parameterType = "com.org.springboot.dao.StudentEntity" useGeneratedKeys = "false" keyProperty = "insertStd.id", keyColumn = "id">
INSERT INTO STUDENT (ID, NAME, BRANCH, PERCENTAGE, PHONE, EMAIL )
VALUES (ID=#{insertStd.id}, NAME=#{insertStd.name}, BRANCH=#{insertStd.branch}, PERCENTAGE=#{insertStd.percentage}, PHONE=#{insertStd.phone}, EMAIL =#{insertStd.email});
</insert>
Service call method
public boolean saveStudent(Student student){
LOGGER.info("Student object save");
int savedId= studentMapper.insertStd(student);
}
Log file
org.springframework.jdbc.badsqlgrammarexception
### Error updating database Causes: cause org.postgresql.util.psqlexception error column id does not exist
HINT: There is a column named "id" in the table "student" but it can't be referenced from this part of the query.
Position 200
### Error may exist in file [c:\.....\StudentMapper.xml]
### Error may involve in com.org.springboot.dao.StudentMapper.insertStd-InLine
### The error occurred while setting parameters
### SQL INSERT INTO STUDENT (ID, NAME, BRANCH, PERCENTAGE, PHONE, EMAIL )
VALUES (ID=?, NAME=?,BRANCH=?, PERCENTAGE=?, PHONE=?, EMAIL=?);
### cause org.postgresql.util.psqlexception ERROR column "id" doesn't exist. //It did worked with JPA id assigned manually.
### There is a column named "ID" in the table "STUDENT", Bbut it cannot be referenced from the part of the query.
The INSERT statement of malformed. The VALUES clause should not include the column names.
Also, since there's no primary auto-generation, you can remove all the other attributes. Just leave the mapper id.
Note: if you want to manually assign the PK value, you need to make sure the table does not have a GENERATED ALWAYS clause for the column. If this is the case, the table will ignore the value you are providing and will use its own rules to generate the PK.
Use:
<insert id="insertStd">
INSERT INTO STUDENT (ID, NAME, BRANCH, PERCENTAGE, PHONE, EMAIL)
VALUES (
#{insertStd.id}, #{insertStd.name}, #{insertStd.branch},
#{insertStd.percentage}, #{insertStd.phone}, #{insertStd.email}
);
</insert>
Your error is easily reproduceable:
create table t (a int, b varchar(10));
insert into t (a, b) values (123, 'ABC'); -- succeeds
insert into t (a, b) values (a=123, b='ABC'); -- fails!
error: column "a" does not exist
See the Fiddle.

unique date field postgresql default value

I have a date column which I want to be unique once populated, but want the date field to be ignored if it is not populated.
In MySQL the way this is accomplished is to set the date column to "not null" and give it a default value of '0000-00-00' - this allows all other fields in the unique index to be "checked" even if the date column is not populated yet.
This does not work in PosgreSQL because '0000-00-00' is not a valid date, so you cannot store it in a date field (this makes sense to me).
At first glance, leaving the field nullable seemed like an option, but this creates a problem:
=> create table uniq_test(NUMBER bigint not null, date DATE, UNIQUE(number, date));
CREATE TABLE
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> select * from uniq_test;
number | date
--------+------
1 |
1 |
1 |
1 |
(4 rows)
NULL apparently "isn't equal to itself" and so it does not count towards constraints.
If I add an additional unique constraint only on the number field, it checks only number and not date and so I cannot have two numbers with different dates.
I could select a default date that is a 'valid date' (but outside working scope) to get around this, and could (in fact) get away with that for the current project, but there are actually cases I might be encountering in the next few years where it will not in fact be evident that the date is a non-real date just because it is "a long time ago" or "in the future."
The advantage the '0000-00-00' mechanic had for me was precisely that this date isn't real and therefore indicated a non-populated entry (where 'non-populated' was a valid uniqueness attribute). When I look around for solutions to this on the internet, most of what I find is "just use NULL" and "storing zeros is stupid."
TL;DR
Is there a PostgreSQL best practice for needing to include "not populated" as a possible value in a unique constraint including a date field?
Not clear what you want. This is my guess:
create table uniq_test (number bigint not null, date date);
create unique index i1 on uniq_test (number, date)
where date is not null;
create unique index i2 on uniq_test (number)
where date is null;
There will be an unique constraint for not null dates and another one for null dates effectively turning the (number, date) tuples into distinct values.
Check partial index
It's not a best practice, but you can do it such way:
t=# create table so35(i int, d date);
CREATE TABLE
t=# create unique index i35 on so35(i, coalesce(d,'-infinity'));
CREATE INDEX
t=# insert into so35 (i) select 1;
INSERT 0 1
t=# insert into so35 (i) select 2;
INSERT 0 1
t=# insert into so35 (i) select 2;
ERROR: duplicate key value violates unique constraint "i35"
DETAIL: Key (i, (COALESCE(d, '-infinity'::date)))=(2, -infinity) already exists.
STATEMENT: insert into so35 (i) select 2;

POSTGRES COALESCE(NULLIF(c.name,''), 'unassigned') not working

No version of a character varying field will return anything but NULL if there isn't a value... it's really frustrating
CASE WHEN COALESCE(NULLIF(e.name,''),'unassigned') IS NULL THEN 'unassigned' ELSE a.name END
was my final test and it still simply returns NULL unless the field has a value
it's character varying(255)
COALESCE(a.name,'unassigned') // won't work
NULLIF(a.name,'') // won't work
NULLIF(a.name,NULL) // won't work
COALESCE(NULLIF(a.name,''),'unassigned') // won't work
however the instant i use 0 it works..
what's up with that?
it's a character varying(255) field and it is set to default to null
as a matter of point the build of the table column is
name varying character(255) DEFAULT(NULL)
so I know it's entering NULL
and I've already done a
SELECT * FROM <tbl> WHERE name IS NULL; and of course, I return all the NULL rows with a.name... so what's the deal with this?
ok... to everyone deciding to answer me with:
COALESCE(NULLIF(e.name,''),'unassigned') IS NULL...
This method will never work on a return of "no records" and as this is a stored procedure which creates a materialized view, where I'm polling via nested queries - where it is possible for a column to have 0 (as the default id = other_id) the nested query would simply return no rows. When no row is returned, the functions of COALESCE or NULLIF would never execute. A row would have to be returned in order for those functions to act upon the row values... As I've never heard of a table with a PK auto-incremented field starting at 0 and generally starting at 1 the resultset of "no records returned" will always return a NULL value into the materialized view column.
The query I run afterward to poll rows from that materialized view will however function properly as COALESCE(etc etc) because now there is an actual NULL value in that column.

How to insert DB Default values under EF?

when adding a new record like this
ContentContacts c2 = new ContentContacts();
c2.updated_user = c2.created_user = loggedUserId;
c2.created_date = c2.updated_date = DateTime.UtcNow;
db.ContentContacts.AddObject(c2);
I'm getting
Cannot insert the value NULL into column 'main_email_support', table 'SQL2008R2.dbo.ContentContacts'; column does not allow nulls. INSERT fails. The statement has been terminated.
but the default value in the database is an empty string like:
why am I getting such error? shouldn't the EF says something like:
"ohh, it's a nullvalue, so let's add the column default value instead"
I did a small test, created a table with a column that had a default value but did not allow nulls.
Then this SQL Statement:
INSERT INTO [Test].[dbo].[Table_1]
([TestText])
VALUES
(null)
Gives this error:
Msg 515, Level 16, State 2, Line 1
Cannot insert the value NULL into column 'TestText', table
'Test.dbo.Table_1'; column does not allow nulls. INSERT fails.
The problem here is that the insert specifies all the columns, also those with default values. The the Insert tries to update the columns with null values.
You have 2 options:
Update the table through a view, which does not contain the default columns
Set the default values in your C# code
A default value is business logic, so there is a case for it being set in the business layer of your application.