Insert null values to postgresql timestamp data type using python - postgresql

I am tying to insert null value to a postgres timestamp datatype variable using python psycopg2.
The problem is the other data types such as char or int takes None, whereas the timestamp variable does not recognize None.
I tried to insert Null , null as a string because I am using a dictionary to get append the values for insert statement.
Below is the code.
queryDictOrdered[column] = queryDictOrdered[column] if isNull(queryDictOrdered[column]) is False else NULL
and the function is
def isNull(key):
if str(key).lower() in ('null','n.a','none'):
return True
else:
False
I get the below error messages:
DataError: invalid input syntax for type timestamp: "NULL"
DataError: invalid input syntax for type timestamp: "None"

Empty timestamps in Pandas dataframes come through as NaT (not a time), which is NOT pg compatible with NULL. A quick work around is to send it as a varchar and then run these 2 queries:
update <<schema.table_name>> set <<column_name>> = Null where
<<column_name>> = 'NULL';
or (depending on what you hard coded empty values as)
update <<schema.table_name>> set <<column_name>> = Null where <<column_name>> = 'NaT';
Finally run:
alter table <<schema.table_name>>
alter COLUMN <<column_name>> TYPE timestamp USING <<column_name>>::timestamp without time zone;

Surely you are adding quotes around the placeholder. Read psycopg documentation about passing parameters to queries.

Dropping this here incase it's helpful for anyone.
Using psycopg2 and the cursor object's copy_from method, you can copy missing or NaT datetime values from a pandas DataFrame to a Postgres timestamp field.
The copy_from method has a null parameter that is a "textual representation of NULL in the file. The default is the two characters string \N". See this link for more information.
Using pandas' fillna method, you can replace any missing datetime values with \N via data["my_datetime_field"].fillna("\\N"). Notice the double backslash here, where the first backslash is necessary to escape the second backslash.
Using the select_columns method from the pyjanitor module (or .loc[] and some subsetting with the column names of your DataFrame), you can coerce multiple columns at once via something akin to this, where all of your datetime fields end with an _at suffix.
data_datetime_fields = \
(data
.select_columns("*_at")
.apply(lambda x: x.fillna("\\N")))

Related

PostgreSQL - jsonb - How to get the datatype for value in query with jsonpath

In PostgreSQL using jsonb column, is there a way to select / convert an attribute with actual datatype the datatype instead of getting it as a string object when using jsonpath? I would like to try to avoid cast as well as -> and ->> type of construct since I have to select many attributes with very deep paths, I am trying to do it using jsonpath and * or ** in the path
Is it possible to do it this way or must I use the -> and ->> for each node in the path ? This will make the query look complicated as I have to select about 35+ attributes in the select with quite deep paths.
Also, how do we remove quotes from the selected value?
This is what I was trying, but doesn't work to remove quotes from Text value and gives an error on numeric
Select
PolicyNumber AS "POLICYNUMBER",
jsonb_path_query(payload, '$.**.ProdModelID')::text AS "PRODMODELID",
jsonb_path_query(payload, '$.**.CashOnHand')::float AS "CASHONHAND"
from policy_json_table
the PRODMODELID still shows the quotes around the value and when I add ::float to second column, it gives an error
SQL Error [22023]: ERROR: cannot cast jsonb string to type double precision
Thank you
When you try to directly cast the jsonb value to another datatype, postgres will attempt to first convert it to a json text and then parse that. See
How to convert Postgres json(b) to integer?
How to convert Postgres json(b) to float?
How to convert Postgres json(b) to text?
How to convert Postgres json(b) to boolean?
When you have strings in your JSON values, to avoid the quotes you'll need to extract them by using one of the json functions/operators returning text. In your case:
SELECT
PolicyNumber AS "POLICYNUMBER",
jsonb_path_query(payload, '$.**.ProdModelID') #>> '{}' AS "PRODMODELID",
(jsonb_path_query(payload, '$.**.CashOnHand') #>> '{}')::float AS "CASHONHAND"
FROM policy_json_table
jsonb_path_query function returns data with quotes (""), so you cannot cast this to integer or float. For casting value to integer, you need value without quotes.
You can use this SQL for getting without quotes:
Select
PolicyNumber AS "POLICYNUMBER",
(payload->>'ProdModelID')::text AS "PRODMODELID",
(payload->>'CashOnHand')::float AS "CASHONHAND"
from policy_json_table
If you need to use exactly jsonb_path_query then you can trim these quotes:
Select
PolicyNumber AS "POLICYNUMBER",
jsonb_path_query(payload, '$.**.ProdModelID')::text AS "PRODMODELID",
trim(jsonb_path_query(payload, '$.**.CashOnHand')::text, '"')::float AS "CASHONHAND"
from policy_json_table

Postgres: getting "... is out of range for type integer" when using NULLIF

For context, this issue occurred in a Go program I am writing using the default postgres database driver.
I have been building a service to talk to a postgres database which has a table similar to the one listed below:
CREATE TABLE object (
id SERIAL PRIMARY KEY NOT NULL,
name VARCHAR(255) UNIQUE,
some_other_id BIGINT UNIQUE
...
);
I have created some endpoints for this item including an "Install" endpoint which effectively acts as an upsert function like so:
INSERT INTO object (name, some_other_id)
VALUES ($1, $2)
ON CONFLICT name DO UPDATE SET
some_other_id = COALESCE(NULLIF($2, 0), object.some_other_id)
I also have an "Update" endpoint with an underlying query like so:
UPDATE object
SET some_other_id = COALESCE(NULLIF($2, 0), object.some_other_id)
WHERE name = $1
The problem:
Whenever I run the update query I always run into the error, referencing the field "some_other_id":
pq: value "1010101010144" is out of range for type integer
However this error never occurs on the "upsert" version of the query, even when the row already exists in the database (when it has to evaluate the COALESCE statement). I have been able to prevent this error by updating COALESCE statement to be as follows:
COALESCE(NULLIF($2, CAST(0 AS BIGINT)), object.some_other_id)
But as it never occurrs with the first query I wondered if this inconsitency had come from me doing something wrong or something that I don't understand? And also what the best practice is with this, should I be casting all values?
I am definitely passing in a 64 bit integer to the query for "some_other_id", and the first query works with the Go implementation even without the explicit type cast.
If any more information (or Go implementation) is required then please let me know, many thanks in advance! (:
Edit:
To eliminate confusion, the queries are being executed directly in Go code like so:
res, err := s.db.ExecContext(ctx, `UPDATE object SET some_other_id = COALESCE(NULLIF($2, 0), object.some_other_id) WHERE name = $1`,
"a name",
1010101010144,
)
Both queries are executed in exactly the same way.
Edit: Also corrected parameter (from $51 to $2) in my current workaround.
I would also like to take this opportunity to note that the query does work with my proposed fix, which suggests that the issue is in me confusing postgres with types in the NULLIF statement? There is no stored procedure asking for an INTEGER arg inbetween my code and the database, at least that I have written.
This has to do with how the postgres parser resolves types for the parameters. I don't know how exactly it's implemented, but given the observed behaviour, I would assume that the INSERT query doesn't fail because it is clear from (name,some_other_id) VALUES ($1,$2) that the $2 parameter should have the same type as the target some_other_id column, which is of type int8. This type information is then also used in the NULLIF expression of the DO UPDATE SET part of the query.
You can also test this assumption by using (name) VALUES ($1) in the INSERT and you'll see that the NULLIF expression in DO UPDATE SET will then fail the same way as it does in the UPDATE query.
So the UPDATE query fails because there is not enough context for the parser to infer the accurate type of the $2 parameter. The "closest" thing that the parser can use to infer the type of $2 is the NULLIF call expression, specifically it uses the type of the second argument of the call expression, i.e. 0, which is of type int4, and it then uses that type information for the first argument, i.e. $2.
To avoid this issue, you should use an explicit type cast with any parameter where the type cannot be inferred accurately. i.e. use NULLIF($2::int8, 0).
COALESCE(NULLIF($51, CAST(0 AS BIGINT)), object.some_other_id)
Fifty-one? Realy?
pq: value "1010101010144" is out of range for type integer
Pay attention, the data type in the error message is an integer, not bigint.
I think the reason for the error is out of showed code. So I take out a magic crystal ball and make a pass with my hands.
an "Install" endpoint which effectively acts as an upsert function like so
I also have an "Update" endpoint
Do you call endpoint a PostgreSQL function (stored procedure)? I think yes.
Also $1, $2 looks like PostgreSQL function arguments.
The magic crystal ball says: you have two PostgreSQL function with different data types of arguments:
"Install" endpoint has $2 function argument as a bigint data type. It looks like CREATE FUNCTION Install(VARCHAR(255), bigint)
"Update" endpoint has $2 function argument as an integer data type, not bigint. It looks like CREATE FUNCTION Update(VARCHAR(255), integer).
At last, I would rewrite your condition more understandable:
UPDATE object
SET some_other_id =
CASE
WHEN $2 = 0 THEN object.some_other_id
ELSE $2
END
WHERE name = $1

psycopg2.DataError: invalid input syntax for integer: ""

I am trying to insert data into a table. When I try to insert an empty string into a Textfield, I am getting the invalid input syntax for integer error message.
Other textfields work fine with empty string.
My code:
cur_p.execute("""
INSERT INTO a_recipient (created, mod, agreed, address, honor)
VALUES (current_timestamp, current_timestamp, current_timestamp, %s, %s)""", (None, None))
psycopg2.DataError: invalid input syntax for integer: ""
LINE 35: ... '', ''..
The code works fine if I remove the last current_timestamp in the values as well as the agreed but if I put it back, the error message re-appears.
I checked other threads opened here in SO, I found this but the problem was about the values in array input error: integer
Any advice?
So there appear to be a few issues here.
First, in your INSERT INTO you have five columns that you name (created, mod, etc.) but in your VALUES statement (%s, %s) you only have two variables.
I don't know what the data types of your columns are but the error may be because you're trying to insert empty strings '' into an integer field. Try using None instead of the empty strings. Psycopg2 converts Python None objects to SQL NULL.
I also don't think you need the trailing comma after "honor".

Load NULL TIMESTAMP with TIME ZONE using COPY FROM in PostgreSQL

I have a CSV file that I'm trying to load into a PostgreSQL 9.2.4 database using the COPY FROM command. In particular there is a timestamp field that is allowed to be null, however when I load "null values" (actually just "") I get the following error:
ERROR: invalid input syntax for type timestamp with time zone: ""
An example CSV file looks as follows:
id,name,joined
1,"bob","2013-10-02 15:27:44-05"
2,"jane",""
The SQL looks as follows:
CREATE TABLE "users"
(
"id" BIGSERIAL NOT NULL PRIMARY KEY,
"name" VARCHAR(255),
"joined" TIMESTAMP WITH TIME ZONE,
);
COPY "users" ("id", "name", "joined")
FROM '/path/to/data.csv'
WITH (
ENCODING 'utf-8',
HEADER 1,
FORMAT 'csv'
);
According to the documentation, null values should be represented by an empty string that cannot contain the quote character, which is double quote (") in this case:
NULL
Specifies the string that represents a null value. The default is \N (backslash-N) in text format, and an unquoted empty string in CSV format. You might prefer an empty string even in text format for cases where you don't want to distinguish nulls from empty strings. This option is not allowed when using binary format.
Note: When using COPY FROM, any data item that matches this string will be stored as a null value, so you should make sure that you use the same string as you used with COPY TO.
I've tried the option NULL '' but that seems to have no affect. Advice, please!
empty string without quotes works normally:
id,name,joined
1,"bob","2013-10-02 15:27:44-05"
2,"jane",
select * from users;
id | name | joined
----+------+------------------------
1 | bob | 2013-10-03 03:27:44+07
2 | jane |
maybe it would be simpler to replace "" with empty string using sed.
The FORCE_NULL option for COPY FROM in Postgres 9.4+ would be the most elegant way to solve your problem. Per documentation:
FORCE_NULL
Match the specified columns' values against the null string, even if
it has been quoted, and if a match is found set the value to NULL. In
the default case where the null string is empty, this converts a
quoted empty string into NULL. This option is allowed only in COPY
FROM, and only when using CSV format.
Of course, it converts all matching values in all columns.
In older versions, you can COPY to a temporary table with the same table layout - except data type text for the problem column. Then fix offending values and INSERT from there:
single quotes appear arround value after running copy in postgres 9.2
Could not get it to work. Ended up using this program:
http://neilb.bitbucket.org/csvfix/
With that you can replace empty fileds with other values.
So for example in your case column 3 needs to have a timestamp value, so I give it a fake one. In this case '1900-01-01 00:00:00'. if needed you can delete or filter them out once the data is imported.
$CSVFIXHOME/csvfix map -f 3 -fv '' -tv '1900-01-01 00:00:00' -rsep ',' $YOURFILE > $FILEWITHDATES
After that you can import the newly created file.

PostgreSQL: Select where timestamp is empty

I have a query that looks like the following:
SELECT * FROM table WHERE timestamp = NULL;
The timestamp column is a timestamp with time zone data type (second type in this table). This is in PostgreSQL 8.4.
What I'm trying to accomplish is to only select rows that have not had a timestamp inserted. When I look at the data in pgAdmin the field is empty and shows no value. I've tried where timestamp = NULL, 'EPOCH' (which you would think would be the default value), a valid timestamp of zeros (0000-00-00 00:00:00-00, which results in a out of range error), the lowest date possible according to the docs (January 1, 4713 BC) and a blank string ('', which just gets a data type mismatch error). There also appears to be no is_timestamp() function that I can use to check if the result is not a valid timestamp.
So, the question is, what value is in that empty field that I can check for?
Thanks.
EDIT: The field does not have a default value.
null in SQL means 'unknown'.
This means that the result of using any comparison operator, like =, with a null is also 'unknown'.
To check if a column is NULL (or not NULL), use the special syntax of IS NULL (or IS NOT NULL) instead of using =.
Applying that to your statement,
SELECT * FROM table WHERE timestamp IS NULL;
should work.