Find records where a timestamp is present - postgresql

I'm migrating from SQLite to PostgreSQL, and the following query is not working anymore:
where("my_timestamp is NOT NULL and my_timestamp != ''")
How can I find all records that have a certain (datetime) attribute present?

Assuming that your my_timestamp column is a real timestamp (i.e. t.datetime in ActiveRecord parlance) then a simple NOT NULL test is sufficient:
where('my_timstamp is not null')
If this is the case then your query should be giving you an error like:
invalid input syntax for type timestamp: ""
pointing at your my_timestamp != '' test. Your comparison with an empty string worked fine in SQLite because SQLite doesn't have a real timestamp type, it just uses ISO 8601 formatted strings in text columns; this data type problem is also why you ended up with '' in your timestamp columns in SQLite in the first place.

Related

Casting String type to Unix Date Amazon Athena

I'm looking to get a result in Amazon Athena were I can count the quantity of users created by day (or maybe by month)
But previous that I have to convert the unix timestamp to another date format. And this is where i fail.
My last goal is to convert this type of timestamp:
1531888605109
In something like:
2018-07-18
According to Epoch Converter
But when I try to apply the solution i saw in this quiestion: Casting unix time to date in Presto
I got the error:
[Simba]AthenaJDBC An error has been thrown from the AWS Athena client. SYNTAX_ERROR: line 1:13: Unexpected parameters (varchar) for function from_unixtime. Expected: from_unixtime(double) , from_unixtime(double, bigint, bigint) , from_unixtime(double, varchar(x)) [SQL State=HY000, DB Errorcode=100071]
This is my query:
select cast(from_unixtime(created)as date) as date_creation,
count(created)
from datalake.test
group by date_creation
Maybe I've to cast over the sctring because the data type of the field is not a date.
My table description: Link to the table description
line 1:13: Unexpected parameters (varchar) for function from_unixtime. Expected: from_unixtime(double)
This means that your timestamps -- even though they appear numeric -- are varchars.
You need to add a CAST to cast(from_unixtime(created)as date), like:
CAST(from_unixtime(CAST(created AS bigint)) AS date)
Note: When dealing with time-related data, please have in mind that https://github.com/prestosql/presto/issues/37 is not resolved yet in Presto.

Insert null values to postgresql timestamp data type using python

I am tying to insert null value to a postgres timestamp datatype variable using python psycopg2.
The problem is the other data types such as char or int takes None, whereas the timestamp variable does not recognize None.
I tried to insert Null , null as a string because I am using a dictionary to get append the values for insert statement.
Below is the code.
queryDictOrdered[column] = queryDictOrdered[column] if isNull(queryDictOrdered[column]) is False else NULL
and the function is
def isNull(key):
if str(key).lower() in ('null','n.a','none'):
return True
else:
False
I get the below error messages:
DataError: invalid input syntax for type timestamp: "NULL"
DataError: invalid input syntax for type timestamp: "None"
Empty timestamps in Pandas dataframes come through as NaT (not a time), which is NOT pg compatible with NULL. A quick work around is to send it as a varchar and then run these 2 queries:
update <<schema.table_name>> set <<column_name>> = Null where
<<column_name>> = 'NULL';
or (depending on what you hard coded empty values as)
update <<schema.table_name>> set <<column_name>> = Null where <<column_name>> = 'NaT';
Finally run:
alter table <<schema.table_name>>
alter COLUMN <<column_name>> TYPE timestamp USING <<column_name>>::timestamp without time zone;
Surely you are adding quotes around the placeholder. Read psycopg documentation about passing parameters to queries.
Dropping this here incase it's helpful for anyone.
Using psycopg2 and the cursor object's copy_from method, you can copy missing or NaT datetime values from a pandas DataFrame to a Postgres timestamp field.
The copy_from method has a null parameter that is a "textual representation of NULL in the file. The default is the two characters string \N". See this link for more information.
Using pandas' fillna method, you can replace any missing datetime values with \N via data["my_datetime_field"].fillna("\\N"). Notice the double backslash here, where the first backslash is necessary to escape the second backslash.
Using the select_columns method from the pyjanitor module (or .loc[] and some subsetting with the column names of your DataFrame), you can coerce multiple columns at once via something akin to this, where all of your datetime fields end with an _at suffix.
data_datetime_fields = \
(data
.select_columns("*_at")
.apply(lambda x: x.fillna("\\N")))

Load NULL TIMESTAMP with TIME ZONE using COPY FROM in PostgreSQL

I have a CSV file that I'm trying to load into a PostgreSQL 9.2.4 database using the COPY FROM command. In particular there is a timestamp field that is allowed to be null, however when I load "null values" (actually just "") I get the following error:
ERROR: invalid input syntax for type timestamp with time zone: ""
An example CSV file looks as follows:
id,name,joined
1,"bob","2013-10-02 15:27:44-05"
2,"jane",""
The SQL looks as follows:
CREATE TABLE "users"
(
"id" BIGSERIAL NOT NULL PRIMARY KEY,
"name" VARCHAR(255),
"joined" TIMESTAMP WITH TIME ZONE,
);
COPY "users" ("id", "name", "joined")
FROM '/path/to/data.csv'
WITH (
ENCODING 'utf-8',
HEADER 1,
FORMAT 'csv'
);
According to the documentation, null values should be represented by an empty string that cannot contain the quote character, which is double quote (") in this case:
NULL
Specifies the string that represents a null value. The default is \N (backslash-N) in text format, and an unquoted empty string in CSV format. You might prefer an empty string even in text format for cases where you don't want to distinguish nulls from empty strings. This option is not allowed when using binary format.
Note: When using COPY FROM, any data item that matches this string will be stored as a null value, so you should make sure that you use the same string as you used with COPY TO.
I've tried the option NULL '' but that seems to have no affect. Advice, please!
empty string without quotes works normally:
id,name,joined
1,"bob","2013-10-02 15:27:44-05"
2,"jane",
select * from users;
id | name | joined
----+------+------------------------
1 | bob | 2013-10-03 03:27:44+07
2 | jane |
maybe it would be simpler to replace "" with empty string using sed.
The FORCE_NULL option for COPY FROM in Postgres 9.4+ would be the most elegant way to solve your problem. Per documentation:
FORCE_NULL
Match the specified columns' values against the null string, even if
it has been quoted, and if a match is found set the value to NULL. In
the default case where the null string is empty, this converts a
quoted empty string into NULL. This option is allowed only in COPY
FROM, and only when using CSV format.
Of course, it converts all matching values in all columns.
In older versions, you can COPY to a temporary table with the same table layout - except data type text for the problem column. Then fix offending values and INSERT from there:
single quotes appear arround value after running copy in postgres 9.2
Could not get it to work. Ended up using this program:
http://neilb.bitbucket.org/csvfix/
With that you can replace empty fileds with other values.
So for example in your case column 3 needs to have a timestamp value, so I give it a fake one. In this case '1900-01-01 00:00:00'. if needed you can delete or filter them out once the data is imported.
$CSVFIXHOME/csvfix map -f 3 -fv '' -tv '1900-01-01 00:00:00' -rsep ',' $YOURFILE > $FILEWITHDATES
After that you can import the newly created file.

i am getting an error" not valid month"

create table Department
(Dname varchar(255) NOT NULL, Dnumber int NOT NULL PRIMARY KEY, Mgr_SSN char(9) NOT NULL, Mgr_start_Date DATE);
insert into Department values('HR', '1', '11001', '2012-04-05 10:15:00');
I am getting the error "not valid month".
Should we define date format when we create the table?
I am using Oracle11g.
When you have a DATE column, you should always insert a DATE, not a VARCHAR2. Relying on implicit casting to correctly convert the string is a bad idea-- it is very easy for different sessions to have different NLS settings and, thus, to do the implicit conversion differently (either resulting in a different DATE or an error). The easiest way to do that is to use the to_date function.
insert into Department( dname,
dnumber,
mgr_ssn,
mgr_start_date )
values('HR',
1,
'11001',
to_date( '2012-04-05 10:15:00', 'yyyy-mm-dd hh24:mi:ss') );
I also modified the statement to list the columns, which is generally a good practice since it ensures that you don't have to look up the physical order of columns in the table every time and since it allows the INSERT statement to work in the future if you add new columns to the table. Since dnumber is a NUMBER, I also changed the INSERT statement to insert a number rather than inserting a string (again, don't rely on implicit conversion if there is no need to do so). I did not correct the apparent bug that you have a CHAR(9) column representing a social security number for which you are inserting a 5 character string.

PostgreSQL: Select where timestamp is empty

I have a query that looks like the following:
SELECT * FROM table WHERE timestamp = NULL;
The timestamp column is a timestamp with time zone data type (second type in this table). This is in PostgreSQL 8.4.
What I'm trying to accomplish is to only select rows that have not had a timestamp inserted. When I look at the data in pgAdmin the field is empty and shows no value. I've tried where timestamp = NULL, 'EPOCH' (which you would think would be the default value), a valid timestamp of zeros (0000-00-00 00:00:00-00, which results in a out of range error), the lowest date possible according to the docs (January 1, 4713 BC) and a blank string ('', which just gets a data type mismatch error). There also appears to be no is_timestamp() function that I can use to check if the result is not a valid timestamp.
So, the question is, what value is in that empty field that I can check for?
Thanks.
EDIT: The field does not have a default value.
null in SQL means 'unknown'.
This means that the result of using any comparison operator, like =, with a null is also 'unknown'.
To check if a column is NULL (or not NULL), use the special syntax of IS NULL (or IS NOT NULL) instead of using =.
Applying that to your statement,
SELECT * FROM table WHERE timestamp IS NULL;
should work.