How to extract timestamp from mongodb objectid in postgres - mongodb

In MongoDB you can retrieve the date from an ObjectId using the getTimestamp() function. How can I retrieve the date from a MongoDB ObjectId using Postgresql (e.g., in the case where such an ObjectId is stored in a Postgres database)?
Example input:
507c7f79bcf86cd7994f6c0e
Wanted output:
2012-10-15T21:26:17Z

In Mongodb documentation the Objectid is formed with a timestamp as the first 4 bytes, but this is represented in hexidecimal. Assuming that hexidecimal value is stored as a string in PostgreSQL, then the following query will extract just the first 8 characters of that objectid, convert that to an integer (which is seconds from 1970-01-01) then convert that integer to a timestamp. For example:
SELECT TO_TIMESTAMP(int_val) ts_val
FROM (
SELECT ('x' || lpad(left(objectid,8), 8, '0'))::bit(32)::int AS int_val
FROM (
VALUES ('507c7f79bcf86cd7994f6c0e')
) AS t1(objectid)
) AS t2
;
Converting a hexadecimal string to integer is discussed here:
Convert hex in text representation to decimal number

The first answer is quite excellent. This one expands the answer by making a reusable function out of it.
create function extractMongoTimestamp(text) RETURNS TIMESTAMP WITH TIME ZONE
as
'SELECT TO_TIMESTAMP(int_val) ts_val
FROM (
SELECT (''x'' || lpad(left(objectid,8), 8, ''0''))::bit(32)::int AS int_val
FROM (
VALUES ($1)
) AS t1(objectid)
) AS t2'
language sql
immutable
RETURNS null on null input;
Use it in your query:
select extractMongoTimestamp('507c7f79bcf86cd7994f6c0e');

Related

In Amazon Redshift though I have specified service_date column as Date datatype but when I am taking date in IN operator it working with quotes only

select *
from nsclc_thought_spot
where patientid = 11000001
and service_date in ('2019-07-08', '2019-07-10')
order by patientid, service_date
is returning the results properly
But this is not working as expected:
select *
from nsclc_thought_spot
where patientid = 11000001
and service_date in (2019-07-08, 2019-07-10)
order by patientid, service_date
This query is not returning results.
If I have defined service_date column as date, then why do I have to pass the values in quotes inside IN operator in redshift?
Because 2019-07-08 means the integer 2019 minus the integer 7 minus the integer 8 which equals the integer 2004. Without quotes in SQL numbers are seen as numeric values. To be interpreted as something else you need to quote them (which is a text value) and then they need to be cast to the data type needed. In this case '2019-07-08' is a text value but Redshift will implicitly cast this to a date to make the comparison to the column data "service_date".
If you want to do this explicitly you can add the casting to the values - ... service_date IN ('2019-07-08'::date,'2019-07-10'::date) ... - which might make things clearer for you.

How to covert substring to integer?

I have varchar data then i want to convert it to integer so i can using the number to order my data this is my varchar
No.SKF.4-04/2021/CBO-ODSP
No.SKF.5-04/2021/CBO-ODSP
No.SKF.6-04/2021/CBO-ODSP
`
i want to take the number so i can select order the data
SELECT varchar from account_information order by CAST(substring(left("NO_SURAT", "length"("NO_SURAT")-17),8)as integer)
but it show some error
SELECT CAST(substring(left("NO_SURAT", "length"("NO_SURAT")-17),8)as integer) from account_information
ERROR: invalid input syntax for type integer: ""
how do i covert substring result to int?
Avoiding an expensive regular expression, you could use:
SELECT CAST(
substr(
'No.SKF.5-04/2021/CBO-ODSP',
8,
position('-' IN 'No.SKF.5-04/2021/CBO-ODSP') - 8
) AS integer
);
substr
════════
5
(1 row)

Convert a bytea into a binary string

I need to decode a base64 string and take a chunk of binary.
Is there a SQL function in Postgres to simply convert a bytea into a binary string representation?
(Like "00010001010101010".)
If your Postgres installation runs with the default setting bytea_output = 'hex', there is a very simple hack:
SELECT right(bytea_col::text, -1)::varbit;
Example:
SELECT right((bytea '\xDEADBEEF')::text, -1)::varbit;
Result:
'11011110101011011011111011101111'
right(text, -1) is just the cheapest way to remove the leading backslash from the text representation.
varbit (standard SQL name bit varying) is for bit strings of arbitrary length. Cast the result to text or varchar if you like.
Related, with explanation:
Convert hex in text representation to decimal number
demo:db<>fiddle
You could put the following code into a function:
WITH byte AS ( -- 1
SELECT E'\\xDEADBEEF'::bytea as value
)
SELECT
string_agg( -- 5
get_byte(value, gs)::bit(8)::text -- 4
, ''
)
FROM
byte,
generate_series( -- 3
0,
length(value) - 1 -- 2
) gs
I demonstrated the development of the query within the fiddle.
The WITH clause encapsulates the bytea value for double usage in further code
length() calculates the binary length of the bytea value
generate_series() creates a list from 0 to length - 1 (0 - 3 in my example)
get_byte() takes the bytea value a second time and gives out the byte at position gs (the previous calculated values 0-3). This gives an integer representation of the the byte. After that the cast to bit(8) type converts the result of this function to its binary representation (1 byte = 8 bit)
string_agg() finally aggregates all for binary strings into one. (taking its text representations instead of bit type, with no delimiters)
A function could look like this:
CREATE OR REPLACE FUNCTION to_bit(value bytea) RETURNS SETOF text AS
$$
BEGIN
RETURN QUERY
SELECT
string_agg(get_byte(value, gs)::bit(8)::text, '')
FROM
generate_series(0, length(value) - 1) gs;
END;
$$ LANGUAGE plpgsql;
After that you could call it:
SELECT to_bit(E'\\xDEADBEEF'::bytea)
You could try it using get_bit() instead of get_byte(). This safes you the ::bit(8) cast but of course you need to multiply the length with factor 8 indeed.
The resulting bit string has another bit order but maybe it fits your use case better:
WITH byte AS (
SELECT E'\\xDEADBEEF'::bytea as value
)
SELECT
string_agg(get_bit(value, gs)::text, '')
FROM
byte,
generate_series(0, length(value) * 8 - 1) gs
demo:db<>fiddle

PostgreSQL - sort by UUID version 1 timestamp

I am using UUID version 1 as the primary key. I would like to sort on UUID v1 timestamp. Right now if I do something like this:
SELECT id, title
FROM table
ORDER BY id DESC;
PostgreSQL does not sort records by UUID timestamp, but by UUID string representation, which ends up with unexpected sorting result in my case.
Am I missing something, or there is not a built in way to do this in PostgreSQL?
The timestamp is one of the parts of a v1 UUID. It is stored in hex format as hundreds nanoseconds since 1582-10-15 00:00. This function extracts the timestamp:
create or replace function uuid_v1_timestamp (_uuid uuid)
returns timestamp with time zone as $$
select
to_timestamp(
(
('x' || lpad(h, 16, '0'))::bit(64)::bigint::double precision -
122192928000000000
) / 10000000
)
from (
select
substring (u from 16 for 3) ||
substring (u from 10 for 4) ||
substring (u from 1 for 8) as h
from (values (_uuid::text)) s (u)
) s
;
$$ language sql immutable;
select uuid_v1_timestamp(uuid_generate_v1());
uuid_v1_timestamp
-------------------------------
2016-06-16 12:17:39.261338+00
122192928000000000 is the interval between the start of the Gregorian calendar and the Unix timestamp.
In your query:
select id, title
from t
order by uuid_v1_timestamp(id) desc
To improve performance an index can be created on that:
create index uuid_timestamp_ndx on t (uuid_v1_timestamp(id));

Extract year from date within WHERE clause

I need to include EXTRACT() function within WHERE clause as follow:
SELECT * FROM my_table WHERE EXTRACT(YEAR FROM date) = '2014';
I get a message like this:
pg_catalog.date_part(unknown, text) doesn't exist**
SQL State 42883
Here is my_table content (gid INTEGER, date DATE):
gid | date
-------+-------------
1 | 2014-12-12
2 | 2014-12-08
3 | 2013-17-15
I have to do it this way because the query is sent from a form on a website that includes a 'Year' field where users enter the year on a 4-digits basis.
The problem is that your column is of data type text, while EXTRACT() only works for date / time types.
You should convert your column to the appropriate data type.
ALTER TABLE my_table ALTER COLUMN date TYPE date;
That's smaller (4 bytes instead of 11 for the text), faster and cleaner (disallows illegal dates and most typos).
If you have non-standard format add a USING clause with a conversion expression. Example:
Alter character field to date
Also, for your queries to be fast with a plain index on date you should rather use sargable predicates. Like:
SELECT * FROM my_table
WHERE date >= '2014-01-01'
AND date < '2015-01-01';
Or, to go with your 4-digit input for the year:
SELECT * FROM my_table
WHERE date >= to_date('2014', 'YYYY')
AND date < to_date('2015', 'YYYY');
You could also be more explicit:
to_date('2014' || '0101', 'YYYYMMNDD')
Both produce the same date '2014-01-01'.
Aside: date is a reserved word in standard SQL and a basic type name in Postgres. Don't use it as identifier.
This happens because the column has a text or varchar type, as opposed to date or timestamp. This is easily reproducible:
SELECT 1 WHERE extract(year from '2014-01-01'::text)='2014';
yields this error:
ERROR: function pg_catalog.date_part(unknown, text) does not exist
LINE 1: SELECT 1 WHERE extract(year from '2014-01-01'::text)='2014';
^ HINT: No function matches the given name and argument types. You might need to add explicit type casts.
extract or is underlying function date_part does not exist for text-like datatypes, but they're not needed anyway. Extracting the year from this date format is equivalent to getting the 4 first characters, so your query would be:
SELECT * FROM my_table WHERE left(date,4)='2014';