How to insert (raw bytes from file data) using a plain text script - postgresql

Database: Postgres 9.1
I have a table called logos defined like this:
create type image_type as enum ('png');
create table logos (
id UUID primary key,
bytes bytea not null,
type image_type not null,
created timestamp with time zone default current_timestamp not null
);
create index logo_id_idx on logos(id);
I want to be able to insert records into this table in 2 ways.
The first (and most common) way rows will be inserted in the table will be that a user will provide a PNG image file via an html file upload form. The code processing the request on the server will receive a byte array containing the data in the PNG image file and insert a record in the table using something very similar to what is explained here. There are plenty of example of how to insert byte arrays into a postgresql field of type bytea on the internet. This is an easy exercise. An example of the insert code would look like this:
insert into logos (id, bytes, type, created) values (?, ?, ?, now())
And the bytes would be set with something like:
...
byte[] bytes = ... // read PNG file into a byte array.
...
ps.setBytes(2, bytes);
...
The second way rows will be inserted in the table will be from a plain text file script. The reason this is needed is only to populate test data into the table for automated tests, or to initialize the database with a few records for a remote development environment.
Regardless of how the data is entered in the table, the application will obviously need to be able to select the bytea data from the table and convert it back into a PNG image.
Question
How does one properly encode a byte array, to be able to insert the data from within a script, in such a way that only the original bytes contained in the file are stored in the database?
I can write code to read the file and spit out insert statements to populate the script. But I don't know how to encode the byte array for the plain text script such that when running the script from psql the image data will be the same as if the file was inserted using the setBytes jdbc code.
I would like to run the script with something like this:
psql -U username -d dataBase -a -f test_data.sql

The easiest way, IMO, to represent bytea data in an SQL file is to use the hex format:
8.4.1. bytea Hex Format
The "hex" format encodes binary data as 2 hexadecimal digits per byte, most significant nibble first. The entire string is preceded by the sequence \x (to distinguish it from the escape format). In some contexts, the initial backslash may need to be escaped by doubling it, in the same cases in which backslashes have to be doubled in escape format; details appear below. The hexadecimal digits can be either upper or lower case, and whitespace is permitted between digit pairs (but not within a digit pair nor in the starting \x sequence). The hex format is compatible with a wide range of external applications and protocols, and it tends to be faster to convert than the escape format, so its use is preferred.
Example:
SELECT E'\\xDEADBEEF';
Converting an array of bytes to hex should be trivial in any language that a sane person (such a yourself) would use to write the SQL file generator.

Related

How to insert and select images in postgresql

Given a JSON file of data to be inserted into a table in Postgresql. One of the fields in the JSON structure is a base64 string. I know there is a data type for psql called BYTEA but how would one insert the base64 string in that field.
I also need to then be able to select the base64 string from the table and display the image in the end.
Note: Bonus points for Golang solutions cause the whole app is in Golang

Convert a BLOB to VARCHAR instead of VARCHAR FOR BIT

I have a BLOB field in a table that I am selecting. This field data consists only of JSON data.
If I do the following:
Select CAST(JSONBLOB as VARCHAR(2000)) from MyTable
--> this returns the value in VARCHAR FOR BIT DATA format.
I just want it as a standard string or varcher - not in bit format.
That is because I need to use JSON2BSON function to convert the JSON to BSON. JSON2BSON accepts a string but it will not accept a VarChar for BIT DATA type...
This conversation should be easy.
I am able to do the select as a VARCHAR FOR BIT DATA.. Manually COPY it using the UI. Paste it into a select literal and convert that to BSON. I need to migrate a bunch of data in this BLOB from JSON to BSON, and doing it manually won't be fast. I just want to explain how simple of a use case this should be.
What is the select command to essentially get this to work:
Select JSON2BSON(CAST(JSONBLOB as VARCHAR(2000))) from MyTable
--> Currently this fails because the CAST converts this (even though its only text characters) to VARCHAR for BIT DATA type and not standard VARCHAR.
What is the suggestion to fix this?
DB2 11 on Windows.
If the data is JSON, then the table column should be CLOB in the first place...
Having the table column a BLOB might make sense if the data is actually already BSON.
You could change the blob into a clob using the converttoclob procedure then you should be ok.
https://www.ibm.com/support/knowledgecenter/SSEPGG_11.5.0/com.ibm.db2.luw.apdv.sqlpl.doc/doc/r0055119.html
You can use this function to remove the "FOR BIT DATA" flag on a column
CREATE OR REPLACE FUNCTION DB_BINARY_TO_CHARACTER(A VARCHAR(32672 OCTETS) FOR BIT DATA)
RETURNS VARCHAR(32672 OCTETS)
NO EXTERNAL ACTION
DETERMINISTIC
BEGIN ATOMIC
RETURN A;
END
or if you are on Db2 11.5 the function SYSIBMADM.UTL_RAW.CAST_TO_VARCHAR2 will also work

Bytea to actual text value in postgresql

I have a table to store file information in postgresql.
select id,filestream,name from Table_file_info
Here filestream is bytea datatype. How to get bytea data into actual text (content of my file) in postgresql.
I tried with below query:
select encode(filestream, 'escape')::text as name from Table_file_info
but i am getting as below
ICAgICAgICAgc2FkZnNhZGZhZCBzZGRkZGRkZGRkIFRlc3R0dA==
actual content of my file is: sadfsadfad sddddddddd Testtt
It looks like base64. Meaning your file was first converted to base64, then converted to bytea (which is kind of pointless since base64 is already text)
select encode(decode(encode(filestream,'escape'),'base64'),'escape') from Table_file_info;

Converting MS SQL Hash Text to Postgres Hash Text

I am working on migrating all our databases from MS SQL server to Postgres. In this process, I am working on writing equivalent code in Postgres to yield the same hashed texts obtained in MS SQL.
Following is my code in MS SQL:
DECLARE #HashedText nvarchar(50)
DECLARE #InputText nvarchar(50) = 'password'
DECLARE #HashedBytes varbinary(20) -- maximum size of SHA1 output
SELECT #HashedBytes = HASHBYTES('SHA1', #InputText)
SET #HashedText = CONVERT(nvarchar(50), #HashedBytes, 2)
SELECT #HashedText
This is yielding the value E8F97FBA9104D1EA5047948E6DFB67FACD9F5B73
Following is equivalent code written in Postgres:
DO
$$
DECLARE v_InputText VARCHAR = 'password';
DECLARE v_HashedText VARCHAR;
DECLARE v_HashedBytes BYTEA;
BEGIN
SELECT
ENCODE(DIGEST(v_InputText, 'SHA1'), 'hex')
INTO
v_HashedBytes;
v_HashedText := CAST(v_HashedBytes AS VARCHAR);
RAISE INFO 'Hashed Text: %', v_HashedText;
END;
$$;
This yields the value 5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8.
After spending some time I understood that replacing the datatype 'NVARCHAR' with 'VARCHAR' in MS SQL yields the same result as Postgres.
Now the problem is in MS SQL we already have passwords hashed and stored in database as shown above. I am unable to convert hashed text in MS SQL to Postgres and also unable to generate same hashed text in Postgres as Postgres doesn't support UTF-16 unicode.
So, I just want to know if there is any possibility of following solutions?
Convert hexadecimal value generated in MS SQL to hex value
equivalent to that generated by using VARCHAR datatype (which is
same value in Postgres)
Convert UTF8 texts to UTF16 texts in
Postgres (even by any kind of extensions) and generate hex values
which would be equivalent to values generated in MS SQL
Let's look at your suggestions in turn:
Convert hexadecimal value generated in MS SQL to hex value equivalent to that generated by using VARCHAR datatype (which is same value in Postgres)
This comes down to converting the user's password from UTF-16 to UTF-8 (or to some other encoding) and re-hashing it. To do this, you need to know the user's password, which theoretically, you don't - that's the point of hashing it in the first place.
In practice, you're using unsalted SHA1 hashes, for which large pre-computed tables exist, and for which brute force is feasible with a GPU-optimised algorithm. So a "grey hat" option would be to crack all your user's passwords, and re-hash them.
If you do so, it would probably be sensible to re-hash them using a salt and better hash function, as well as converting them to UTF-8.
Convert UTF8 texts to UTF16 texts in Postgres (even by any kind of extensions) and generate hex values which would be equivalent to values generated in MS SQL
This, theoretically, is simpler, you just need a routine to do the string conversion. However, as you've found, there is no built-in support for this in Postgres.
For any string composed entirely of ASCII characters, the conversion is trivial: insert a NULL byte (hex 00) before every byte of the string. But this will break any password that used a character not in this range.
An alternative would be to move the responsibility for generating the hash out of the database:
Retrieve the hash from the DB.
Ensure the user's input is represented as UTF-16 in your application, and calculate its hash.
If valid, you can now generate a new hash (because you know the password the user just typed), using a better algorithm, and store that in the DB instead of the old hash.
Once all active users have logged in at least once, you will have no SHA1 hashes left, and can remove the support for them completely.

Import excel file into teradata using tpt

I am required to load an excel file to a teradata table which already has data in it. I have used TPT Inserter operator to load data with CSV files. I am not sure how to directly load an excel file using TPT Inserter.
When I tried providing the excel file with TextDelimiter='TAB', the parser threw an error
data_connector: TPT19134 !ERROR! Fatal data error processing file 'd:\sample_dat
a.csv'. Delimited Data Parsing error: Too few columns in row 1.
1) Could someone explain what are the options required while directly importing excel file to teradata
2) How to load a TAB delimited file in teradata using tptLoad / tptInserter
the script that I have used is:
define job insert_data
description 'Load from Excel to TD table'
(
define operator insert_operator
type inserter
schema *
attributes
(
varchar logonmech='LDAP',
varchar username='username',
varchar userpassword='password',
varchar tdpid='tdpid',
varchar targettable='excel_to_table'
);
define schema upload_schema
(
quarter varchar(20),
cust_type varchar(20)
);
define operator data_connector
type dataconnector producer
schema upload_schema
attributes
(
varchar filename='d:\sample_data.xlsx',
varchar format='delimited',
varchar textdelimiter='TAB',
varchar openmode='Read'
);
apply ('insert into excel_to_table(quarter, cust_type) values(:quarter, :cust_type);')
to operator (insert_operator[1])
select quarter, cust_type
from operator (data_connector[1]);
);
Thanks!!
The scripts actually seems fine by the looks besides for the fact the the error is related to delimited data and a .xlsx extension file is specified in the script. Are you sure that the specified file is Tab delimited?
Formats supported by TPT Dataconnector operator are:
Binary - Binary data fitting exactly in the defined Schema plus indicator bytes
Delimited - Easier for multiple column human readable files, limited to all varchar schema
Formatted - For working with data exported by Teradata TTUs
Text - For text files containing fixed width columns, also human readable, limited to all varchar schema
Unformatted - For working with data exported by Teradata TTUs
The original excel data (in true xls or xlsx format) is not directly supported by native TPT operators. But if your data is really Tab delimited then this shouldn't be a problem; you should be able to load this. An obvious point to consider in loading a delimited file is that Char or Varchar fields must not contain delimiter within data. You can escape delimiter characters in data by using a '\'. A more subtle point is that you cannot specify TAB delimiter in lower case, i.e. varchar textdelimiter='TAB' works but varchar textdelimiter='tab' doesn't. Also, any other control characters (besides TAB) cannot be specified as delimiters.
If you truly need to load excel files then you may need to pre-process it into a loadable format such as delimited or binary or text data. You can write separate code in any language to achieve this.