How to use pgcrypto with prisma - postgresql

Looking at how to implement hash+salt password storing strategy in NodeJS using bcrypt I found this article, which suggests using native Postgress function pgcrypto.
Prisma docs have an example of using pgcrypto only for generating random id, as a #default value in the Prisma schema.
I'm curious if pgcrypto can be used with Prisma, as in this use case it's not a default value, but a transformation to the value given to the DB at the moment of creating of the record.

pgcrypto contains a lot of functions that are related in some way to cryptography. Your 2nd link about using gen_random_uuid is a completely different topic (although still touching on cryptography), and has nothing useful to say about the subject of your question. Just forget that article and focus on the first one, and the docs, and the first principles of security.
I don't think there are any special 'gotchas' about using pgcrypto from prisma. You just need to do it. (Or look for prisma libraries that already do it for you.)

Related

Postgres TDE capability only for specific schema

As part of GDPR requirement we need to encrypt data at rest.
We are planning to use Postgres and from the below links looks like TDE can be achieved in Postgres as well.
https://www.enterprisedb.com/blog/postgres-and-transparent-data-encryption-tde
https://www.cybertec-postgresql.com/en/products/postgresql-transparent-data-encryption/
When we have multiple schema in Postgres, is it possible to apply TDE only in a particular schema?
Unfortunately it is not possible to just encrypt a schema because, when you install PostgreSQL TDE, you initialize the whole database with the encryption key.
Like you can see in the picture here:
there is a reason for this: if we allow encryption on a per-table level (or per schema or per database, doesn't matter) we got to manage an infinite number of keys. this is especially true during point-in-time-recovery and all that. this is why we decided to do the encryption on the instance level. one key. the core advantage is: we can easily encrypt all parts of the instance including the WAL, temp files, and so on (basically everything but the clog).
don't expect this to change - go for full encryption.
we can help you with that.
cheers from cybertec :)
i hope you like the feature :)
hans

How to use JPA to query encrypted data?

I am supposed to come up with a plan to encrypt our existing personal information and be able to retrieve it and show it back to user through application if required. This is not PCI information nor password, so it requires a two way journey that is encrypt and decrypt both (and for the moment to keep things simple no tokenization, no public private key encryption either), as well as query the information through JPA if required.
Here is what I have: application engine JBOSS 7, database PostGres, Java EE 7 and JPA 2.
What I have thought of? Converting the existing data in postgres is doable with pgcrypto using
ALTER TABLE table_name ALTER column_name TYPE bytea USING pgp_sym_encrypt(column_name, key, [other options])
If there are any dependencies they can be handled but they are not a lot. Decryption and searching or displaying on the same would be
SELECT pgp_sym_decrypt(column_name, key) AS column_name FROM table_name WHERE ...
The entity files can be handled by just changing their data type and so on.
Where I am stuck? The system uses JPA 2 (hibernate implementation) for the queries presently that take advantage of the fields being in plain text. If I update the database the queries that are present would fail, they would need to rewritten anyways to handle encrypted data. But I would have to use native queries in JPA instead of JPQL, which could lead to problems in future in case we change our database.
So the question is there any way in JPA or JPQL other than native calls to query data? I did have a look at jasypt but documentation says it is for hibernate, and it looks specifically pertaining to encryption decryption only.
When I insert new data into table via JPA where do I encrypt it? Should I encrypt the data in the Java world using some cipher algorithm and then insert the bytes into table column. Or is there any elegant JPA way of doing this. Also do note that even if I take care that encryption algorithm used in pgcrypto and that from Java library are same, will they cause any inconsistency problem when we try to compare those data.
Are there better approaches to the problem? I do not mean in terms of security for now but in terms of ease of implementation and future robustness. We have lot of old code that we have recently updated to JSR 299 specification, and I would like to keep updates to minimal.
I only seek answer for first bullet point, the rest two are additional details if someone experienced wants to chip in.

Dynamic table partitioning in postgres

I was looking up ways to have postgres partition data into tables based on timestamp for example, but without having to add the relevant child tables manually. I saw this blog post that does just that
https://blog.engineyard.com/2013/scaling-postgresql-performance-table-partitioning
but I'm dubious about the idea of creating tables based on string concatenation and checking the pg_catalog. Is this a reasonable idea?
pg_partman is an extension created specifically to manage the complexity of partition management. I haven't used this extension, but I've used others by the same author and they are generally of excellent quality.

Accessing non-public schema in PostgreSQL with Pentaho

Let me start by saying, what I know about Pentaho wouldn't fill up a single paragraph. I'm more knowledgeable about PostgreSQL. I'm working with some contractors that are building a set of monthly reports in Pentaho (v. 4.5) for my company. Some of the data needs to go through a ETL process and get rolled up for reporting purposes. From a dba(ish) point of view, I would like to move these tables into a separate PostgreSQL schema.
I know that Pentaho is often times used with MySQL (which doesn't have schemas) and I'm concerned this might cause problems. I've done some "googlin'" and I don't turn up a lot of hits on the topic, but I did find a closed bug from a few years ago - thus implying that the functionality should be supported.
before I do this, I would like to see if anyone knows of a reason this will fail or be a bad idea. (or if you've done it an it works great, please let me know that, too).
Final notes: I'm using PostgreSQL 9.1.5, and I don't have access to a Pentaho instance to even test this myself. And I'm hoping the good folks in the Stackoverflow community will share their expertise and save me from having to install one and the hours of playing/testing to get an idea of this is a bad idea.
EDIT:
I sort of knew this question was a bit vague, but I was hoping that some one would read it and share any experience they have. So, Let me spell it out more clearly and ask more explicit questions.
I have not done anything. I don't know Pentaho. I don't want to learn Pentaho (not that there is anything wrong with Pentaho... It's just not where my interests are right now). My company hired contractors (I did not hire them). They have experience with Pentaho, but with MySQL. They don't really know anything about PostgreSQL. There are some important difference between PostgreSQL and MySQL. Including the fact that PostgreSQL supports schemas (whereas MySQL uses separate database... similar in concept be behave differently in some ways). Some ORMs (and tools) don't really like this... for example, the Django framework still doesn't really fully support schemas in Postgresql (I know this because I use Python and Django often and my life is much better when I keep things in the "public" schema). Because of my experience with Django and PostgreSQL schemas, I'm a bit leery of moving this data to a new schema.
I do understand that where ever the tables are, they will need permissions to be able to access the data.
My explicit questions:
Do you use Pentaho to access a PostgreSQL database to access tables in schemas other than "public" (the default).
If so, does it just work (no problems)?
If you had problems, would you please be willing to share with me (and the Stackoverflow community) any online resources that helped you? Or would you be willing to detail what you remember here?
Do you know of anything that just won't work correctly? For example, an open bug in Pentaho related to this topic.
Again, it's not your standard kind of question. I'm hoping that someone out there has experience and is willing to share it here and save me from having to spend time setting up a new Pentaho instance and trying to learn Pentaho well enough to test it, etc.
Thanks.
Two paths you can take:
1) What previous post said ("Pentaho steps (table inputs, outputs, etc.) usually allow you to specify a database schema.")
2) In database connection, advanced tab, "The preferred schema name".
If you're working with different schemas, you can create one database connection per schema. With this approach you can leave schema field in input/output steps empty.
We use MS SQL server and I can tell you that Pentaho does struggle with the idea of a schema. Many of their apps allow you to select a schema but Pentaho, like you said, is built to use something like mySQL.
Make you pentaho database user work like it would be working in mySQL.
We made the database user default to dbo then we structured our tables like dbo.dimDimension,
dbo.factFactTable etc. Basically, only use dbo for Pentaho purposes. (Or whatever schema you want to default to.)
I use PDI and PgSQL extensively every day with a bunch of different schemas. It works fine. The only trouble you might run into is Pg's troublesome practice of forcing unquoted identifiers to lower instead of upper case. I soon realized everything was easier when I set the Advanced connection property to "Quote all in database".
Yes, you have to quote everything when you type SQL if PDI doesn't do it for you, but it works quite well. Haven't experimented with forcing all identifiers to lower case, but I expect that would work as well.
And yes, use the "Preferred schema nanme" as well, but be aware that some steps use that option and others don't. You can't, for example, expect it to add schema names to SQL you type into a Table Input step.
The only other issues you might run into are the limits of Pg's JDBC driver. It's not as good as SQL Server's or DB2's, but the only thing I've every had trouble with was sending error rows from a Table Output step to another step when the Table Output step was in batch mode.
Have fun learning PDI. It makes a great complement to your DBA skills.
Brian
Pentaho steps (table inputs, outputs, etc.) usually allow you to specify a database schema.
I did a quick test using PDI and our 8.4 Postgres instance and was able to explore, read from and write to tables in different schemas.
So, I think this is a reasonable direction. Hope this helps.

Experiences with PostgreSQL Java/JDBC Copy API for bulk inserts

With version 8.4 PostgreSQL finally integrated a proprietary API into their JDBC driver, which allows stream based inserts and selects. The so called Copy API grants access to COPY TO/COPY FROM SQL commands, which read text data from a stream/reader into one table at a time or write text data to a stream/writer from one table. Constraints and triggers are regarded for insert operations. Basic transformations (delimiter, quotation, null values etc.) are available. The performance gain is quite impressive, which probably is because of less object instantiation and a much simpler protocol between client and server backend.
Has anyone experiences with this API, good or bad. Is it production ready? Are there any pitfalls one has to be aware of? BTW: The fact that it is a proprietary API is a non-issue for me.
The COPY API is present in PostgreSQL C library for at least 6 years. It is very stable.
See: http://www.postgresql.org/docs/9.0/interactive/libpq-copy.html
and http://www.postgresql.org/docs/9.0/interactive/sql-copy.html
JDBC implementation should have same properties, but I haven't used it.
PS. I think there is a misunderstanding when you call this "proprietary". Both protocol specification and server/client/driver source code is free (as in freedom).