How to use JPA to query encrypted data? - postgresql

I am supposed to come up with a plan to encrypt our existing personal information and be able to retrieve it and show it back to user through application if required. This is not PCI information nor password, so it requires a two way journey that is encrypt and decrypt both (and for the moment to keep things simple no tokenization, no public private key encryption either), as well as query the information through JPA if required.
Here is what I have: application engine JBOSS 7, database PostGres, Java EE 7 and JPA 2.
What I have thought of? Converting the existing data in postgres is doable with pgcrypto using
ALTER TABLE table_name ALTER column_name TYPE bytea USING pgp_sym_encrypt(column_name, key, [other options])
If there are any dependencies they can be handled but they are not a lot. Decryption and searching or displaying on the same would be
SELECT pgp_sym_decrypt(column_name, key) AS column_name FROM table_name WHERE ...
The entity files can be handled by just changing their data type and so on.
Where I am stuck? The system uses JPA 2 (hibernate implementation) for the queries presently that take advantage of the fields being in plain text. If I update the database the queries that are present would fail, they would need to rewritten anyways to handle encrypted data. But I would have to use native queries in JPA instead of JPQL, which could lead to problems in future in case we change our database.
So the question is there any way in JPA or JPQL other than native calls to query data? I did have a look at jasypt but documentation says it is for hibernate, and it looks specifically pertaining to encryption decryption only.
When I insert new data into table via JPA where do I encrypt it? Should I encrypt the data in the Java world using some cipher algorithm and then insert the bytes into table column. Or is there any elegant JPA way of doing this. Also do note that even if I take care that encryption algorithm used in pgcrypto and that from Java library are same, will they cause any inconsistency problem when we try to compare those data.
Are there better approaches to the problem? I do not mean in terms of security for now but in terms of ease of implementation and future robustness. We have lot of old code that we have recently updated to JSR 299 specification, and I would like to keep updates to minimal.
I only seek answer for first bullet point, the rest two are additional details if someone experienced wants to chip in.

Related

Get nextvalue of sequence in postgres using JPA and not native query

I have a sequence created using flyway in postgres which should start from 10000.
I want to get the next value of the sequence using JPA and not a native query , since i have different db platforms being run at different cloud providers.
I'm not able to find a JPA query to get the next value of a sequence, please redirect me to the right page if i am missing something already ..
Thanks for any help in that area though already!
P.S : I found this link which helps me doing the same with native query.
postgresql sequence nextval in schema
I don't think this is possible in a direct way.
JPA doesn't know about sequences.
Only the implementation knows about those and utilizes them to create ids.
I see the following options to get it to work anyway:
create a view in the database with a single row and a single column containing the next value. You can query that with native SQL which should be the same for all databases since it is a trivial select.
Create a dummy entity using the sequence for id generation, save a new instance and let JPA populate the id.
A horrible workaround but pure JPA.
Bite the bullet and create a simple class that provides the correct native SQL statement to use for the current environment and execute it via JdbcTemplate.

Challenges in migrating from IBM DB2 to Netezza

Due to added advantage of high performance and reduction in turnaround time, I am trying to migrate all the data from IBM DB2 to Netezza in my organization.
But what I realized is there is no concept of primary key in Netezza? If true, I can try and take care of these issue by using duplicate removal stage in Datastage.
Also, could you guys please assist me understanding if there are any more constraints that I should consider or challenges I could face for DB2 to Netezza migration?
Netezza does allow you to specify Primary Key and Foreign Key restraints, but they are not enforced. Which is to say that they are purely informational (for bot the user and the optimizer). A well-formed upsert process in ETL is a good way to manage for this.
On the topic of other issues you may face, here are a few thoughts:
Surrogate Keys
Be sure that you generate your surrogate keys either with Netezza's SEQUENCE object, or with a surrogate key generator in your ETL tool. Avoid using ROW_NUMBER for this process as it will most often prevent you from exploiting the parallel nature of the system when used in this way.
Stored Procedures
Stored procedures should avoid row-by-row/cusor-based processing when possible, as this is another case where you may prevent yourself from exploiting the parallel nature of the system.
SQL Extension Functions
If you find that you rely on functions that exists in DB2 that you don't find natively in Netezza, be sure to check what is available in the SQL Extensions Toolkit, which is included with Netezza, but not automatically installed/configured.
MERGE
If you rely on MERGE in your current environment, be aware that you must be on v7.2.1 to use MERGE in Netezza. Otherwise you will have to break it out into an INSERT/UPDATE operation.
Once you load the data in Netezza, one method we have utilized is to create a View to access the data and only expose the view. The view would have the logic inside to remove the duplicates.
Good luck!
Delan

How to encrypt entire tables using pgcrypto in PostgreSQL

I am looking to store all of my tables in PostgreSQL as aes 256 encrypted (due to client requirements).
I will look at decrypting few columns for my analysis later.
But apparently the encryption process is a drag as I have loads of tables. I am using update statements to pgp_sym_encrypt each column individually.
Is there a way to update the entire table easily or is there a better process instead of writing manual column update queries in each table??
Many thanks
Is there a way to update the entire table easily or is there a better process instead of writing manual column update queries in each table?
No, there isn't.
PostgreSQL doesn't support encrypted tables. It's not something an extension can really add, it'd have to be added to the core database engine, and nobody's done the work required to add the feature yet.
Most people who need this do the encryption application-side and store bytea fields in the table.

Foreign Key mapping in Core Data

I understand that Core Data is not a relational database but I need to understand how it can be used to support a client/server model where the server uses a Rails, ActiveRecord, Mysql setup.
My app is pulling records from the server using JSON and I am mapping the relationships using Core Data.
The Foreign Key in the SQLLite database is showing the PK field of the related table even though I have set the User Info Key/Value of primaryAttributeKey => id. (I can't remember where I saw this mentioned.)
Is there any way to setup the models so they will use my id as the PK so that it will clean up the export of related data back to the server?
Edward,
The PK is just a field in your object. If you want to maintain them in CD, they are just numbers. As you build your object graph, you have to maintain them in parallel with your relations. Of course, exporting records created on the device back to your server will have difficulty -- FKs and PKs are unique to each table and that uniqueness is determined on the server. Hence, tracking these numbers is not that useful.
May I suggest that your JSON needs to be structured such that it is redundant -- that it has both the data and the various PKs and FKs, if any?
Finally, you appear to be making a CRUD focused API. Generally, those are low performance APIs for remote devices. There are other problems with CRUD APIs, such as inconsistent business logic between servers and clients. I would suggest you to rethink your APIs.
Andrew

Experiences with PostgreSQL Java/JDBC Copy API for bulk inserts

With version 8.4 PostgreSQL finally integrated a proprietary API into their JDBC driver, which allows stream based inserts and selects. The so called Copy API grants access to COPY TO/COPY FROM SQL commands, which read text data from a stream/reader into one table at a time or write text data to a stream/writer from one table. Constraints and triggers are regarded for insert operations. Basic transformations (delimiter, quotation, null values etc.) are available. The performance gain is quite impressive, which probably is because of less object instantiation and a much simpler protocol between client and server backend.
Has anyone experiences with this API, good or bad. Is it production ready? Are there any pitfalls one has to be aware of? BTW: The fact that it is a proprietary API is a non-issue for me.
The COPY API is present in PostgreSQL C library for at least 6 years. It is very stable.
See: http://www.postgresql.org/docs/9.0/interactive/libpq-copy.html
and http://www.postgresql.org/docs/9.0/interactive/sql-copy.html
JDBC implementation should have same properties, but I haven't used it.
PS. I think there is a misunderstanding when you call this "proprietary". Both protocol specification and server/client/driver source code is free (as in freedom).