How to cache a query response with Postgres?

How to cache a query response with Postgres? - postgresql

I have a database that syncs completely every 2 hours. All data is dropped and populated from the main data source.
I have some queries coming from client app, that have the same response for the current 2-hours dataset. So, if 100 clients run their apps, I will have to run this query 100 times for each of them, even though they don't differ.
How do I avoid running this real query against my database every time, but just keep its response somewhere and return it instead?
I think I can run this query after each sync and save to its own table then return from it.
What are other options, probably provided by Postgres itself?

You should use something like redis to store the result or your query in memory. It comes with many clients. You can invalidate the result of this query when it's time to.
There are other memory caching like memcache, easy to install & to use.
Note these are specific to postgres.

Related

How To Sum CloudKit Record Values

Let's say you have a CloudKit type called Steps, and each step record has a value and a date. If you have 500 records, is there anyway to have CloudKit pass back the sum of all the values (ex: 5,490)?
I haven't found a way besides doing a CKQueryOperation for all the steps and then manually adding them. My concern with this is that if I ended up with a lot of step records, say 5,000, that CKQueryOperation might start to limit my results.

No, this isn't possible with CloudKit. You'd have to use something that allows server functions, like Parse.
But this is a problem that is easily solved with your local device cache. If you sync your Step records with a local database (Realm, CoreData, SQLite, etc.) you can simply query all your steps in your database and add them up. This is the approach I would take.

Insert MongoDB document with an objectId that existed in the past

I've a bunch of collections (12) and I need to rename many fields of them. I can't do it live in the database, all I can do is download and reupload a dump of it.
So I've downloaded the collection with mongodump manipulated the data and I'm planning to use mongorestore to push it back on the database.
I'm wondering what will happen then with ObjectIds.. I know that an objectId is unique throughout the database so I'm thinking about deleting all the old data right before using mongorestore, is it ok or will I still have problems with the ids?

You can specify any value for MongoID whatever you want. You even can use string instead of MongoID.
If you have production app you need to perform upgrade and migrate data by application itself step by step.
If you have one proccess singlethreaded application or if you can run your app in that way - it is most simple case. Else you need synchronization service.
Be carefull with async/await and promises and so on asyncronous processes. They receive and have in memory one the data in one time and continue process with that data in another time, and it need to have in mind that.
You need to do:
modify service to be ready to both data format
create modification code which will go through all the data and migrate it
modify service to be ready only to new data format once all the data migrate done

Should I filter data in PostgreSQL or server backend?

I am working on a project which uses graphql and PostgreSQL where we want to select data from the database with a value after a certain date. It is currently selecting all data from the database and then filtering it on the server:
.filter(({time}) => moment(time).isAfter(startTime))
However I would have thought it would be best to do this filtering in the database query as the full dataset is never used.
Is there any benefit to doing it on the server rather than in the database query?

Barring some unusual edge case -- such as other parts of your backend code really do need all the data for some reason -- it would definitely be more efficient to filter everything on the Postgres side via the SQL that is being used to fetch the data in the first place.
This is true for several reasons:
Assuming the table is properly indexed, the filtering will be able to occur much faster within the database.
The unneeded data will not need to be serialized and sent over the wire to the backend, only to then be discarded by the backend's own filtering.
The memory footprint should be reduced on both the Postgres and server end due to needing to process only a portion of the results.
I've not worked with GraphQL myself, but from doing a bit of poking around through its docs, it appears GraphQL often uses other mechanisms in different layers (outside of the database) to try to improve performance.
It would be worth seeing what the actual SQL is that your GraphQL query is generating (that may be possible via a function in GraphQL; it could also be done by enabling certain log settings on the Postgres server and correlating the log output to the query). That may lead to further optimization possibilities if you want to keep things purely GraphQL.
Jumping down to a raw query seems like it would be a good possibility though. Certainly that is something that is often done with ORMs like Django and ActiveRecord.

Multiple updates performance improvement

I have built an application with Spring Boot and JPA to migrate a Jira postgres database.
Basically, I have 5000 users that I need to migrate. Each user means 67 update queries in different tables.
Each query uses the LOWER function to compare ignoring case.
Some pseudo-code:
for (user : users){
for (query : queries) {
jdbcTemplate.execute(query.replace(user....
I ignore any errors, so if a single query fails, I still go on and execute the other 66.
I am running this in 10 separate threads and each user is taking roughly 120 seconds to migrate. (20 threads resulted in database dead lock)
At this pace, it's gonna take more than a day, which is not acceptable (I am running this in a test environment before doing in production).
The queries looks like this:
UPDATE table SET column = 'NEWUSERNAME' where LOWER(column) = LOWER('CURRENTUSERNAME');
Is there anything I can do to try and optimize this migration?
UPDATE:
I changed my approach. First, I select every element with the CURRENTUSERNAME and get it's ID. Then I create the UPDATE queries using the ID as the "where" clause.
Other than that, it is still taking a long time (4+ hours) to execute.
I am running millions of UPDATEs, each at a time. I know jdbcTemplate has a bulk method, but if a single UPDATE fails, I believe it roll's back every successful update too. Also, I am not aware of the performance improvement it would have, if any.
So, to update the question, given that I have millions of UPDATE queries to run, what would be the best way execute them? (bulk, multi threading, something else)

what happens to my dataset in case of unexpected failure

i know this has been asked here. But my question is slightly different. When the dataset was designed keeping the disconnected principle in mind, what was provided as a feature which would handle unexpected termination of the application, say a power failure or a windows hang or system exception leading to restart. Say the user has entered some 100 rows and it is modified at the dataset alone. Usually the dataset is updated at the application close or at a timely period.
In old times which programming using vb 6.0 all interaction used to take place directly with the database, thus each successful transaction was committing itself automatically. How can that be done using datasets?

DataSets are never for direct access to database, they are a disconnected model only. There is no intent that they be able to recover from machine failures.
If you want to work live against the database you need to use DataReaders and issue DbCommands against the database live for changes. This of course will increase your load on the database server though.
You have to balance the two for most applications. If you know a user just entered vital data as a new row, execute an insert command to the database, and put a copy in your local cached DataSet. Then your local queries can run against the disconnected data, and inserts are stored immediately.

A DataSet can be serialized very easily, so you could implement your own regular backup to disk by using serialization of the DataSet to the filesystem. This will give you some protection, but you will have to write your own code to check for any data that your application may have saved to disk previously and so on...
You could also ignore DataSets and use SqlDataReaders and SqlCommands for the same sort of 'direct access to the database' you are describing.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse