Postgres work_mem tuning - postgresql

I have 2 postgres instances in azure.
I see every hour:
1 temporary file created in one postgres instance (About 8 MB Max
Size)
4 temporary files are created in another postgres instance
(About 11 MB Max Size)
I thought this was frequent enough for me to increase work_mem...
I increased work_mem to 8MB and 12 MB respectively in the 2 postgres instances but then I saw the temporary files were still created..
This time I saw each instance has one temporary file of 16MB size each... this behavior confuses me..
I expected that temporary file creation would stop..
I tried to refer to: https://pganalyze.com/docs/log-insights/server/S7
Is few temporary files every hour not a big deal?
Should I not tune work_mem ?

Related

Nominatim - Postgres DB growing very fast in size when doing the daily updates

I have done a full import with the planet from OSM website and scheduled the updates to run daily on a cronjob.
But I have noticed that the disk usage is growing very fast in size on a daily basis.
When running the command df -h, I have noticed that every day the disk size grows about 1GB. Not sure if this command does some round up, but even so this size seems very huge.
I have a disk with 1TB free, but this would mean that the disk would be full in about 3 years.
I have tried to inspect the folders under /var/lib/postgresql/<version>/<cluster> and it seems that the folders that concur to this size increase are the folders pg_wal and base/16390.
The folder base/16390 has many files with 1GB each and the folder pg_wal has about 40 something files of 16MB each.
I don't know which files are safe to remove or if there are some configs for the postgresql.conf file that would prevent this huge increase in size each day.
Also don't know if this has to do with some backups or logs that postgres does by default, but I would like to also reduce those backups and logs to a minimum.
Any help on this would be appreciated.
Thanks in advance.

Postgres db data deletion not reducing the table size

I have a postgres db with its size constantly increasing due to ~70 gb of data size and ~305 indexes size. So we planned a cleanup of older data,even after cleanup of data for 14 months and keeping 6 month data but data size and indexes size doesn't seems to be freeing up.
Please suggest anything if I am missing as I am new to postgres db.

OrientDB disk utilization

I have been working with orientDB and stored about 120 Million records to it, the size on disk was 24 GB, I then I deleted all the records by running the following commands against console :
Delete from E unsafe
Delete from V unsafe
When i checked the DB size on disk it was also 24 GB, Is there anything extra I need to do to get free disk space?
In OrientDB when you delete a record the disk space remains allocated. The only way to free it is to export than re-import the DB.

How much disk space for MongoDB

I am about to setup MongoDB on AWS EC2 (Amazon Linux HVM 64bits) and implement RAID 10.
I am expecting a couple of millions of records for a system of videos on demand.
I could not find any good advice on how much disk space I should use for that instance.
The dilemma is that I can't spend too much on EBS volume right now, but if I have to add a new bigger volume in less than a year and turn the db off to move the data to that new volume, that is a problem.
For the initial stage, I was thinking 16Gb (available after RAID 10 implementation) on a t2.medium, with plan of upgrading to m4.medium and adding replica sets later.
Any thoughts on this?
The math is pretty simple:
Space required = bytes per record x number of records
If you have an average of 145 bytes per record with an expectation of 5 million records, you can work with 1 GB of storage.
EBS storage is pretty cheap. 1 GB of SSD is $0.10 per month in us-east-1. You could allocate 5 GB for only $0.50 per month.
Also, RAID 10 is RAID 0 and RAID 1 combined. Read over this SO question regarding RAID 0 and RAID 1 configurations with EBS.
https://serverfault.com/questions/253908/is-raid-1-overkill-on-amazon-ebs-drives-in-terms-of-reliability

How is mongod working set partitioned among databases

I am wondering how mongo is split the available memory among databases. I have multiple databases running in one mongod of variable size and I would like to know how is my working sets going to be portioned.
Lets assume I collect data every day that are going to be access the day after (so daily my user are querying only the last day and no look back in the past). My problem is I am datasets of very variable sizes and so variable working sets. That is with a the following setup :
db1 - size 100 (45 %)
db2 - size 100 (45 %)
db3 - size 10 ( 5 %)
db4 - size 10 ( 5 %)
Now I wonder how would individual db working set would be partitioned in memory size in memory ? 45 / 45 / 5 / 5 ?
So in my case, I have db1 that got loaded all at once yesterday night and it feels like the partition is no longer 45 / 45 / 5 / 5 but 88 / 10 / 1 / 1 (meaning that db1 working set is overtaking the memory, values are arbitrary).
If that is the case, is there a way to ensure that individual dbs would keep space in memory ?
In MongoDB <= 2.6 with storage engine based on mmap, the operating system picks what data is and isn't in memory, based on the access patterns of MongoDB in the memory-mapped files. MongoDB doesn't control what's in or what's out of memory except through how it's accessing the data (and so neither can you control it through MongoDB).
To keep memory dedicated to a database in a system with pressure on memory, you'd need to keep accessing the database (you might say you'd need to keep the database "warmed up").