I'm in the development stage of an application and happened that I have to use Postgresql. In the README file are the following instructions...
On a MacBook Pro with 2GB of RAM, the author's sysctl.conf contains:
kern.sysv.shmmax=1610612736
kern.sysv.shmall=393216
kern.sysv.shmmin=1
kern.sysv.shmmni=32
kern.sysv.shmseg=8
kern.maxprocperuid=512
kern.maxproc=2048
Note that (kern.sysv.shmall * 4096) should be greater than or equal to
kern.sysv.shmmax. kern.sysv.shmmax must also be a multiple of 4096.
I'm guessing that for development I wouldn't have problem leaving the default setting for my Mac, however, there are occasions where I'm running some python script to do data science and I would like to take advantage all the resources available (RAM). What would be the correct configuration for 16GB of RAM.
Related
We have a Data ware house server running on Debian linux ,We are using PostgreSQL , Jenkins and Python.
It's been few day the memory of the CPU is consuming a lot by jenkins and Postgres.tried to find and check all the ways from google but the issue is still there.
Anyone can give me a lead on how to reduce this memory consumption,It will be very helpful.
below is the output from free -m
total used free shared buff/cache available
Mem: 63805 9152 429 16780 54223 37166
Swap: 0 0 0
below is the postgresql.conf file
Below is the System configurations,
Results from htop
Please don't post text as images. It is hard to read and process.
I don't see your problem.
Your machine has 64 GB RAM, 16 GB are used for PostgreSQL shared memory like you configured, 9 GB are private memory used by processes, and 37 GB are free (the available entry).
Linux uses available memory for the file system cache, which boosts PostgreSQL performance. The low value for free just means that the cache is in use.
For Jenkins, run it with these JAVA Options
JAVA_OPTS=-Xms200m -Xmx300m -XX:PermSize=68m -XX:MaxPermSize=100m
For postgres, start it with option
-c shared_buffers=256MB
These values are the one I use on a small homelab of 8GB memory, you might want to increase these to match your hardware
I have installed Nominatim to a server dedicated just for OSM data, with the following configurations: CentOS 7 operating system, 2x Intel XEON CPU L5420 # 2.50GHz (Total 8 CPU cores), 16 GB of ram, and 2x2TB SATA hard drive.
I've configured the postgresql based on the recomendations on the Nominatim install wiki (http://wiki.openstreetmap.org/wiki/Nominatim/Installation#PostgreSQL_Tuning), taking into account, that my machine has only got 16 GB instead of the 32 GB recommended for those configs. I've used the following things:
shared_buffers = 1GB # recommended for a 32GB machine was 2 GB
maintenance_work_mem = 4GB # recommended for a 32GB macinhe was 8 GB
work_mem = 20MB # recommended for a 32GB machine was 50 MB
effective_cache_size = 10GB # recommended for a 32GB machine was 24 GB
synchronous_commit = off
checkpoint_segments = 100
checkpoint_timeout = 10min
checkpoint_completion_target = 0.9
fsync = off
full_page_writes = off`
First, I've tried importing a small country extract(Luxembourg), setting a cache size of 6000, using the setup.php file from utils, it was imported succesfully under 1 hour.
Secondly, I've deleted the data of Luxembourg, and imported for another test purpose the country extract of Great Brittain, using a cache size of 8000, it imported succesfully as well, in around 2-3 hours.
Today, I've decided, to try to import the whole planet.pbf file, so I've deleted the postgresql database, downloaded a pbf of the planet from one of the official mirror sites, and ran the setup with a cache size of 10000. Beforehand, I've read up some benchmarks to get a vague idea of how much time and space will this operation take.
When the import started, I was very surprised. The importing of the nodes went with a whopping high speed of 1095.6k/s, in the benchmark which I've analyized (a 32GB ram machine), it was only 311.7k/s.
But when the import of the nodes finished, and the import of the ways started, the speed significantly dropped. It was importing the ways with the speed of 0.16k/s (altough it was slowly rising, it started from 0.05k/s, and in 4 hours it rised to the above mentioned value).
I've stopped the import, and tried to tweak the settings. I've allocated a higher cache size first (12000), but with no success, the nodes imported with a very high speed, but the ways remained at 0.10-0.13k/s. I then tried allocating a new swap file(the original was 8GB, I've allocated another 32GB as a swap file), but that didn't change anything neither. Lastly, I've edited the setup.php, changed the --number-processes from 1, to 6, and included the --slim keyword when osm2psql is started from there, but nothing changed.
Right now I am out of ideas. Is this speed decrease normal? Should I upgrade my machine to the recommended memory? I tought that a 16GB ram would be enough for planet pbf, I was aware that it could take more time with this machine, then with a 32 GB, but this seems very much. If the whole planet import would take not more then 12-15 days, I would be ok with that, but as things look now, with these settings the import would take around 2 months, and this is just too much, considering, an error could occur anywhere, and I have to start the whole import process again.
Any ideas what could cause this problem, or what other tweaks could I try, to fasten the import process?
Thanks
I had a similar performance problem using SATA drives, when I replaced the SATA drives for SSD drives the ways import speeded up from 0.02k/s to 8.29k/s. Now I have a very slow relations import which is at 0.01/s rate, so I believe memory is also an important factor for a full planet import but I have not tested it again.
I am working on a batch job which imports data from a legacy database, transforms the data in 3NF and inserts the resulting data into another database (target database). The batch job is written with Spring Batch.
While I was developing the steps of the job, I wrote unit tests to test the functionality for each step. But now I am finished with development of the steps and want to test the system in a kind of testing environment before rolling the batch job out to production. Therefore, I imported the legacy database locally on a MySQL server and also created a local version of the target database. These MySQL servers are deployed on my Macbook Pro with 256 GB SSD. I already ran the job a few times with little bugfixes but now it came to my mind that SSDs are more sensible to write cycles than a standard HDD. Hence, I checked the process mysqld in my activity manager and noticed that 424.64 GB have been written to my SSD in the last three days.
How much influence (lifetime, write cycles) does this number of written GB will have to my SSD? Would you recommend to deploy the database on a normal HDD instead of using my SSD? Or do you think that I am falsely alarmed?
I would recommend you deploy the database to a normal HDD, because the NAND flash on your SSD do have a max erase threshold. In other words, you are wearing down your SSD. Although SSDs have features to ensure that the NAND flash wear down evenly, you are definitely wearing it down much faster than normal usage.
I was looking for an answear but didn't find one.
I'm trying to create a new VM to develop a web application. What would be the optimal processor settings?
I have i7 (6th gen) with hyperthreading.
Host OS: Windows 10. Guest OS: CentOS.
Off topic: RAM that should I give to VM should be 50% of my memory? Would it be ok? (I have 16GB RAM)
Thanks!
This is referred to as 'right-sizing' a vm, and it is dependent on the application workload that will run inside it. Ideally, you want to provide the VM with the minimum amount of resources the app requires to run correctly. "Correctly" is subjective based upon your expectations.
Inside your VM (CentOS) you can run top to see how much memory and cpu % is being used. You can also install htop which you may find friendlier than top.
RAM
If you see a low % of RAM being used, you can probably reduce what you're giving the VM. If you are seeing any swap memory used (paging to disk), you may want to increase the RAM. Start with 2GB and see how the app behaves.
CPU
You'll may want to start with no more than 2vCPUs, checkout top to see how utilized the application is under load, and then make an assessment for more/less vCPUs.
The way a hosted hypervisor (VMware Workstation) handles guest CPU usage is through a CPU scheduler. When you give a vm x number of vCPUs, the VM will need to wait till that many cores are free on the CPU to do 'work'. The more vCPUs you give it, the more difficult (slower) it will be to schedule. It's more complicated than this, but I'm trying to keep it high level. CPU scheduling deep dive.
as here mention, mongodb has limitation of datasize to 2GB in 32-bit machine with one single mongod instance. But I wonder 32-bit machine has 4GB addressable space in theory, and mongod can use this 4GB instead of 2GB for virtual memory usage. So why the answer is 2GB, not 4GB?
4Gb of addressable space is not the same as the memory space available for memory-mapped files opened by user applications. Some of the addressable space is reserved for the O/S kernel and memory-mapped devices such as video cards.
For example, 32-bit Windows limits user mode (and thus memory-mapped files) to ~2Gb RAM and total system RAM to ~3.5Gb.
For more reading, see:
Coding Horror: Dude, Where's My 4 Gigabytes of RAM?
MSDN: Managing Memory-Mapped Files
MSDN: Memory-Mapped Files
The majority of modern desktop and server environments starting moving to 64-bit almost a decade ago (see 64-bit operating system timeline on Wikipedia) so this isn't a limit that practically affects deployment.
You would only want to use 32-bit MongoDB in a development environment with limited data.
32-bit MongoDB processes are limited to about 2 gb of data. This has come as a surprise to a lot of people who are used to not having to worry about that. The reason for this is that the MongoDB storage engine uses memory-mapped files for performance.
By not supporting more than 2gb on 32-bit, we’ve been able to keep our code much simpler and cleaner. This greatly reduces the number of bugs, and reduces the time that we need to release a 1.0 product.
http://blog.mongodb.org/post/137788967/32-bit-limitations