Very slow Loading of LinkedGeoData in PostgreSql - postgresql

I have installed and tuned my PostgreSql Database and I downloaded LinkedGeoData files from here and then I have executed the line lgd-createdb -h localhost -d databasename -U user -W password -f bremen-latest.osm.pbf (12MB) and the same for saarland-latest.osm.pbf (21.6 MB) and worked fine and under 15 Minutes but I tried to load a heavier file like Mecklenburg-Vorpommern-latest.osm.pbf (54MB) and it didn't react very good, system executes that line but I wait for result since yesterday.
The values of my PostgreSql's conf File postgresql.conf are
shared_buffers = 2GB
effective_cache_size = 4GB
checkpoint_segments = 256
checkpoint_completion_target = 0.9
autovacuum = off
work_mem = 256MB
maintainance_work_mem = 256MB
My PostgreSql Version is 9.1 under Debian Machine.
How can I solve this issue?
I thank you in advance.

I am the developer of the lgd-createdb script, and I just tried to reproduce the problem using postgresql 9.3 (via ubuntu 14.04) on a notebook with quad core I7, SSD and 8GB RAM - and for me the Mecklenburg-Vorpommern-latest.osm.pbf file was loaded in less than 10 minutes.
My settings were:
shared_buffers = 2GB
temp_buffers = 64MB
work_mem = 64MB
maintenance_work_mem = 256MB
checkpoint_segments = 64
checkpoint_completion_target = 0.9
checkpoint_warning = 30s
effective_cache_size = 2GB
so quite similar to yours.
I even created a new version of the LGD script (not in the repo yet), where osmosis is configured to first load the data into the "snapshot" schema and afterwards convert it to the "simple" schema. Osmosis is optimized for the former schema, and indeed on a single run (using the CompactTempFile option) it was a slightly faster (8min snapshot vs 8:30min simple).
Do you have SSDs? The latter loading strategy might be significantly faster on non-SSDs (although it shouldn't be hours for a 50MB file).
Maybe a system load indicator such as htop or indicator-multiload could help you reveal resource problems (such as running out of RAM or high disk I/O by another process).

Related

Handle more than 5000 connections without PGBouncer in PostgreSQL

We are using, PostgreSQL Version 14.
Our PC configuration at production
Windows 2016 Server
32 GB RAM
8 TB Hard disk
8 Core CPU
In my PostgreSQL.conf file
Shared_buffer 8 GB
Work_MEM 1 GB
Maintenance_work_mem 1 GB
Max_Connections 1000
I wish to handle 5000 connection at a time. Somebody suggest me to go with PGBouncer.
But we initially starts with PostgreSQL without PGBouncer.
I need to know, is my configuration is OK with the 5000 connection or we need to increase RAM or any other...
This is our first, PostgreSQL implementation. So Please suggest me to start with PostgreSQL with out PGBouncer.
Thank you
Note:
In SQL if we set -1, it will handle more number of connections. Like this is there any configuration is available in PostgreSQL

Increase Max connections in postgresql

My Server Config :
CPU - 16 core
RAM - 64 GB
Storage : 2 TB
OS : CentOs 64 Bit
I have DB and java application on the same server.
My postgres config file has the following:
max_connections = 9999
shared_buffers = 6GB
However, when i check DB via show max_connections it shows only 500.
How can i increase the max_connections value ?
Either you forgot to remove the comment (#) at the beginning of the postgresql.conf line, or you didn't restart PostgreSQL.
But a setting of 500 is already much too high, unless you have some 100 cores in the machine and an I/O system to match. Use a connection pool.

The configuration of postgresql.conf for PostgreSql

I have a server with 32GB RAM, Windows platform. Out of that 8GB is assigned to JVM. 500 is the max number of postgres connections. Database size is around 24GB. Postgres Version is 9.2.3
What can be the best configuration in postgresql.conf.
I am a newbie in postgres database. Appreciate your help.
This is the current configuration
max_connections = 500
shared_buffers = 1GB
temp_buffers = 512MB
work_mem = 24MB
maintenance_work_mem = 512MB
wal_buffers = 8MB
effective_cache_size = 8GB

How to increase the max connections in postgres?

I am using Postgres DB for my product. While doing the batch insert using slick 3, I am getting an error message:
org.postgresql.util.PSQLException: FATAL: sorry, too many clients already.
My batch insert operation will be more than thousands of records.
Max connection for my postgres is 100.
How to increase the max connections?
Just increasing max_connections is bad idea. You need to increase shared_buffers and kernel.shmmax as well.
Considerations
max_connections determines the maximum number of concurrent connections to the database server. The default is typically 100 connections.
Before increasing your connection count you might need to scale up your deployment. But before that, you should consider whether you really need an increased connection limit.
Each PostgreSQL connection consumes RAM for managing the connection or the client using it. The more connections you have, the more RAM you will be using that could instead be used to run the database.
A well-written app typically doesn't need a large number of connections. If you have an app that does need a large number of connections then consider using a tool such as pg_bouncer which can pool connections for you. As each connection consumes RAM, you should be looking to minimize their use.
How to increase max connections
1. Increase max_connection and shared_buffers
in /var/lib/pgsql/{version_number}/data/postgresql.conf
change
max_connections = 100
shared_buffers = 24MB
to
max_connections = 300
shared_buffers = 80MB
The shared_buffers configuration parameter determines how much memory is dedicated to PostgreSQL to use for caching data.
If you have a system with 1GB or more of RAM, a reasonable starting
value for shared_buffers is 1/4 of the memory in your system.
it's unlikely you'll find using more than 40% of RAM to work better
than a smaller amount (like 25%)
Be aware that if your system or PostgreSQL build is 32-bit, it might
not be practical to set shared_buffers above 2 ~ 2.5GB.
Note that on Windows, large values for shared_buffers aren't as
effective, and you may find better results keeping it relatively low
and using the OS cache more instead. On Windows the useful range is
64MB to 512MB.
2. Change kernel.shmmax
You would need to increase kernel max segment size to be slightly larger
than the shared_buffers.
In file /etc/sysctl.conf set the parameter as shown below. It will take effect when postgresql reboots (The following line makes the kernel max to 96Mb)
kernel.shmmax=100663296
References
Postgres Max Connections And Shared Buffers
Tuning Your PostgreSQL Server
Adding to Winnie's great answer,
If anyone is not able to find the postgresql.conf file location in your setup, you can always ask the postgres itself.
SHOW config_file;
For me changing the max_connections alone made the trick.
EDIT: From #gies0r: In Ubuntu 18.04 it is at
/etc/postgresql/11/main/postgresql.conf
If your postgres instance is hosted by Amazon RDS, Amazon configures the max connections for you based on the amount of memory available.
Their documentation says you get 112 connections per 1 GB of memory (with a limit of 5000 connections no matter how much memory you have), but we found we started getting error messages closer to 80 connections in an instance with only 1 GB of memory. Increasing to 2 GB let us use 110 connections without a problem (and probably more, but that's the most we've tried so far.) We were able to increase the memory of an existing instance from 1 GB to 2 GB in just a few minutes pretty easily.
Here's the link to the relevant Amazon documentation:
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Limits.html#RDS_Limits.MaxConnections
change max_connections variable
in postgresql.conf file located in
/var/lib/pgsql/data or /usr/local/pgsql/data/
Locate postgresql.conf file by below command
locate postgresql.conf
Edit postgresql.conf file by below command
sudo nano /etc/postgresql/14/main/postgresql.conf
Change
max_connections = 100
shared_buffers = 24MB
to
max_connections = 300
shared_buffers = 80MB

Postgresql does not restart after changing max_connections or shared_buffers

I have tested pgtune on my postgres.config so I know what I can change there, but when I try to change max_connections or shared_buffers I can't restart postgres. I just get an error, but there is nothing in the log specifying the error. (Not sure where those logs go, but they are not in regular pg_log dir.)
My settings is:
shared_buffers = 24MB # (pgtune wizard 2013-04-11 = 120MB)
max_connections = 120 # (pgtune wizard 2013-04-11 = 200)
Im on a 512 linode which only runs postgresql. If I change shared_buffers beyond 24MB or max_connections beyond 120, I can't restart postgres.
I'm running on a Linode xen instance with Ubuntu 12.04.2 LTS:
Ubuntu 12.04.2 LTS (GNU/Linux 3.8.4-x86_64-linode31 x86_64)
Anyone know if postgres it self determine that 24MB and 120 connections is max for my system?
It sounds like you're probably exceeding a very low default limit for shared memory.
This is covered in the manual - see operating system resource limits. For Linux, see kernel.shmmax.
On a side-note, increasing max_connections is often the wrong answer. Most PostgreSQL instances will work best with a relatively small number of actively working connections. It's often best to use connection pooling to queue up work; you'll get better overall throughput with lower resource use. If your application doesn't have a connection pool built-in you can use PgBouncer as an external connection pool.