Two postgresql server with same configuration, different performance - postgresql

I got two identical servers, in both is installed postgresql server version 9.0.4 with the same configuration. If I launch a .sql file that performs about 5k inserts, on the first one it takes a couple of seconds, on the second one it takes 1 minute and 30 seconds.
If I set synchronous_commit, speed dramatically reduces (as expected), and the performances of the two servers are comparable. But if I set synchronous_commit to on, on one server the insert script execution time increases of less than one second, on the other one it increases too much, as I said in the first period.
Any idea about this difference in performances? Am I missing some configuration?
Update: tried a simple disk test: time sh -c "dd if=/dev/zero of=ddfile bs=8k count=200000 && sync"
fast server output:
1638400000 bytes (1.6 GB) copied, 1.73537 seconds, 944 MB/s
real 0m32.009s
user 0m0.018s
sys 0m2.298s
slow server output:
1638400000 bytes (1.6 GB) copied, 4.85727 s, 337 MB/s
real 0m35.045s
user 0m0.019s
sys 0m2.221s
Common features (both servers):
SATA, RAID1, controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller, distribution: linux centOS. mount -v output:
/dev/md2 on / type ext3 (rw)
proc on /proc type proc (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/md1 on /boot type ext3 (rw)
fast server: kernel 2.6.18-238.9.1.el5 #1 SMP
Disk /dev/sda: 750.1 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders, total 1465149168 sectors
Units = sectors of 1 * 512 = 512 bytes
Device Boot Start End Blocks Id System
/dev/sda1 3906 4209029 2102562 fd Linux raid autodetect
/dev/sda2 4209030 4739174 265072+ fd Linux raid autodetect
/dev/sda3 4739175 1465144064 730202445 fd Linux raid autodetect
slow server: kernel 2.6.32-71.29.1.el6.x86_64 #1 SMP
Disk /dev/sda: 750.2 GB, 750156374016 bytes
64 heads, 32 sectors/track, 715404 cylinders, total 1465149168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0006ffc4
Device Boot Start End Blocks Id System
/dev/sda1 2048 4194303 2096128 fd Linux raid autodetect
/dev/sda2 4194304 5242879 524288 fd Linux raid autodetect
/dev/sda3 5242880 1465147391 729952256 fd Linux raid autodetect
Could it be useful to address the performance issue?

I suppose your slow server with newer kernel has working barriers. This is good, as otherwise you can loose data in case of a power failure. But it is of course slower than running with write cache enabled and without barriers, aka running with scissors.
You can check if barriers are enabled using mount -v — search for barrier=1 in output. You can disable barriers for your filesystem (mount -o remount,barrier=0 /) to speed up, but then you risk data corruption.
Try to do your 5k inserts in one transaction — Postgres won't have to write to disk on every row inserted. The theoretical limit for number of transactions per second wound be comparable to disk rotational speed (7200rpm disk ≈ 7200/60 tps = 120 tps) as a disk can only write to a sector once per rotation.

To me this sounds like in the "fast" server there is a write cache enbled for the harddisk(s), whereas in the slow server the harddisk(s) are really writing the data when PG writes it (by calling fsync)

Related

High CPU usage of PostgreSQL

I have a PostgreSQL backed complex Ruby on Rails application running on a Ubuntu Virtual Machine. I see the Postgres processes are having very high %CPU values while running "top"commands.
. Periodically the %CPU is going up to 94 and 95.
lscpu
gives the fallowing output
Architecture: i686
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 4
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Stepping: 4
CPU MHz: 2100.000
BogoMIPS: 4200.00
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 33792K
top -n1
top -c
I want the know the reason for the High CPU utilization by Postgres.
Any help is appreciated.
Thanks in Advance!!

pg_top output analysis of puppetdb with postgres

I recently started using a tool called pg_top that shows statistics for Postgres, however since I am not verify versed with the internals of Postgres I needed a bit of clarification on the output.
last pid: 6152; load avg: 19.1, 18.6, 20.4; up 119+20:31:38 13:09:41
41 processes: 5 running, 36 sleeping
CPU states: 52.1% user, 0.0% nice, 0.8% system, 47.1% idle, 0.0% iowait
Memory: 47G used, 16G free, 2524M buffers, 20G cached
DB activity: 151 tps, 0 rollbs/s, 253403 buffer r/s, 86 hit%, 1550639 row r/s,
21 row w/s
DB I/O: 0 reads/s, 0 KB/s, 35 writes/s, 2538 KB/s
DB disk: 233.6 GB total, 195.1 GB free (16% used)
Swap:
My question is under the DB Activity, the 1.5 million row r/s, is that a lot? If so what can be done to improve it? I am running puppetdb 2.3.8, with 6.8 million resources, 2500 nodes, and Postgres 9.1. All of this runs on a single 24 core box with 64GB of memory.

Hugepagesize is not increasing to 1G in VM

I am using CentOS VM in ESXi Server. I want to increase the Huge page size to 1G.
I followed the link:
http://dpdk-guide.gitlab.io/dpdk-guide/setup/hugepages.html
I executed the small script to check if the size of 1 GB is supported:
[root#localhost ~]# if grep pdpe1gb /proc/cpuinfo >/dev/null 2>&1; then echo "1GB supported."; fi
1GB supported.
[root#localhost ~]#
I added default_hugepagesz=1GB hugepagesz=1G hugepages=4 to /etc/default/grub.
grub2-mkconfig -o /boot/grub2/grub.cfg
Rebooted the VM.
But still I can see 2048 KB (2MB) for the Huge page size.
[root#localhost ~]# cat /proc/meminfo | grep -i huge
AnonHugePages: 8192 kB
HugePages_Total: 1024
HugePages_Free: 1024
HugePages_Rsvd: 0
HugePages_Surp: 0
**Hugepagesize: 2048 kB**
[root#localhost ~]#
The following are details of VM:
[root#localhost ~]# uname -a
Linux localhost.localdomain 3.10.0-514.10.2.el7.x86_64 #1 SMP Fri Mar 3 00:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
[root#localhost ~]#
[root#localhost ~]# cat /proc/cpuinfo | grep -i flags
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq vmx ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt aes xsave avx hypervisor lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi ept vpid
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq vmx ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt aes xsave avx hypervisor lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi ept vpid
[root#localhost ~]#
8GB of Memory and 2 CPUs are allocated to VM.
CPU flag of 1gb hugepage support and guest OS support/enabling are not enough to get 1 gb hugepages working in virtualized environment.
The idea of huge pages both on PMD (2MB or 4 MB before PAE and x86_64) and on PUD level (1 GB) is to create mapping from aligned virtual region of huge size into some huge region of physical memory (As I understand, it should be aligned too). With additional virtualization level of hypervisor there are now three (or four) memory levels: virtual memory of app in guest OS, some memory which is considered as physical by guest OS (it is the memory managed by virtualization solution: ESXi, Xen, KVM, ....), and the real physical memory. It's reasonable to assume that hugepage idea should have the same size of huge region in all three levels to be useful (generate less TLB miss, use less Page Table structures to describe a lot of memory - grep "Need bigger than 4KB pages" in the DickSites's "Datacenter Computers: modern challenges in CPU design", Google, Feb2015).
So, to use huge page of some level inside Guest OS, you should already have same sized huge pages in physical memory (in your Host OS) and in your virtualization solution. You can't effectively use huge page inside Guest when they are not available for you host OS and Virtualization software. (Some like qemu or bochs may emulate them but this will be from slow to very slow.) And when you want both 2 MB and 1 GB huge pages: Your CPU, Host OS, Virtual System and Guest OS all should support them (and host system should have enough aligned continuous physical memory to allocate 1 GB page, you probably can't split this page over several Sockets in NUMA).
Don't know about ESXi, but there as some links for
RedHat and some(?) linux virtualization solutions (with libvirtd). In "Virtualization Tuning and Optimization Guide" manual hugepages are listed for Host OS: ⁠8.3.3.3. Enabling 1 GB huge pages for guests at boot or runtime
:
Procedure 8.2. Allocating 1 GB huge pages at boot time
To allocate different sizes of huge pages at boot, use the following command, specifying the number of huge pages. This example allocates 4 1 GB huge pages and 1024 2 MB huge pages: 'default_hugepagesz=1G hugepagesz=1G hugepages=4 hugepagesz=2M hugepages=1024' Change this command line to specify a different number of huge pages to be allocated at boot.
Note The next two steps must also be completed the first time you allocate 1 GB huge pages at boot time.
Mount the 2 MB and 1 GB huge pages on the host:
# mkdir /dev/hugepages1G
# mount -t hugetlbfs -o pagesize=1G none /dev/hugepages1G
# mkdir /dev/hugepages2M
# mount -t hugetlbfs -o pagesize=2M none /dev/hugepages2M
Restart libvirtd to enable the use of 1 GB huge pages on guests:
# service restart libvirtd
1 GB huge pages are now available for guests.
For Ubuntu and KVM: "KVM - Using Hugepages"
By increasing the page size, you reduce the page table and reduce the pressure on the TLB cache. ... vm.nr_hugepages = 256 ... Reboot the system (note: this is about physical reboot of host machine and host OS) ... Set up Libvirt to use Huge Pages KVM_HUGEPAGES=1 ... Setting up a guest to use Huge Pages
For Fedora and KVM (old manual about 2MB pages): https://fedoraproject.org/wiki/Features/KVM_Huge_Page_Backed_Memory
ESXi 5 had support of 2MB pages, which should be manually enabled: How to Modify Large Memory Page Settings on ESXi
For "VMware’s ESX server" of unknown version, from March 2015 paper: BQ Pham, "Using TLB Speculation to Overcome Page Splintering in Virtual Machines", Rutgers University Technical Report DCS-TR-713, March 2015:
Lack of hypervisor support for large pages: Finally, hypervisor vendors can take a few production cycles before fully adopting large pages. For example, VMware’s ESX server currently has no support for 1GB large pages in the hypervisor,
even though guests on x86-64 systems can use them.
Newer paper, no direct conclusion about 1GB pages: https://rucore.libraries.rutgers.edu/rutgers-lib/49279/PDF/1/
We find that large pages are conflicted with lightweight memory management across a range of hypervisors (e.g., ESX, KVM) across architectures (e.g., ARM, x86-64) and container-based technologies.
Old pdf from VMWare: "Large Page Performance. ESX Server 3.5 and ESX Server 3i v3.5". https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/large_pg_performance.pdf --- only 2MB huge pages are listed as supported
VMware ESX Server 3.5 and VMware ESX Server 3i v3.5 introduce 2MB large page support to the virtualized
environment. In earlier versions of ESX Server, guest operating system large pages were emulated using small
pages. This meant that, even if the guest operating system was using large pages, it did not get the
performance benefit of reducing TLB misses. The enhanced large page support in ESX Server 3.5 and ESX
Server 3i v3.5 enables 32‐bit virtual machines in PAE mode and 64‐bit virtual machines to make use of large
pages.
Passthrough host cpu to VM work for me, which gives VM pdpe1gb cpu flag.
I use Qemu + libvirt, enable 1G hugepagesz on host.
Maybe it is helpful. set the cpu fuature in the xml described the vm as following:
<cpu mode='custom' match='exact' check='partial'>
<model fallback='allow'>Broadwell</model>
<feature policy='force' name='pdpe1gb'/>
</cpu>

Postgres gets out of memory errors despite having plenty of free memory

I have a server running Postgres 9.1.15. The server has 2GB of RAM and no swap. Intermittently Postgres will start getting "out of memory" errors on some SELECTs, and will continue doing so until I restart Postgres or some of the clients that are connected to it. What's weird is that when this happens, free still reports over 500MB of free memory.
select version();:
PostgreSQL 9.1.15 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit
uname -a:
Linux db 3.2.0-23-virtual #36-Ubuntu SMP Tue Apr 10 22:29:03 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Postgresql.conf (everything else is commented out/default):
max_connections = 100
shared_buffers = 500MB
work_mem = 2000kB
maintenance_work_mem = 128MB
wal_buffers = 16MB
checkpoint_segments = 32
checkpoint_completion_target = 0.9
random_page_cost = 2.0
effective_cache_size = 1000MB
default_statistics_target = 100
log_temp_files = 0
I got these values from pgtune (I chose "mixed type of applications") and have been fiddling with them based on what I've read, without making much real progress. At the moment there's 68 connections, which is a typical number (I'm not using pgbouncer or any other connection poolers yet).
/etc/sysctl.conf:
kernel.shmmax=1050451968
kernel.shmall=256458
vm.overcommit_ratio=100
vm.overcommit_memory=2
I first changed overcommit_memory to 2 about a fortnight ago after the OOM killer killed the Postgres server. Prior to that the server had been running fine for a long time. The errors I get now are less catastrophic but much more annoying because they are much more frequent.
I haven't had much luck pinpointing the first event that causes postgres to run "out of memory" - it seems to be different each time. The most recent time it crashed, the first three lines logged were:
2015-04-07 05:32:39 UTC ERROR: out of memory
2015-04-07 05:32:39 UTC DETAIL: Failed on request of size 125.
2015-04-07 05:32:39 UTC CONTEXT: automatic analyze of table "xxx.public.delayed_jobs"
TopMemoryContext: 68688 total in 10 blocks; 4560 free (4 chunks); 64128 used
[... snipped heaps of lines which I can provide if they are useful ...]
---
2015-04-07 05:33:58 UTC ERROR: out of memory
2015-04-07 05:33:58 UTC DETAIL: Failed on request of size 16.
2015-04-07 05:33:58 UTC STATEMENT: SELECT oid, typname, typelem, typdelim, typinput FROM pg_type
2015-04-07 05:33:59 UTC LOG: could not fork new process for connection: Cannot allocate memory
2015-04-07 05:33:59 UTC LOG: could not fork new process for connection: Cannot allocate memory
2015-04-07 05:33:59 UTC LOG: could not fork new process for connection: Cannot allocate memory
TopMemoryContext: 396368 total in 50 blocks; 10160 free (28 chunks); 386208 used
[... snipped heaps of lines which I can provide if they are useful ...]
---
2015-04-07 05:33:59 UTC ERROR: out of memory
2015-04-07 05:33:59 UTC DETAIL: Failed on request of size 1840.
2015-04-07 05:33:59 UTC STATEMENT: SELECT... [nested select with 4 joins, 19 ands, and 2 order bys]
TopMemoryContext: 388176 total in 49 blocks; 17264 free (55 chunks); 370912 used
The crash before that, a few hours earlier, just had three instances of that last query as the first three lines of the crash. That query gets run very often, so I'm not sure if the issues are because of this query, or if it just comes up in the error log because it's a reasonably complex SELECT getting run all the time. That said, here's an EXPLAIN ANALYZE of it: http://explain.depesz.com/s/r00
This is what ulimit -a for the postgres user looks like:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 15956
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 15956
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I'll try and get the exact numbers from free next time there's a crash, in the meantime this is a braindump of all the info I have.
Any ideas on where to go from here?
I just ran into this same issue with a ~2.5 GB plain-text SQL file I was trying to restore. I scaled my Digital Ocean server up to 64 GB RAM, created a 10 GB swap file, and tried again. I got an out-of-memory error with 50 GB free, and no swap in use.
I scaled back my server to the small 1 GB instance I was using (requiring a reboot) and figured I'd give it another shot for no other reason than I was frustrated. I started the import and realized I forgot to create my temporary swap file again.
I created it in the middle of the import. psql made it a lot further before crashing. It made it through 5 additional tables.
I think there must be a bug allocating memory in psql.
Can you check if there's any swap memory available when the error raises up?
I've remove completely the swap memory in my Linux desktop (just for testing other things...) and I got the exactly same error! I'm pretty sure that this is what is going on with you too.
It is a bit suspicious that you report the same free memory size as your shared_buffers size. Are you sure you are looking the right values?
Output of free command at the time of crash would be useful as well as the content of /proc/meminfo
Beware that setting overcommit_memory to 2 is not so effective if you see the overcommit_ratio to 100. It will basically limits the memory allocation to the size swap (0 in this case) + 100% of physical RAM, which doesn't take into account any space for shared memory and disk caches.
You should probably set overcommit_ratio to 50.

strange G-WAN response speed differences

I had just implement G-WAN web server and test for my code, however, it is very strange that my server response very fast sometimes (20 ms), and sometimes over few seconds (6–7 s) or even timeout...
I try to simplify my code, and return a string to clients, the problem still occurs...
Beside, I had log the time consume by my code, it never over 1 sec, so what cause the problem?!
I guess this cause by network delay, and test the network speed of the same server, it very fast, any idea? (Will the problem caused by include some 3rd party library like MySQL?)
Here is my G-WAN log:
*------------------------------------------------
*G-WAN 4.3.14 64-bit (Mar 14 2013 07:33:12)
* ------------------------------------------------
* Local Time: Mon, 29 Jul 2013 10:09:05 GMT+8
* RAM: (918.46 MiB free + 0 shared + 222.81 MiB buffers) / 1.10 GiB total
* Physical Pages: 918.46 MiB / 1.10 GiB
* DISK: 3.27 GiB free / 6.46 GiB total
* Filesystem Type Size Used Avail Use% Mounted on
* /dev/mapper/vg_centos6-root
* ext4 6.5G 3.2G 3.0G 52% /
* tmpfs tmpfs 1004M 8.2M 995M 1% /dev/shm
* /dev/xvda1 ext4 485M 129M 331M 28% /boot
* 105 processes, including pid:10874 '/opt/gwan/gwan'
* Page-size:4,096 Child-max:65,535 Stream-max:16
* CPU: 1x Intel(R) Xeon(R) CPU E5506 # 2.13GHz
* 0 id: 0 0
* Cores: possible:0-14 present:0 online:0
* L1d cache: 32K line:64 0
* L1i cache: 32K line:64 0
* L2 cache: 256K line:64 0
* L3 cache: 4096K line:64 0
* NUMA node #1 0
* CPU(s):1, Core(s)/CPU:0, Thread(s)/Core:2
* Bogomips: 4,256.14
* Hypervisor: XenVMMXenVMM
* using 1 workers 0[1]0
* among 2 threads 0[]1
* 64-bit little-endian (least significant byte first)
* CentOS release 6.3 (Final) (3.5.5-1.) 64-bit
* user: root (uid:0), group: root (uid:0)
* system fd_max: 65,535
* program fd_max: 65,535
* updated fd_max: 500,000
* Available network interfaces (3):
* 127.0.0.1
* 192.168.0.1
* xxx.xxx.xxx.xxx
* memory footprint: 1.39 MiB.
* Host /opt/gwan/0.0.0.0_8080/#0.0.0.0
* loaded index.c 3.46 MiB MD5:afb6c263-791c706a-598cc77b-e0873517
* memory footprint: 3.40 MiB.
If I use -g mode, and increase the number of workers up to the number of CPUs of the server, this problem seem to be resolved
Then, it seems to be a CPU detection issue. Please dump the relevant part of your gwan.log file header (CPU detection) so ew can have a look.
When G-WAN has to re-compile a servlet using external libraires that must be searched and linked, this may take time (especially if there's only one worker and other requests are pending).
UPDATE: following your gwan.log file dump, here is what's important:
CPU: 1x Intel(R) Xeon(R) CPU E5506 # 2.13GHz
0 id: 0 0
Cores: possible:0-14 present:0 online:0
CPU(s):1, Core(s)/CPU:0, Thread(s)/Core:2
Hypervisor: XenVMMXenVMM
using 1 workers 0[1]0
among 2 threads 0[]1
The Intel E5506 is a 4-Core CPU... but the Xen Hypervisor is reporting 1 CPU and 0 Cores (and hyperthreading enabled, which makes no sense without any CPU Core).
Why Xen finds it a priority to corrupt the genuine and correct information about the CPU with complete nonsense is beyond the purpose of this discussion.
All I can say is that this is the cause of the issue experienced by 'moriya' (hence the 'fix' with ./gwan -g -w 4 to bypass the wrong information reported by the corrupted Linux kernel /proc and the CPUID instruction).
I can only suggest to avoid using brain-damaged hypervisors which prevent multicore software (like G-WAN) from running correctly by sabotaging the two standard ways to detect CPU topologies: the Linux kernel /proc structure and the CPUID instruction.