strange G-WAN response speed differences - multicore

I had just implement G-WAN web server and test for my code, however, it is very strange that my server response very fast sometimes (20 ms), and sometimes over few seconds (6–7 s) or even timeout...
I try to simplify my code, and return a string to clients, the problem still occurs...
Beside, I had log the time consume by my code, it never over 1 sec, so what cause the problem?!
I guess this cause by network delay, and test the network speed of the same server, it very fast, any idea? (Will the problem caused by include some 3rd party library like MySQL?)
Here is my G-WAN log:
*------------------------------------------------
*G-WAN 4.3.14 64-bit (Mar 14 2013 07:33:12)
* ------------------------------------------------
* Local Time: Mon, 29 Jul 2013 10:09:05 GMT+8
* RAM: (918.46 MiB free + 0 shared + 222.81 MiB buffers) / 1.10 GiB total
* Physical Pages: 918.46 MiB / 1.10 GiB
* DISK: 3.27 GiB free / 6.46 GiB total
* Filesystem Type Size Used Avail Use% Mounted on
* /dev/mapper/vg_centos6-root
* ext4 6.5G 3.2G 3.0G 52% /
* tmpfs tmpfs 1004M 8.2M 995M 1% /dev/shm
* /dev/xvda1 ext4 485M 129M 331M 28% /boot
* 105 processes, including pid:10874 '/opt/gwan/gwan'
* Page-size:4,096 Child-max:65,535 Stream-max:16
* CPU: 1x Intel(R) Xeon(R) CPU E5506 # 2.13GHz
* 0 id: 0 0
* Cores: possible:0-14 present:0 online:0
* L1d cache: 32K line:64 0
* L1i cache: 32K line:64 0
* L2 cache: 256K line:64 0
* L3 cache: 4096K line:64 0
* NUMA node #1 0
* CPU(s):1, Core(s)/CPU:0, Thread(s)/Core:2
* Bogomips: 4,256.14
* Hypervisor: XenVMMXenVMM
* using 1 workers 0[1]0
* among 2 threads 0[]1
* 64-bit little-endian (least significant byte first)
* CentOS release 6.3 (Final) (3.5.5-1.) 64-bit
* user: root (uid:0), group: root (uid:0)
* system fd_max: 65,535
* program fd_max: 65,535
* updated fd_max: 500,000
* Available network interfaces (3):
* 127.0.0.1
* 192.168.0.1
* xxx.xxx.xxx.xxx
* memory footprint: 1.39 MiB.
* Host /opt/gwan/0.0.0.0_8080/#0.0.0.0
* loaded index.c 3.46 MiB MD5:afb6c263-791c706a-598cc77b-e0873517
* memory footprint: 3.40 MiB.

If I use -g mode, and increase the number of workers up to the number of CPUs of the server, this problem seem to be resolved
Then, it seems to be a CPU detection issue. Please dump the relevant part of your gwan.log file header (CPU detection) so ew can have a look.
When G-WAN has to re-compile a servlet using external libraires that must be searched and linked, this may take time (especially if there's only one worker and other requests are pending).
UPDATE: following your gwan.log file dump, here is what's important:
CPU: 1x Intel(R) Xeon(R) CPU E5506 # 2.13GHz
0 id: 0 0
Cores: possible:0-14 present:0 online:0
CPU(s):1, Core(s)/CPU:0, Thread(s)/Core:2
Hypervisor: XenVMMXenVMM
using 1 workers 0[1]0
among 2 threads 0[]1
The Intel E5506 is a 4-Core CPU... but the Xen Hypervisor is reporting 1 CPU and 0 Cores (and hyperthreading enabled, which makes no sense without any CPU Core).
Why Xen finds it a priority to corrupt the genuine and correct information about the CPU with complete nonsense is beyond the purpose of this discussion.
All I can say is that this is the cause of the issue experienced by 'moriya' (hence the 'fix' with ./gwan -g -w 4 to bypass the wrong information reported by the corrupted Linux kernel /proc and the CPUID instruction).
I can only suggest to avoid using brain-damaged hypervisors which prevent multicore software (like G-WAN) from running correctly by sabotaging the two standard ways to detect CPU topologies: the Linux kernel /proc structure and the CPUID instruction.

Related

bitnami helm chart fails to launch pods,

My VM System has a below config, but when i download any bitnami/dokuwiki from bitnami charts or any other and run the deployment, Pods are getting pending or crashloop back. Can some one help in this regard.
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 40 bits physical, 48 bits virtual
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 4
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6230R CPU # 2.10GHz
Stepping: 7
CPU MHz: 2095.077
BogoMIPS: 4190.15
Hypervisor vendor: VMware
Virtualization type: full
L1d cache: 128 KiB
L1i cache: 128 KiB
L2 cache: 4 MiB
L3 cache: 143 MiB
NUMA node0 CPU(s): 0-3
issue:
I tried applying pv, but still it is not running. I want to run this pods.

High CPU usage of PostgreSQL

I have a PostgreSQL backed complex Ruby on Rails application running on a Ubuntu Virtual Machine. I see the Postgres processes are having very high %CPU values while running "top"commands.
. Periodically the %CPU is going up to 94 and 95.
lscpu
gives the fallowing output
Architecture: i686
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 4
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Stepping: 4
CPU MHz: 2100.000
BogoMIPS: 4200.00
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 33792K
top -n1
top -c
I want the know the reason for the High CPU utilization by Postgres.
Any help is appreciated.
Thanks in Advance!!

How to configure u-boot for 512 Mb Ram

I'm using an imx6slevk board with yocto-bsp.
I use mfgtool to flash the images and it works fine for 1GB DRAM.
Now I'm trying to change DRAM to 512 MB.
I modified the dts file memory node:
memory {
reg = <0x80000000 0x20000000>; //it was 0x40000000
};
I ran calibration tool and updated 2 registers
DATA 4 0x021b0848 0x4644484a //changed for 512 mb old value = 0x4241444a
DATA 4 0x021b0850 0x3a363a30 //changed for 512 mb old value = 0x3030312b
However u-boot log still shows 1 GiB DRAM in flash log:
U-Boot 2017.03-imx_v2017.03_4.9.88_2.0.0_ga+gb76bb1b (Sep 24 2019 - 11:04:03 +0530)
CPU: Freescale i.MX6SL rev1.2 996 MHz (running at 792 MHz)
CPU: Commercial temperature grade (0C to 95C) at 48C
Reset cause: POR
Model: Freescale i.MX6 SoloLite EVK Board
Board: MX6SLEVK
DRAM: 1 GiB
How can I change DRAM from 1 GiB o 512 MiB?
The kernel doesn't flash without this.

Minimum hardware requirements for JIRA Software, Confluence and MySQL?

My company is considering a self-hosted option for a combination of JIRA, Confluence and MySQL running behind an nginx proxy. We are a very small team of 5, and expect extremely mild usage for now. I hardly even expect any concurrent usage at this point.
I am a bit puzzled by the various guidelines posted by Atlassian:
https://confluence.atlassian.com/enterprise/jira-sizing-guide-461504623.html
https://confluence.atlassian.com/adminjiraserver075/jira-applications-installation-requirements-935390824.html
https://confluence.atlassian.com/doc/example-size-and-hardware-specifications-from-customer-survey-76840961.html
https://confluence.atlassian.com/doc/server-hardware-requirements-guide-30736403.html
It seems they don't want to bother providing actual minimum hardware requirements. For example, on the same page they could say "minimum heap size to allocate to Confluence is 1 GB and 1 GB for Synchrony (which is required for collaborative editing)" and also that " minimum hardware recommendation" is 6GB. The leap from 1 required plus 1 optional to 6 recommended minimum is bizarre, to say the least.
I think what I want to know is whether I will be able to fit this setup into a 2GB RAM machine or a 4GB RAM machine (both dual CPU).
OK, I have done a test with following configuration:
VM with 2 cores capped at ~2.2Ghz and 4GB RAM
Ubuntu 16.04 server
Docker and docker-compose
Containers:
nginx
jwilder/docker-gen
jrcs/letsencrypt-nginx-proxy-companion
cptactionhank/atlassian-jira-software
cptactionhank/atlassian-confluence
mysql
This 4GB RAM machine is barely capable of running this setup:
$ free -m
total used free shared buff/cache available
Mem: 3951 3553 107 0 291 157
Swap: 974 725 249
CPU usage was going up to 200% only during initialisation when JIRA and Confluence started with empty home dirs. The following top output is after:
creating a space and a page in Confluence
and a project with ~10 issues in JIRA
and linking JIRA and Confluence together
$ top -o %MEM | head -15
top - 16:14:33 up 6:12, 2 users, load average: 0.15, 0.04, 0.01
Tasks: 132 total, 1 running, 131 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.6 us, 0.5 sy, 0.0 ni, 95.8 id, 1.0 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem : 4046364 total, 128808 free, 3638444 used, 279112 buff/cache
KiB Swap: 998396 total, 252956 free, 745440 used. 161144 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6328 bin 20 0 3306232 1.468g 0 S 0.0 38.1 12:03.27 java
6418 bin 20 0 2860000 1.320g 0 S 0.0 34.2 10:56.24 java
7205 bin 20 0 2807088 476592 1724 S 0.0 11.8 1:58.37 java
5752 999 20 0 1815480 99804 4728 S 0.0 2.5 1:11.29 mysqld
1070 root 20 0 621908 28672 8904 S 0.0 0.7 0:30.74 dockerd
1179 root 20 0 623004 7536 2520 S 0.0 0.2 0:16.66 docker-containe
968 root 20 0 291352 6536 1912 S 0.0 0.2 0:00.77 snapd
8310 root 20 0 15388 5064 3056 S 0.0 0.1 0:21.39 docker-gen
Confluence also allocated ~500MB RAM to Synchrony:
$ ps aux --sort -rss | head -4
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
bin 6328 3.3 38.3 3306232 1551120 ? Ssl 10:14 12:12 /usr/lib/jvm/java-1.8-openjdk/bin/java -Djava.util.logging.config.file=/opt/atlassian/confluence...
bin 6418 2.9 34.1 2860000 1382868 ? Ssl 10:14 10:57 /usr/lib/jvm/java-1.8-openjdk/bin/java -Djava.util.logging.config.file=/opt/atlassian/jira/...
bin 7205 0.5 11.7 2807088 476588 ? Sl 10:44 1:59 /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -classpath /opt/atlassian/confluence/temp/... synchrony.core sql
During JIRA and Confluence install stage, MySQL peaked at around 500MB RAM usage, and during normal operation it sits around 100MB.
In my attempts, a 2GB machine was only enough to run either JIRA or Confluence without MySQL.
Conclusion:
It looks like 4GB RAM Dual core machine is the absolute minimum required for JIRA+Confluence+MySQL. But keep in mind that such a machine is barely enough for a practically empty project.
I personally was not expecting these applications to be that RAM hungry being empty.

Two postgresql server with same configuration, different performance

I got two identical servers, in both is installed postgresql server version 9.0.4 with the same configuration. If I launch a .sql file that performs about 5k inserts, on the first one it takes a couple of seconds, on the second one it takes 1 minute and 30 seconds.
If I set synchronous_commit, speed dramatically reduces (as expected), and the performances of the two servers are comparable. But if I set synchronous_commit to on, on one server the insert script execution time increases of less than one second, on the other one it increases too much, as I said in the first period.
Any idea about this difference in performances? Am I missing some configuration?
Update: tried a simple disk test: time sh -c "dd if=/dev/zero of=ddfile bs=8k count=200000 && sync"
fast server output:
1638400000 bytes (1.6 GB) copied, 1.73537 seconds, 944 MB/s
real 0m32.009s
user 0m0.018s
sys 0m2.298s
slow server output:
1638400000 bytes (1.6 GB) copied, 4.85727 s, 337 MB/s
real 0m35.045s
user 0m0.019s
sys 0m2.221s
Common features (both servers):
SATA, RAID1, controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller, distribution: linux centOS. mount -v output:
/dev/md2 on / type ext3 (rw)
proc on /proc type proc (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/md1 on /boot type ext3 (rw)
fast server: kernel 2.6.18-238.9.1.el5 #1 SMP
Disk /dev/sda: 750.1 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders, total 1465149168 sectors
Units = sectors of 1 * 512 = 512 bytes
Device Boot Start End Blocks Id System
/dev/sda1 3906 4209029 2102562 fd Linux raid autodetect
/dev/sda2 4209030 4739174 265072+ fd Linux raid autodetect
/dev/sda3 4739175 1465144064 730202445 fd Linux raid autodetect
slow server: kernel 2.6.32-71.29.1.el6.x86_64 #1 SMP
Disk /dev/sda: 750.2 GB, 750156374016 bytes
64 heads, 32 sectors/track, 715404 cylinders, total 1465149168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0006ffc4
Device Boot Start End Blocks Id System
/dev/sda1 2048 4194303 2096128 fd Linux raid autodetect
/dev/sda2 4194304 5242879 524288 fd Linux raid autodetect
/dev/sda3 5242880 1465147391 729952256 fd Linux raid autodetect
Could it be useful to address the performance issue?
I suppose your slow server with newer kernel has working barriers. This is good, as otherwise you can loose data in case of a power failure. But it is of course slower than running with write cache enabled and without barriers, aka running with scissors.
You can check if barriers are enabled using mount -v — search for barrier=1 in output. You can disable barriers for your filesystem (mount -o remount,barrier=0 /) to speed up, but then you risk data corruption.
Try to do your 5k inserts in one transaction — Postgres won't have to write to disk on every row inserted. The theoretical limit for number of transactions per second wound be comparable to disk rotational speed (7200rpm disk ≈ 7200/60 tps = 120 tps) as a disk can only write to a sector once per rotation.
To me this sounds like in the "fast" server there is a write cache enbled for the harddisk(s), whereas in the slow server the harddisk(s) are really writing the data when PG writes it (by calling fsync)