PostgreSQL vs Kubernetes performance problem - postgresql

I am running some performance tests against PostgreSQL to test the performance when the DB server runs on a VSI and when it runs on a Kubernetes on a worker node. I use PgBench to run these tests.
My DB (that in production is configured in the cluster) has a big workload so I tested on VSI (on IBM Cloud) with 64 vCPU Dual Processor 2.3 GHz 32 cores and 128 Gb RAM, disk 2Tb 5IOPS per Gb.
Then I tested it on Kubernetes (IBM Cloud) with a worker node of 48 vCPU, 192 Gb RAM, disk 2 Tb 5 IOPS per Gb.
The problem is that performance on Kubernetes is 50% worst than VSI and I didn't expect all this difference. So I am trying to understand what could be the bottleneck.
The disks in the tests are similar and throughput is similar, so the disk cannot be the bottleneck
I used a service to access the Pod configured as Network Load Balancer (before it was Application Load Balancer and the performance is even worst). However, I did also a test with a Pod to Pod communication deploying PgBench on another worker node bypassing the Service. But no improvements have been noticed. So I think this should exclude the problem is the Load Balancer.
The I though to reserve more CPU and RAM to PostgreSQL Pod adding something like this.
resources:
requests:
memory: 128Gi
cpu: 32
limits:
memory: 128Gi
cpu: 32
At this point, I don't know what I can do to improve performance.
Any suggestion? Is it normal that Kubernetes add all this overhead?

Related

Rightsizing Kubernetes Nodes | How much cost we save when we switch from VMs to containers

We are running 4 different micro-services on 4 different ec2 autoscaling groups:
service-1 - vcpu:4, RAM:32 GB, VM count:8
service-2 - vcpu:4, RAM:32 GB, VM count:8
service-3 - vcpu:4, RAM:32 GB, VM count:8
service-4 - vcpu:4, RAM:32 GB, VM count:16
We are planning to migrate this workload on EKS (in containers)
We need help in deciding the right node configuration (in EKS) to start with.
We can start with a small machine vcpu:4, RAM:32 GB, but will not get any cost saving as each container will need a separate vm.
We can use a large machine vcpu:16, RAM: 128 GB, but when these machines scale out, scaled out machine will be large and thus can be underutiliized.
Or we can go with a Medium machine like vcpu: 8, RAM:64 GB.
Other than this recommendation, we were also evaluating the cost saving of moving to containers.
As per our understanding, every VM machine comes with following overhead
Overhead of running hypervisor/virtualisation
Overhead of running separate Operating system
Note: One large VM vs many small VMs cost the same on public cloud as cost is based on number of vCPUs + RAM.
Hypervisor/virtualization cost is only valid if we are running on-prem, so no need to consider this.
On the 2nd point, how much resources a typical linux machine can take to run a OS? If we provision a small machine (vcpu:2, RAM:4GB), an approximate cpu usage is 0.2% and memory consumption (other than user space is 500Mb).
So, running large instances (count:5 instances in comparison to small instances count:40) can save 35 times of this cpu and RAM, which does not seem significant.
You are unlikely to see any cost savings in resources when you move to containers in EKS from applications running directly on VM's.
A Linux Container is just an isolated Linux process with specified resource limits, it is no different from a normal process when it comes to resource consumption. EKS still uses virtual machines to provide compute to the cluster, so you will still be running processes on a VM, regardless of containerization or not and from a resource point of view it will be equal. (See this answer for a more detailed comparison of VM's and containers)
When you add Kubernetes to the mix you are actually adding more overhead compared to running directly on VM's. The Kubernetes control plane runs on a set of dedicated VM's. In EKS those are fully managed in a PaaS, but Amazon charges a small hourly fee for each cluster.
In addition to the dedicated control plane nodes, each worker node in the cluster need a set of programs (system pods) to function properly (kube-proxy, kubelet etc.) and you may also define containers that must run on each node (daemon sets), like log collectors and security agents.
When it comes to sizing the nodes you need to find a balance between scaling and cost optimization.
The larger the worker node is the smaller the relative overhead of system pods and daemon sets become. In theory a worker node large enough to accommodate all your containers would maximize resources consumed by your applications compared to supporting applications on the node.
The smaller the worker nodes are the smaller the horizontal scaling steps can be, which is likely to reduce waste when scaling. It also provides better resilience as a node failure will impact fewer containers.
I tend to prefer nodes that are small so that scaling can be handled efficiently. They should be slightly larger than what is required from the largest containers, so that system pods and daemon sets also can fit.

Apache Druid: My VM hangs when I try to load quickstart data

I'm new to Apache Druid. I used Azure VM (Standard B2s (2 vcpus, 4 GiB memory)) to install apache druid and then tried to load the quick-start tutorial json data (wikiticker-2015-09-12-sampled.json.gz) using console.
I followed all the instructions as mentioned in the DRUID tutorial on their official site. I tried multiple times but each time the VM hangs and make it unresponsive. Am I missing anything/need to do any configuration changes for task to execute before loading the data?
Thanks.
Druid comes with several startup configuration profiles for a range of machine sizes.
*Single server reference configurations
Nano-Quickstart: 1 CPU, 4GB RAM
Micro-Quickstart: 4 CPU, 16GB RAM
Small: 8 CPU, 64GB RAM (~i3.2xlarge)
Medium: 16 CPU, 128GB RAM (~i3.4xlarge)
Large: 32 CPU, 256GB RAM (~i3.8xlarge)
X-Large: 64 CPU, 512GB RAM (~i3.16xlarge)
*
To start the Druid services I was using the micro configuration profile:
./bin/start-micro-quickstart
However, my machines as mentioned above is more of a Nano configuration and hence should be using below command to start the Druid services:
./bin/start-nano-quickstart
I was now able to successfully load and query the data file.
Please check your machine configuration before running the service start command.
Regards,
Udayan

Ceph rbd write poor performance

We have a ceph cluster of four nodes, 48GB of memory on each, and Debian 9.8.
Cluster connected 10Gbe. Cisco sg350xg-24t switch, Intel Corporation Ethernet Controller 10G X550T Cards. One port for internal network, one for external.
For tests, each node has only one ssd Intel DC S4600 Series and BlueStor.
Created rbd pool, located on these ssd with replication 3.
Created image in this pool.
The image is mounted on one of the nodes.
With the fio test, I get 600 IOPS on random write and 3,600 IOPS on random read.
With the rbd bench test, I got 6000 IOPS on random write and 15000 IOPS on random read.
Can you please tell me where such a big difference in performance comes from?
There are also nodes with proxmox, when using Ceph Rbd, for virtual machines, there is the same performance as with fio.

Presto on Preemptible GCE instances

I am running an instance group of 20 Preemptible GCE instance to read ORC files on Google storage, The data partitioned by hour, each hour about 2GB.
What type of instances should i use ?
How many of the Ram should be used by the JVM ?
I am using autoscale configuration of 80% CPU and 10 minute cooldown, Is there more subtitle config for Presto ?
Is there a solution for servers shutdowns, due to lack of resources ?
Partial responses will be appreciated as well.
As 0.199 version of PrestoDB there's no google cloud storage connector for Presto, which makes impossible to query GCS data.
Regarding hardware requirements, I'll cite Terada doc here.
Memory
You should allocate a minimum of 16GB of RAM per node for Presto. But
recommend 64GB for most production workloads.
Network Bandwidth
It is recommended to have 10 Gigabit Ethernet between all the nodes in
the cluster.
Other Recommendations
Presto can be installed on any normally configured Hadoop cluster.
YARN should be configured to account for resources dedicated to
Presto. For example, if a node has 64GB of RAM, perhaps you would
normally allocate 60GB to YARN. If you install Presto on that node and
give Presto 32GB of RAM, then you should subtract 32GB from the 60GB
and let YARN only allocate 28GB per node. An optimized configuration
might choose to have separate Presto and Hadoop nodes. The optimized
configuration allows you to give more memory to Presto, and thus
perform larger join queries, for example.

Should I use SSD or HDD as local disks for kubernetes cluster?

Is it worth using SSD as boot disk? I'm not planning to access local disks within pods.
Also, GCP by default creates 100GB disk. If I use 20GB disk, will it cripple the cluster or it's OK to use smaller sized disks?
Why one or the other?. Kubernetes (Google Conainer Engine) is mainly Memory and CPU intensive unless your applications need a huge throughput on the hard drives. If you want to save money you can create tags on the nodes with HDD and use the node-affinity to tweak which pods goes where so you can have few nodes with SSD and target them with the affinity tags.
I would always recommend SSD considering the small difference in price and large difference in performance. Even if it just speeds up the deployment/upgrade of containers.
Reducing the disk size to what is required for running your PODs should save you more. I cannot give a general recommendation for disk size since it depends on the OS you are using and how many PODs you will end up on each node as well as how big each POD is going to be. To give an example: When I run coreOS based images with staging deployments for nginx, php and some application servers I can reduce the disk size to 10gb with ample free room (both for master and worker nodes). On the extreme side - If I run self-contained golang application containers without storage need, each POD will only require a few MB space.