Network throughput from aws to on-premise site-to-site vpn - bandwidth

I have an issue network throughput from AWS to on-premise site-to-site VPN. when I check throughput from AWS EC2 with iperf3 command i get high throughput but my connection has latency from backend server which located AWS EC2 to db server which located on-premise server .
the output of iperf3
iperf3 -c 10.0.32.201 -p 5001
Connecting to host 10.0.32.201, port 5001
[ 4] local 10.242.162.61 port 56224 connected to 10.0.32.201 port 5001
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 13.3 MBytes 111 Mbits/sec 0 5.34 MBytes
[ 4] 1.00-2.00 sec 31.2 MBytes 262 Mbits/sec 111 3.74 MBytes
[ 4] 2.00-3.00 sec 25.0 MBytes 210 Mbits/sec 234 1.89 MBytes
[ 4] 3.00-4.00 sec 17.5 MBytes 147 Mbits/sec 391 1.40 MBytes
[ 4] 4.00-5.00 sec 20.0 MBytes 168 Mbits/sec 0 1.48 MBytes code here
but when i try to check throughput with dd
the output is very low
ssh root#10.0.32.201 'dd if=/dev/zero bs=5GB count=3 2>/dev/null' | dd of=/dev/null
status=progress
Password:
6428484096 bytes (6,4 GB) copied, 303,729862 s, 21,2 MB/s
What is the difference between these two traffic?

Related

ceph df max available miscalculation

Ceph cluster shows following weird behavior with ceph df output:
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 817 TiB 399 TiB 418 TiB 418 TiB 51.21
ssd 1.4 TiB 1.2 TiB 22 GiB 174 GiB 12.17
TOTAL 818 TiB 400 TiB 418 TiB 419 TiB 51.15
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
pool1 45 300 21 TiB 6.95M 65 TiB 20.23 85 TiB
pool2 50 50 72 GiB 289.15k 357 GiB 0.14 85 TiB
pool3 53 64 2.9 TiB 754.06k 8.6 TiB 3.24 85 TiB
erasurepool_data 57 1024 138 TiB 50.81M 241 TiB 48.49 154 TiB
erasurepool_metadata 58 8 9.1 GiB 1.68M 27 GiB 2.46 362 GiB
device_health_metrics 59 1 22 MiB 163 66 MiB 0 85 TiB
.rgw.root 60 8 5.6 KiB 17 3.5 MiB 0 85 TiB
.rgw.log 61 8 70 MiB 2.56k 254 MiB 0 85 TiB
.rgw.control 62 8 0 B 8 0 B 0 85 TiB
.rgw.meta 63 8 7.6 MiB 52 32 MiB 0 85 TiB
.rgw.buckets.index 64 8 11 GiB 1.69k 34 GiB 3.01 362 GiB
.rgw.buckets.data 65 512 23 TiB 33.87M 72 TiB 21.94 85 TiB
As seen above available storage 399TiB, and max avail in pool list shows 85TiB. I use 3 replicas for each pool replicated pool and 3+2 erasure code for the erasurepool_data.
As far as I know Max Avail segment shows max raw available capacity according to replica size. So it comes up to 85*3=255TiB. Meanwhile cluster shows almost 400 available.
Which to trust?
Is this only a bug?
Turns out max available space is calculated according to the fullest osds in the cluster and has nothing to do with total free space in the cluster. From what i've found this kind of fluctiation mainly happens on small clusters.
MAX AVAIL column represents the amount of data that can be used before the first OSD becomes full. It takes into account the projected distribution of data across disks from the CRUSH map and uses the 'first OSD to fill up' as the target. it does not seem to be a bug. If MAX AVAIL is not what you expect it to be, look at the data distribution using ceph osd tree and make sure you have a uniform distribution.
You can also check some helpful posts here that explains some of the miscalculations:
Using available space in a Ceph pool
ceph-displayed-size-calculation
max-avail-in-ceph-df-command-is-incorrec
As you have Erasure Coding involved please check this SO post:
ceph-df-octopus-shows-used-is-7-times-higher-than-stored-in-erasure-coded-pool
When you add the erasure coded pool, i.e. erasurepool_data at 154, you get 255+154 = 399.

ceph pgs marked as inactive and undersized+peered

I installed a rook.io ceph storage cluster. Before installation, I cleaned up the previous installation like described here: https://rook.io/docs/rook/v1.7/ceph-teardown.html
The new cluster was provisioned correctly, however ceph is not healthy immediately after provisioning, and stuck.
data:
pools: 1 pools, 128 pgs
objects: 0 objects, 0 B
usage: 20 MiB used, 15 TiB / 15 TiB avail
pgs: 100.000% pgs not active
128 undersized+peered
[root#rook-ceph-tools-74df559676-scmzg /]# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
0 hdd 3.63869 1.00000 3.6 TiB 5.0 MiB 144 KiB 0 B 4.8 MiB 3.6 TiB 0 0.98 0 up
1 hdd 3.63869 1.00000 3.6 TiB 5.4 MiB 144 KiB 0 B 5.2 MiB 3.6 TiB 0 1.07 128 up
2 hdd 3.63869 1.00000 3.6 TiB 5.0 MiB 144 KiB 0 B 4.8 MiB 3.6 TiB 0 0.98 0 up
3 hdd 3.63869 1.00000 3.6 TiB 4.9 MiB 144 KiB 0 B 4.8 MiB 3.6 TiB 0 0.97 0 up
TOTAL 15 TiB 20 MiB 576 KiB 0 B 20 MiB 15 TiB 0
MIN/MAX VAR: 0.97/1.07 STDDEV: 0
[root#rook-ceph-tools-74df559676-scmzg /]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 14.55475 root default
-3 14.55475 host storage1-kube-domain-tld
0 hdd 3.63869 osd.0 up 1.00000 1.00000
1 hdd 3.63869 osd.1 up 1.00000 1.00000
2 hdd 3.63869 osd.2 up 1.00000 1.00000
3 hdd 3.63869 osd.3 up 1.00000 1.00000
Is there anyone who can explain what went wrong and how to fix the issue?
The problem is that osds are running on the same host and failure domain is set to host. Switching failure domain to osd fixes the issue. The default failure domain can be changed as per https://stackoverflow.com/a/63472905/3146709

ceph df (octopus) shows USED is 7 times higher than STORED in erasure coded pool

The pool default.rgw.buckets.data has 501 GiB stored, but USED shows 3.5 TiB.
root#ceph-01:~# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 196 TiB 193 TiB 3.5 TiB 3.6 TiB 1.85
TOTAL 196 TiB 193 TiB 3.5 TiB 3.6 TiB 1.85
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
device_health_metrics 1 1 19 KiB 12 56 KiB 0 61 TiB
.rgw.root 2 32 2.6 KiB 6 1.1 MiB 0 61 TiB
default.rgw.log 3 32 168 KiB 210 13 MiB 0 61 TiB
default.rgw.control 4 32 0 B 8 0 B 0 61 TiB
default.rgw.meta 5 8 4.8 KiB 11 1.9 MiB 0 61 TiB
default.rgw.buckets.index 6 8 1.6 GiB 211 4.7 GiB 0 61 TiB
default.rgw.buckets.data 10 128 501 GiB 5.36M 3.5 TiB 1.90 110 TiB
The default.rgw.buckets.data pool is using erasure coding:
root#ceph-01:~# ceph osd erasure-code-profile get EC_RGW_HOST
crush-device-class=hdd
crush-failure-domain=host
crush-root=default
jerasure-per-chunk-alignment=false
k=6
m=4
plugin=jerasure
technique=reed_sol_van
w=8
If anyone could help explain why it's using up 7 times more space, it would help a lot.
Versioning is disabled. ceph version 15.2.13 (octopus stable).
This is related to bluestore_min_alloc_size_hdd=64K (default on Octopus).
If using Erasure Coding, data is broken up into smaller chunks, which each take 64K on disk.
One option would be to lower bluestore_min_alloc_size_hdd to 4K, which is good if your solution requires storing millions of tiny (16K) objects. In my case, I'm storing hundreds of millions of 3-4M photos, so I decide to skip Erasure Coding, stay on bluestore_min_alloc_size_hdd=64K and switch to replicated 3 (min 2). Which is much safer and faster in the long run.
Here is the reply from Josh Baergen on the mailing list:
Hey Arkadiy,
If the OSDs are on HDDs and were created with the default
bluestore_min_alloc_size_hdd, which is still 64KiB in Octopus, then in
effect data will be allocated from the pool in 640KiB chunks (64KiB *
(k+m)). 5.36M objects taking up 501GiB is an average object size of 98KiB
which results in a ratio of 6.53:1 allocated:stored, which is pretty close
to the 7:1 observed.
If my assumption about your configuration is correct, then the only way to
fix this is to adjust bluestore_min_alloc_size_hdd and recreate all your
OSDs, which will take a while...
Josh

ceph raw used is more than sum of used in all pools (ceph df detail)

First of all sorry for my poor English
In my ceph cluster, when i run the ceph df detail command it shows me like as following result
RAW STORAGE:
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 62 TiB 52 TiB 10 TiB 10 TiB 16.47
ssd 8.7 TiB 8.4 TiB 370 GiB 377 GiB 4.22
TOTAL 71 TiB 60 TiB 11 TiB 11 TiB 14.96
POOLS:
POOL ID STORED OBJECTS USED %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER COMPR
rbd-kubernetes 36 288 GiB 71.56k 865 GiB 1.73 16 TiB N/A N/A 71.56k 0 B 0 B
rbd-cache 41 2.4 GiB 208.09k 7.2 GiB 0.09 2.6 TiB N/A N/A 205.39k 0 B 0 B
cephfs-metadata 51 529 MiB 221 1.6 GiB 0 16 TiB N/A N/A 221 0 B 0 B
cephfs-data 52 1.0 GiB 424 3.1 GiB 0 16 TiB N/A N/A 424 0 B 0 B
So i have a question about the result
As you can see, sum of my pools used storage is less than 1 TB, But in RAW STORAGE section the used from HDD hard disks is 10TB and it is growing every day.I think this is unusual and something is wrong with this CEPH cluster.
And also FYI the output of ceph osd dump | grep replicated is
pool 36 'rbd-kubernetes' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 244 pg_num_target 64 pgp_num_target 64 last_change 1376476 lfor 2193/2193/2193 flags hashpspool,selfmanaged_snaps,creating tiers 41 read_tier 41 write_tier 41 stripe_width 0 application rbd
pool 41 'rbd-cache' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode on last_change 1376476 lfor 2193/2193/2193 flags hashpspool,incomplete_clones,selfmanaged_snaps,creating tier_of 36 cache_mode writeback target_bytes 1000000000000 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 3600s x1 decay_rate 0 search_last_n 0 min_read_recency_for_promote 1 min_write_recency_for_promote 1 stripe_width 0
pool 51 'cephfs-metadata' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 31675 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
pool 52 'cephfs-data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 742334 flags hashpspool,selfmanaged_snaps stripe_width 0 application cephfs
Ceph Version ceph -v
ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)
Ceph OSD versions ceph tell osd.* version return for all OSDs like
osd.0: {
"version": "ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)"
}
Ceph status ceph -s
cluster:
id: 6a86aee0-3171-4824-98f3-2b5761b09feb
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph-sn-03,ceph-sn-02,ceph-sn-01 (age 37h)
mgr: ceph-sn-01(active, since 4d), standbys: ceph-sn-03, ceph-sn-02
mds: cephfs-shared:1 {0=ceph-sn-02=up:active} 2 up:standby
osd: 63 osds: 63 up (since 41h), 63 in (since 41h)
task status:
scrub status:
mds.ceph-sn-02: idle
data:
pools: 4 pools, 384 pgs
objects: 280.29k objects, 293 GiB
usage: 11 TiB used, 60 TiB / 71 TiB avail
pgs: 384 active+clean
According to the provided data, you should evaluate the following considerations and scenarios:
The replication size is inclusive, and once the min_size is achieved in a write operation, you receive a completion message. That means you should expect storage consumption with the minimum of min_size and maximum of the replication size.
Ceph stores metadata and logs for housekeeping purposes, obviously consuming storage.
If you do benchmark operation via "rados bench" or a similar interface with the --no-cleanup parameter, objects will be permanently stored within the cluster that consumes storage.
All the mentioned scenarios are a couple of possibilities.

Does Google Kubernetes Engine support custom node images and/or 10Gbps networking?

We've been setting up a number of private GCP GKE clusters which work quite well. Each currently has a single node pool of 2 ContainerOS nodes.
We also have a non-K8s Compute Engine in the network that is a FreeBSD NFS server and is configured for 10Gbps networking.
When we log in to the K8s nodes, it appears that they do not support 10Gbps networking out of the box. We suspect this, because "large-receive-offload" seems to be turned off in the network interface(s).
We have created persistent storage claims inside the Kubernetes clusters for shares from this fileserver, and we would like them to support the 10Gbps networking but worry that it is limited to 1Gbps by default.
Google only seems to offer a few options for the image of its node-pools (either ContainerOS or Ubuntu). This is limited both through their GCP interface as well as the cluster creation command.
My question is:
Is it at all possible to support 10Gbps networking somehow in GCP GKE clusters?
Any help would be much appreciated.
Is it at all possible to support 10Gbps networking somehow in GCP GKE clusters?
Yes, GKE natively supports 10GE connections out-of-the-box, just like Compute Engine Instances, but it does not support custom node images.
A good way to test your speed limits is using iperf3.
I Created a GKE instance with default settings to test the connectivity speed.
I also created a Compute Engine VM named Debian9-Client which will host our test, as you see below:
First we set up our VM with iperf3 server running:
❯ gcloud compute ssh debian9-client-us --zone "us-central1-a
user#debian9-client-us:~$ iperf3 -s -p 7777
-----------------------------------------------------------
Server listening on 7777
-----------------------------------------------------------
Then we move to our GKE to run the test from a POD:
❯ k get nodes
NAME STATUS ROLES AGE VERSION
gke-cluster-1-pool-1-4776b3eb-16t7 Ready <none> 16m v1.15.7-gke.23
gke-cluster-1-pool-1-4776b3eb-mp84 Ready <none> 16m v1.15.7-gke.23
❯ kubectl run -i --tty --image ubuntu test-shell -- /bin/bash
root#test-shell-845c969686-6h4nl:/# apt update && apt install iperf3 -y
root#test-shell-845c969686-6h4nl:/# iperf3 -c 10.128.0.5 -p 7777
Connecting to host 10.128.0.5, port 7777
[ 4] local 10.8.0.6 port 60946 connected to 10.128.0.5 port 7777
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 661 MBytes 5.54 Gbits/sec 5273 346 KBytes
[ 4] 1.00-2.00 sec 1.01 GBytes 8.66 Gbits/sec 8159 290 KBytes
[ 4] 2.00-3.00 sec 1.08 GBytes 9.31 Gbits/sec 6381 158 KBytes
[ 4] 3.00-4.00 sec 1.00 GBytes 8.62 Gbits/sec 9662 148 KBytes
[ 4] 4.00-5.00 sec 1.08 GBytes 9.27 Gbits/sec 8892 286 KBytes
[ 4] 5.00-6.00 sec 1.11 GBytes 9.51 Gbits/sec 6136 532 KBytes
[ 4] 6.00-7.00 sec 1.09 GBytes 9.32 Gbits/sec 7150 755 KBytes
[ 4] 7.00-8.00 sec 883 MBytes 7.40 Gbits/sec 6973 177 KBytes
[ 4] 8.00-9.00 sec 1.04 GBytes 8.90 Gbits/sec 9104 212 KBytes
[ 4] 9.00-10.00 sec 1.08 GBytes 9.29 Gbits/sec 4993 594 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 9.99 GBytes 8.58 Gbits/sec 72723 sender
[ 4] 0.00-10.00 sec 9.99 GBytes 8.58 Gbits/sec receiver
iperf Done.
The average transfer rate was 8.58Gits/sec on this test, proving that the cluster node is, by default, running with 10Gbps Ethernet.
If I can help you further, just let me know in the comments.