The command I use:
sudo bin/ycsb run cassandra-cql -P workloads/workloadf -P cassandra.dat -s > outputs/runf-2.dat 2> outputs/log-runf-2.txt
And:
tail -f outputs/log-runf-2.txt
But the log stuck at:
2021-08-16 17:36:38:255 7850 sec: 65373659 operations; 6141.2 current ops/sec; est completion in 1 hour 9 minutes [READ: Count=61414, Max=1200127, Min=265, Avg=4892.3, 90=1040, 99=168191, 99.9=544255, 99.99=799743] [READ-MODIFY-WRITE: Count=30620, Max=986111, Min=472, Avg=5108.82, 90=1557, 99=153087, 99.9=481279, 99.99=762367] [UPDATE: Count=30634, Max=63551, Min=166, Avg=484.96, 90=550, 99=737, 99.9=14887, 99.99=44991]
2021-08-16 17:36:48:255 7860 sec: 65435896 operations; 6223.7 current ops/sec; est completion in 1 hour 9 minutes [READ: Count=62232, Max=1267711, Min=300, Avg=4969.52, 90=1029, 99=150399, 99.9=586239, 99.99=946175] [READ-MODIFY-WRITE: Count=31050, Max=1258495, Min=541, Avg=5495.38, 90=1544, 99=156543, 99.9=572927, 99.99=950271] [UPDATE: Count=31044, Max=94847, Min=152, Avg=486.59, 90=544, 99=730, 99.9=16095, 99.99=54623]
2021-08-16 17:36:58:255 7870 sec: 65501151 operations; 6525.5 current ops/sec; est completion in 1 hour 9 minutes [READ: Count=65256, Max=2203647, Min=233, Avg=4650.72, 90=1016, 99=151935, 99.9=532479, 99.99=888319] [READ-MODIFY-WRITE: Count=32585, Max=2203647, Min=368, Avg=5292.65, 90=1524, 99=151295, 99.9=549375, 99.99=909823] [UPDATE: Count=32576, Max=87935, Min=132, Avg=485.37, 90=542, 99=726, 99.9=15343, 99.99=55423]
2021-08-16 17:37:08:255 7880 sec: 65559502 operations; 5835.1 current ops/sec; est completion in 1 hour 9 minutes [READ: Count=58354, Max=1277951, Min=313, Avg=5203.44, 90=1037, 99=176767, 99.9=634367, 99.99=939519] [READ-MODIFY-WRITE: Count=29259, Max=1217535, Min=563, Avg=5589.22, 90=1547, 99=172031, 99.9=627711, 99.99=916479] [UPDATE: Count=29247, Max=76863, Min=183, Avg=480.89, 90=545, 99=733, 99.9=17087, 99.99=57279]
2021-08-16 17:37:18:255 7890 sec: 65614920 operations; 5541.8 current ops/sec; est completion in 1 hour 8 minutes [READ: Count=55415, Max=1049599, Min=199, Avg=5494.55, 90=1047, 99=192383, 99.9=639999, 99.99=934911] [READ-MODIFY-WRITE: Count=27552, Max=1030143, Min=326, Avg=5864.25, 90=1571, 99=184319, 99.9=578047, 99.99=915455] [UPDATE: Count=27567, Max=773631, Min=113, Avg=551.14, 90=553, 99=742, 99.9=26751, 99.99=111295]
It didn't show any error or warning but stopped printing log.
I check the ycsb process:
ps auwx | grep ycsb
The result:
ran 93177 0.0 0.0 13144 1048 pts/2 S+ 18:10 0:00 grep --color=auto ycsb
I am new to Cassandra and YCSB and trying to run benchmarking on the 3 node cassandra cluster which is built through docker-compose with YCSB.
YCSB's load phase completed in 4 hrs without any error or issues but in the run phase, I am seeing "READ-FAILED" error after running load for 2.5 hours (on 9212 second). I tried running the same test a couple of times but seeing the same issue not sure why.
.
.
2021-05-27 22:22:53:019 9208 sec: 8625003 operations; 661 current ops/sec; est completion in 6 days 1 hour [READ: Count=133, Max=89599, Min=311, Avg=5145.44, 90=10551, 99=78783, 99.9=89599, 99.99=89599] [READ-MODIFY-WRITE: Count=69, Max=26751, Min=707, Avg=4425.57, 90=11583, 99=18271, 99.9=26751, 99.99=26751] [INSERT: Count=450, Max=1432, Min=216, Avg=537.25, 90=818, 99=1128, 99.9=1432, 99.99=1432] [UPDATE: Count=145, Max=1471, Min=184, Avg=472.85, 90=733, 99=1284, 99.9=1471, 99.99=1471]
2021-05-27 22:22:54:019 9209 sec: 8625668 operations; 665 current ops/sec; est completion in 6 days 1 hour [READ: Count=127, Max=66367, Min=334, Avg=4931.35, 90=12767, 99=36127, 99.9=66367, 99.99=66367] [READ-MODIFY-WRITE: Count=64, Max=36543, Min=709, Avg=4670.2, 90=13511, 99=34143, 99.9=36543, 99.99=36543] [INSERT: Count=458, Max=2303, Min=237, Avg=589.22, 90=869, 99=1195, 99.9=2303, 99.99=2303] [UPDATE: Count=144, Max=1190, Min=218, Avg=501.5, 90=759, 99=1186, 99.9=1190, 99.99=1190]
2021-05-27 22:22:55:019 9210 sec: 8626279 operations; 611 current ops/sec; est completion in 6 days 1 hour [READ: Count=110, Max=98495, Min=399, Avg=6190.99, 90=12063, 99=38431, 99.9=98495, 99.99=98495] [READ-MODIFY-WRITE: Count=55, Max=100095, Min=692, Avg=8793.56, 90=15983, 99=39999, 99.9=100095, 99.99=100095] [INSERT: Count=441, Max=1659, Min=241, Avg=624.24, 90=969, 99=1327, 99.9=1659, 99.99=1659] [UPDATE: Count=119, Max=1395, Min=187, Avg=571.55, 90=909, 99=1310, 99.9=1395, 99.99=1395]
2021-05-27 22:22:56:019 9211 sec: 8626842 operations; 563 current ops/sec; est completion in 6 days 1 hour [READ: Count=118, Max=97215, Min=318, Avg=5499.74, 90=10463, 99=93055, 99.9=97215, 99.99=97215] [READ-MODIFY-WRITE: Count=45, Max=98495, Min=742, Avg=5810.96, 90=8807, 99=98495, 99.9=98495, 99.99=98495] [INSERT: Count=385, Max=1252, Min=239, Avg=616.27, 90=924, 99=1163, 99.9=1252, 99.99=1252] [UPDATE: Count=101, Max=1327, Min=195, Avg=580.12, 90=904, 99=1097, 99.9=1327, 99.99=1327]
2021-05-27 22:22:57:019 9212 sec: 8627010 operations; 168 current ops/sec; est completion in 6 days 1 hour [READ: Count=33, Max=90367, Min=732, Avg=12685.67, 90=35679, 99=90367, 99.9=90367, 99.99=90367] [READ-MODIFY-WRITE: Count=18, Max=93183, Min=1121, Avg=17020.33, 90=36895, 99=93183, 99.9=93183, 99.99=93183] [INSERT: Count=120, Max=109951, Min=325, Avg=2155.85, 90=3283, 99=7943, 99.9=109951, 99.99=109951] [UPDATE: Count=35, Max=11567, Min=302, Avg=1142.29, 90=2081, 99=11567, 99.9=11567, 99.99=11567] [READ-FAILED: Count=1, Max=23615, Min=23600, Avg=23608, 90=23615, 99=23615, 99.9=23615, 99.99=23615]
2021-05-27 22:22:58:019 9213 sec: 8627523 operations; 513 current ops/sec; est completion in 6 days 1 hour [READ: Count=87, Max=97151, Min=417, Avg=8968.98, 90=14639, 99=67967, 99.9=97151, 99.99=97151] [READ-MODIFY-WRITE: Count=44, Max=62303, Min=654, Avg=7554.91, 90=14047, 99=62303, 99.9=62303, 99.99=62303] [INSERT: Count=378, Max=1220, Min=240, Avg=467.85, 90=686, 99=1030, 99.9=1220, 99.99=1220] [UPDATE: Count=97, Max=1017, Min=217, Avg=411.89, 90=649, 99=861, 99.9=1017, 99.99=1017] [READ-FAILED: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0]
2021-05-27 22:22:59:019 9214 sec: 8628119 operations; 596 current ops/sec; est completion in 6 days 1 hour [READ: Count=115, Max=112063, Min=334, Avg=6460.7, 90=12127, 99=90943, 99.9=112063, 99.99=112063] [READ-MODIFY-WRITE: Count=58, Max=91711, Min=788, Avg=6967.95, 90=13015, 99=60575, 99.9=91711, 99.99=91711] [INSERT: Count=423, Max=1359, Min=234, Avg=473.31, 90=708, 99=895, 99.9=1359, 99.99=1359] [UPDATE: Count=108, Max=1033, Min=210, Avg=429.63, 90=637, 99=1031, 99.9=1033, 99.99=1033] [READ-FAILED: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0]
2021-05-27 22:23:00:019 9215 sec: 8628679 operations; 560 current ops/sec; est completion in 6 days 1 hour [READ: Count=117, Max=115071, Min=327, Avg=6498.37, 90=16143, 99=64863, 99.9=115071, 99.99=115071] [READ-MODIFY-WRITE: Count=66, Max=65599, Min=607, Avg=6775.21, 90=17151, 99=48191, 99.9=65599, 99.99=65599] [INSERT: Count=391, Max=1137, Min=218, Avg=466.95, 90=711, 99=1021, 99.9=1137, 99.99=1137] [UPDATE: Count=118, Max=1338, Min=191, Avg=438.92, 90=674, 99=1012, 99.9=1338, 99.99=1338] [READ-FAILED: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0]
2021-05-27 22:23:01:019 9216 sec: 8629411 operations; 732 current ops/sec; est completion in 6 days 1 hour [READ: Count=139, Max=94143, Min=390, Avg=5108.03, 90=10015, 99=59999, 99.9=94143, 99.99=94143] [READ-MODIFY-WRITE: Count=71, Max=95039, Min=597, Avg=5881.15, 90=8959, 99=41823, 99.9=95039, 99.99=95039] [INSERT: Count=524, Max=1256, Min=200, Avg=443.07, 90=639, 99=1023, 99.9=1218, 99.99=1256] [UPDATE: Count=142, Max=988, Min=174, Avg=404.29, 90=659, 99=926, 99.9=988, 99.99=988] [READ-FAILED: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0]
2021-05-27 22:23:02:019 9217 sec: 8629929 operations; 518 current ops/sec; est completion in 6 days 1 hour [READ: Count=116, Max=103615, Min=362, Avg=6558.6, 90=12535, 99=89599, 99.9=103615, 99.99=103615] [READ-MODIFY-WRITE: Count=55, Max=103999, Min=619, Avg=7671.18, 90=15127, 99=19727, 99.9=103999, 99.99=103999] [INSERT: Count=344, Max=960, Min=233, Avg=481.37, 90=683, 99=892, 99.9=960, 99.99=960] [UPDATE: Count=111, Max=818, Min=189, Avg=402.95, 90=596, 99=779, 99.9=818, 99.99=818] [READ-FAILED: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0]
.
.
.
However, when I am running benchmarking on MongoDB it's working fine not seeing any error.
Please let me know if any settings or parameter needs to be changed in Cassandra yml deployment or while running YCSB on Cassandra cluster.
In case if you need any more logs please do let me know, will upload them as per request. Currently, I have uploaded 2 log files (on github) one for docker and Cassandra logs and one for YCSB execution.
Any help is appreciated.
[ycsb_logs.txt] https://github.com/neekhraashish/logs/blob/main/ycsb_logs.txt
[docker_cassandra_logs.txt] https://github.com/neekhraashish/logs/blob/main/docker_cassandra_logs.txt
Thanks
Looking at the Cassandra logs, the cluster is not in a healthy state - a few things are noticeable:
The commit log sync warnings - this is indicating that the underlying IO is not keeping up with the commit logs being written to disk.
Dropped mutations - a lot of operations are being dropped between the nodes, this then comes back in the form of synchronous read-repairs when the digest mismatch is noticed on reading - and these read repairs also also often failing.
Some more details on how you have the storage / io provisioned would be useful.