Cssandra workload fails o insert data - cassandra-3.0

While I run "worklada" from seed node,
#./bin/ycsb.sh load cassandra-cql -p hosts="X.X.X.X" -P /root/ycsb/ycsb-0.17.0/workloads/workloada -threads 64 -p operationcount=1000000 -p recordcount=1000000 -s > /root/ycsb/ycsb-0.17.0/workload_A64T_VSSBB_load.csv
It throws bellow results after the run :
Error inserting, not retrying any more. number of attempts: 1Insertion Retry Limit: 0
2022-11-23 08:13:24:257 2 sec: 0 operations; est completion in 106751991167300 days 15 hours [CLEANUP: Count=64, Max=2234367, Min=0, Avg=34896.25, 90=1, 99=4, 99.9=2234367, 99.99=2234367] [INSERT: Count=0, Max=0, Min=9223372036854775807, Avg=�, 90=0, 99=0, 99.9=0, 99.99=0] [INSERT-FAILED: Count=64, Max=23679, Min=16240, Avg=19395.94, 90=22223, 99=23471, 99.9=23679, 99.99=23679]
What can cause this error?

Related

YCSB get stuck when working with Cassandra

The command I use:
sudo bin/ycsb run cassandra-cql -P workloads/workloadf -P cassandra.dat -s > outputs/runf-2.dat 2> outputs/log-runf-2.txt
And:
tail -f outputs/log-runf-2.txt
But the log stuck at:
2021-08-16 17:36:38:255 7850 sec: 65373659 operations; 6141.2 current ops/sec; est completion in 1 hour 9 minutes [READ: Count=61414, Max=1200127, Min=265, Avg=4892.3, 90=1040, 99=168191, 99.9=544255, 99.99=799743] [READ-MODIFY-WRITE: Count=30620, Max=986111, Min=472, Avg=5108.82, 90=1557, 99=153087, 99.9=481279, 99.99=762367] [UPDATE: Count=30634, Max=63551, Min=166, Avg=484.96, 90=550, 99=737, 99.9=14887, 99.99=44991]
2021-08-16 17:36:48:255 7860 sec: 65435896 operations; 6223.7 current ops/sec; est completion in 1 hour 9 minutes [READ: Count=62232, Max=1267711, Min=300, Avg=4969.52, 90=1029, 99=150399, 99.9=586239, 99.99=946175] [READ-MODIFY-WRITE: Count=31050, Max=1258495, Min=541, Avg=5495.38, 90=1544, 99=156543, 99.9=572927, 99.99=950271] [UPDATE: Count=31044, Max=94847, Min=152, Avg=486.59, 90=544, 99=730, 99.9=16095, 99.99=54623]
2021-08-16 17:36:58:255 7870 sec: 65501151 operations; 6525.5 current ops/sec; est completion in 1 hour 9 minutes [READ: Count=65256, Max=2203647, Min=233, Avg=4650.72, 90=1016, 99=151935, 99.9=532479, 99.99=888319] [READ-MODIFY-WRITE: Count=32585, Max=2203647, Min=368, Avg=5292.65, 90=1524, 99=151295, 99.9=549375, 99.99=909823] [UPDATE: Count=32576, Max=87935, Min=132, Avg=485.37, 90=542, 99=726, 99.9=15343, 99.99=55423]
2021-08-16 17:37:08:255 7880 sec: 65559502 operations; 5835.1 current ops/sec; est completion in 1 hour 9 minutes [READ: Count=58354, Max=1277951, Min=313, Avg=5203.44, 90=1037, 99=176767, 99.9=634367, 99.99=939519] [READ-MODIFY-WRITE: Count=29259, Max=1217535, Min=563, Avg=5589.22, 90=1547, 99=172031, 99.9=627711, 99.99=916479] [UPDATE: Count=29247, Max=76863, Min=183, Avg=480.89, 90=545, 99=733, 99.9=17087, 99.99=57279]
2021-08-16 17:37:18:255 7890 sec: 65614920 operations; 5541.8 current ops/sec; est completion in 1 hour 8 minutes [READ: Count=55415, Max=1049599, Min=199, Avg=5494.55, 90=1047, 99=192383, 99.9=639999, 99.99=934911] [READ-MODIFY-WRITE: Count=27552, Max=1030143, Min=326, Avg=5864.25, 90=1571, 99=184319, 99.9=578047, 99.99=915455] [UPDATE: Count=27567, Max=773631, Min=113, Avg=551.14, 90=553, 99=742, 99.9=26751, 99.99=111295]
It didn't show any error or warning but stopped printing log.
I check the ycsb process:
ps auwx | grep ycsb
The result:
ran 93177 0.0 0.0 13144 1048 pts/2 S+ 18:10 0:00 grep --color=auto ycsb

Observing READ-FAILED after 2.5 hrs when running YCSB on Cassandra

I am new to Cassandra and YCSB and trying to run benchmarking on the 3 node cassandra cluster which is built through docker-compose with YCSB.
YCSB's load phase completed in 4 hrs without any error or issues but in the run phase, I am seeing "READ-FAILED" error after running load for 2.5 hours (on 9212 second). I tried running the same test a couple of times but seeing the same issue not sure why.
.
.
2021-05-27 22:22:53:019 9208 sec: 8625003 operations; 661 current ops/sec; est completion in 6 days 1 hour [READ: Count=133, Max=89599, Min=311, Avg=5145.44, 90=10551, 99=78783, 99.9=89599, 99.99=89599] [READ-MODIFY-WRITE: Count=69, Max=26751, Min=707, Avg=4425.57, 90=11583, 99=18271, 99.9=26751, 99.99=26751] [INSERT: Count=450, Max=1432, Min=216, Avg=537.25, 90=818, 99=1128, 99.9=1432, 99.99=1432] [UPDATE: Count=145, Max=1471, Min=184, Avg=472.85, 90=733, 99=1284, 99.9=1471, 99.99=1471]
2021-05-27 22:22:54:019 9209 sec: 8625668 operations; 665 current ops/sec; est completion in 6 days 1 hour [READ: Count=127, Max=66367, Min=334, Avg=4931.35, 90=12767, 99=36127, 99.9=66367, 99.99=66367] [READ-MODIFY-WRITE: Count=64, Max=36543, Min=709, Avg=4670.2, 90=13511, 99=34143, 99.9=36543, 99.99=36543] [INSERT: Count=458, Max=2303, Min=237, Avg=589.22, 90=869, 99=1195, 99.9=2303, 99.99=2303] [UPDATE: Count=144, Max=1190, Min=218, Avg=501.5, 90=759, 99=1186, 99.9=1190, 99.99=1190]
2021-05-27 22:22:55:019 9210 sec: 8626279 operations; 611 current ops/sec; est completion in 6 days 1 hour [READ: Count=110, Max=98495, Min=399, Avg=6190.99, 90=12063, 99=38431, 99.9=98495, 99.99=98495] [READ-MODIFY-WRITE: Count=55, Max=100095, Min=692, Avg=8793.56, 90=15983, 99=39999, 99.9=100095, 99.99=100095] [INSERT: Count=441, Max=1659, Min=241, Avg=624.24, 90=969, 99=1327, 99.9=1659, 99.99=1659] [UPDATE: Count=119, Max=1395, Min=187, Avg=571.55, 90=909, 99=1310, 99.9=1395, 99.99=1395]
2021-05-27 22:22:56:019 9211 sec: 8626842 operations; 563 current ops/sec; est completion in 6 days 1 hour [READ: Count=118, Max=97215, Min=318, Avg=5499.74, 90=10463, 99=93055, 99.9=97215, 99.99=97215] [READ-MODIFY-WRITE: Count=45, Max=98495, Min=742, Avg=5810.96, 90=8807, 99=98495, 99.9=98495, 99.99=98495] [INSERT: Count=385, Max=1252, Min=239, Avg=616.27, 90=924, 99=1163, 99.9=1252, 99.99=1252] [UPDATE: Count=101, Max=1327, Min=195, Avg=580.12, 90=904, 99=1097, 99.9=1327, 99.99=1327]
2021-05-27 22:22:57:019 9212 sec: 8627010 operations; 168 current ops/sec; est completion in 6 days 1 hour [READ: Count=33, Max=90367, Min=732, Avg=12685.67, 90=35679, 99=90367, 99.9=90367, 99.99=90367] [READ-MODIFY-WRITE: Count=18, Max=93183, Min=1121, Avg=17020.33, 90=36895, 99=93183, 99.9=93183, 99.99=93183] [INSERT: Count=120, Max=109951, Min=325, Avg=2155.85, 90=3283, 99=7943, 99.9=109951, 99.99=109951] [UPDATE: Count=35, Max=11567, Min=302, Avg=1142.29, 90=2081, 99=11567, 99.9=11567, 99.99=11567] [READ-FAILED: Count=1, Max=23615, Min=23600, Avg=23608, 90=23615, 99=23615, 99.9=23615, 99.99=23615]
2021-05-27 22:22:58:019 9213 sec: 8627523 operations; 513 current ops/sec; est completion in 6 days 1 hour [READ: Count=87, Max=97151, Min=417, Avg=8968.98, 90=14639, 99=67967, 99.9=97151, 99.99=97151] [READ-MODIFY-WRITE: Count=44, Max=62303, Min=654, Avg=7554.91, 90=14047, 99=62303, 99.9=62303, 99.99=62303] [INSERT: Count=378, Max=1220, Min=240, Avg=467.85, 90=686, 99=1030, 99.9=1220, 99.99=1220] [UPDATE: Count=97, Max=1017, Min=217, Avg=411.89, 90=649, 99=861, 99.9=1017, 99.99=1017] [READ-FAILED: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0]
2021-05-27 22:22:59:019 9214 sec: 8628119 operations; 596 current ops/sec; est completion in 6 days 1 hour [READ: Count=115, Max=112063, Min=334, Avg=6460.7, 90=12127, 99=90943, 99.9=112063, 99.99=112063] [READ-MODIFY-WRITE: Count=58, Max=91711, Min=788, Avg=6967.95, 90=13015, 99=60575, 99.9=91711, 99.99=91711] [INSERT: Count=423, Max=1359, Min=234, Avg=473.31, 90=708, 99=895, 99.9=1359, 99.99=1359] [UPDATE: Count=108, Max=1033, Min=210, Avg=429.63, 90=637, 99=1031, 99.9=1033, 99.99=1033] [READ-FAILED: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0]
2021-05-27 22:23:00:019 9215 sec: 8628679 operations; 560 current ops/sec; est completion in 6 days 1 hour [READ: Count=117, Max=115071, Min=327, Avg=6498.37, 90=16143, 99=64863, 99.9=115071, 99.99=115071] [READ-MODIFY-WRITE: Count=66, Max=65599, Min=607, Avg=6775.21, 90=17151, 99=48191, 99.9=65599, 99.99=65599] [INSERT: Count=391, Max=1137, Min=218, Avg=466.95, 90=711, 99=1021, 99.9=1137, 99.99=1137] [UPDATE: Count=118, Max=1338, Min=191, Avg=438.92, 90=674, 99=1012, 99.9=1338, 99.99=1338] [READ-FAILED: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0]
2021-05-27 22:23:01:019 9216 sec: 8629411 operations; 732 current ops/sec; est completion in 6 days 1 hour [READ: Count=139, Max=94143, Min=390, Avg=5108.03, 90=10015, 99=59999, 99.9=94143, 99.99=94143] [READ-MODIFY-WRITE: Count=71, Max=95039, Min=597, Avg=5881.15, 90=8959, 99=41823, 99.9=95039, 99.99=95039] [INSERT: Count=524, Max=1256, Min=200, Avg=443.07, 90=639, 99=1023, 99.9=1218, 99.99=1256] [UPDATE: Count=142, Max=988, Min=174, Avg=404.29, 90=659, 99=926, 99.9=988, 99.99=988] [READ-FAILED: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0]
2021-05-27 22:23:02:019 9217 sec: 8629929 operations; 518 current ops/sec; est completion in 6 days 1 hour [READ: Count=116, Max=103615, Min=362, Avg=6558.6, 90=12535, 99=89599, 99.9=103615, 99.99=103615] [READ-MODIFY-WRITE: Count=55, Max=103999, Min=619, Avg=7671.18, 90=15127, 99=19727, 99.9=103999, 99.99=103999] [INSERT: Count=344, Max=960, Min=233, Avg=481.37, 90=683, 99=892, 99.9=960, 99.99=960] [UPDATE: Count=111, Max=818, Min=189, Avg=402.95, 90=596, 99=779, 99.9=818, 99.99=818] [READ-FAILED: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0]
.
.
.
However, when I am running benchmarking on MongoDB it's working fine not seeing any error.
Please let me know if any settings or parameter needs to be changed in Cassandra yml deployment or while running YCSB on Cassandra cluster.
In case if you need any more logs please do let me know, will upload them as per request. Currently, I have uploaded 2 log files (on github) one for docker and Cassandra logs and one for YCSB execution.
Any help is appreciated.
[ycsb_logs.txt] https://github.com/neekhraashish/logs/blob/main/ycsb_logs.txt
[docker_cassandra_logs.txt] https://github.com/neekhraashish/logs/blob/main/docker_cassandra_logs.txt
Thanks
Looking at the Cassandra logs, the cluster is not in a healthy state - a few things are noticeable:
The commit log sync warnings - this is indicating that the underlying IO is not keeping up with the commit logs being written to disk.
Dropped mutations - a lot of operations are being dropped between the nodes, this then comes back in the form of synchronous read-repairs when the digest mismatch is noticed on reading - and these read repairs also also often failing.
Some more details on how you have the storage / io provisioned would be useful.

Nominatim - pgsql returned with error code (3)

When I try to install new osm file in my nominatim server.
With the setup.php (10666 = 2/3 of 16G RAM) :
cd /srv/nominatim/build/ && ./utils/setup.php --osm-file /srv/nominatim/Nominatim-3.4.1/data/france-latest.osm --all --osm2pgsql-cache 10666
I have some errors :
Done 9254119 in 55677 # 166.210800 per second - Rank 30 ETA (seconds): 8.447104
Done 9254226 in 55679 # 166.206757 per second - Rank 30 ETA (seconds): 7.803534
Done 9254333 in 55680 # 166.205688 per second - Rank 30 ETA (seconds): 7.159803
Done 9254458 in 55681 # 166.204956 per second - Rank 30 ETA (seconds): 6.407751
Done 9254596 in 55682 # 166.204453 per second - Rank 30 ETA (seconds): 5.577468
Done 9254744 in 55683 # 166.204117 per second - Rank 30 ETA (seconds): 4.687008
Done 9254913 in 55684 # 166.204163 per second - Rank 30 ETA (seconds): 3.670185
Done 9255092 in 55685 # 166.204407 per second - Rank 30 ETA (seconds): 2.593192
Done 9255256 in 55686 # 166.204361 per second - Rank 30 ETA (seconds): 1.606456
Done 9255349 in 55688 # 166.200058 per second - Rank 30 ETA (seconds): 1.046931
Done 9255508 in 55689 # 166.199936 per second - Rank 30 ETA (seconds): 0.090253
Done 9255523 in 55689 # 166.200195 per second - ETA (seconds): 0.000000
Done 9255523 in 55689 # 166.200195 per second - FINISHED
2020-09-10 09:07:48 == Index postcodes
2020-09-10 09:09:32 == Create Search indices
ERROR: out of memory
DETAIL: Failed on request of size 10737418200.
ERROR: pgsql returned with error code (3)
string(34) "pgsql returned with error code (3)"
I've change many times the --osm2pgsql-cache parameters with some values : Same errors :(
My configuration :
~$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/ploop41196p1 296G 167G 117G 59% /
:~$ free -h
total used free shared buff/cache available
Mem: 16G 56M 14G 1.2G 1.5G 14G
Swap: 256M 256M

Not running RabbitMQ on Linux, can not find the file asn1.app

I installed on CentOs successfully ever. However, here is another CentOs I used, and it failed to stared rabbitMq.
My erlang from here.
[rabbitmq-erlang]
name=rabbitmq-erlang
baseurl=https://dl.bintray.com/rabbitmq/rpm/erlang/20/el/7
gpgcheck=1
gpgkey=https://dl.bintray.com/rabbitmq/Keys/rabbitmq-release-signing-key.asc
repo_gpgcheck=0
enabled=1
this is my erl_crash.dump.
erl_crash_dump:0.5
Sat Jun 23 09:17:30 2018
Slogan: init terminating in do_boot ({error,{no such file or directory,asn1.app}})
System version: Erlang/OTP 20 [erts-9.3.3] [source] [64-bit] [smp:24:24] [ds:24:24:10] [async-threads:384] [hipe] [kernel-poll:true]
Compiled: Tue Jun 19 22:25:03 2018
Taints: erl_tracer,zlib
Atoms: 14794
Calling Thread: scheduler:2
=scheduler:1
Scheduler Sleep Info Flags: SLEEPING | TSE_SLEEPING | WAITING
Scheduler Sleep Info Aux Work:
Current Port:
Run Queue Max Length: 0
Run Queue High Length: 0
Run Queue Normal Length: 0
Run Queue Low Length: 0
Run Queue Port Length: 0
Run Queue Flags: OUT_OF_WORK | HALFTIME_OUT_OF_WORK
Current Process:
=scheduler:2
Scheduler Sleep Info Flags:
Scheduler Sleep Info Aux Work: THR_PRGR_LATER_OP
Current Port:
Run Queue Max Length: 0
Run Queue High Length: 0
Run Queue Normal Length: 0
Run Queue Low Length: 0
Run Queue Port Length: 0
Run Queue Flags: OUT_OF_WORK | HALFTIME_OUT_OF_WORK | NONEMPTY | EXEC
Current Process: <0.0.0>
Current Process State: Running
Current Process Internal State: ACT_PRIO_NORMAL | USR_PRIO_NORMAL | PRQ_PRIO_NORMAL | ACTIVE | RUNNING | TRAP_EXIT | ON_HEAP_MSGQ
Current Process Program counter: 0x00007fbd81fa59c0 (init:boot_loop/2 + 64)
Current Process CP: 0x0000000000000000 (invalid)
how to identify this problem ? Thank you.

Simply distributed index: precached 0 indexes

I have two simple indexes:
First, 01.conf:
searchd
{
listen = 9301
listen = 9401:mysql41
pid_file = /var/run/sphinxsearch/searchd01.pid
log = /var/log/sphinxsearch/searchd01.log
query_log = /var/log/sphinxsearch/query01.log
binlog_path = /var/lib/sphinxsearch/data/test/01
}
source base
{
type = mysql
sql_host = localhost
sql_db = test
sql_user = root
sql_pass = toor
sql_query_pre = SET NAMES utf8
sql_attr_uint = group_id
}
source test : base
{
sql_query = \
SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, title, content \
FROM documents WHERE id % 2 = 0
}
index test
{
source = test
path = /var/lib/sphinxsearch/data/test/01
}
Second looks like first but with "02" instead "01" in filename and inside.
And distributed index in 00.conf:
searchd
{
listen = 9305
listen = 9405:mysql41
pid_file = /var/run/sphinxsearch/searchd00.pid
log = /var/log/sphinxsearch/searchd00.log
query_log = /var/log/sphinxsearch/query00.log
binlog_path = /var/lib/sphinxsearch/data/test
}
index test
{
type = distributed
agent = 127.0.0.1:9301:test
agent = 127.0.0.1:9302:test
}
And I try to use distributed index:
sudo searchd --config /etc/sphinxsearch/d/00.conf --stop
sudo searchd --config /etc/sphinxsearch/d/01.conf --stop
sudo searchd --config /etc/sphinxsearch/d/02.conf --stop
sudo searchd --config /etc/sphinxsearch/d/01.conf
sudo searchd --config /etc/sphinxsearch/d/02.conf
sudo indexer --all --rotate --config /etc/sphinxsearch/d/01.conf
sudo indexer --all --rotate --config /etc/sphinxsearch/d/02.conf
sudo searchd --config /etc/sphinxsearch/d/00.conf
Unfortunately I obtain next output:
...
using config file '/etc/sphinxsearch/d/00.conf'...
listening on all interfaces, port=9305
listening on all interfaces, port=9405
precached 0 indexes in 0.000 sec
Why?
And when I try to search something with distributed index (9305):
no enabled local indexes to search.
And mysql indexes are works perfectly if I use them with port 9301 and 9302 respectively. But searching in distributed index returns nothing.
UPDATE
# tail /var/log/sphinxsearch/searchd00.log
[Thu Sep 29 23:43:04.599 2016] [ 2353] binlog: finished replaying /var/lib/sphinxsearch/data/test/binlog.001; 0.0 MB in 0.000 sec
[Thu Sep 29 23:43:04.599 2016] [ 2353] binlog: finished replaying total 4 in 0.000 sec
[Thu Sep 29 23:43:04.599 2016] [ 2353] accepting connections
[Thu Sep 29 23:43:24.336 2016] [ 2353] caught SIGTERM, shutting down
[Thu Sep 29 23:43:24.472 2016] [ 2353] shutdown complete
[Thu Sep 29 23:43:24.473 2016] [ 2352] watchdog: main process 2353 exited cleanly (exit code 0), shutting down
[Thu Sep 29 23:43:24.634 2016] [ 2404] watchdog: main process 2405 forked ok
[Thu Sep 29 23:43:24.635 2016] [ 2405] listening on all interfaces, port=9305
[Thu Sep 29 23:43:24.635 2016] [ 2405] listening on all interfaces, port=9405
[Thu Sep 29 23:43:24.636 2016] [ 2405] accepting connections
UPDATE2
Hmm... It seems what problem in querying data from Sphinx. Also I renamed distributed index into test1. Next code works well.
# mysql -h 127.0.0.1 -P 9405
mysql> select * from test1 where match ('one|two');
+------+----------+
| id | group_id |
+------+----------+
| 1 | 1 |
| 2 | 1 |
+------+----------+
2 rows in set (0,00 sec)
I think what problem was in old version of sphinxapi.php what I used.
precached 0 indexes in 0.000 sec
Well that it self, is normal. There are no local indexes to 'precache'. A distributed index has no index files to 'load' or (pre)cache.
... but searchd should still be running at the end of that. I think searchd should start up ok.
Try also checking
/var/log/sphinxsearch/searchd00.log
might have some more.
Although I suppose its possible sphinx will not startup without any real indexes (ie cant have JUST distributed index), so could just add a fake index to that config.